Selectome, Looking for Darwinian Evolution in the Tree of Life
Darwinian selection is the force which drives evolutionary diversification and functional changes in biology. Recent studies of whole genomes in a few closely related species have found abundant recent Darwinian selection in the fruit fly but little in humans. In a study of 884 genes in 11 species with much larger divergence times, we found abundant Darwinian selection in vertebrates (including human). Results have been shown to be of biomedical relevance, e.g., genes which have evolved under Darwinian selection are often involved in complex diseases. Yet because of computational limitations the global incidence of Darwinian selection in genomes remains uncharacterized.
Recently, we have developed a database of Darwinian selection called Selectome. Populating and updating this database poses a major computational challenge. For high quality detection of Darwinian selection two likelihood computations per edge of the gene tree are required. There are m - 3 internal edges per tree of m sequences, and the number of available sequences is increasing exponentially. The available software for these computations, Codeml from the package PAML, has never been optimized for large computations nor for next-generation high performance computing systems.
Consequently, an important aim is to produce software, FastCodeml, optimized for the specific test of Darwinian selection on supercomputers, including efficient computing of large and very large (>1000 genes) trees.
- Prof. Marc Robinson-Rechavi, University of Lausanne
- Dr. Nicolas Salamin, University of Lausanne
- Dr. Heinz Stockinger, Swiss Institute of Bioinformatics
- Sébastien Moretti, Swiss Institute of Bioinformatics
- Dr. Hannes Schabauer, University of Lausanne