Protein superfamily evolution: algorithms and applications

Protein superfamily evolution: algorithms and applications

Speaker : Kimmen Sjölander, Professor from Department of Bioengineering, University of California, Berkeley  Time : 02:00 pm Starting Date : 22/10/2015

Location : Retrovirus room – LWOFF (22) ,Institut Pasteur, Paris

Protein superfamilies evolve through diverse mechanisms, including insertions, deletions, point mutations, gene duplications and domain architecture rearrangements.   Each of these events can modify the function or structure of the encoded protein. It is not surprising, therefore, that bioinformatics algorithms for virtually every conceivable task depend on accurately reconstructed phylogenetic trees.    In this talk, I will present the SATCHMO (Simultaneous Alignment and Tree Construction using Hidden Markov mOdels) algorithm. SATCHMO is designed to handle extreme levels of sequence and structural divergence, using Hidden Markov Models and other statistical modeling techniques to model the conserved structure within nested subgroups, and HMM-HMM scoring and alignment to construct a hierarchical tree and output a multiple sequence alignment. On benchmark datasets of structurally aligned proteins, SATCHMO outperforms MUSCLE, MAFFT, and ClustalW algorithms.    In the second half of this talk, I will present phylogenomic methods for ortholog identification. Ortholog identification is fundamental to phylogenetic tree estimation as well as automatic function prediction. While most standard orthology prediction methods employ computationally efficient graph-based approaches, their accuracy is generally lower than phylogenomic approaches.   Benchmark experiments using the TreeFam dataset show the superior performance of our methods, particularly in cases where proteins contain “promiscuous” domains.

