Using multiple interdependency to separate functional from phylogenetic correlations in protein alignments.Elisabeth R. M. Tillier and Thomas W. H. Lui
Bioinformatics, 2003 19: 750-755
Abstract
Motivation: Multiple sequence alignments of homologous proteins are
useful for inferring their phylogenetic history and to reveal functionally
important regions in the proteins. Functional constraints may lead to co-variation
of two or more amino acids in the sequence, such that a substitution at
one site is accompanied by compensatory substitutions at another site.
It is not sufficient to find the statistical correlations between sites
in the alignment because these may be the result of several undetermined
causes. In particular, phylogenetic clustering will lead to many strong
correlations.
Results: A procedure is developed to detect statistical correlations stemming from functional interaction by removing the strong phylogenetic signal that leads to the correlations of each site with many others in the sequence. Our method relies upon the accuracy of the alignment but it does not require any assumptions about the phylogeny or the substitution process. The effectiveness of the method was verified using computer simulations and then applied to predict functional interactions between amino acids in the Pfam database of alignments.
Availability: The program and supplementary figures tables are available
from the site
http://www.uhnresearch.ca/labs/tillier/software.htm#2.
Contact: e.tillier@utoronto.ca
Supplementary figures:
supplementary.ps