The function of a protein limits its evolutionary path through sequence space. Sequence homolog collections document the results of millions of evolutionary experiments in which the protein changes in accordance with these limitations. It is extremely difficult to decipher the evolutionary history contained in these sequences and use it for engineering and prediction applications. Due to the development of low-cost high-throughput genome sequencing, the potential value of resolving this problem has increased (Knop D et al., 2015). It is difficult to separate genuine co-evolution connections from the chaotic collection of apparent correlations. We tackle this problem by inferring residue pair couplings using a maximum entropy model of the protein sequence, restricted by the statistics of the multiple sequence alignment. Unexpectedly, we discover that the strength of these inferred couplings is a very good indicator of the closeness of residues in folded structures. In fact, the highest-scoring residue couplings are remarkably exact and evenly dispersed to characterise the 3D protein structure (Ravi B et al., 2013).
Human proteome sequence variation data may be used to get functional understanding of 3D protein structures. We examined 3D positional conservation in 4,715 proteins and 3,951 homology models utilising genetic variation data from over 140,000 people, employing 860,292 missense and 465,886 synonymous variants. At least one intolerant 3D site is present in 60% of protein structures, as shown by a significant decrease of observed over anticipated missense variation. Data on structural intolerance were connected with shallow mutagenesis data for 1,026 proteins and functional readouts from deep mutational scanning for PPARG, MAPK1/ERK2, UBE2I, SUMO1, PTEN, CALM1, CALM2, and TPK1. Different characteristics for ligand binding pockets and orthosteric and allosteric locations were found by the 3D structural intolerance analysis. A definition of functional 3D locations proteome-wide is supported by extensive data on human genetic diversity (Valverde ME et al., 2015).
Share this article