Media release
From:
Structural biology: Predicting the structures of our proteins
The production of accurate structure predictions of the human proteome (the complete set of proteins encoded by the human genome) using AlphaFold is described in a paper published in Nature this week. The resulting dataset provides a confident prediction of the structural position for nearly 60% of the amino acids within the human proteome and the predictions will be made freely available to the community via a public database hosted by the European Bioinformatics Institute (EMBL-EBI).
Determining the structure of proteins can provide valuable information for understanding biological processes and could inform drug development. Given the importance of understanding the human proteome for health and medicine, intensive efforts have been made to determine these protein structures. However, after decades of research only 17% of the human proteome’s amino acids — the subunits that are linked together to form proteins — have been included within an experimentally determined structure. Experimental structure determination requires overcoming many time-consuming hurdles and, as such, obtaining more extensive coverage of the proteome remains a key challenge.
Kathryn Tunyasuvunakool, John Jumper, Demis Hassabis and colleagues applied the state-of-the-art machine learning method, AlphaFold, to determine the structures of proteins covering almost the entire human proteome (98.5% of all human proteins). The authors found that AlphaFold was able to make a confident prediction of the structural position of 58% of the amino acids in the human proteome. Of this, the position of a subset of 35.7% was predicted with a very high degree of confidence, which is double the number covered by experimental structures. At the protein level, AlphaFold produced a confident prediction for the structure of 43.8% of proteins for at least three quarters of their amino acid sequence.
The authors conclude that large-scale and accurate structure prediction will become an important tool, allowing new scientific questions to be addressed from a structural perspective and the predictions by AlphaFold will help to further illuminate the role of proteins.