- Course overview
- Search within this course
- Validation and impact
- Inputs and outputs
- Accessing and predicting protein structures with AlphaFold 2
- Choosing how to access AlphaFold2
- Accessing predicted protein structures in the AlphaFold Database
- Predicting protein structures with ColabFold and AlphaFold2 Colab
- Predicting protein structures using the AlphaFold2 open-source code
- Other ways to access predicted protein structures
- How to cite AlphaFold
- Advanced modelling and applications of predicted protein structures
- Classifying the effects of missense variants using AlphaMissense
- AlphaFold 3 and AlphaFold Server
- Summary
- Course slides
- Your feedback
- Glossary of terms
- References
- Acknowledgements
What is the protein folding problem?
It is theoretically possible to predict the 3D structure of a protein just from its amino acid sequence. However, this is extremely challenging because of the sheer number of possible conformations. Artificial intelligence is ideally suited to this problem.
Protein folding problem: predicting protein structure from sequence
The protein folding problem encompasses two interrelated challenges: understanding the process of protein chain folding and accurately predicting a protein’s final folded structure
In 1972 Christian Anfinsen shared the Nobel Prize in Chemistry for proposing that, in its standard physiological environment, a protein’s structure is determined by the sequence of amino acids that make it up. This came to be known as Anfinsen’s dogma.
This hypothesis was important, because it suggested we should be able to reliably predict a protein’s structure from its sequence. Decades of research into structural biology have since shown that Anfinsen was largely correct.
The computational challenge
However, it turns out that predicting protein structure from sequence is not so simple. This is because of a second concept called Levinthal’s paradox.
In the 1960s, Cyrus Levinthal showed that there is a very large number of possible conformations a protein chain could theoretically adopt. If a protein was to explore them all, it would take an incomprehensible amount of time, comparable with the lifetime of the Universe.
Nevertheless, Anfinsen’s findings inspired a search for an efficient system that could reliably identify the most likely native structure of a protein, based solely on its amino acid sequence. While challenging, this was at least theoretically possible.
The role of artificial intelligence
This is where artificial intelligence comes in. Modern machine learning methods can help identify complex relationships in large datasets, enabling prediction of protein structures.
Crucially, Anfinsen’s dogma implies that predicting the folded state of a protein does not necessarily require an understanding of the folding process. That is, it should be possible to predict the final 3D shape of a protein without predicting the sequence of movements that leads to this shape – sidestepping Levinthal’s paradox.