- Course overview
- Search within this course
- An introductory guide to AlphaFold’s strengths and limitations
- Inputs and outputs
- Accessing and predicting protein structures with AlphaFold 2
- Choosing how to access AlphaFold2
- Accessing predicted protein structures in the AlphaFold Database
- Predicting protein structures with ColabFold and AlphaFold2 Colab
- Predicting protein structures using the AlphaFold2 open-source code
- Other ways to access predicted protein structures
- How to cite AlphaFold
- Advanced modelling and applications of predicted protein structures
- Classifying the effects of missense variants using AlphaMissense
- AlphaFold 3 and AlphaFold Server
- Summary
- Course slides
- Your feedback
- Glossary of terms
- References
- Acknowledgements
How have AlphaFold2’s predictions of protein structure been validated?
AlphaFold2’s capability to predict protein structure was first demonstrated when it triumphed in the CASP14 assessment of structure predictions. Since then it has been validated by multiple lines of evidence from structural biology experiments, including studies of X-ray crystallography, cryogenic electron microscopy and cross-linking mass spectrometry.
AlphaFold’s success in CASP
Critical Assessment of Structure Prediction (CASP) is an experimental test of protein structure predictions. It has been carried out every two years since 1994. The assessment is open to anyone.
CASP entrants submit predicted structures for proteins. The proteins in question have their structures determined by experiment, by X-ray crystallography, nuclear magnetic resonance (NMR), or cryogenic electron microscopy (cryo-EM). However, these structures are not released to the public until assessment is over. Predicted structures are then compared against these experimental structures.
Google DeepMind entered structure predictions from AlphaFold2 into CASP14 in 2020. The software outperformed all the other entrants by a wide margin.

Previously, overall structure prediction accuracy, measured by global distance from ground truth (GDT_TS), had only reached about 60. AlphaFold2 scored over 90. This score meant the predicted protein structures closely matched the experimentally-resolved structures. CASP coordinators proclaimed that the protein-folding problem had been “largely solved”, at least for single protein chains.
Google DeepMind previously entered an earlier version of AlphaFold in 2018’s CASP13. It took the first place but by a small margin. Those predictions were not accurate enough, so the protein structure prediction problem was not considered solved.
Google DeepMind did not directly participate in CASP15 in 2022. However, all the top performers used modified or customised versions of AlphaFold2. Because Google DeepMind released the source code for AlphaFold, other researchers were able to build on it and in some cases outperform the standard version of the software (Elofsson, 2023; Kryshtafovych et al., 2023).
Subsequent evidence from structural biology
In CASP14, AlphaFold2 succeeded in predicting the structures of dozens of proteins. However, there are millions of proteins in nature. Hence subsequent experimenters have subjected the software to further validation.
Structural biology experiments demonstrate that AlphaFold2 structures (or well-defined parts of the predicted structures, like protein domains) work well as search models for molecular replacement in X-ray crystallography (Barbarin-Bocahu and Graille, 2022; McCoy et al., 2022; Millán et al., 2021). This implies the AlphaFold2 structures closely resemble the protein crystal structures.
AlphaFold2 structures fit well into experimental cryo-EM electron density maps (Chojnowski, 2022; Giri et al., 2023). This again suggests a good match between structure predictions and the experimental data.
AlphaFold2 structures are still a good fit when proteins are in solution, as opposed to crystallised. Using AlphaFold2 models to interpret nuclear magnetic resonance (NMR) data obtained in solution suggested an excellent fit in the vast majority of the cases (Fowler and Williamson, 2022; Tejero et al., 2022). Interestingly, this indicates that AlphaFold2 models are not that biassed towards predicting a crystal state, despite AlphaFold2 mainly having been trained on data derived from protein crystals.
Figure 10. Specialized acyl carrier protein protein
Notably, AlphaFold’s prediction (AlphaFold ID: AF-Q6N882-F1) demonstrates a closer match to the NMR structure (green, PDB ID: 2LPK) than the corresponding X-ray crystal structure (grey, PDB ID: 3LMO) (Tejero et al., 2022)
Cross-linking mass-spectrometry experiments showed that the majority of AlphaFold2 structure predictions were correct for both single protein chains and protein-protein complexes in situ (Bartolec et al., 2023; McCafferty et al., 2023).
Taken together, these data validate AlphaFold2’s accuracy. They also suggest that AlphaFold2 models can be useful for a variety of research applications.