Computers are helping scientists design proteins, molecules made of amino acids that do important jobs in the body. The way amino acids fold into a 3D shape decides what the protein can do, like fighting disease. Creating new proteins for medicines is hard because it’s tricky to predict which amino acid order will fold into the right shape. This process is called inverse protein folding, where scientists figure out the amino acid sequence needed for a specific 3D protein structure.
Machine learning (ML) is being used to make better predictions for inverse protein folding. Scientists train ML models with lots of information about known proteins to guess which amino acid sequences will form useful shapes. A new ML tool called MapDiff was created by researchers from the University of Sheffield, AstraZeneca, and the University of Southampton. In tests, MapDiff was better at predicting the right amino acid sequences than other top AI tools.
Advancing protein engineering
MapDiff’s success is a step toward faster design of proteins for new vaccines, gene therapies, and other treatments. It works alongside tools like AlphaFold, which predicts a protein’s 3D shape from its amino acid sequence. MapDiff does the opposite, starting with the desired shape and finding the amino acid sequence. This could help scientists make proteins that stick to specific targets in the body, like a key fitting a lock, which is important for creating new drugs.
MapDiff could lead to new ways to design proteins for medical uses, solving a big challenge in biology. This work builds on earlier projects, like an AI called DrugBAN, which predicts if a drug will work with its target protein. MapDiff is described in a study published in Nature Machine Intelligence. The results show promise, but more work is needed to make it ready for real-world use. If successful, this tool could speed up the discovery of new medicines, making treatments better and faster to develop.