Scientists are using artificial intelligence (AI) to design proteins, which are molecules made of amino acids, the building blocks of life. An AI language model for proteins can generate protein sequences, similar to how a chatbot creates text.
You can follow basic instructions to build a simple protein language model and used it to create a short amino acid sequence, as reported in Nature News (unpaywalled copy).
When tested with AlphaFold, an AI tool that predicts protein structures, the result looked realistic but was unlikely to work in a lab, as AlphaFold showed low confidence in the prediction.
New AI tools are changing this by allowing users to give instructions in plain English to design proteins or other molecules, like potential drugs. These tools make it easier for people with less expertise to join the bio-AI revolution, a growing field where AI is used in biology. Unlike older AI models that needed complex inputs like protein sequences, these new models understand simple text and can create unique proteins, such as enzymes, which speed up chemical reactions, or antibodies, which fight infections.
Text-to-protein breakthroughs
Chinese researchers developed a model called Pinal that designs proteins from text prompts. Trained on descriptions of 1.7 billion proteins, Pinal created functional enzymes and glowing proteins unlike any found in nature. For example, when asked to design an alcohol-metabolizing enzyme, Pinal produced designs that worked in lab tests, though less efficiently than natural enzymes. Other models, like ESM-3 and MP4, also use text to design proteins, including some that interact with ATP, a cell’s energy source. These advances could lead to new drugs, like those for obesity.
AI is also helping scientists “talk” to cells. Tools like CellWhisperer and Cell2Sentence analyze cell data and describe cell types or predict drug effects using plain language. These models simplify complex biology tasks, making them accessible to more researchers. While some experts call these tools experimental, they see potential in combining AI with biological data. Challenges remain, like crafting the right text prompts, as poorly worded instructions can lead to flawed designs. Still, these tools are a promising step toward easier and more precise protein and drug design.