Beyond AlphaFold: A.I. excels at creating new proteins
Over the past two years, machine learning has revolutionized protein structure prediction. Now, three papers in Science describe a similar revolution in protein design.
In the new papers, biologists at the University of Washington School of Medicine show that machine learning can be used to create protein molecules much more accurately and quickly than previously possible. The scientists hope this advance will lead to many new vaccines, treatments, tools for carbon capture, and sustainable biomaterials.
“Proteins are fundamental across biology, but we know that all the proteins found in every plant, animal, and microbe make up far less than one percent of what is possible. With these new software tools, researchers should be able to find solutions to long-standing challenges in medicine, energy, and technology,” said senior author David Baker, professor of biochemistry at the University of Washington School of Medicine and recipient of a 2021 Breakthrough Prize in Life Sciences.
Proteins are often referred to as the “building blocks of life” because they are essential for the structure and function of all living things. They are involved in virtually every process that takes place inside cells, including growth, division, and repair. Proteins are made up of long chains of chemicals called amino acids. The sequence of amino acids in a protein determines its three-dimensional shape. This intricate shape is crucial for the protein to function.
Recently, powerful machine learning algorithms including AlphaFold and RoseTTAFold have been trained to predict the detailed shapes of natural proteins based solely on their amino acid sequences. Machine learning is a type of artificial intelligence that allows computers to learn from data without being explicitly programmed. Machine learning can be used to model complex scientific problems that are too difficult for humans to understand.
To go beyond the proteins found in nature, Baker’s team members broke down the challenge of protein design into three parts andused new software solutions for each. More