What is the protein folding problem that has left researchers stuck for nearly 50 years?
Knowing the 3D shape of proteins is so important for our understanding of various diseases and vaccine development. However, these shapes are fantastically complex and difficult to predict. Researchers have spent years trying to determine the 3D structure of proteins.
Thanks to AI systems like AlphaFold, it’s now much easier and faster to predict protein shapes. AlphaFold is currently leading the way in protein folding research and has been described as a “revolution in biology.”
In this episode of Short and Sweet AI, I explore the protein folding problem in more detail and how AlphaFold is accelerating our understanding of protein structures.
In this episode, find out:
Important Links and Mentions:
Hello to you who are curious about AI. I’m Dr. Peper and today I’m talking about AlphaFold.
One of Biology’s most difficult challenges, one that researchers have been stuck on for nearly 50 years is how to determine a protein’s 3D shape from its amino-acid sequence. It's known as “the protein folding problem”.
When I first came across the subject, I thought, ok, that’s a biology problem and maybe AI will solve it but there’s no big story here. I was wrong.
Some biologists spend months, years, or even decades performing experiments to determine the precise shape of a protein. Sometimes they never succeed. But they persist because having the ability to know how a protein folds up can accelerate our ability to understand diseases, develop new medicines and vaccines, and crack one of the greatest challenges in biology.
Why is protein folding so important? Proteins structures contain as much, if not more information, than stored in DNA. Their 3D shapes are fantastically complex. Proteins are made up of strings of amino acids, called the building blocks of life. In order to function, the strings twist and fold into a precise, delicate shapes that turn or wrap around each other. These strings can even merge into bigger, megaplex structures.
Only then can these proteins function in the way necessary to build and sustain life. A protein’s shape defines what the protein can do and what it cannot do.
But there’s an astronomical number of ways a protein can fold into its final 3D structure. It’s called Levinthal’s paradox. Cyrus Levinthal, a molecular biologist, published a paper in 1969 called “How to Fold Graciously.” He found there are so many degrees of freedom in an unfolded chain of amino acids, the molecule has an astronomical number of possible configurations.
There’re an estimated 200 million known proteins with 30 million new ones discovered every year. Each one has a unique 3D shape which determines how it works and what it does. For the last 50 years, biologists discovered the exact 3D structure of only a tiny fraction of known proteins.
The protein folding problem led to a global competition called CASP, which stands for Critical Assessment of Structure Protein. Scientists measure and compare their research efforts using computer-based predictions. The competition started in 1994 to improve computational methods for accurately predicting a protein’s 3D shape.
DeepMind, an AI research lab owned by Google, has made headlines for creating deep learning neural networks AlphaGo and AlphaZero, which beat the world’s leading chess and Go champions. I’ve talked about them in previous episodes. Protein folding has been called the challenge of a lifetime, and the researchers at DeepMind wanted to use AI, not for only game playing but to make a real-world impact. So, DeepMind went to work creating AlphaFold, a deep learning computer system, to solve the protein folding problem.
In 2018 AlphaFold entered the CASP competition for the first time. It achieved the highest score for accurately predicting various protein structures, scoring 60 out of a possible 100 points. But the AlphaFold researchers thought it could improve the accuracy and developed the deep learning neural network even further.
In addition to using a data set with 170,000 protein structures, DeepMind supercharged the algorithm. They added data about physics, geometry, and evolutionary history into their training model. The algorithm analyzed any buried relationships or patterns and was able to determine highly accurate structures in a matter of days, even hours. It could predict a protein’s shape down to the width of an atom.
The turning point came in the CASP competition in November 2020. AlphaFold, as well as teams from Microsoft and the Chinese tech company Tencent, competed to predict protein structures considered to be moderately difficult. The best performance of the other teams was 75 points on a 100-point scale. AlphaFold performed so unbelievably well, it was called a revolution in biology. AlphaFold scored 90 out of 100.
One researcher had been looking for the structure of a protein for 10 years, an absolute decade. AlphaFold’s predictions gave him the protein’s 3D structure in half an hour. You can’t make this stuff up. His exuberance is understandable when he says: “This will change medicine. It will change research. It will change bioengineering. It will change everything.”
Here are a few more comments made by experts that convey why Alphafold is not just a big story but rivals the discovery of DNA.
One researcher said, “I nearly fell off my chair when I saw these results.” Another proclaimed, “It’s a breakthrough of the first order, certainly one of the most significant results of my lifetime.” Another commented, “…a stunning advance…It’s occurred decades before many people in the field would have predicted.”
And John Moult, a professor who helped to create the CASP competition, describes it as a dream come true. He said “I always hoped I would live to see this day. But it wasn’t always obvious I was going to make it.”
Thanks for listening. I hope you found this helpful. If you like this episode, please leave a review and subscribe, because then you’ll receive my episodes weekly. From Short and Sweet AI, I’m Dr. Peper.