AI Tool Dramatically Speeds Up the Study of Protein Dynamics
Researchers say they have developed an artificial intelligence tool to analyze how proteins move and interact which is faster and more accurate than current methods, according to a study, “DeepFRET, a software for rapid and automated single-molecule FRET data classification using deep learning” published today in eLife.
“Single-molecule Förster Resonance energy transfer (smFRET) is an adaptable method for studying the structure and dynamics of biomolecules. The development of high throughput methodologies and the growth of commercial instrumentation have outpaced the development of rapid, standardized, and automated methodologies to objectively analyze the wealth of produced data,” write the investigators.
“Here we present DeepFRET, an automated, open-source standalone solution based on deep learning, where the only crucial human intervention in transiting from raw microscope images to histograms of biomolecule behavior, is a user-adjustable quality threshold. Integrating standard features of smFRET analysis, DeepFRET consequently outputs the common kinetic information metrics.
“Its classification accuracy on ground truth data reached >95% outperforming human operators and commonly used threshold, only requiring ~1% of the time. Its precise and rapid operation on real data demonstrates DeepFRET’s capacity to objectively quantify biomolecular dynamics and the potential to contribute to benchmarking smFRET for dynamic structural biology.”
The software, which is freely available, dramatically speeds up the study of protein dynamics and makes it accessible to research teams across the world, rather than limited to a few laboratories with specialist expertise, note the scientists.
One of the main tools for studying protein motion is called single molecule Förster Resonance Energy Transfer (smFRET). This works by labeling two or more parts of the molecule with a different fluorescent tag, and when the two tags are in close proximity, the change in fluorescence can be detected by a microscope. In this way, the movement of proteins can be visualized and measured down to the nanometer level.
“Some of the challenges with smFRET include the very large data that are produced, and the steps that researchers need to take to process the images before analysis,” explains lead author Johannes Thomsen, PhD, who carried out this study as a research assistant at the University of Copenhagen, Denmark. “Machine learning technologies, especially deep neural networks, have significantly improved our ability to understand large datasets without the need for human intervention. We wanted to see whether employing these technologies to smFRET data would allow automated, fast characterization of protein motions, independently of human experts.”
The team chose to use a type of deep learning called deep neural networks (DNN). Deep learning is a unique branch of machine learning that takes the raw form of the data and looks for patterns with no prior “knowledge.” It has the advantage of learning useful features from raw data without time-intensive pre-processing, and offers a ‘less opinionated’ evaluation of the data, compared with the more subjective analysis by humans.
DNN has a further advantage in that it can learn to recognize important aspects of the data and then classify it into groups. Although developing a DNN is a computationally intensive process that can take time, once trained the model can be used easily, and by non-experts, in any computer.
The tool, DeepFRET, imports raw microscope images, locates the two different fluorescence signals, corrects for background noise and, with limited human help, produces a chart showing the motion of the molecules within the sample. When tested with simulated and real data, its accuracy at detecting meaningful patterns from the data was more than 95%, outperforming human operators and yet only needing 1% of the time. The evaluation time for DeepFRET on a single piece of data (a trace) was around 50 milliseconds, whereas human reviewers spent an average of five seconds per trace.
“We have developed a machine learning method that can automatically, rapidly and reproducibly analyze recordings of the choreography of protein motions, with simple user interface that works on different operating systems,” concludes senior author Nikos Hatzakis, PhD, associate professor at the University of Copenhagen, and affiliate associate professor at the Novo Nordisk Foundation Center for Protein Research, University of Copenhagen.
“The method works equally to or better than existing methods, and requires only minimal contribution by humans. It therefore offers a tool for people with limited expertise, which we hope will contribute to the standardization and rapid expansion of this field of study.”