A team of New York University computer scientists has created a neural network that can explain how it reaches its predictions. The work reveals what accounts for the functionality of neural networks — the engines that drive artificial intelligence and machine learning — thereby illuminating a process that has largely been concealed from users.
The breakthrough centers on a specific usage of neural networks that has become popular in recent years — tackling challenging biological questions. Among these are examinations of the intricacies of RNA splicing — the focal point of the study — which plays a role in transferring information from DNA to functional RNA and protein products.
“Many neural networks are black boxes — these algorithms cannot explain how they work, raising concerns about their trustworthiness and stifling progress into understanding the underlying biological processes of genome encoding,” says Oded Regev, a computer science professor at NYU’s Courant Institute of Mathematical Sciences and the senior author of the paper, which appears in the Proceedings of the National Academy of Sciences. “By harnessing a new approach that improves both the quantity and the quality of the data for machine-learning training, we designed an interpretable neural network that can accurately predict complex outcomes and explain how it arrives at its predictions.”
Regev and the paper’s other authors, Susan Liao, a faculty fellow at the Courant Institute, and Mukund Sudarshan, a Courant doctoral student at the time of the study, created a neural network based on what is already known about RNA splicing.
Specifically, they developed a model — the data-driven equivalent of a high-powered microscope — that allows scientists to trace and quantify the RNA splicing process, from input sequence to output splicing prediction.
“Using an ‘interpretable-by-design’ approach, we’ve developed a neural network model that provides insights into RNA splicing — a fundamental process in the transfer of genomic information,” notes Regev. “Our model revealed that a small, hairpin-like structure in RNA can decrease splicing.”
The researchers confirmed the insights their model provides through a series of experiments. These results showed a match with the model’s discovery: Whenever the RNA molecule folded into a hairpin configuration, splicing was halted, and the moment the researchers disrupted this hairpin structure, splicing was restored.
The research was supported by grants from the National Science Foundation (MCB-2226731), the Simons Foundation, the Life Sciences Research Foundation, an Additional Ventures Career Development Award, and a PhRMA Fellowship.