AI models are powerful, but are they biologically plausible?
Artificial neural networks, ubiquitous machine-learning models that can be trained to complete many tasks, are so called because their architecture is inspired by the way biological neurons process information in the human brain.
About six years ago, scientists discovered a new type of more powerful neural network model known as a transformer. These models can achieve unprecedented performance, such as by generating text from prompts with near-human-like accuracy. A transformer underlies AI systems such as ChatGPT and Bard, for example. While incredibly effective, transformers are also mysterious: Unlike with other brain-inspired neural network models, it hasn’t been clear how to build them using biological components.
Now, researchers from MIT, the MIT-IBM Watson AI Lab, and Harvard Medical School have produced a hypothesis that may explain how a transformer could be built using biological elements in the brain. They suggest that a biological network composed of neurons and other brain cells called astrocytes could perform the same core computation as a transformer.
Recent research has shown that astrocytes, non-neuronal cells that are abundant in the brain, communicate with neurons and play a role in some physiological processes, like regulating blood flow. But scientists still lack a clear understanding of what these cells do computationally.
With the new study, published this week in open-access format in the Proceedings of the National Academy of Sciences, the researchers explored the role astrocytes play in the brain from a computational perspective, and crafted a mathematical model that shows how they could be used, along with neurons, to build a biologically plausible transformer.
Their hypothesis provides insights that could spark future neuroscience research into how the human brain works. At the same time, it could help machine-learning researchers explain why transformers are so successful across a diverse set of complex tasks.
“The brain is far superior to even the best artificial neural networks that we have developed, but we don’t really know exactly how the brain works. There is scientific value in thinking about connections between biological hardware and large-scale artificial intelligence networks. This is neuroscience for AI and AI for neuroscience,” says Dmitry Krotov, a research staff member at the MIT-IBM Watson AI Lab and senior author of the research paper. More