Mastering the Game by Trial and Error: How machine learning could help to achieve targeted interactions with biological neuronal networks
Sreedhar Saseendran Kumar and colleagues from IMTEK and Bernstein Center Freiburg, and the Machine Learning Lab of the Department of Computer Science have approached this question with the help of “reinforcement learning.” In this machine learning strategy, the machine progressively optimizes the stimulation of a network by trial and error and is rewarded based on the outcome. Their study is part of a project of the Cluster of Excellence BrainLinks-BrainTools. It has now been published in the scientific journal PLoS Computational Biology.
“The fluctuation in the ongoing activity of a neuronal network is one reason why responses to identical stimuli vary,” Kumar explains. This makes it a challenge for the machine to autonomously identify the optimal time for stimulation in a network. The ‘machine’ Kumar refers to is a software agent capable of initiating actions in response to activity recorded from a network of neurons. The network response to the actions causes rewards, which the machine tries to maximize over time. “We want to find the right strategy to maximize the total response in a test period. But at the beginning, we do not even know what the value of that maximum is. We need to stimulate at the most suitable time to evoke strong responses and at the same time avoid interruptions by ongoing activity.”
Since the factors involved in this variability and the mechanisms that determine these interactions are poorly understood, it is not possible to develop mathematical models to compute the optimal stimulation time. One of the strengths of this new approach by Kumar and colleagues is that it does not require a model of the system at all. Controllers based on reinforcement learning learn to interact optimally with the network just by trial and error.
“The lack of computational models made it difficult for us to ascertain whether what the machine had learned was the optimal solution,” says Kumar. To circumvent this, the researchers used prior studies on such networks and identified a simpler method to describe the response behavior of the network in order to predict the optimal timing for stimulation. “We asked if the same timing could be learned autonomously through reinforcement learning.” The researchers have now shown that this is indeed the case.
What Kumar and colleagues want to do now is to extend the framework to handle more demanding goals. For this purpose, they will have to take the influence of further factors into account: “We will give the machine more information from multiple locations in the network and allow it more options for action,” Kumar explains. One of the challenges for the method is the dimensionality of the problem: learning becomes more difficult with a larger number of contributing factors. Evaluating the quality of the strategies which the machine learns autonomously is just as challenging.
Image Caption:
One decisive element in the game of football is finding the right balance between offense and defense. A similar trade-off can occur in the stimulation of neuronal networks where a stimulus interacts with the inherent activity of the networks and the efficacy of the stimulation depends on the right timing
Contact:
Prof. Dr. Ulrich Egert
Director, Bernstein Center Freiburg &
Biomicrotechnology, Dept. of Microsystems Engineering
Faculty of Engineering
Phone: +49 (0)761 / 203 – 7524
Fax: +49 (0)761 / 203 –
E-Mail: ulrich.egert@imtek.uni-freiburg.de
Michael Veit,
Science Communicator, Bernstein Center Freiburg
Phone: +49 (0)761 / 203 – 9322
Fax: +49 (0)761 / 203 – 9559
E-Mail: michael.veit@bcf.uni-freiburg.de