A neurocomputational model of reward-based motor learning

Ragonetti, Gianmarco

dc.date.accessioned	2018-12-13T10:46:22Z
dc.date.available	2018-12-13T10:46:22Z
dc.description	2015 - 2016	it_IT
dc.description.abstract	The following thesis deals with computational models of nervous system employed in motor reinforcement learning. The novel contribution of this work is that it includes a methodology of experiments for evaluating learning rates for human which we compared with the results coming from a computational model we derived from a deep analysis of literature. Rewards or punishments are particular stimuli able to drive for good or for worse the performance of the action to learn. This happens because they can strengthen or weaken the connections among a combination of sensory input stimuli and a combination of motor activation outputs, attributing them some kind of value. A reward/ punisher can originate from innate needs(hunger, thirst, etc), coming from hardwired structures in the brain (hypothalamus), yet it could also come from an initially neutral cue (from cortex or sensory inputs) that acquires the ability to produce value after learning(for example money value, approval).We called the formers primary value, while the latter learned values. The efficacy of a stimulus as a reinforcer/punisher depends on the specific context the action take place (Motivating operation). It is claimed that values drive learning through dopamine firing and that learned values acquire this ability after repetitive pairings with innate primary values, in a Pavlovian classic conditioning paradigm. Under some hypothesis made we propose a computational model made of:  A block taking place in Cortex mapping sensory combinations(posterior cortex) and possible actions(motor cortex) . The weights of the net which corresponds to the probability of a movement , given a sensory combination in input. Rewards/punishments alter these probabilities trhought a selection rule we implemented in Basal Ganglia for action selection;  A block for the production of values (critic): we evaluated two different scenarios In the first we considered the block only fo innate rewards, made of VTA(Ventral Tegmental Area) and Lateral Hypothalamus(innate rewards) and Lateral Habenula(innate punishments) In the second scenario we added the structures for learning of rewards, Amygdala, which learns to produce a dopamine activation on the onset of an initially neutral stimulus and a Ventral Striatum, which learns to predict the occurrence of the innate reward, cancelling its dopamine activation. Innate reward is fundamental for learning value system: even in a well trained system, if the learned stimulus reward is no more able to expect innate stimulus reward( because is occurring late or not at all ), and if this occurs frequently it could lose its reinforcing/weakening abilities. This phenomenon is called acquisition extinction and is strictly dependent on the context (motivating operation). Validation of the model started from Emergent , which provides a biologically accurate model of neuron networks and learning mechanisms and was ported to Matlab , more versatile, in order to prove the ability of system to learn for a specific task . In this simple task the system has to learn among two possible actions , given a group of stimuli of varying cardinality: 2, 4 and 8. We evaluated the task in the 2 scenarios described, one with innate rewards and one with learned rewards. Finally several experiments were performed to evaluate human learning rate: volunteers had to learn to press the right keyboard buttons when visual stimuli appeared on monitor, in order to get an auditory and visual reward. The experiments were carefully designed in a way such to make comparable the result of simple artificial neural network with those of human performers. The strategy was to select a reduced set of responses and a set of visual stimuli as simple as possibles (edges), thus bypassing the problem of a hierarchical complex information representation, by collapsing them in one layer . The result were then fitted with an exponential and a hyperbolical function. Both fitting showed that human learning rate is slow compared to artificial network and decreases with the number of stimuli it has to learn. [edited by Author]	it_IT
dc.language.iso	en	it_IT
dc.subject.miur	ING-INF/05 SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI	it_IT
dc.contributor.coordinatore	De Santis, Alfredo	it_IT
dc.description.ciclo	XV n.s.	it_IT
dc.contributor.tutor	Marcelli, Angelo	it_IT
dc.contributor.cotutor	Viggiano, Andrea	it_IT
dc.identifier.Dipartimento	Ingegneria dell'Informazione ed Elettrica e Matematica Applicata	it_IT
dc.title	A neurocomputational model of reward-based motor learning	it_IT
dc.contributor.author	Ragonetti, Gianmarco
dc.date.issued	2017-09-22
dc.identifier.uri	http://hdl.handle.net/10556/3028
dc.identifier.uri	http://dx.doi.org/10.14273/unisa-1317
dc.type	Doctoral Thesis	it_IT
dc.subject	Neurocomputational models	it_IT
dc.subject	Motor learning	it_IT
dc.subject	reward-based learning	it_IT
dc.publisher.alternative	Universita degli studi di Salerno	it_IT

Find Full text

Files in this item

Name:: tesi_di_dottorato_G_Ragognetti.pdf
Size:: 2.333Mb
Format:: PDF
Description:: tesi di dottorato

View/Open

Name:: abstract in inglese G. Ragogne ...
Size:: 25.21Kb
Format:: PDF
Description:: abstract in inglese a cura ...

View/Open

Name:: abstract in italiano G. Ragogn ...
Size:: 124.8Kb
Format:: PDF
Description:: abstract in italiano a cura ...

View/Open

This item appears in the following Collection(s)

Informatica ed Ingegneria dell'Informazione

Show simple item record