This talk will present a model assuming that during decision making the cortico-basal-ganglia circuit computes probabilities that considered alternatives are correct, according to Bayes' theorem. The model suggests how the equation of Bayes' theorem is mapped onto the functional anatomy of a circuit involving the cortex, basal ganglia and thalamus. The talk will also present relationship of model's prediction to experimental data, ranging from detailed properties of individual neurons in the circuit, to the effects of disruptions of computations in this circuit on behaviour.
Recent developments in decision-making research have restored attention to the classic but long neglected topic of planning: the selection of actions based on a projection and evaluation of their potential outcomes. This renewed interest raises the need for an updated computational account of planning, one that makes contact with contemporary views of cognitive and neural information processing. I'll discuss two interrelated projects, both aimed at contributing toward such an account. The central gambit in both projects is to consider how planning might arise from domain-general operations for probabilistic inference. One project focuses on the core procedures involved in planning, modeling these in terms of Bayesian inversion. This approach yields a novel, unifying view of some important neurophysiological observations, and reveals a suprising continuity with drift-diffusion models of simple choice. The second project focuses on hierarchical representations in planning, applying principles from Bayesian model selection to understand how such representations might arise from experience. In addition to laying out the theoretical approach, I will describe some behavioral and neuroimaging results in which we have begun to test specific predictions.
Sensory stimuli are frequently ambiguous and uncertain. Considerable recent research has focused on how animals can generate more accurate estimates of a parameter of interest by integrating visual information across time. I will argue that the same circumstances that lead animals to integrate information across time, ambiguous and uncertain stimuli, lead them to integrate information across sensory modalities. My laboratory has developed a novel multisensory decision task that uses dynamic, time varying auditory and visual stimuli. We have collected data from rats and humans on the task and report three main findings. First, we have found that for multisensory stimuli, both species show improvements in accuracy that are close to the statistically optimal prediction. Next, we report that subjects make use of time in a similar way for unisensory and multisensory stimuli, and for reliable and unreliable stimuli. Finally, we report that synchronous activation of auditory and visual circuitry likely does not drive the improvements in accuracy, since a comparable improvement was evident even when auditory and visual stimuli were presented asynchronously.
Taken together, these findings identify two possible strategies, integrating across time and integrating across sensory modalities, that can help animals overcome sensory uncertainty to make better decisions. Because the inherent variability of cortical neurons renders all stimuli to some degree uncertain, integrating over time or across modalities may be a strategy that is apparent in many circumstances.
Substantial recent work has explored multiple mechanisms of decision-making in humans and other animals. Functionally and anatomically distinct modules have been identified, and their individual properties have been examined using intricate behavioural and neural tools. I will discuss the background of these studies, and show fMRI results that suggest closer and more complex interactions between the mechanisms than originally conceived. In some circumstances, model-free methods seize control after much less experience than would seem normative; in others, temporal difference prediction errors, which are epiphenomenal for the model-based system, are nevertheless present and apparently effective. Finally, I will show that model-free and model-based methods on occasion both cower in the face of Pavlovian influences, and will try and reconcile this as a form of robust control.
Sensory systems need to identify quickly and accurately the composition of noisy, time varying sensory scenes. We will consider what this fundamentally implies for the response properties of sensory neurons. In particular, the way sensory neurons should integrate their input and compete with each other depends strongly on assumptions about the sensory noise and the temporal statistics of sensory stimuli. Past models assumed gaussian noise and static stimuli, which led to "predictive coding", i.e. the notion that sensory neurons should respond to the difference between their sensory input and a prediction of this input by other neurons. This implies substrative lateral inhibition and/or center/surround receptive fields. However, sensory inputs are not static and corrupted by gaussian noise. They are dynamic, stricktly positive, and corrupted by signal-dependant noise (i.e. with a variance that increase with the mean).
This implies that sensory neurons should compete, not by inhibiting each other through lateral inhibition, as commonly assumed, but by selectively shunting the inputs to other neurons that they can predict. This results in "divisive normalization" and a profound reshaping of sensory receptive fields by the context and past/surrounding stimuli. Many puzzing contextual and adaptive effects on sensory receptive fields can be explained in this manner. Thus, the concept of "receptive field" in sensory processing is meaningless and should be replaced by a "predictive field". We will show how this model accounts for recent data reported in early olfactory and visual processing. We will consider how these "predictive fields" could be learnt and measured experimentally.
Interactions between frontal cortex and basal ganglia are instrumental in supporting motivated control over action and learning. Computational models have been proposed at multiple levels of description, from biophysics up to algorithmic approaches. I will describe recent attempts to link across levels of description to develop on the one hand, mechanistic neural models with sufficient detail to make predictions about electrophysiology, pharmacology and genetic manipulations, and on the other hand, higher level computational descriptions which often have normative interpretations and, pragmatically, are more suited to quantitatively fit behavioral data. By fitting outputs of neural models with reduced versions, one can derive predictions about how parametric variation of particular neural mechanisms should give rise to observable change in latent computational parameters -- even if the two levels are not perfectly isomorphic. Examples include the impact of dopamine on learning and choice incentive, prefrontal-subthalamic modulation of decision thresholds, and hierarchical control over actions across multiple corticostriatal circuits. In each case, the (optimistic) result is a better understanding of the domain than that afforded by either level of model alone.
Single-neuron activity in prefrontal cortex (PFC) is often tuned to mixtures of multiple task-related aspects. Such mixed selectivity is highly heterogeneous, seemingly disordered and di cult to interpret.
Because of its prominence in PFC, it is natural to ask whether such heterogeneity plays a role in subserving the cognitive functions ascribed to this area. We addressed this question by analyzing the neural activity recorded in PFC during an object sequence memory task. We rst show that mixed selectivity neurons can be as informative as highly selective cells. Each task-relevant aspect can be decoded from the population of recorded neurons even when the selectivity to that aspect is eliminated from individual cells. We then show that the recorded mixed selectivity neurons actually o er a signi cant computational advantage over specialized cells in terms of the repertoire of input-output functions that are implementable by readout neurons. The superior performance is due to the fact that the recorded mixed selectivity neurons respond to highly diverse non-linear mixtures of the task-relevant variables. This property of the responses is a signature of the high-dimensionality of the neural representations.
We report that the recorded neural representations have actually the maximal dimensionality. Crucially, we also observed that this dimensionality is predictive of animal behavior. Indeed in the error trials the measured dimensionality of the neural representations collapses. Surprisingly, in these trials it was still possible to decode all task-relevant aspects, indicating that the errors are not due to a failure in coding or remembering the sensory stimuli, but instead in the way the information about the stimuli is mixed in the neuronal responses. Our fi ndings suggest that the focus of attention should be moved from neurons that exhibit easily interpretable response tuning to the widely observed, but rarely analyzed, mixed selectivity neurons. Work done with M. Rigotti, O. Barak, M. Warden, N. Daw, X-J Wang, E.K. Miller.
All thalamic inputs that are relayed to cortex come in axons that also send a branch to motor structures. Thus, cortex receives information from sensory receptors about the body and the world and about subcortical activity from first order thalamic relays (see Sherman abstract) and about cortical processing of those inputs from higher order relays. In addition cortex also receives from all of these inputs copies of instructions for upcoming actions (efference copies) that are on their way to execution in the motor branches. That is, essentially all the information that cortex receives from thalamus, i.e. most of the information that cortex receives, concerns sensorimotor contingencies (O'Regan and Noë, 2001, Behav. Brain Sci, 24,939-973), not purely sensory information. 'Sensory' here refers to past, 'motor' to future events. Wolpert & Miall, (1996, Neural Netw., 9:1265-1279) discuss how efference copies generate "forward models" about upcoming actions. Thalamus, as a gate controlling information transfer to cortex, controls generation of ubiquitous cortical forward models. Vukadinovic (2012, EJN, 34,1031-9) relates the thalamic gate to control of forward models, arguing that a closed gate prevents actions from being recognized as generated by the organism, i.e. the self, and suggesting that this links functional and structural abnormalities of the thalamus to some symptoms of schizophrenia; Rolfs et al. (2011, Nature Neuroscience 14: 252-256) demonstrate the role of forward models in attention. The ubiquity of efference copies in thalamocortical circuits suggests that key problems of the self and of attention depend on readily identifiable thalamocortical pathways.
Many objects around us have values which have been acquired through our life-long history. This suggests that the values of individual objects are stored in the brain as long-term memories. We discovered that such object-value memories are represented in part of the basal ganglia including the tail of the caudate nucleus (CDt) and the substantia nigra pars reticulata (SNr). We had monkeys experience many visual objects repeatedly, each of which was consistently associated with a large reward (high-valued) or a small reward (low-valued). After learning sessions across several days, CDt and SNr neurons started showing differential responses to the objects. Many of the CDt neurons showed excitatory responses to high-valued objects more strongly than to low-valued objects. SNr neurons were inhibited by high-valued objects and excited by low-valued objects. These responses occurred even though rewards were given in a non-contingent manner. Many of the SNr neurons projected to the superior colliculus, suggesting that the reward-dependent visual signals are used for controlling saccadic eye movements. Indeed, when these visual objects were presented simultaneously, the monkeys tended to look at high-valued objects even though no reward was given. Thus, the CDt-SNr-SC system enables animals to choose and look at high-valued objects automatically.
I will describe a range of models, from cellular to cortical scales, that illuminate how we accumulate evidence and make simple decisions. Large networks composed of individual spiking neurons can capture biophysical details of synaptic transmission and neuromodulation, but their complexity renders them opaque to analysis. Employing methods of mean field and dynamical systems theory, I will argue that these high-dimensional stochastic differential equations can be approximately to simple drift-diffusion (DD) processes like those used to fit behavioral data in cognitive psychology. The DD models are analytically tractable, coincide with optimal methods from statistical decision theory, and prompt new experiments as well as questions on why we fail to optimize. If time permits, I will describe work in progress on a multi-area model of attention and descision making.
The talk will draw on joint work with Fuat Balci, Rafal Bogacz, Jonathan Cohen, Philip Eckhoff, Sam Feng, Mike Schwemmer, Eric Shea-Brown, Patrick Simen, Marieke van Vugt, KongFatt Wong-Lin and Miriam Zacksenhouse.
Research supported by NIMH and AFOSR.
Several authors have discussed previously the use of loglinear models, often called maximum entropy models, for analyzing spike train data to detect synchrony. The usual loglinear modeling techniques, however, do not allow for time-varying firing rates that typically appear in stimulus-driven (or action-driven) neurons, nor do they incorporate non-Poisson history effects or covariate effects. I will outline a generalization of the usual approach, which combines point process regression models of individual-neuron activity with loglinear models of multiway synchronous interaction (Kass, Kelly, and Loh, 2011, Annals of Applied Statistics; Kelly and Kass, 2012, Neural Computation). I will also describe a method, based on Bayesian control of false discoveries, for assessing the large number of pairs of neurons that are typically examined in a single experiment. Preliminary physiological results come from Utah array recordings in V1.
Our natural environments contain too much information for the visual system to represent. Therefore, attentional mechanisms are necessary to mediate the selection of behaviorally relevant information. Much progress has been made to further our understanding of the modulation of neural processing in visual cortex. However, our understanding of how these modulatory signals are generated and controlled is still poor. In the first part of my talk, I will discuss recent functional magnetic resonance imaging and transcranial magnetic stimulation studies directed at topographically organized frontal and parietal cortex in humans to reveal the mechanisms underlying space-based control of selective attention. In the second part of my talk, I will discuss recent monkey physiology studies that suggest an important function of a thalamic nucleus, the pulvinar, in controlling the routing of information through visual cortex during spatial attention. Together, these studies indicate that a large-scale network of high-order cortical as well as thalamic brain regions is involved with the control of space-based selection of visual information from the environment.
The first ingredient in any reinforcement learning model is the state space -- a description of the task in terms of a sequence of situations (states) that (hopefully, if they possess the Markov property) embody in them all the information needed to determine the probability of immediate rewards and state transitions given an action. For many tasks, the state space is not trivial, and must be learned. I will first demonstrate that state spaces are learned using a simple perceptual judgement task, and then argue that the orbitofrontal cortex (OFC), a region well known for its pervasive yet subtle influence on decision making, encodes a map of the states of the current task and their inter-relations. This map provides a state space for reinforcement learning elsewhere in the brain, and is especially critical in complex tasks. I will use this hypothesis to explain recent experimental findings in an odor guided choice task (Takahashi et al, Nature Neuroscience 2011) as well as classic findings in reversal learning and extinction. In addition, I will lay out a number of testable experimental predictions that can distinguish our theory from other accounts of OFC function.
Work with Robert C. Wilson, Samuel J. Gershman and Geoffrey Schoenbaum.
A diverse array of studies have shown that discrimination thresholds for sensory variables are often proportional the stimulus magnitude, a phenomenon known as Weber's law. Typical explanations invoke a finely tuned combination of a nonlinear neural representation and noise in neural responses. For instance, one such explanation assumes that neural responses are sensitive to the logarithm of the sensory variable (or the ratio of sensory variables) corrupted by a noise with fixed variance. Here we propose a purely computational explanation for Weber's Law which does not require an appeal to internal representation or noise. Rather, we suggest that it arises purely from the statistical nature of the problem faced by the brain. For example, imagine having to estimate the number of items in a scene. If we treat the items as blobs of activity in feature maps, the total sum of the activity provides a numerosity estimate. If the variability within the feature map is independent, the variance of the estimate scales with the mean numerosity, in contrast to Webers' law which predicts that the variance scale with the square of the mean. However, the independence assumption is problematic because activity in such maps is likely to be scaled by global parameters which vary from trial to trial such as the overall luminosity in the image. It is easy to show that the presence of such global scaling parameters correlates neural activity in such a way that the variance of the total sum scales with the square of the mean, thus yielding Weber's law. This simple intuition can be generalized to more complex models, by considering inference in scale mixture models such as the Gaussian Scale or Gamma-Poisson mixture models which precisely replicate Weber's Law when the variance of the scale parameter is large enough.
Neuroeconomics seeks to characterize the computational and neurobiological basis of different types of decisions. This talk will discusses a series of studies designed to understand how the brain makes simple choices, such as whether to choose and apple or an orange, as well as the quality of the resulting decision. This includes understanding how the brain assigns value to stimuli at the time of choice, how values are computed to make a choice and generate the motor movements necessary to implement the choices, and how these basic processes extend to more complex choice situations.
Glutamatergic inputs in thalamus and cortex can be classified into two categories: Class 1( driver) and Class 2 (modulator). Following the logic that identifying driver pathways in thalamus and cortex permit insights into information processing leads to the conclusion that there are two types of thalamic relay: first order nuclei like the LGN receive driver input from a subcortical source (i.e., retina), whereas higher order nuclei like the pulvinar relay driver input from layer 5 of one cortical area to another. This thalamic division is also seen in other sensory systems: for the somatosensory system, first order is VPM/L and higher order is POm; and for the auditory system, first order is MGBv and higher order is MGBd. Furthermore, this first and higher order classification extends beyond sensory systems. Indeed, it appears that most of thalamus by volume consists of higher order relays. Many, and perhaps all, direct driver connections between cortical areas are paralleled by an indirect cortico-thalamo-cortical (transthalamic) driver route involving higher order thalamic relays. Such thalamic relays represent a heretofore unappreciated role in cortical functioning, and this assessment challenges and extends conventional views both regarding the role of thalamus and mechanisms of corticocortical communication. Evidence for this transthalamic circuit as well as speculations as to why these two parallel routes exist will be offered.
The question I will address in the lecture is how information is retrieved from memory when there are no precise item-specific cues. Real life examples are when you try to recall the names of your class-mates, or your favorite writers, or places to see in Rome. I hypothesize that in this situation, retrieval occurs in an associative manner, i.e. each recalled item is triggering the retrieval of a subsequent one. Mathematically this problem can be reduced to random graphs, and general results about the retrieval capacity of the recall can be derived. The main conclusion of the analysis is that retrieval capacity is severely limited, such that only a small fraction of items can be recalled, with characteristic power-law scaling with the total number of items in memory. Theoretical results can be compared to free recall experiments and surprisingly good agreement is observed.
Visual attention is believed to enhance neuronal activity across cortical areas through dynamic interactions between top down and bottom up pathways. To test the hypothesis that attention enhances bottom-up processing, we studied the influence of attention at the very first processing stage in primary visual cortex -the geniculocortical synapse. Animals were trained to attend to one of two drifting gratings and report the occurrence of a contrast change. While recording from identified neurons in cortical layer 4C, we delivered brief, electrical shocks to retinotopically-matching regions of the LGN. Shocks were delivered while animals attended towards or away from the receptive fields of recorded neurons during a time window just prior to the contrast change. Importantly, stimulation levels were set such that half of the stimulation trials resulted in a monosynaptic spike. Our results reveal a significant influence of attention on geniculocortical communication, as the majority of cortical neurons in layer 4C show an increase in the probability of generating an electrically-evoked spike when monkeys attend to the stimulus overlapping their receptive field. Attention also reduces the timing jitter of postsynaptic responses within and between cortical neurons. To our knowledge, these results represent the first study of attention at a synaptic level, and demonstrate that attention can enhance neuronal communication at the very first synapse in visual cortex.
Work done in collaboration with Farran Briggs and George. R Mangun. This work was supported by NIH grants EY018683, EY013588, MH055714, and NSF grant 1228535.
Cognitive tasks require the joint activity of a large population of neurons. Hence there is no a priori reason to find single neurons with easily interpretable activity profiles. Yet when we record from single neurons, we look for and bias our models by precisely these neurons. Here, we use data from the delayed vibrotactile discrimination task from the Romo laboratory to highlight the "not easily interpretable" neurons[1,2]. We compare three different models to data recorded from the prefrontal cortex of monkeys performing this task. The first model is a highly organized linear attractor model, where the internal connectivity is dictated by the neurons' tuning curves. The second model is a randomly connected network with chaotic activity, where the only training is done on the readout. The third model is an intermediate obtained by training the internal and external connectivity of an initially random network. We show that the data most resemble neurons from the intermediate model, but that some "orderly" features of the data are present in the chaotic model as well. Initially random networks are able to perform a working memory and decision making task after training. Our results suggest that prefrontal networks may begin in a random state relative to the task, and initially rely on modified readout for task performance. As training proceeds, more tuned neurons with less time-varying responses should emerge as the networks become more structured. Furthermore, our results provide a cautionary note about interpretation of seemingly ordered features of the data as hinting on specific network structures.on specific network structures.
 Barak O, Tsodyks M, Romo R. Neuronal population coding of parametric working memory. Journal of Neuroscience 2010;30(28):9424.
 Brody CD, Hernandez A, Zainos A, Romo R. Timing and neural encoding of somatosensory parametric working memory in macaque prefrontal cortex. Cerebral cortex 2003;13(11):1196-1207
Many sensory neurons are modulated, but not driven, by a second sensory input. Recently, Billock & Tsou (IMRF, 2011) showed that enhanced unisensory cells in rattlesnake optic tectum and cat cortex could be simulated by the enhanced firing rate produced when excitatory cells synchronize their firing rates. This oscillatory binding-like mechanism also accounts for the Principle of Inverse Enhancement, which is a key principle of sensory integration. The rattlesnake case is interesting because it involves mutual enhancement between neurons responding to visual light and to heat (infrared light). This is so close to a color vision problem that it suggests an approach to modeling the mysterious nonlinear enhancements and suppressions found for combining information in early color mechanisms. No matter how you slice it, color vision is expressable as a combination of three channels - the legacy of trichromacy - but the channels get more complicated and less well understood as you move from lower to higher order mechanism. Some of the more mysterious channels involve nonlinear combinations of lower-order mechanisms. For example, one of the hue channels - the Yellow lobe of the Yellow - Blue channel - is thought to be constructed from a nonlinear combination of L- and M-cone driven mechanisms and shows signs of an expansive (superadditive) nonlinearity driven by the interaction. Similarly, chromatic brightness is like an enhanced (spectrally broadened) version of the luminance channel and is thought to involve nonlinear interactions of hue and luminance mechanisms. Using De Valois et al.'s (JOSA, 1966) macaque data I modeled yellowness as a synchronization between excitatory Y+B- and G+R- P-cells. I modeled brightness as a synchronization between excitatory hue and luminance mechanisms. The result is like a hybrid of oscillatory sensory binding (which creates coherent percepts) and classic sensory integration (which focuses on influences of one sense on another). Here, the influences are profound, and result in a new fused percept out of its sensory inputs.
People do not need supervision or incentives to learn rules. When learning stimulus-action mappings through reinforcement, they structure their policy into abstract rules (Collins & Koechlin 2012, Frank & Badre 2011), even when this does not afford any immediate advantage (Collins & Frank in press). We further investigate how and why individuals build such rules, using our structured reinforcement learning model to derive and test behavioral predictions. Subjects learned to select correct actions in response to stimuli presented in three different contexts. There were two stimulus-action rules, where one of them was valid in two contexts and the other only in a third context (but where each rule was equally frequent across trials). Subsequent phases introduced new stimuli in old contexts and new contexts with old stimuli.
Consistent with model predictions, subjects transferred their self-constructed rule structure to new situations, at different processing levels. First, they gathered rule-specific, rather than context-specific, knowledge about new stimuli, thus learning faster by clustering contexts cueing the same rule. Second, they learned faster in new contexts by generalizing known rules across stimuli. Finally, when faced with a new context, subjects were more likely to try rules that were valid across multiple contexts than those that applied to only one, controlling for rule frequency. These results confirm our model's predictions, and show that the seemingly suboptimal strategy of building complex structure affords long term advantages given the opportunity to generalize across contexts.
Parkinson's disease (PD) is a neurodegenerative disorder affecting the basal ganglia (BG), a set of small subcortical nervous system nuclei. The hallmark of the disease is a dopaminergic denervation of the input stage of the BG, altering information patterns along movement-related ganglia-mediated pathways in the brain, inducing therefore movement disorders such as tremor at rest, bradykinesia, akinesia, and rigidity. It is still unclear how dopamine depletion causes those motor symptoms. Experimental studies have shown that abnormally synchronized oscillatory activities - rhythmic bursting activity at the neurocellular level and beta frequency band oscillations at the network level - emerge in PD at multiple levels of the BG-cortical loops and are correlated with motor symptoms. We propose a computational model of the BG using a novel unicellular mechanism to explain the induction of bursting activity and beta band oscillations in the network. We show how a single change in the dopaminergic level at the input stage of the BG can switch the model from its physiological state to the pathological state. This computational model also proposes a simple mechanism for high-frequency deep brain stimulations.
During perceptual decision-making, human behavior often varies beyond what can be explained by variability in the underlying sensory evidence. This variability might be due to noise in the inference or selection process, but no clear evidence has been presented in favor of either alternative.
We combine a multi-sample categorization task and ideal observer modeling to show that, in three different conditions, noise during the inference process alone best explains the behavioral variability of human subjects. This results supports the idea that perceptual inference is generally intractable and requires approximation, and puts in doubt the hypothesis that decisions are made by sampling from a posterior.
Neurophysiologically-based models of the visual system typically formalize information processing as a sequence of filters applied to visual input. But are all visual computations naturally formalized in this way? Humans parse the world into compositionally structured objects with part-whole and other relational attributes. These kinds of objects are naturally modeled with data structures like graphs, grammars and programs. How can a neural system realize and reason about such data structures? Here we use motion perception as a case study: we develop a Bayesian theory of inference over "motion trees" that express the compositional structure of motion stimuli, and apply it to classic experiments on the perception of moving dot patterns.
Visual sensation arises from the activity of large populations of neurons, however sampling that population activity often involves trade-offs, either in the time resolution of the recording technique or in the sampling density. One approach to the analysis of population coding is to simulate ensemble data based on real neural responses in order to fill in the sampling gaps. To test which features of real neural responses and interactions are important for population coding, one needs a performance measure. Here we use the pursuit system as a model to test theories of cortical population coding of visual motion. Pursuit behavior offers a valuable performance metric for such a population model, since eye movement can be well-characterized, the neural circuitry is well-known, and pursuit initiation is tightly coupled to responses in area MT. We have synthesized an MT population based on extracellular single-unit recordings in macaques. MT neurons encode information about motion direction in the first few spikes of their response to movement, but single neuron thresholds for discriminating motion direction are about 10 times larger than those for pursuit and perception, indicating that target motion is estimated from the joint activity of the cortical population. We use data-driven simulations that preserve the heterogeneity of feature selectivity, dynamics, and temporal spike count correlations to compute the time course of population information about motion direction. Simulation allows us to compare several models of neural interactions. We compute the Cramer-Rao bound on the variance of target direction estimates over time in comparison to pursuit behavior. We find that the precision of motion estimates is influenced by the degree to which the simulation preserves the natural features of MT responses. For example, preserving the natural heterogeneity in neural response dynamics improves direction estimation by a factor of 2, whereas preserving the natural temporal correlations in spike count, rather than treating spiking as Poisson, decreases the precision of population estimates by a factor of 2. This analysis allows us to quantify the impact of each response feature on population information, informing our understanding of the nature of the sensory code.
Work with O. Barak, M. Warden, N.D. Daw, X.-J. Wang, E.K. Miller, S. Fusi.
When designing tasks addressing sequential decision making one usually makes the rewards explicit and interprets observed behavior in terms of optimality criteria with respect to the generative model of the task variables. The situation in naturalistic tasks is different in that the costs and benefits implicit in the observed behavior are not observed. Here we express behavior as combinations of concurrent goals in the context of optimal control, which has the distinct advantage of expressing behavioral goals as reward functions. We show that, in such a setting, a specific formulation of inverse reinforcement learning can be derived that allows the recovery of reward weights, which quantify how much individual component tasks contribute to the overall behavior. We show how to recover the component reward weights for individual tasks and demonstrate through simulations that good estimates can be obtained already with minimal amounts of observation. Thus, we ask: for which reward assignments to the component tasks is the observed behavior optimal?
We apply this framework to a sequence of experiments involving human participants in a multiple objective navigation task in a virtual environment. Participants were given different task specifications leading do different walking behavior. It is shown that the recovered intrinsic reward weights reflect the given instructions on a trial by trial and individual subject's basis, but that subjects have systematic biases that lead them to assign rewards to tasks that they were not instructed to do. Finally, we show how eye movements of individual participants on a trial by trial basis relate to the inferred reward weights.
Two critical difficulties plague neural implementations of recurrent auto-associative memories, as in area CA3 of the hippocampus. First, as stressed by Fusi (2005), synapses only have limited dynamic ranges. Memory thus has a palimpsest character, with traces degrading as new patterns are stored. Second, synapses sharing pre- or post-synaptic partners are significantly correlated (Song et al, 2005). This oft-ignored fact severely complicates recall, as the evidence at a synapse can only be interpreted in light of other synapses to the same neuron.
Here, we suggest systems- and circuit-level solutions to these problems in the context of the hippocampus. First, since traces degrade over time, pattern age needs to be considered for successful recall. As age is a form of unfamiliarity, reflected in the activity of a subpopulation of perirhinal neurons, we construct that a dual system, combining the hippocampus (for recollection) and perirhinal cortex (for familiarity) an show that it provides an efficient solution for this problem (Savin et al, 2011). Second, at the circuit level, (approximately) optimal retrieval dynamics predict a close link between CA3 synaptic plasticity and homeostatic mechanisms reported in this region (Zhang and Linden, 2003), and stabilization from feedback inhibition, and experimentally observed dendritic nonlinearities. Overall, our results provide an unifying view of various aspects of hippocampal circuitry, offering high-level solutions for synapse-level concerns.
Diffusion models allow choice behavior to be understood by analogy to simple physical processes – specifically, Poisson spike processes, which neurons seem to implement approximately. Psychophysical laws, on the other hand, serve to organize diverse behavioral phenomena into simple regularities. A very simple form of diffusion model can account for several psychophysical laws, including Weber's law of choice-accuracy in two-alternative decision making (Link, 1992). This law states that response accuracy should remain constant as the intensity of the two stimuli being compared changes proportionally. It has been known to apply in experiments involving perceptual-intensity discrimination since the 1850s, and has more recently been shown to hold in temporal-duration discrimination. In its account of Weber's law, this simple diffusion model furthermore predicts a number of salient properties of behavioral response time (RT) data – some of which, to our knowledge, have never previously been predicted or tested. For example, the model gives a simple account of the constancy of the coefficient of variation (CV) of RTs that is typically observed across psychophysical task conditions in decision making (Wagenmakers & Brown, 2007) and timing (Simen et al., 2011). At the same time, it predicts dramatic violations of this constancy when task-conditions differ in key respects. The model accounts for a decrease in this CV with increasing task-practice as a byproduct of reward-rate optimization, and it predicts a response-time corollary of Weber's Law: that RTs should decrease as the two stimulus intensities increase proportionally. Most notably, simple diffusion models predict an upper bound on the CV of observed RTs that is equal to the square root of 2/3 under all conditions. We have begun testing the model's behavioral predictions, and evidence so far gathered suggests that they hold.