Replies: 27 comments 34 replies
-
Primate PFC and agate modelHere's a quick summary of the new Hazy Frank O'Reilly 2021 primate-oriented synthesis of a range of data on BG / PFC:
This is a nice story with clear divisions of labor, and it makes sense according to the classical BG gating model, but as discussed above, this classical output gating model doesn't really fit with the Rubicon goal-directed model, which requires BG gating of active maintenance of goal / plan states. Also, the primate-specific nature of the layer 3 active maintenance mechanism calls into question the generality of this framework for the more foundational Rubicon goal-management model, which we think is very much present in rodents and is the functional origin of OFC / ACC / SMA like brain areas, which carries forward into primates too. Thus, in this context, it may make more sense to think of the layer 3 agate thing as an evolutionarily late adaptation for holding onto lots of bits of information during fluid online information processing of complex internal mental models. This is not needed in the core rodent-level goal-driven dynamics, where the BG plays a central role. |
Beta Was this translation helpful? Give feedback.
-
Updated synthesis across rodent, primate, and MD / VThalNote: Updated based on lit review below: #56 (comment)
|
Beta Was this translation helpful? Give feedback.
-
Turning off active maint at the "Reckoning"This perennial question of how the active maintenance gets turned off re-emerges here: when we get our US (or give up on it), how does that then reset the maintenance? |
Beta Was this translation helpful? Give feedback.
-
Mechanisms of maintenance in layer 5, rodentThe emerging story from top references below is that maintenance occurs via a SRN-like chain of MD <-> mPFC (also termed prelimbic cortex, PrL, a putative rodent analog of dlPFC in primates) interactions, over relatively long time scales (e.g., each mPFC neuron being active for roughly 2-5 secs), with a dynamically updating population of neurons chaining along "relay race" style. Computationally, this is like the reservoir computing dynamic of a randomly shifting pattern, but with much longer update windows, presumably due to the time dynamics of the NMDA and bursting etc in the MD <-> PFC circuits. It is good for giving a unique timestamp to each point in time as the pattern of activity present at that moment, and thus for providing a good timing signal. This rat-level dynamic differs from the classical primate dlPFC delay period activity, where a more "canonical" sustained activity pattern is maintained over time, via the layer 3 specialized circuits presumably. However, the delay windows are relatively long -- 5 sec or more. At shorter intervals, it is possible that individual neurons could span the gap.. The Halassa lab studies show that MD -> PFC has a contextual gating-like effect, helping to translate cues into task rule behavior. MD receives cue information from a subset of cue-selective PFC cells, and then projects that back to task-selective PFC cells that are cue-independent, but mediate actual task performance. They talk a lot about MD -> inhibitory effects but my sense is that this is similar to how all thalamic projections are likely to drive both excitation and inhibition in ways that cancel out for task-relevant cells but inhibit task irrelevant ones. A key missing part of this story is about the potential role of the drivers into the MD vs. the CT top-down "predictive" inputs -- for the cue-selective parts, it would seem that the drivers are task relevant cues (e.g., from OFC?). Others could be having motor output drivers. Need to go back and find the papers on that! The MD <-> PFC nature of this maintenance circuit suggests a natural way in which the stronger layer 3 recurrent maintenance could evolve. Overall, this literature provides an important additional ingredient to the "agate" model above -- sequences of SRN updates instead of some kind of more static sustained activity. Also, it shows how in PFC the time scale of these CT / SRN updates is much longer than posterior cortex (2+ sec vs 200 msec), providing a clue for the long-sought bridge to longer behaviorally-relevant timescales. Note that in well-practiced tasks, OFC is not important (but mPFC / PrL is reliably important), and it has a separate MD <-> OFC pathway -- we can plausibly assume similar dynamics across all areas. RikhyeGilraHalassa18
SchmittWimmerNakajimaEtAl17
NakajimaSchmittHalassa19Another paper in the above series, showing PFC -> BG -> visTRN control over visual processing -- another mechanism of top-down control. Need to look back at this again later. ChernyshevaSychFominsEtAl21Summary paper: ParnaudeauBolkanKellendonk18
BolkanStujenskeParnaudeauEtAl17This is ref 8 from above. Same group as 7: SpellmanRigottiAhmariEtAl15 Note: this and the above paper used a much simpler deterministic alternation paradigm, and they readily admit that this prevents anything interesting from being seen during the delay period -- thus the usual task state maintenance is definitely happening during the delay period (as in the Halassa lab papers), but in these papers it is missing.
AkhlaghpourWiskerkeChoiEtAl16Ref 14 from above, recording in dmStr To some extent, shows sequence of delay-period activity like mPFC, but actually most information encoding was at the start:
Thus, it seems likely that BG is doing early gating and then likely reflecting delay-period activity?
YangShiWangEtAl14This is a "direct hit" for large-N electrophysiology in the rat, doing delayed alternation Y-maze task. They do not show sustained delay period activity, but rather a sequence of firing over time that looks more like a "relay race" -- a chain of more short-term windows of activity. This would give better timing information, and should be something that the SRN-style CT layers can support.
WangStradtmanWangEtAl08Tom Hazy pointed out this paper a while ago -- they find "classical" NMDAR-based sustained maintenance mechanisms in layer 5 PFC in the rat. These are the same NR2B subunit type as in the layer 3 primate circuits, enriched in mPFC in rat layer 5 relative to other cortical areas. These NMDA have longer decay time (81 ms vs 43 in V1) -- TODO: could try faster tau in posterior models.
DembrowChitwoodJohnston10
DembrowJohnston14Review paper of above.. GilmartinBalderstonHelmstetter14Lots of interesting data on PL / mPFC for trace conditioning, and ACh modulation thereof. SreenivasanDEsposito19broad review paper.. GuoYamawakiSvobodaEtAl18
|
Beta Was this translation helpful? Give feedback.
-
Synapse-level NMDA active maintenance: fast weights bindingTodo: some evidence for this in above papers -- makes a lot of sense computationally -- explore in model! |
Beta Was this translation helpful? Give feedback.
-
If I'm understanding it correctly, all this is right in line with the way I was thinking about the role of the BG way back when I was writing my dissertation: phasic dopamine acts as a signal to instantiate a previously-learned "set" (action-plan selection). |
Beta Was this translation helpful? Give feedback.
-
Scope of BG Gating: Spirals and LoopsTo make the BOA model, we need to know how to connect the xFC -> BG -> thal <-> xFC circuits. The simplest conceptual version of the Rubicon model is one global BG gating event that engages goal representations distributed across OFC, ACC, and dlPFC. However, there is a big literature on all the different pathways, and things are very different between primates and rodents, so we need to sort through this literature to determine a good starting point and what to anticipate in the design for future expansion into the primate level. Primate loops and spiralsThe original parallel loops paper is Alexander, DeLong & Strick (1986), identifying the following loops: Note that there are both open and closed (looping back to itself) aspects to the loops. This figure from Bosch-BoujuHylandParr-Brownlie13 shows more clearly how there is a "spiraling" structure to the open parts of the loop: "higher" (dlPFC) and more "motivational" areas (OFC, ACC) drive "lower" areas: This spiral is well documented by Haber and colleagues across the BG and dopaminergic pathways -- here's HaberMcFarland02 which has a good summary of relevant data: And Tom Hazy's rendition of the thalamic connectivity from OReillyHazy22 (Thalamus chapter) usefully comparing the primate and rodent pathways: And here's a detailed map of the thalamus from RovoUlbertAcsady12 The green areas in VL / VA / VM and parts of MD indicate areas where BG inhibition is the primary driver, while red areas have cortical drivers, with the Pulvinar (Pul) areas being the prototype -- these are the predictive learning parts of the thalamus. Parts of the MD also have this cortical-driver property. Rodent BG thalamocortical circuitsCollinsAnastasiadesMarlinEtAl18 and AnastasiadesCollinsCarter21 provide good updated accounting of at least the prelimbic (PL) based circuit, which is roughly analogous to dlPFC in rodents.
NOTE: 20% of L5 are "CT" not "PT" -- could potentially lump these in with L6 pure CT
NOTE: this is consistent with Guo -- and with idea that CT is not directly gated by BG (or anything else).
Also, MD targets 2/3 much more strongly -- VM is fairly uniform.
VP -> MD: Root et al, 2015RootMelendezZaborszkyEtAl15 is a major review of ventral pallidum (VP) which is a major output of the ventral striatum (nucleus accumbens) -- it is the equivalent of GPe for the
Fig8: VPvm showed phasic inhibition (hence disinhibition of MD) at onset of approach.
|
Beta Was this translation helpful? Give feedback.
-
A Strong Rubicon MD / VThal HypothesisThe Collins et al (2018), Anastasiades et al (2021), and Root et al (2015) papers in rodents (quoted above) together support the general idea from agate that MD and VM (Vthal) thalamic pathways have distinct cortical laminar targets, with MD targeting layer 3 "active maintenance" cells with more focal connectivity, while VM drives very broad modulatory impacts across the entire layer, including layer 5. Furthermore, MD activation of VIP+ is disinhibitory of NMDA Ca++ maintenance, while VM is net inhibitory, while providing transient PT activation. Notably, and in distinction with posterior cortical areas, layer 6 CT neurons do not receive direct input from either MD or VM. Putting all of this together with the basic Rubicon framework:
Keeping MD gating separated and uniquely capable of driving layer 3 active maintenance in dlPFC, OFC, ACC provides a nice extra level of robustness for the goal-engaged state, vs. the "hustle and bustle" of Vthal gating during the task. Having a special VM -> turn off maintenance pathway is also critical for this long-sought mechanism. However, one missing piece is a way of turning off maintenance in OFC, ACC -- the MD does have some balanced effects NDNF+ cells and could certainly serve as a toggle. Also Tom points out in other papers some indication of diversity in the MD projections, with some more focal and some broader -- so here's a potential resolution:
|
Beta Was this translation helpful? Give feedback.
-
Credit assignment for MD gating pathway across temporal delayIf we postulate MD gating driving active maintenance of goal reps in PFC areas, then we have to confront the same temporal credit assignment problem as always: how does DA at the final outcome drive learning that will sensibly affect gating at the start of goal selection next time around? There is a circular dependency: you have to already gate in a PFC rep and maintain it in some relatively consistent form over time, in order to associate that PFC rep with BG Go pathway at the time of phasic DA at the outcome. The promise of avoiding these kinds of circular dependencies was one attraction of the non-BG gated agate model. This is the mother of all learning problems -- the crux of all hard bootstrapping problems in action learning: how do you know what to do at the start (early in time) based on what happens at the end (later in time). We have discussed this in many different contexts over the years, e.g., in terms of inverting forward (predictive) models, and PBWM, etc. Some of these options are discussed in #63 as well:
The CS-US bootstrapBy front-loading everything on a small fixed set of USs (unconditioned stimuli: motivationally privileged outcomes), we can potentially short-circuit the catch-22. Here's the general scenario:
There is some ambiguity in the recording data above about whether stable task-relevant bridging activity is actually present in rodent PL etc, but with the right paradigms, I think they did actually find it -- some paradigms were too degenerate to find something distinctive. Logically, this is necessary. Ok, I think this is enough of a direction to proceed at this point.. |
Beta Was this translation helpful? Give feedback.
-
BG Matrix Match ComputationAt a very general ACT-R production-system level, the BG Matrix should detect "match" or "conditional" situations: fire when the object is A and position is B etc. In the BOA rubicon case, VS needs to detect when the CS matches a currently relevant drive state (e.g., food when hungry), and not just respond to any CS. A general solution to this problem, consistent with striatum physiology and anatomy, and existing theories (Bogacz? Graybiel?), is that there are sparse and systematic patterns of conjunctive connectivity, e.g., a distinct subset of neurons receives from each different drive input, and, orthogonally, from different CS inputs, so distinct neurons can compute the conjunction (match) of drive * CS. This strategy requires high-threshold AND-like behavior, so the neurons don't fire for either drive or CS independently, but only to the conjunction. This is consistent with the quiet, inhibited behavior of striatal neurons. Thus, BG is uniquely situated for doing this AND "match" like computation. Note that this is distinct from the broadly integrative conjunctive nature of hippocampal representations. Although maybe EC2 has some similar characteristics? |
Beta Was this translation helpful? Give feedback.
-
PT thalamic gating dynamicsFinally need to confront the long-postponed challenge of how to properly simulate the effect of BG -> thalamus gating signal on the PT neurons. There is a bit of a catch-22 situation here: you need the PT -> thalamus activity to allow thalamus to get active when the BG inhibition is removed, but you also want thalamus -> PT to have a special effect on the PT neurons (uniquely turning on maintenance currents in the basic Rubicon maintenance case). Biologically, the MD thalamic inputs are likely to have more of an effect on apical dendrites in layer 3 where NMDA maintenance channels are concentrated. In the axon neuron, we can drive NMDA channels via the MD input selectively -- this works! Also, there is a diversity of PT neuron firing profiles, with some showing active maintenance and others only phasic firing at putative BG gating time -- need to sort that out eventually but for now we can have OFC and ACC just do the maintenance and not get to the other Vthal gating yet, which is likely to have more phasic effects + maintenance. |
Beta Was this translation helpful? Give feedback.
-
Discrete high-threshold behavior is inherently brittleThe Rubicon model, which posits a discrete gating event of goal engagement (triggered by VS -> VP -> MD gating of OFC, ACC, dlPFC) and relies upon the stable bridging activity strategy, creates a significant modeling challenge because any such discrete, rare event in a neural system is inherently brittle. Most existing RL and other AI models avoid this problem by choosing actions stochastically on a moment-by-moment basis, driven by current state representations, with no longer term goals or plans. However, it is critical to not continue to avoid this challenge, but rather to tackle it head-on, with full awareness of the nature of the problems and ways to make it as robust as possible. The source of the brittleness is fairly obvious at a general level: if the gating event doesn't happen or happens at the wrong time, then you either miss the opportunity or lock the system into a bad goal state that drives behavior over some period of time. Because there can only be one goal active at a time (in terms of shaping current proactive behavior in a coordinated way), an ill-timed gating event can trigger the loss of an important prior goal -- if you just keep updating all the time, nothing proactive and coordinated will ever emerge. Likewise, never engaging (characteristic of major depressive disorder), results in lack of proactive behavior entirely. The gambler by Kenny Rogers summarizes the situation well :)
(where 'em are goals, not cards) The extensive experience with the PBWM BG-gating model shows that as a general-purpose working memory maintenance mechanism, gating is not very robust. At an abstract computational level, it is equivalent to a POMDP (partially-observable markov decision process) ToddNivCohen08, which is a notoriously badly-behaved beast, due to extreme combinatorial explosion. In PBWM, we had to have many "stripes" of parallel maintenance to amortize the brittleness in any one. Likewise, typical LSTMs operate with gates at a per unit level, not one gate controlling many units, achieving a very high level of parallelism and thus robustness. Indeed, it is this brittleness of discrete gating that makes the CT / SRN / reservoir computing model more appealing as a basic working memory system: it is continuously updating and doesn't depend on discrete gating. But it also cannot bridge longer time gaps. Thus, the current framework that combines this continuous model with the discrete gating updates can give the best of both. Here are some key properties of the biological system that help mitigate the challenge:
Note that all of these factors only work with a restricted small subset of a priori known important things. This suggests that discrete BG gating and the stable bridging strategy may only work well in this one case of goal-based gating. Thus, all further expanded, flexible use of such a strategy in humans must somehow ride on top of this core goal-driven foundation. Thus, getting this goal-based system right, with all its special case logic, may be critical for doing anything coherent in a neural system over time. |
Beta Was this translation helpful? Give feedback.
-
Rubicon gating logic diagram |
Beta Was this translation helpful? Give feedback.
-
Novelty driven default approach behaviorIn the initial BOA model, the scaffolding for initial exploration of CSs was driven by the instinct actions, but these did not result in a VS BG "Go" gating signal, and thus there was nothing for the dopamine to respond to. It is more plausible that initial exploration is driven by a bias to approach novel stimuli, and novelty-driven dopamine etc is widely documented. Here's some of the motivation and background for the "Novelty Value" component added to the 2010 version of the PVLV model:
Across a range of papers summarized below, BG (VS) and BLA areas consistently show up as having strong novelty coding, and, interestingly, the hippocampus does not. These novelty signals end up in medial frontal areas ultimately, but likely this is reflecting the input from VS and BLA. Computationally, given that the CS -> US coding that drives VS gating arrives from the BLA, it would be convenient if BLA itself provided the novelty signal. A simple version of this would be that "US" associated with the CS is "novelty" itself, and by default, all novel CSs end up activating that US. Then, as an actual CS -> US association is learned in the BLA, it ends up out-competing the initial novelty-driven CS -> US association. It is also important, as well-documented by the latent inhibition phenomenon, that CSs not associated with any other US end up producing a decreasing novelty response over time, which has the effect of making it more difficult to then associate them with something positive. |
Beta Was this translation helpful? Give feedback.
-
ACh acetylcholine cholinergic lit review (Extinction especially)It is clear from various sources that ACh signals reward salience: non-RPE (non-prediction discounted), positively-rectified (fires for both positive and negative valences) signal for both US and CS onset. As such, it is a valuable modulator of both BG gating (via the cholinergic interneurons (CIN) in the BG) and learning in the BOA system. There is extensive cholinergic modulation of the BLA and medial PFC areas. One critical outstanding question: does ACh fire for the absence of an expected US? If so, it can continue to be a useful learning modulator during extinction learning as well as acquisition. If not, things get more complicated. SturgillHegedusLiEtAl20Sturgill, J. F., Hegedus, P., Li, S. J., Chevy, Q., Siebels, A., Jing, M., Li, Y., Hangya, B., & Kepecs, A. (2020). Basal forebrain-derived acetylcholine encodes valence-free reinforcement prediction error (p. 2020.02.17.953141). bioRxiv. https://doi.org/10.1101/2020.02.17.953141 Summary: Key review of reward salience function of ACh and distinction between different types of neurons within the basal forebrain, which have different response properties. CrimminsLingawiChiengEtAl23Crimmins, B. E., Lingawi, N. W., Chieng, B. C., Leung, B. K., Maren, S., & Laurent, V. (2023). Basal forebrain cholinergic signaling in the basolateral amygdala promotes strength and durability of fear memories. Neuropsychopharmacology, 48(4), Article 4. https://doi.org/10.1038/s41386-022-01427-w Summary: NBM (nucleus basalis of Meynert) facilitates acquisition and, possibly extinction learning. HDB (horizontal limb of diagonal band of broca) appears to only facilitate acquisition -- everything is weaker without it.
WilsonFadel17Wilson, M. A., & Fadel, J. R. (2017). Cholinergic regulation of fear learning and extinction. Journal of Neuroscience Research, 95(3), 836–852. https://doi.org/10.1002/jnr.23840 Summary: Has a lot of review info on basic anatomy etc, and shows clear evidence of ACh involvement in extinction learning.
|
Beta Was this translation helpful? Give feedback.
-
Lateral HabenulaArakiMcGeerKimura88Araki, M., McGeer, P. L., & Kimura, H. (1988). The efferent projections of the rat lateral habenular nucleus revealed by the PHA-L anterograde tracing method. Brain Research, 441(1), 319–330. https://doi.org/10.1016/0006-8993(88)91410-2 Summary: clearly shows projections to Substantia innominata (SI -- part of NBM), HDB cholinergic nuclei, in addition to RMTg and MD. Also main inputs are from VP.
Sutherland82Sutherland, R. J. (1982). The dorsal diencephalic conduction system: A review of the anatomy and functions of the habenular complex. Neuroscience & Biobehavioral Reviews, 6(1), 1–13. https://doi.org/10.1016/0149-7634(82)90003-3 Might LHb itself have ACh neurons?? Generally consistent with Araki et al.
HikosakaSesackLecourtierEtAl08Hikosaka, O., Sesack, S. R., Lecourtier, L., & Shepard, P. D. (2008). Habenula: Crossroad between the basal ganglia and the limbic system. Journal of Neuroscience, 28(46), 11825–11829. https://doi.org/10.1523/JNEUROSCI.3463-08.2008 Good overall review, citing Araki et al, confirming ACh projections. MathisLecourtier17Mathis, V., & Lecourtier, L. (2017). Role of the lateral habenula in memory through online processing of information. Pharmacology Biochemistry and Behavior, 162, 69–78. https://doi.org/10.1016/j.pbb.2017.07.004
LavezziZahm11Lavezzi, H. N., & Zahm, D. S. (2011). The mesopontine rostromedial tegmental nucleus: An integrative modulator of the reward system. Basal Ganglia, 1. http://www.ncbi.nlm.nih.gov/pubmed/22163100
|
Beta Was this translation helpful? Give feedback.
-
ACh gating of PFC for gating nonlinearityFunctionally, the PFC dynamic during gating is highly nonlinear and thus parameter sensitive. Given that we reliably have ACh at gating, and that it is known to modulate PFC excitability, in a pathway-specific fashion, it is likely that we can make gating more reliable. The existing evidence shows strong ACh modulation of layer 3 NMDA active maintenance circuits. A simple mechanism is that ACh is needed to enable activation of a new WM active state from thalamic gating -- without the ACh, existing NMDA currents can maintain, but nothing new can be activated. Mechanistically, ACh modulates K channels -- try that. GalvinYangPaspalasEtAl20 (Arsten, Wang senior authors)Galvin, V. C., Yang, S. T., Paspalas, C. D., Yang, Y., Jin, L. E., Datta, D., Morozov, Y. M., Lightbourne, T. C., Lowet, A. S., Rakic, P., Arnsten, A. F. T., & Wang, M. (2020). Muscarinic M1 receptors modulate working memory performance and activity via KCNQ potassium channels in the primate prefrontal cortex. Neuron, 106(4), 649-661.e4. https://doi.org/10.1016/j.neuron.2020.02.030
YangPaspalasJinEtAl13Yang, Y., Paspalas, C. D., Jin, L. E., Picciotto, M. R., Arnsten, A. F. T., & Wang, M. (2013). Nicotinic α7 receptors enhance NMDA cognitive circuits in dorsolateral prefrontal cortex. Proceedings of the National Academy of Sciences, 110(29), 12078–12083. https://doi.org/10.1073/pnas.1307849110
|
Beta Was this translation helpful? Give feedback.
-
STN and SKCa channels for pausing (sAHP)The STNp pausing behavior looks like the following, as documented physiologically, and required for providing a BG gating window:
The following mechanism is consistent with available data:
To model this, we have the following variables and dynamics:
AdelmanMaylieSah12Adelman, J. P., Maylie, J., & Sah, P. (2012). Small-conductance Ca2+-activated K+ channels: Form and function. Annual Review of Physiology, 74, 245–269. https://doi.org/10.1146/annurev-physiol-020911-153336
|
Beta Was this translation helpful? Give feedback.
-
PPTg temporal derivative mechanismThe PPTg temporal derivative could be modulated such that it is engaged when goal-engaged, but not during exploration. Need to know more about the connectivity and computations in this area. Key points:
Huerta-OcampoDautanGutEtAl21Huerta-Ocampo, I., Dautan, D., Gut, N. K., Khan, B., & Mena-Segovia, J. (2021). Whole-brain mapping of monosynaptic inputs to midbrain cholinergic neurons. Scientific Reports, 11, 9055. https://doi.org/10.1038/s41598-021-88374-6 This looks like the current definitive study of inputs to PPTg (PPN, LDT). An absolute goldmine!
LDT is our primary region of interest for "limbic" control i.e., goal activation system. Cortical inputs: single largest source to LDT, followed by superior colliculus (strong novelty input!), Raphe, and Gi (gigantocellular reticular nucleus -- some kind of major motor coordination brainstem area)
DautanSouzaHuerta-OcampoEtAl16Dautan, D., Souza, A. S., Huerta-Ocampo, I., Valencia, M., Assous, M., Witten, I. B., Deisseroth, K., Tepper, J. M., Bolam, J. P., Gerdjikov, T. V., & Mena-Segovia, J. (2016). Segregated cholinergic transmission modulates dopamine neurons integrated in distinct functional circuits. Nature Neuroscience, 19(8), Article 8. https://doi.org/10.1038/nn.4335 This is another key ref -- Dautan is the man.. TODO: fill in later. OmelchenkoSesack05Omelchenko, N., & Sesack, S. R. (2005). Laterodorsal tegmental projections to identified cell populations in the rat ventral tegmental area. The Journal of Comparative Neurology, 483(2), 217–235. https://doi.org/10.1002/cne.20417 This is one of the key refs from Mollick et al PVLV paper.
|
Beta Was this translation helpful? Give feedback.
-
Superior Colliculus and novelty / onset sensory codingDuttaGutfreund14Dutta, A., & Gutfreund, Y. (2014). Saliency mapping in the optic tectum and its relationship to habituation. Frontiers in Integrative Neuroscience, 8. https://www.frontiersin.org/articles/10.3389/fnint.2014.00001 Shows stimulus-specific adaptation (SSA) in OT of owls, related to same in SC of mammals.
BoehnkeBergMarinoEtAl11Boehnke, S. E., Berg, D. J., Marino, R. A., Baldi, P. F., Itti, L., & Munoz, D. P. (2011). Visual adaptation and novelty responses in the superior colliculus. European Journal of Neuroscience, 34(5), 766–779. https://doi.org/10.1111/j.1460-9568.2011.07805.x Shows rapid adaptation in all types of SC neurons. Didn't seem to have a novel stimulus comparison condition unfortunately, but at least we know that showing the same stimulus repeatedly results in much lower neural responding in SC, producing a kind of short term visual novelty signal.
VT (visual, transient) cells don't activate in saccades, show rapid attenuation to repeated stimuli. VT is dark blue, VS red. most decrease on 2nd repetition. responses to unexpected brighter / dimmer and absent stimuli were consistent with raw adaptation model.
WolfLintzCostabileEtAl15Wolf, A. B., Lintz, M. J., Costabile, J. D., Thompson, J. A., Stubblefield, E. A., & Felsen, G. (2015). An integrative role for the superior colliculus in selecting targets for movements. Journal of Neurophysiology, 114(4), 2118–2131. https://doi.org/10.1152/jn.00262.2015 todo KanedaIsa13Kaneda, K., & Isa, T. (2013). GABAergic mechanisms for shaping transient visual responses in the mouse superior colliculus. Neuroscience, 235, 129–140. https://doi.org/10.1016/j.neuroscience.2012.12.061 todo ComoliCoizetBoyesEtAl03Comoli, E., Coizet, V., Boyes, J., Bolam, J. P., Canteras, N. S., Quirk, R. H., Overton, P. G., & Redgrave, P. (2003). A direct projection from superior colliculus to substantia nigra for detecting salient visual events. Nature Neuroscience, 6, 974–980. http://www.ncbi.nlm.nih.gov/pubmed/12925855 Title pretty much says it all!
|
Beta Was this translation helpful? Give feedback.
-
Extinction, BLANovelty, and BLANothingThere are 3 related cases that require 2 additional BLA layers:
New plan: just make the first pool in BLAPosAcqD1 the novelty pool, driven by a constant input pool of neurons, and then everything else is just aligned according to number of USs etc. |
Beta Was this translation helpful? Give feedback.
-
Extinction and "paying the effort costs"A key principle of the Rubicon framework is that effort costs must be properly accounted for and "paid" at the time of a US outcome (but otherwise relatively discounted during the goal engaged state). But in terms of more fine-grained credit assignment, it should be clear that effort costs are associated directly with the dlPFC action plan, not the US outcome per se. You could have 2 different action plans for accomplishing the same US outcome, that vary in effort, and would want to select the lower effort plan, obviously. This is accomplished in part by having the dlPFC and ACCcost layers predict the Effort outcome (while OFC only predicts US and overall US value), and thus they should represent it and drive VS gating accordingly. There are several related questions here:
|
Beta Was this translation helpful? Give feedback.
-
Challenges in temporally-delayed opponent-process learningBoth the BG and BLA require 2 "challenging" (non-trivial, biologically complicated) learning dynamics:
This all suggests that we might want to rethink the cortical trace learning too, which has so far not demonstrated any dramatic improvements in temporally-extended predictive learning cases where it should theoretically be beneficial. Currently, the trace is a sender * receiver Ca co-product, but we could try a longer sender-only trace that is multiplicative on the current-trial Ca co-product instead. |
Beta Was this translation helpful? Give feedback.
-
Considering the magnitude of the architectural change from before, when PPTg was just doing a temporal derivative, how does the current picture square with what's currently known about connectivity, in terms of, e.g., the size of the fiber bundles that traverse from one nucleus to another? Just a random thought that occurred to me. I think I need to review my neuroanatomy. If you have suggestions for a good current source, I'd greatly appreciate it. |
Beta Was this translation helpful? Give feedback.
-
OFC prediction is too staticOFC should predict US and PV values. But over the course of a goal engaged episode, these are relatively static in most cases: new info on this does not come in frequently, and is mainly based on the BLA US learning. This doesn't give much for OFC predictive learning to do, especially on a trial-by-trial basis. In the One general idea is that OFC should "track progress toward the goal" -- this would amount to a kind of distance-like representation that increments steadily upward toward the overall US outcome that is predicted at a given point in time. Unlike effort, where the hypothalamus very plausibly provides increments (and integrals) of metabolic effort expended, it is unclear what kind of low-level brainstem mechanism would naturally provide a tracking toward the ultimate OFC value. It is kind of like a TD chaining dynamic with discounting that drives a progressive increment toward the final value. |
Beta Was this translation helpful? Give feedback.
-
PL as central, defining locus of controlBarattaSeligmanMaier23Baratta, M. V., Seligman, M. E. P., & Maier, S. F. (2023). From helplessness to controllability: Toward a neuroscience of resilience. Frontiers in Psychiatry, 14. https://www.frontiersin.org/journals/psychiatry/articles/10.3389/fpsyt.2023.1170417
|
Beta Was this translation helpful? Give feedback.
-
MD connectivity in rodentKuramotoPanFurutaEtAl17Kuramoto, E., Pan, S., Furuta, T., Tanaka, Y. R., Iwai, H., Yamanaka, A., Ohno, S., Kaneko, T., Goto, T., & Hioki, H. (2017). Individual mediodorsal thalamic neurons project to multiple areas of the rat prefrontal cortex: A single neuron-tracing study using virus vectors. Journal of Comparative Neurology, 525(1), 166–185. https://doi.org/10.1002/cne.24054
AnastasiadesCollinsCarter21Anastasiades, P. G., Collins, D. P., & Carter, A. G. (2021). Mediodorsal and ventromedial thalamus engage distinct L1 circuits in the prefrontal cortex. Neuron. https://doi.org/10.1016/j.neuron.2020.10.031
CollinsAnastasiadesMarlinEtAl18Collins, D. P., Anastasiades, P. G., Marlin, J. J., & Carter, A. G. (2018). Reciprocal circuits linking the prefrontal cortex with dorsal and ventral thalamic nuclei. Neuron, 98(2), 366-379.e4. https://doi.org/10.1016/j.neuron.2018.03.024
PerryLomiMitchell21Perry, B. A. L., Lomi, E., & Mitchell, A. S. (2021). Thalamocortical interactions in cognition and disease: The mediodorsal and anterior thalamic nuclei. Neuroscience & Biobehavioral Reviews, 130, 162–177. https://doi.org/10.1016/j.neubiorev.2021.05.032
|
Beta Was this translation helpful? Give feedback.
-
In developing the new
examples/ofcacc
model, I am now re-confronting the long history of issues about active maintenance, output gating and relationship between the BG and PFC.The broadest issues concern the role of BG in driving active maintenance of goals / task / other info in PFC, which was the core idea in the original PBWM model: BG gating toggles active maintenance in PFC.
The BG learns from phasic DA in a way that no other area can, and thus is in a position to provide a learned value-based selection process, via opposing Go vs. NoGo pathways. In the classical BG action selection model, the BG drives a "Go" gating of "good" actions, and a "NoGo" inhibition of "bad" actions (i.e., Thorndike's Law of Effect), and this gating signal functions directly to immediately initiate the action by disinhibiting the corresponding cortical area.
However, this classical model does not capture the well-established early, goal-driven action plan selection dynamic that is actually evident in most BG / PFC recordings in rodents to primates (in well-learned tasks): The action selection takes place at the very start of a trial (or as soon as relevant information is available), and it is somehow then maintained over the subsequent discrete motor actions (e.g., running down a Y maze), until a desired outcome is achieved, or the goal is abandoned.
This dynamic is consistent with the general goal-selection / goal-engaged framework O'Reilly, 2020 (aka the Rubicon model of Heckhausen & Gollwitzer, 1987 -- might as well use that name going forward -- it is cool), where behavior is organized into alternating periods of goal selection, followed by execution of the goal to the point of either success or failure / goal abandonment. This is the guiding framework for the
ofcacc
model. Its advantages are:Goal selection involves a coordinated evaluation of outcome (US, in OFC), action plan (dlPFC / SMA), and net utility (gains vs. effort, time, uncertainty and other cost factors, in ACC), all of which are evaluated by the BG. By re-using common OFC and ACC representations, the BG's job is made simpler and more plausible: it can learn over time about the relative values and costs of these codes, and make consistent decisions based on those evaluations. It can perhaps integrate various contextual and circumstantial factors, but largely it can rely on the "auditor" evaluations provided by OFC, ACC.
During goal-engaged execution, the system isn't constantly second-guessing and micro-managing everything: a "best effort" is made to achieve the goal. This leads to the strong asymmetries in value functions highlighted in the above article. Nevertheless, the system is continuously tracking progress toward the outcome, and insufficient progress can result in goal abandonment, leading to negative dopamine dips and associated "disappointment".
In either success or failure / goal abandonment, the "final reckoning" occurs, where the accumulated time and effort is fully registered by the ACC, and the exact nature of the OFC outcome is also learned, along with the associated phasic DA as a function of prior expectations (RPE). A key, motivating property of this goal-driven framework is that US outcomes are relatively rare, and thus represent special moments where learning should be concentrated, to update the estimates for subsequent goal-selection events. This reflects a basic assumption that, generally speaking, a temporally-extended sequence of actions is required to achieve most desired outcomes (often with little immediate prospect of reward along the way), and that the best point at which to evaluate these action sequences is at the point when an outcome has been achieved (or foresworn).
This Rubicon model contrasts with a more uniform, continuous mode of action selection in standard RL models, which just involve a continuous sequence of state / action decisions and evaluations. The classical BG model is more consistent with this model, as the "actor" system. A high-level claim and test of this goal-driven framework is that, in real-world naturalistic tasks (not necc Atari games!), the goal-oriented parsing of time and effort in the Rubicon framework results in better outcomes, more reliably and quickly, compared to the continuous mode RL models.
Thus, the primary challenge here is to get the various brain parts to do their jobs properly in the context of the overall Rubicon model.
The
ofcacc
model uses theapproach
environment to operationalize the most essential aspects of this process:At the start of a behavioral trial, the model sees a CS stimulus by virtue of looking at a given location, at some initial distance, and it has an internal "drive" body state that requires a specific US to satisfy it.
If the CS is known to be associated with the desired US, then the model should gate in a plan to approach and consume the US. This should be driven by BG gating evaluating the match of US and drive (strength of relevant OFC rep?), while the ACC should reflect the time-cost associated with the distance to achieve the goal. In more complex versions, it would be good to have different US's at different distances, with a range of internal drive strengths, such that ACC utility signals are really needed to select the best overall choice.
If the current "view" (CS, distance) does not produce a BG Go signal, then a left or right exploratory looking action should be taken, to find another option. There is some question as to whether this exploration behavior is itself a kind of goal-driven action, but for now, we can treat it as a default-level behavior produced in the absence of an engaged approach goal.
Once a goal is engaged, and approach & consume behavior is properly executed (which can be initially coached by subcortical instincts, or random exploration of some sort if we really want to go there), then the special outcome-driven learning occurs, based on an accumulated time counter and the DA signal from actually getting the US in the context of the drive, subtracting the time cost.
In this context, the simplistic, original PBWM-style "BG gates in active maintenance of OFC, ACC, and SMA reps" seems like all we need (modulo a few details about what happens at the point of US / reckoning).
The challenge at hand is thus to reconcile this with all the new info about BG / PFC summarized in Hazy Frank O'Reilly 2021 and captured in the nascent
agate
framework, and figure out how this all might generalize and work in broader contexts, etc.Key figure for PFC / BG / thalamus spiral loops from O'Reilly, 2020:
Beta Was this translation helpful? Give feedback.
All reactions