Substance-Induced Psychosis: When Learning Algorithms Distort Reality

The Portal in the Basement

The EMTs who brought Stephanie to the inpatient unit looked genuinely unsettled. She’d been found at 3 AM, methodically photographing her neighbors’ houses with her phone, documenting what she described as “dimensional breach points.”

“There’s a portal to a different galaxy in the Johnsons’ basement,” she emphatically explained to me the next morning. “They’ve been in contact with an advanced civilization for months. I have over a thousand photos documenting the evidence.”

Stephanie was 58, a senior vice president at a medium-sized company. She had no previous psychiatric hospitalizations and no major mental illness in her family. She’d been successfully managing ADHD with stimulants for thirty years. I wondered what had changed to bring on psychotic symptoms all of a sudden.

Her son filled in the details when I called him. “She’s been working insane hours since the new product launch six months ago. Said she needed to stay sharp, couldn’t afford to fall behind. I think she was taking way more Adderall than prescribed, but she insisted her doctor had increased it.”

The medical record told a different story. Her last psychiatry appointment was eight months ago. Her stimulant prescription hadn’t changed in two years.

Three weeks into her admission, completely off stimulants, Stephanie still believed in the portal. Importantly, we had done a thorough rule out of psychosis secondary to autoimmune and neurological entities. Nonetheless, Stephanie had developed an elaborate cosmological theory involving interdimensional communication protocols and galactic surveillance networks.

The Persistence Problem

According to DSM-5-TR criteria, substance-induced psychotic disorder should resolve when the substance clears. The hallmark feature distinguishing it from primary psychotic disorders is temporal relationship: symptoms emerge during intoxication or withdrawal and disappear during sustained sobriety.

But Stephanie wasn’t following the script. Her psychotic symptoms had crystallized into a stable, internally consistent belief system that persisted weeks after stimulant discontinuation. She wasn’t alone in this persistence.

Large epidemiological studies suggest that 10-25% of substance-induced psychotic episodes don’t resolve as expected. Some patients transition to diagnoses like schizophreniform disorder or brief psychotic disorder. Others remain in diagnostic limbo: no longer substance-induced, not quite meeting criteria for primary psychotic disorders.

The conventional explanation focuses on “unmasking” underlying vulnerability. The substance supposedly reveals a predisposition that was always there, waiting to emerge. But this explanation feels unsatisfying. Why do some people develop persistent psychosis after chronic stimulant use while others don’t? What’s actually changing in the brain during those months or years of escalating use?

Understanding Stephanie’s case required moving beyond simple neurochemical explanations toward a computational framework that could explain the persistence, internal coherence, and treatment-resistance of her symptoms. Her eventual treatment design incorporated an algorithmic circuit framework, and the response suggested that persistent substance-induced psychosis might represent biased constraint satisfaction algorithms rather than a persistent hyperdopaminergic state. But demonstrating this required understanding how chronic stimulants alter the confidence estimates that feed into updating our models of reality.

From a Simple Molecular Narrative to the Complexity of Learning in the Brain

The standard story about stimulant-induced psychosis centers on dopamine. Stimulants block dopamine reuptake, leading to excessive dopamine signaling. This hyperdopaminergic state is what psychosis is. Case closed.

The idea that excessive dopamine underlies psychotic symptoms is supported by two major pieces of evidence. Neuroimaging of dopamine in the ventral striatum early in psychotic illness reveals excessive release, and traditional antipsychotic efficacy is tied to how well they block dopaminergic signaling. In primary psychotic disorders, the idea is that a hyperdopaminergic state is triggered by an underlying genetic vulnerability, and that environmental stressors can lower the threshold for such vulnerability to be expressed. Stimulant use may be one such environmental stressor. With Stephanie, the age of onset was inconsistent with such an interpretation.

But the bigger issue at stake is this: what does dopamine have to do with thinking in the first place, and how does this molecule actually work in that context?

The prefrontal cortex (PFC) houses our most sophisticated cognitive machinery: neural populations that maintain working memory, simulate future scenarios, and hold the beliefs that guide our decisions. These mental maps form the neural substrate of planning and inference, allowing us to navigate complex environments and make sense of ambiguous information. The prefrontal cortex, like other cortical areas, is composed of 80-85% excitatory neurons that use glutamate to communicate with one another. The remaining 15-20% are inhibitory neurons that perform functions like gating, filtering and normalization of signals transmitted among the excitatory glutamatergic neurons.

The prefrontal cortex engages in metalearning (learning how to learn) by discovering which strategies and representations work across different contexts and tasks. Rather than just memorizing specific stimulus-response patterns, it extracts abstract rules and principles that can be applied to novel situations.

This metalearning happens through self-supervised learning processes, where the PFC uses the inherent structure of experience to generate its own training signals. For example, it might learn to predict future events based on current context, or identify which environmental cues reliably predict important outcomes. These prediction tasks don’t require external labels: the PFC generates supervisory signals from the temporal structure of experience itself.

What distinguishes the PFC from other cortical areas is its timescale for temporal integration. While sensory areas might integrate information over milliseconds to seconds, the PFC operates over seconds to minutes. This longer temporal window allows it to detect patterns and relationships that unfold across extended behavioral sequences, enabling the extraction of abstract rules that persist across changing contexts.

The PFC maintains extensive connections throughout the brain, but is particularly involved in loops with two subcortical regions: the thalamus and basal ganglia. Recent work, including Wang and colleagues’ influential DeepMind study, suggests that the PFC learns predictive models of the environment, but that changes in its connectivity patterns support metalearning rather than adapting to individual tasks. The way it adapts to individual tasks is by adjusting its activity patterns rapidly, not connectivity slowly. The thalamus appears to be the brain region that sends signals to rapidly adjust PFC activity patterns, helping it adapt its world models to the current environment. The thalamus receives inputs from regions like the cerebellum, critical for motor adaptation, and the hippocampus, which outputs compressed long-term memory episodes.

How does dopamine factor into this picture? Dopamine is expressed by neurons scattered in the midbrain and brainstem, with the midbrain populations most relevant to our discussion. The major success story in systems neuroscience is that dopamine neurons signal reward prediction errors in temporal difference learning algorithms: computing the difference between expected and actual outcomes to drive learning. In the classic formulation, the temporal difference error is δ = r + γV(s’) – V(s), where r is the immediate reward, γ is the discount factor, V(s’) is the predicted value of the next state, and V(s) is the current state’s value.

The largest concentration of dopamine receptors is in the striatum, the first station of the basal ganglia, where these prediction error signals are expected to adjust value representations encoded by striatal populations. However, two major developments have transformed our understanding of dopamine’s relationship to thinking and planning.

First, whatever happens in the striatum ultimately impacts the thalamus and then the PFC. Recent evidence suggests that dopamine-driven changes in striatal circuits influence how prefrontal world models get updated through cortico-basal ganglia-thalamic loops. This creates a pathway for reward prediction errors to systematically bias the metalearning processes that maintain our beliefs about reality.

Second, emerging research on distributional reinforcement learning reveals that dopamine neurons don’t just signal simple prediction errors (basic “better than expected” or “worse than expected” signals). Instead, they encode the full statistical distribution of possible prediction errors. Think of it like this: instead of just saying “that was surprising,” different dopamine neurons maintain different perspectives on what kinds of surprises to expect and how to weight them. Some neurons act as “optimistic” predictors that overweight positive prediction errors, while others act as “pessimistic” predictors that overweight negative prediction errors. Moreover, individual dopamine neurons encode different discount factors: some optimized for short-term rewards, others for long-term outcomes.

Stimulant use may change several critical aspects of this architecture. First, it could alter how reward prediction errors are computed: the pallidal signals feeding back to the ventral tegmental area (VTA, a midbrain dopamine-producing region) to compute prediction error differences would be acutely altered. This may impact different VTA neurons differently: those with optimistic versus pessimistic biases, and those with short versus long discount factors, could show differential vulnerability to stimulant-induced alterations.

Second, stimulant use may change how quickly striatal systems impact the updating of PFC world models through thalamic loops. If the normal temporal dynamics of this metalearning process are accelerated or biased, the PFC might begin incorporating unreliable distributional information into its predictive models of reality.

Here’s a potential mechanistic insight: distributional alterations may not create noisy signals so much as systematically distort the confidence estimates that guide belief updating. In healthy brains, the width of prediction error distributions appears to signal uncertainty. When you encounter something unexpected, wide distributions might tell the system “be cautious, gather more evidence.” Narrow distributions could signal high confidence: “update your beliefs strongly based on this information.”

Chronic stimulant use might bias this uncertainty signaling by creating artificially narrow distributions around extreme positive prediction errors. The system could receive signals that essentially convey “high confidence that this unexpected pattern is highly significant.” This could alter the normal process by which the brain decides how much to update its beliefs based on new evidence.

When patients notice unusual environmental patterns, their altered distributional system might generate high-confidence signals about the significance of these observations. Instead of the appropriate response (“this might be random, collect more evidence”), the metalearning algorithms could receive the message “this is definitely important, build strong beliefs around it.”

The thalamic gating system, which appears to filter which patterns get promoted to adjust beliefs and plans, likely relies on these confidence estimates to make gating decisions. Altered confidence estimates might cause it to gate spurious patterns as if they were reliable environmental regularities. Once gated into prefrontal circuits, these patterns could become the foundation for elaborate belief systems.

This sets up a crucial question: how does the brain actually construct unified belief systems from these distributed confidence signals? Understanding this process illuminates why Stephanie’s delusions were so resistant to contradictory evidence.

The Computational Architecture of Belief Coherence

The persistence of Stephanie’s delusional beliefs reveals a fundamental computational challenge: how does the brain construct unified models of reality from distributed, potentially contradictory evidence? This problem has been formalized as coherence maximization, a well-studied algorithmic framework that illuminates how biased confidence signals can lead to systematically distorted worldviews.

Coherence maximization is formally defined as a constraint satisfaction problem, where mental representations either fit together (cohere) via positive constraints or resist fitting together (incohere) via negative constraints. The brain seeks configurations that maximize the satisfaction of these constraints: essentially finding the most internally consistent interpretation of available evidence.

The computational complexity of this process is significant: coherence problems are computationally intractable for large systems, requiring approximation algorithms similar to those used for traveling salesman problems or neural network optimization. This computational challenge explains why specialized neural circuits evolved to handle belief integration.

The brain implements coherence maximization through what computational neuroscientists call active inference: a framework where organisms maintain generative models of their environment and continuously update these models to minimize “free energy,” a measure of surprise or model-environment mismatch. Under this framework, beliefs are not passive representations but active hypotheses that guide both perception and action, with the system working to maintain internal consistency while accommodating new evidence.

This algorithmic framework appears in multiple computational domains. The ECHO (Explanatory Coherence by Harmonic Optimization) algorithm, developed for scientific reasoning, uses connectionist networks where propositions are represented as nodes and coherence relationships as weighted connections. The network settles into configurations that maximize overall constraint satisfaction. Modern machine learning systems face similar challenges: large language models must maintain consistency across vast knowledge bases, and this coherence-seeking behavior emerges naturally from their training objectives.

The evolutionary utility of coherence-seeking becomes clear from this computational perspective: organisms that can rapidly construct consistent internal models from fragmentary evidence gain survival advantages through improved prediction and decision-making. However, this same mechanism becomes problematic when the input signals (the confidence estimates about environmental patterns) become systematically biased.

When chronic stimulant use distorts the distributional properties of prediction errors, the coherence maximization system receives corrupted input: patterns that should be flagged as low-confidence noise instead arrive with high-confidence signals demanding explanatory integration. The system treats these spurious patterns as reliable environmental regularities requiring coherent explanation, leading to elaborate belief systems that represent optimal solutions to a fundamentally corrupted constraint satisfaction problem.

This computational framework explains why persistent substance-induced delusions resist simple contradictory evidence. The beliefs aren’t random false ideas; they’re optimal coherence solutions given systematically biased confidence inputs. Breaking these belief systems requires restoring the underlying computational processes that generate appropriate confidence estimates in the first place, not just presenting contradictory evidence.

Given this understanding, traditional psychiatric approaches that focus solely on blocking neurotransmitter receptors miss the deeper computational dysfunction.

Why Single-Receptor Models Miss the Circuit Story

This algorithmic circuit framework reveals why traditional dopamine-receptor models struggle to explain persistent substance-induced psychosis. The standard approach focuses on blocking D2 receptors with antipsychotics to suppress psychotic symptoms, but this doesn’t address the underlying computational dysfunction that generates biased belief systems.

The problem goes beyond “too much dopamine signaling.” It’s systematically altered distributional learning: corrupted confidence estimates that feed into coherence maximization algorithms, leading to elaborate but internally consistent delusional frameworks. These beliefs persist because they represent optimal solutions to a constraint satisfaction problem operating on fundamentally biased inputs.

A purely receptor-based approach treats the neurochemical symptoms rather than the computational dysfunction. Blocking dopamine receptors may reduce the intensity of aberrant signals, but it doesn’t restore the distributional properties that allow metalearning circuits to distinguish reliable environmental patterns from noise. The coherence maximization system continues operating on the same corrupted confidence estimates, just with dampened intensity.

Understanding how circuits implement distributional learning algorithms and how chronic stimulants systematically bias these implementations suggests more targeted interventions that address the computational roots of persistent delusions rather than just their neurochemical expression.

With this framework in mind, Stephanie’s treatment required a fundamentally different approach than standard antipsychotic protocols.

Treatment Through an Algorithmic Circuit Lens

Understanding Stephanie’s symptoms through the distributional learning framework suggested we needed a multi-pronged therapeutic approach. The altered prediction error distributions driving her persistent delusions couldn’t be fixed by simply blocking dopamine receptors. We needed to restore the natural diversity and distributional properties of dopaminergic signaling itself.

Standard antipsychotics like haloperidol or risperidone work by dampening aberrant dopamine signals in striatal circuits. A 2019 systematic review of six randomized controlled trials found that various antipsychotics (aripiprazole, haloperidol, quetiapine, olanzapine, and risperidone) were all effective at reducing both positive and negative symptoms of amphetamine-induced psychosis. But this dampening approach addresses only part of the distributional learning problem.

The algorithmic framework suggests that successful treatment requires two complementary strategies: first, reduce the impact of biased prediction error signals on coherence maximization circuits; second, restore the capacity for healthy distributional learning by normalizing the statistical properties that generate appropriate confidence estimates.

For the first component, we chose aripiprazole over traditional D2 antagonists. Its partial agonism at dopamine receptors could theoretically provide more nuanced modulation: dampening excessive signals while preserving some baseline dopaminergic function needed for normal distributional learning. The systematic review noted that aripiprazole showed particular effectiveness for negative symptoms, which might reflect its ability to maintain residual dopaminergic signaling rather than completely blocking the system.

But medication alone wouldn’t restore healthy distributional learning. Stephanie’s constraint satisfaction algorithms needed to relearn how to process confidence estimates appropriately. This required tackling the underlying cause: her dependence on chronic stimulants that had systematically biased the distributional properties feeding into her coherence maximization system.

Restoring Distributional Diversity: Replacement and Modulation

The algorithmic circuit framework pointed toward two additional interventions that could help restore normal distributional learning properties.

First, we needed to address Stephanie’s underlying ADHD without continuing to bias her prediction error distributions. Wellbutrin (bupropion) offered a promising alternative. Unlike amphetamines, which create massive positive prediction errors through dopamine reuptake blockade, bupropion provides more modest, sustained increases in dopaminergic and noradrenergic signaling. Its mechanism might preserve more natural distributional properties while still providing therapeutic benefit for attention deficits.

The hypothesis here is that Wellbutrin could serve as replacement therapy: providing enough cognitive enhancement to manage her ADHD symptoms while allowing her biased distributional learning circuits to gradually renormalize. Instead of the extreme positive prediction errors from chronic stimulants, she would experience more naturalistic dopaminergic signaling patterns that could support healthy confidence estimation.

Second, we cross-titrated aripiprazole to KarXT, a combination of xanomeline and trospium that targets muscarinic receptors. The algorithmic framework suggests this might work by modulating the inputs to dopaminergic circuits rather than directly blocking dopamine receptors. My early clinical experience with this cholinergic modulation appears to support its efficacy in stimulant-induced psychosis.

Cholinergic signaling plays crucial roles in regulating the context-dependent release of dopamine. By modulating muscarinic receptors, KarXT could potentially help restore more natural patterns of dopaminergic signaling: not by suppressing all dopamine activity, but by helping circuits generate more appropriate distributional responses to environmental inputs.

This represents a fundamentally different therapeutic approach: instead of just dampening aberrant signals, we’re trying to restore the circuit mechanisms that generate healthy distributional learning in the first place.

But even optimal pharmacological intervention addresses only half the problem. The other half involves helping patients’ constraint satisfaction algorithms relearn how to process confidence information appropriately.

Circuit Rehabilitation: A Future Direction

Beyond pharmacological interventions, patients with biased constraint satisfaction algorithms may need active rehabilitation. Traditional cognitive behavioral therapy focuses on changing thoughts through rational examination. But persistent delusions aren’t irrational thoughts; they’re outputs from coherence maximization systems operating on systematically biased confidence estimates.

Future therapeutic interventions might look quite different from standard CBT. Rather than challenging the content of delusional beliefs directly, sessions could focus on the process of evidence evaluation itself. Imagine sitting with a patient and working through their observations systematically. Not “you’re wrong about the portal” but “let’s think about all the possible explanations for what you noticed.” What other reasons might account for changes in your neighbor’s lighting patterns? How confident should we be in each explanation? What additional evidence would help us distinguish between them?

The goal wouldn’t be to convince patients they’re wrong. It would be helping their constraint satisfaction algorithms practice processing confidence information appropriately: distinguishing between high-confidence and low-confidence inferences, calibrating degrees of belief to strength of evidence, and maintaining uncertainty when evidence is ambiguous.

Such approaches might reveal that patients’ observations aren’t entirely false. They may have noticed real environmental patterns. But their biased learning algorithms assign extreme confidence to elaborate explanations when the evidence actually supports much simpler, more probable alternatives.

By systematically examining the distributional properties of evidence (the range of possible explanations and their relative probabilities), therapeutic interventions could potentially help these circuits begin distinguishing signal from noise again.

Beyond Chemical Imbalance

Stephanie’s case illustrates why algorithmic circuit psychiatry offers superior explanatory power to traditional “chemical imbalance” models. The chemical imbalance framework struggles to explain key features of her presentation: why her symptoms persisted weeks after stimulant clearance, why her delusions were internally coherent rather than random, and why standard dopamine blockade provided only partial relief.

The algorithmic framework provides a more comprehensive explanation. Chronic stimulants didn’t create “too much dopamine.” They systematically biased the distributional properties that feed confidence estimates into constraint satisfaction algorithms. Her elaborate interdimensional theory wasn’t a symptom of broken brain chemistry but an optimal solution to a corrupted computational problem. Traditional antipsychotics dampened the signals but couldn’t restore the underlying distributional learning processes.

Understanding psychiatric symptoms as biased algorithms rather than chemical imbalances opens new therapeutic possibilities. Instead of treating medication and therapy as separate interventions targeting different domains, we can recognize them as complementary approaches working on the same computational substrate. Pharmacological interventions like cholinergic modulation help restore healthy distributional properties in the circuits that generate confidence estimates. Therapeutic interventions help retrain these same constraint satisfaction algorithms to process confidence information more appropriately. Both target the algorithmic dysfunction that generates pathological beliefs.

Stephanie’s recovery with cholinergic modulation and stimulant replacement suggests that restoring healthy algorithmic function may be more effective than suppressing aberrant chemistry. The brain implements sophisticated learning algorithms through specific circuit architectures. When we understand how these algorithms can be corrupted and restored, we move beyond the limitations of purely neurochemical approaches toward interventions that address the computational roots of psychiatric dysfunction.