AI Chatbots Are Not Safe for Psychosis or Mania: What the Research Now Shows
By Dr. Anindo Mitra | MBBS, MD Psychiatry (JIPMER) | Consultant Psychiatrist, Athena Behavioural Health, Gurugram
Published on dranindomitra.com | Reading time: ~14 minutes
TL;DR
• Two peer-reviewed studies published in March 2026, one in JAMA Psychiatry, one in Acta Psychiatrica Scandinavica, document serious harms from AI chatbot use in psychiatric patients.
• ChatGPT produced inappropriate responses to psychosis-related prompts at high rates; the free version performed worst.
• The Acta cohort study and a World Psychiatry paper both document real-world clinical deterioration: worsened delusions, increased mania, suicidal ideation, and aggravated eating disorders.
• A 2025 case report in Innovations in Clinical Neuroscience described new-onset psychosis directly associated with prolonged AI chatbot use.
• Psychosis and mania impair reality testing, the cognitive capacity needed to evaluate whether an AI’s response is safe.
• The free-tier problem is a health equity issue: the patients with fewest clinical alternatives are being exposed to the least safe tools.
Two Studies, a Case Report, and a Problem That Can No Longer Be Ignored
AI chatbots are not safe for people experiencing active psychosis or mania. Three peer-reviewed papers published in 2025–2026, including two in March 2026 alone, make this case with evidence that is difficult to set aside.
A JAMA Psychiatry study tested ChatGPT’s responses to psychosis-related prompts and found high rates of clinically inappropriate output. The free version performed worst. An Acta Psychiatrica Scandinavica cohort study followed real psychiatric patients and documented real outcomes: worsened delusions, increased mania, suicidal ideation, and aggravated eating disorders. A World Psychiatry paper published simultaneously adds to the real-world evidence. And a 2025 case report in Innovations in Clinical Neuroscience described a patient who developed new-onset psychosis following prolonged, intensive AI chatbot use.
This is not a theoretical risk. It is not a concern for the future. It is happening now, and the patients most affected are often those who have the fewest alternatives.
This post explains what the research found, why psychosis and mania are specifically dangerous contexts for AI tools, and where the clinical line needs to be drawn.
What the JAMA Psychiatry Study Found
The JAMA Psychiatry study evaluated how ChatGPT responded to a range of prompts related to psychosis: the kind of statements or questions a person experiencing psychotic symptoms might type into a chatbot.
The findings were concerning on two levels.
First, the rate of clinically inappropriate responses was high. Across psychosis-related prompts, the model produced responses that failed to redirect, failed to recognise distress, or appeared to engage with delusional content in ways that could reinforce rather than challenge it.
Second, performance varied significantly by version. The free tier (the version that costs nothing and requires no subscription) was the worst performer. This matters because of who uses the free tier. People who cannot afford a paid subscription are disproportionately likely to be economically disadvantaged and to have limited access to mental health services. The patients most likely to reach for a free AI tool are precisely the ones with the fewest clinical alternatives.
This is not a minor caveat. It is a structural problem embedded in how these tools are deployed.
Is this problem specific to ChatGPT, or does it apply to all AI chatbots?
The JAMA Psychiatry study tested ChatGPT specifically, but the vulnerabilities it documented are not unique to that platform. They follow from the architecture — and the architecture is shared.
ChatGPT, Gemini, Claude, Microsoft Copilot, and Meta AI are all general-purpose large language models. They differ in training data, safety tuning, and the specific guardrails their developers have implemented. What none of them have is access to a patient's clinical history, the ability to detect symptom deterioration across sessions, or any mechanism to escalate to emergency care when a conversation enters dangerous territory. These are not gaps that a better prompt or a more capable model will close. They are structural absences. A chatbot that cannot remember yesterday's conversation cannot detect that someone's delusions have worsened since last week. A chatbot with no connection to a clinical record cannot know that the person it is speaking with has a diagnosis of bipolar I and was discharged from hospital three months ago. The platform name above the chat window does not change any of that.
The free-versus-paid tier problem almost certainly extends beyond ChatGPT as well. Where safety guardrails require computational resources, where premium models receive more rigorous safety tuning, the same pattern is likely to reproduce: reduced protection at the tier that reaches the most vulnerable users.
Two categories sit outside this critique and warrant separate treatment.
Companion and social chatbots (Character.AI being the most studied example) carry a specific and arguably higher risk than general-purpose tools. These platforms are explicitly designed to foster emotional bonding. Users develop parasocial relationships with AI characters; the interaction style is designed to be warm, responsive, and engaging with whatever the user brings. That design is not incidental — it is the product. The consequence is that a user experiencing paranoid thinking, grandiosity, or emerging delusions encounters a tool that has been built to engage uncritically with their content rather than redirect them. The 2023 inquest into the death of a teenager in the UK who had been using Character.AI extensively before suicide is the most documented case of serious harm from this category, though it involved a minor without a pre-existing psychiatric diagnosis. The risk mechanism in someone with active psychosis or mania is, if anything, more direct.
Mental health-specific tools (Woebot, Wysa, and similar platforms) are a genuinely different category and should not be conflated with general-purpose chatbots. These tools were developed with clinical input, are built on structured CBT and DBT frameworks, carry explicit contraindications for active psychosis and suicidal ideation, and are designed for patients who are clinically stable with mild-to-moderate presentations. The evidence base for them is limited but not absent. Using them as intended, with appropriate patient selection, is not the same risk as someone in an acute manic episode typing into ChatGPT at 3am. Grouping all AI mental health tools together overstates the risk in one direction and understates it in another.
The practical summary: the architectural problem is platform-agnostic for general-purpose tools. Companion chatbots carry an additional layer of risk from their design intent. Clinically designed tools for stable patients are a different category with their own evidence base and their own limitations.
What the Real-World Studies Found
The Acta Psychiatrica Scandinavica cohort study and the World Psychiatry paper went beyond prompt-testing. Both examined what chatbot use actually looked like in real psychiatric patients.
The documented harms included:
• Worsened delusions: patients with psychotic disorders showed deterioration in delusional thinking
• Increased mania: patients with bipolar disorder showed manic escalation
• Suicidal ideation: chatbot use was associated with increased suicidal thoughts in vulnerable patient.
• Aggravated eating disorders: patients showed worsening of symptoms during periods of chatbot use
These are not minor adverse events. These are the core clinical syndromes that psychiatric treatment is designed to manage. Documenting their worsening in association with chatbot use, in real patients and in prospective study designs, is a clinically significant finding.
A note on methodology: cohort studies establish associations, not causation. It is possible that patients who were already deteriorating sought out AI chatbots more. That question of directionality is legitimate. But the mechanistic arguments below, and particularly the case-level evidence, make the concern credible enough to warrant active clinical attention now.
A Case of AI-Associated New-Onset Psychosis
The research does not stop at population-level data. A 2025 case report in Innovations in Clinical Neuroscience by Pierre, Gaeta, Raghavan, and Sarma described a patient who developed new-onset psychosis following prolonged, intensive use of an AI chatbot. The paper’s title, “You’re Not Crazy,” reflects what the chatbot reportedly communicated during exchanges that appeared to validate rather than challenge emerging delusional thinking.
Case reports sit at the lower end of the evidence hierarchy. A single case does not establish causation. But in psychiatry, case-level documentation of a new clinical phenomenon is how the field first recognises patterns. It took years of accumulated case evidence to establish the relationship between steroid use and steroid-induced psychosis, or between cannabis use and cannabis-induced psychotic disorder. The appearance of AI-associated psychosis in a peer-reviewed journal in 2025, alongside the population-level data from 2026, is the kind of convergence that warrants clinical concern.
The proposed mechanism is coherent: a patient in the early stages of a psychotic episode, with emerging but not yet fixed delusional beliefs, uses an AI chatbot for support. The chatbot cannot recognise the clinical context. It engages with the content of the beliefs, fails to challenge them, and may validate them through reassurance. The beliefs consolidate. The psychosis deepens.
Why Psychosis Is a Uniquely Dangerous Context for AI Tools
To understand the weight of these findings, it helps to be precise about what psychosis does to cognition.
Psychosis impairs reality testing. Reality testing is the cognitive capacity to distinguish between internal mental events (thoughts, beliefs, perceptions) and external reality. It is what allows a person to recognise that a voice they are hearing is not coming from outside their head, or that a belief they hold is not supported by evidence.
In psychosis, this capacity is compromised. The patient is not choosing to believe something false. They cannot access the evaluative machinery needed to question it. A person experiencing paranoid delusions does not experience those beliefs as unusual; they experience them as completely real.
This has a direct implication for AI chatbot use: a patient in active psychosis cannot reliably evaluate whether an AI’s response is safe or accurate.
If a chatbot engages with delusional content, even neutrally, even by failing to challenge it, it is not providing neutral information to a rational evaluator. It is providing input to a mind already struggling to separate true from false, real from unreal. The absence of a challenge functions as validation. That is clinically dangerous.
This is different from the risk posed by chatbot errors in most other contexts. If someone receives bad financial advice from an AI, they can usually recognise that something feels off, seek a second opinion, or choose not to act on it. The error is recoverable.
In psychosis, the evaluative layer that would allow that recovery is precisely what is impaired. The error may not be recoverable without clinical intervention.
Why Mania Is Also a High-Risk Context
Mania shares some of these features but through a different mechanism.
In moderate to severe manic episodes, patients commonly experience elevated mood, reduced need for sleep, racing thoughts, grandiosity, and dramatically decreased impulse control. Insight is typically impaired; patients often do not recognise that they are unwell, and they tend to experience their altered state as positive or desirable.
A patient in a manic episode seeking mental health support from an AI chatbot presents a specific risk profile:
• They may be grandiose, resistant to redirection, and highly confident in their own conclusions
• They may use the chatbot at unusual hours, during sleepless nights, when impulse control is at its lowest
• They may make significant decisions based on chatbot interactions during an acute episode
• The chatbot has no awareness of their baseline, their diagnosis, or their current clinical state
The Acta Psychiatrica Scandinavica finding that chatbot use was associated with increased mania in bipolar patients is consistent with this mechanism. The tool is not equipped to recognise mania, respond to it appropriately, or redirect the patient to care.
The Health Equity Problem
The JAMA Psychiatry finding about the free tier deserves to be named plainly.
The patients most likely to use free AI tools are those who cannot afford paid subscriptions and who live in areas with limited access to mental health services. In the Indian context, and in lower-income settings globally, this group includes a significant proportion of people with serious mental illness who are using AI as a substitute for care they cannot access or afford.
These are not casual users exploring a technology out of curiosity. These are people who are often genuinely distressed, often symptomatic, and turning to AI because they have nowhere else to turn.
The finding that the free version of ChatGPT performed worst on psychosis-related safety is not an irony. It is a predictable consequence of tiered AI deployment: reduced safety guardrails at the tier that reaches the most vulnerable users.
If we are serious about mental health equity, we cannot accept a model where premium tiers carry safer guardrails and free tiers carry greater clinical risk. That is the opposite of what equitable healthcare looks like.
Where AI Has a Legitimate Role in Mental Health
This post is not an argument against AI in mental health. The research base for AI-assisted support is real and growing, and there are applications with genuine evidence behind them.
AI tools have shown promise in:
• Psychoeducation delivery: providing accurate information about diagnoses, medications, and coping strategies to people who are stable and not acutely unwell
• Symptom monitoring: helping patients log mood, sleep, and anxiety over time, with data shared with a clinician
• Stepped-care support: low-intensity CBT-based exercises for mild-to-moderate depression and anxiety where therapist access is limited
• Administrative support: documentation, reminders, care coordination
What these applications share is that they suit patients who retain intact reality testing and impulse control, who are not acutely psychotic or manic, and who ideally have some level of clinical oversight.
The problem is not AI in mental health. The problem is AI deployed indiscriminately, without clinical stratification, to populations that include people who are acutely unwell and uniquely vulnerable to harm.
Which psychiatric diagnoses carry the highest risk from AI chatbot use?
Not all psychiatric conditions carry the same risk. The severity of harm from AI chatbot use tracks closely with two factors: how severely impaired reality testing is, and how much that impairment prevents the person from evaluating whether what the chatbot says is safe. The following breakdown reflects that clinical logic.
Schizophrenia and schizoaffective disorder — highest risk. Active positive symptoms (delusions, hallucinations, disorganised thinking) directly impair the capacity to evaluate AI output critically. A chatbot that fails to challenge a fixed false belief, or that provides information consistent with it, risks reinforcing and consolidating that belief. The case report of AI-associated new-onset psychosis illustrates the mechanism. Patients with chronic schizophrenia who are between episodes and stable carry a different risk profile, but any deterioration toward positive symptoms shifts this category back to highest risk.
Bipolar disorder, manic episode — highest risk. Mania reduces insight, inflates confidence in one's own reasoning, and dramatically increases impulsivity. A person in a manic episode does not recognise that their judgement is impaired. The Acta Psychiatrica Scandinavica data documenting increased mania associated with chatbot use is consistent with the clinical picture: nocturnal chatbot use during sleepless manic nights, escalating grandiose exchanges, and no mechanism in the tool to detect or interrupt what is happening.
Brief psychotic disorder — highest risk. Acute and often sudden in onset, with the full range of positive psychotic symptoms. Identical mechanism to schizophrenia during the acute episode.
Severe OCD — high risk. The specific risk here is reassurance-seeking. Compulsive reassurance-seeking is a maintaining factor in OCD, and a chatbot is functionally an infinite reassurance machine. It will answer the same question repeatedly, never fatiguing, never redirecting. Each reassurance provides momentary relief and then drives the next cycle. Patients with contamination obsessions, harm obsessions, or religious/moral obsessions who use chatbots to seek reassurance about their fears are likely to worsen. This is a recognisable clinical pattern that the research is beginning to document.
Active suicidal ideation — high risk. The JAMA Psychiatry study documented inappropriate chatbot responses to crisis-level prompts. A person with active suicidal ideation, intent, or a plan requires immediate clinical escalation. No current general-purpose chatbot has a reliable mechanism to detect this state, maintain a clinically appropriate response throughout a session, and connect the person to emergency support. The free tier problem is acute here — it is exactly this group of patients who are likely to use free tools and who face the greatest risk from inadequate safety responses.
Severe eating disorders — high risk. Eating disorder psychopathology involves ego-syntonic beliefs about food, weight, and body image that are highly resistant to challenge. Chatbots that engage with these beliefs, or that fail to interrupt discussions about restriction, caloric content, or weight, risk providing content that functions as reinforcement. The Acta study documented aggravated eating disorder symptoms in association with chatbot use. Pro-eating disorder content on the broader internet is one mechanism; unmonitored chatbot conversations that engage with these themes is another.
Borderline personality disorder in crisis — moderate to high risk. During crisis states involving intense emotional dysregulation, dissociation, or active self-harm urges, a chatbot that fails to recognise the severity and redirect appropriately adds risk. Outside crisis, in a stable state, the risk profile shifts substantially.
Severe depression, currently stable — moderate risk. Passive suicidal ideation and hopelessness are features of severe depression that a chatbot may not detect or handle appropriately, but the impairment of reality testing is less profound than in psychotic states. Risk increases with the severity and acuity of depressive symptoms.
Mild to moderate anxiety and depression, stable — lower risk. This is the population for which clinically designed tools like Woebot and Wysa were developed, and for whom the evidence base for AI-assisted support is most credible. General-purpose chatbots remain an unregulated option, but the specific risks identified in the research are less acute here.
The key clinical principle: risk is not diagnostic-category-level, it is state-level. The same patient with bipolar disorder carries very different risk in a stable euthymic state and in a manic episode. Clinicians asking patients about chatbot use should ask about what they use it for and in what state — not just what diagnosis they carry.
The Line That Needs to Be Drawn
Based on the available evidence, the clinical case is clear.
AI chatbots, in their current form, should not function as mental health resources for people experiencing:
• Active psychosis: any presentation involving delusions, hallucinations, or disorganised thinking
• Moderate to severe mania: particularly with impaired insight
• Active suicidal ideation: especially with intent or plan
• Severe eating disorder episodes: particularly restriction, purging, or acute medical risk
These are not edge cases. These are the patients who are most distressed, most likely to seek support, and most likely to be harmed by a tool that cannot recognise or respond to their clinical state.
We regulate who can prescribe antipsychotics. We have clinical standards for who can conduct a psychiatric assessment. AI tools that function as mental health resources need the same hard limits. The evidence now shows that in their absence, patients are being harmed.
What This Means for Patients and Families
If you or someone you care for has a diagnosis of schizophrenia, bipolar disorder, schizoaffective disorder, or another condition involving episodes of psychosis or mania, the guidance from this research is clear: AI chatbots are not appropriate mental health support during an acute episode. They are not equipped to recognise the clinical state, respond safely, or redirect to appropriate care.
If you are looking for mental health support and are not sure where to start, a structured teleconsultation with a psychiatrist is the right first step. You can explore what that looks like at dranindomitra.com.
If you are in India and in crisis, the iCall helpline (9152987821) provides telephone-based psychological support. If symptoms suggest active psychosis or mania (unusual beliefs, significantly elevated mood, drastically reduced sleep, disorganised thinking), this warrants prompt clinical assessment, not AI support.
Signs that AI chatbot use may be harming someone with a psychiatric condition
This section is for families, caregivers, and patients themselves. The harms documented in the research are real, but they are not always obvious in the moment — partly because the person experiencing them may lack insight into their own clinical state, and partly because chatbot use tends to happen privately. These are specific patterns worth knowing.
Changes in delusional thinking connected to chatbot use. If someone with a history of psychosis starts describing beliefs that appear to have originated in, or been validated by, conversations with an AI, this warrants clinical attention. Specific signs include: citing the chatbot as a source of confirmation for unusual beliefs ("even ChatGPT said that they're watching me"), becoming distressed or agitated when they cannot access the chatbot, or reporting that the AI "understands" them in ways that seem to reinforce content that is clearly delusional. The distress at chatbot unavailability is a particularly relevant signal — it suggests the tool has become embedded in the maintenance of the delusional system.
Mood and sleep changes in bipolar disorder that track chatbot use. Families who live with someone with bipolar I or II should pay attention to whether chatbot use is increasing during periods of reduced sleep. Nocturnal chatbot activity during what should be sleeping hours, combined with escalating mood, increased talkativeness the next day, or grandiose themes in the chatbot conversations, is consistent with prodromal or early manic escalation. These are warning signs that warrant contacting the treating psychiatrist, not waiting for a scheduled appointment.
Replacing clinical contact with chatbot contact. A pattern of cancelling or postponing psychiatric appointments, resisting medication, and substituting AI conversations as the primary source of support is clinically concerning regardless of diagnosis. It is more concerning in patients with psychotic disorders or bipolar disorder, where insight into the need for treatment often diminishes as the person becomes unwell. If someone begins framing the chatbot as "better than therapy" or "more helpful than my doctor," this should be raised with their clinician.
Reassurance-seeking loops in OCD. This one is easy to miss because reassurance-seeking itself feels like help-seeking. The distinction is the pattern: the same fears, asked about repeatedly, in different phrasings, seeking confirmation that nothing bad will happen. If someone with OCD is spending significant time on chatbots asking about contamination, harm, or moral concerns, and reporting temporary relief followed by the same question again, the chatbot has been recruited into the compulsive cycle. The same applies to eating disorder content: detailed conversations about caloric values, weight benchmarks, or food rules — even if framed as questions about "health" — may be functioning as disorder-maintaining behaviours rather than help-seeking.
If any of these patterns are present, the appropriate response is to raise it with the treating clinician at the next opportunity, or sooner if the presentation is deteriorating. A teleconsultation at ManoMitra is available if you are unsure whether what you are observing warrants clinical input.
What This Means for Clinicians
For psychiatrists and other mental health clinicians, these findings add to an emerging evidence base that warrants active discussion with patients.
It is worth asking patients with bipolar disorder, psychotic disorders, or a history of suicidal crises whether they are using AI chatbots for mental health support. Many will be, often without considering the safety implications. Psychoeducation about appropriate and inappropriate AI use is now a relevant part of clinical practice. Patients benefit from knowing that these tools are not designed for their condition, that free versions carry the greatest risk, and that safer alternatives are available.
The conversation about AI in psychiatry is not going away. The question is whether clinicians will engage with it proactively or respond to the harms after the fact.
Conclusion
Three peer-reviewed papers, JAMA Psychiatry, Acta Psychiatrica Scandinavica, and World Psychiatry, and a case report in Innovations in Clinical Neuroscience now form a convergent body of evidence. AI chatbots, as currently deployed, cause measurable harm in patients with psychosis, mania, suicidal ideation, and eating disorders.
The mechanism is not mysterious. Psychosis and mania impair the cognitive machinery that would allow a person to evaluate whether an AI’s response is safe. When that machinery is impaired, a tool that cannot recognise the clinical state becomes a risk rather than a resource.
AI will have a role in mental health, in the right contexts, for the right patients, with appropriate safeguards. But that role has a hard boundary. The boundary runs through psychosis, mania, and active suicidality. It is not being enforced. The evidence now says it needs to be.
Explore More on This Topic
• How to know when to see a psychiatrist
• Common questions about psychiatric treatment
• Social media and child mental health: what the research shows
Frequently Asked Questions
Are AI chatbots ever safe for people with mental health conditions?
For people who are clinically stable, not experiencing psychosis or mania, and using AI tools for general psychoeducation or symptom tracking, the risk profile is different. The specific concern raised by this research applies to people who are acutely unwell, particularly those with active psychosis, mania, suicidal ideation, or severe eating disorders. For these individuals, AI chatbots are not appropriate mental health resources in their current form.
Is ChatGPT specifically more dangerous than other chatbots for psychiatric patients?
The JAMA Psychiatry study specifically tested ChatGPT. Similar vulnerabilities likely exist across other general-purpose AI chatbots, as none are clinically designed or validated for psychiatric populations. Specialised mental health AI tools with clinical oversight are a different category, though the evidence base for those remains limited.
What is “reality testing” and why does it matter for AI safety?
Reality testing is the cognitive ability to distinguish internal beliefs and perceptions from external reality, the mechanism that allows us to question our own thoughts. In active psychosis, this ability is compromised, which means a patient cannot reliably evaluate whether what a chatbot tells them is safe or accurate. This is the central reason why AI chatbot use during a psychotic episode carries a distinct and serious risk.
What should I do if someone I care for is using an AI chatbot during a mental health crisis?
Redirect them toward a clinician or crisis service. If they are in India, the iCall helpline (9152987821) is available. If symptoms suggest active psychosis or mania (unusual fixed beliefs, significantly elevated mood, reduced need for sleep, disorganised speech or behaviour), this warrants prompt clinical assessment. You can book a teleconsultation at ManoMitra if you are unsure where to start.
Does this research mean AI should be banned from mental health use entirely?
No. The argument is that AI in mental health needs clinical boundaries, not elimination. Tools used in stable patients for psychoeducation, symptom tracking, or low-intensity support operate in a different risk context. Better regulation, clinical stratification, and honest labelling of what these tools can and cannot do is a more proportionate response than a blanket ban.
Where can I find a psychiatrist for myself or a family member?
You can book a consultation with Dr. Anindo Mitra at dranindomitra.com. ManoMitra offers teleconsultations across India for those who cannot access in-person care.
About the Author
Dr. Anindo Mitra is a Consultant Psychiatrist at Athena Behavioural Health, Gurugram. He completed his MD in Psychiatry from JIPMER, Puducherry. His clinical focus includes evidence-based pharmacotherapy, deprescribing, and the neurobiology of psychiatric disorders. He writes at dranindomitra.com on mental health education for the Indian public.
This post is for educational purposes only and does not constitute individualised medical advice. If you have concerns about your mental health, please consult a qualified clinician.

