
When ChatGPT Becomes Your Doctor: The Regulatory Blind Spot Harming the Most Vulnerable
Main takeaway: People use general‑purpose AI to self‑diagnose. Current rules don’t clearly say who is responsible when that goes wrong. We need proportionate accountability and simple safety guardrails.
1) Regulatory reality — in one minute
At 2 a.m., two people upload the same mole photo to an AI.
Near a hospital: the AI is a convenience. In a primary‑care desert: the AI becomes the decision.
Today’s law:
- U.S. (FDA): A tool is a medical device only if it shows medical intended use in labeling, promotion, design, or how it’s distributed. [21 CFR § 801.4; 21 U.S.C. § 321(h)]
- EU (MDR): Same idea—intended purpose by the manufacturer. [MDR Art. 2; MDCG 2019‑11 rev.1]
- Important nuance: FDA can infer “objective intent” from websites, app‑store text, sales decks, etc., and has cited these in warning letters (e.g., Exer Labs, Feb 10, 2025). [warning letter]
The gap: People still use general AIs for medical decisions, but the product isn’t classified as a medical device. That leaves unclear responsibility when harm happens.
2) Why this matters
Clinical accuracy is uneven
- Med‑PaLM 2: up to 86.5% on USMLE‑style multiple choice; open‑ended clinical prompts perform lower (often ≈ ~65% depending on task and scoring). [Nature 2023]
- Symptom‑checker apps: ~57% appropriate triage (95% CI 52–61%). [BMJ Open 2019]
Access is unequal
- 86.9 million Americans live in primary‑care shortage areas (HPSAs). [HRSA Q4 FY 2025]
- For many, AI advice becomes de facto medical counsel.
A 58‑year‑old with chest tightness reads “could be anxiety or muscle strain,” waits, and dies of a heart attack. No doctor–patient relationship. No app marketed as medical. No obvious liable “manufacturer.” 1
3) The liability landscape (plain English)
- Malpractice? Usually no—there’s no clinician involved.
- Product liability? Possibly—design defect or failure to warn (foreseeable misuse). [Restatement (Second) § 402A; Restatement (Third) § 2]
- Reality check: Success is uncertain for multi‑use tools that don’t claim medical purpose. As of Oct 17, 2025, no U.S. court has finally decided a case on injury from a general‑purpose AI used without health claims; several are pending.
4) Existing levers (regulation and “soft law”)
U.S.
- Cures Act § 3060 / FDCA § 520(o): Some software sits outside device rules; FDA also uses enforcement discretion. [FDA Digital Health]
- Q‑Submission (Pre‑Sub): Non‑binding FDA feedback on red‑flag routing and uncertainty displays. [FDA Q‑Submission]
- Multiple‑Function Device Policy (2020): For apps mixing medical and non‑medical features. [FDA Guidance 2020]
EU AI Act (dates to know)
- Feb 2, 2025: Prohibitions & AI‑literacy measures start.
- Aug 2, 2025: General‑purpose AI (GPAI) provider duties begin.
- Aug 2, 2026: Most obligations apply.
- Aug 2, 2027: High‑risk obligations for AI in regulated products ramp in.
- Tools: Art. 95 (voluntary codes of conduct) and Art. 53 (regulatory sandboxes). [Art. 95; Art. 53; Art. 113]
Other jurisdictions (very short)
- Japan (PMDA): Treats qualifying AI as SaMD; sandbox “DASH” and change‑management pathways. [PMDA DASH]
- China (NMPA): AI/algorithms regulated under existing medical‑device law. [NMPA]
5) A simple “Tier 2” middle ground
When usage shows meaningful health risk, attach proportionate duties—even if the AI isn’t a regulated device.
| Trigger | Threshold | How to measure |
|---|---|---|
| Health‑query volume | ≥ 1,000,000 health queries/month per jurisdiction | Internal, privacy‑safe metrics |
| Red‑flag prevalence | ≥ 5% include sentinel clusters (e.g., chest pain + shortness of breath) | Audited intent‑classifier logs |
| Escalation friction | ≥ 30% drop‑off after a red‑flag prompt | UX funnel analytics |
| Harm signal | Any SAE linked to advice and failed escalation | Complaint/SAE log + root‑cause |
What Tier 2 requires (guardrails)
- Detect medical intent (with cautious thresholds).
- Communicate uncertainty (no faux certainty; link to evidence).
- Route red flags (one‑click to nurse lines/telehealth).
- Track equity (HPSA escalation parity, ≤ 8th‑grade reading level, multilingual support).
6) Making safety real (how to build it)
- Intent classifier: Use conservative thresholds to avoid missing red flags.
- Adversarial defenses: Apply OWASP LLM Top‑10 mitigations (e.g., strip jailbreak instructions). [OWASP LLM Top 10]
- Output handling: Always say “not a diagnosis,” present ranges, and avoid definitive labels.
- Calibration monitoring: Post calibration cards quarterly; alarm on drift; default to escalation if uncertain. [Kadavath et al., 2022]
- Privacy: Keep telemetry GDPR‑compliant—data minimization, purpose limitation, short retention, DPIA where required. [GDPR Art. 5]
7) Broader fixes (AI isn’t the only answer)
- Telehealth reimbursement parity and after‑hours coverage
- Community health‑worker and nurse‑line expansion
- Mobile clinics and zero‑rated links to local services
- Public health‑literacy campaigns
8) What to do next
Lawmakers & regulators
- Clarify how “intended use” applies to general AIs.
- Use sandboxes (EU) and pre‑subs (U.S.) to test guardrails fast.
Platforms & developers
- Implement Tier‑2 guardrails where usage shows risk.
- Publish calibration and equity metrics.
- Document incident response and red‑flag routing in your QMS.
Health systems & payers
- Offer low‑friction escalation endpoints (nurse lines, telehealth).
- Partner with platforms to close the loop for red‑flag users.
Conclusion
People in care deserts will keep asking general AIs health questions. Without proportionate guardrails, they carry the risk alone. We can keep innovation moving and add basic protections. The fixes are practical: detect intent, show uncertainty, route red flags, and measure equity—then prove it with data.
Transparency note: Composite scenarios used; links point to official sources. 1 Composite myocardial‑infarction vignette reflects patterns in rural‑care delay literature; no specific platform identified.
Sources (selection)
- FDA intended use — 21 CFR § 801.4; definition of “device” — 21 U.S.C. § 321(h)
- FDA Clinical Decision Support Guidance (2022) — link
- EU MDR Art. 2; MDCG 2019‑11 rev.1 — MDR; MDCG
- EU AI Act — Arts. 53, 95, 113 — Art. 53; Art. 95; Art. 113
- HRSA Health Professional Shortage Areas (Q4 FY 2025) — link
- Med‑PaLM 2 performance — Nature Medicine (2025)
- Symptom‑checker triage accuracy — BMJ Open (2019)
- Objective‑intent enforcement example — FDA Warning Letter to Exer Labs (Feb 10, 2025) — link
- LLM calibration/miscalibration — Kadavath et al., 2022
- LLM security guardrails — OWASP LLM Top 10