When Doctors Outsource Clinical Judgment to ChatGPT: The Governance Gap We Can't Ignore
A response to Jim Katzaman's "AI at the Bedside: Doctors Face a New Crossroads"
Foreword: Jim Katzaman wrote about mid-career physicians navigating AI adoption, asking important questions about skill gaps, trust, and shifting roles. He’s right to ask those questions. In fact, his questions reminded me of one of the most urgent problems we’re facing right now. This is part one of that issue.
What Happened in That Exam Room
A few days ago, my friend, a cancer survivor, went to see her oncologist for a cancer risk consultation. The kind of appointment where you expect decades of medical training, clinical reasoning, and careful judgment to converge.
Instead, she watched her oncologist open ChatGPT.
Not as a reference tool. Not as a second opinion. As the primary decision making system.
The doctor used ChatGPT to look up cancer risk calculation methods, typed in her patient information to estimate risk levels, asked for medication options, checked side effects, and confidently prescribed medication based on the chatbot’s output.
My friend sat there thinking: “Is my cancer prevention plan being generated by the same tool people use to write vacation itineraries?”
She works in AI. She builds AI systems. She loves the technology.
But this crossed a line.
This Was NOT Evidence Based Medicine
Let me be very clear about what that oncologist was using:
NOT an FDA approved clinical decision support system.
NOT a validated risk calculator.
NOT a regulated medical device.
NOT evidence based medicine.
This was ChatGPT. An unregulated chatbot. Being used to calculate disease risk and prescribe medication.
Jim’s article asks whether doctors can trust AI while protecting patient safety. That’s a good question.
But here’s a better one: Why are we even having a conversation about “trust” when doctors are using consumer AI tools for clinical decisions without any governance infrastructure in place?
The Governance Gap Isn’t Abstract Anymore
AI governance can seem abstract. Risk here, hypothetical scenario there.
But when something like this happens between two humans, between a doctor and a patient? That’s no longer abstract. That’s personal.
Algorithms are personal.
I spent a year as AI Policy Advisor to Senator Bill Cassidy, Chairman of the Senate HELP Committee. I sat in meetings about healthcare AI regulation. I attended conferences on clinical AI deployment.
And then I hear about this oncologist, and I realize: the governance gap isn’t closing fast enough.
Because if clinicians are already:
Offloading risk calculations to general purpose LLMs.
Relying on unvalidated outputs for medication choices.
Using consumer AI tools instead of FDA cleared systems.
Then we’re not “integrating AI into healthcare.”
We’re outsourcing clinical judgment to systems never designed for medicine and hoping no one gets hurt.
What Jim Got Right (And Where We Need to Go Deeper)
Jim’s article captures the physician perspective beautifully. The skill gaps are real. The training needs are urgent. The identity questions about what it means to be a doctor in an AI era matter deeply.
AI isn’t just a technological shift. It’s a cultural, ethical and professional one.
Therefore, we need to have AI governance infrastructure in place BEFORE AI deployment.
The oncologist using ChatGPT? That’s not a skill gap problem. That’s not a training problem.
That’s a governance failure.
And Jim’s questions about physician adaptation become even more critical when we realize doctors are making these choices right now, in real time, without the frameworks they need.
What Should Have Happened
Before any AI tool touches patient care, health systems need four things:
1. Risk Classification
Is this high stakes clinical decision making?
Cancer risk assessment and medication prescription? That’s high stakes.
Administrative scheduling? That’s low stakes.
You don’t use the same governance framework for both.
The oncologist should have known ChatGPT is explicitly NOT cleared for clinical use. OpenAI’s own terms of service say so.
2. Vendor Evaluation
Is this tool FDA cleared for this specific use case?
Does it have clinical validation studies?
Can it produce audit trails?
What happens when it makes a mistake?
ChatGPT has none of this. It’s a general-purpose language model trained on internet text. It wasn’t designed for medicine. It wasn’t tested for medicine. It has no medical liability framework.
Using it for cancer risk assessment is like using a calculator app to perform surgery. The tool isn’t designed for the task.
3. Usage Policies
What can AI be used for in this health system?
What can’t it be used for?
Who approves new tools?
What’s the training requirement before clinicians can use them?
Most health systems don’t have these policies yet. So doctors are making up their own rules. Some use ChatGPT for clinical decisions. Some don’t. It’s chaos.
4. Oversight Mechanisms
Who’s monitoring what AI tools are being used?
What happens when things go wrong?
How do we investigate adverse events?
Where’s the accountability?
Right now, in most health systems, the answer is: Nobody. Nothing. We don’t. Nowhere.
The Efficiency Trap
The oncologist probably thought she was being efficient. Innovative, even.
Jim’s article talks about how AI saves time on research and clerical work. That’s true. AI can absolutely make healthcare more efficient.
But efficiency without safety isn’t innovation. It’s risk.
The article summarizes this dilemma beautifully: “Studies show humans get tired after a hundred patient consultations, while AI gets better with time when faced with the same algorithms.”
That’s partially true. AI doesn’t get tired.
But AI also doesn’t know when it’s wrong. It doesn’t have clinical judgment. It doesn’t understand context. It can’t tell the difference between a research paper and a Reddit post when it’s synthesizing information.
And ChatGPT specifically? It hallucinates. Regularly. Confidently. With no indication that it’s making things up.
You want that system calculating your cancer risk?
The Trust Question Needs Context
Jim asks: “Can doctors rely on AI while protecting patient safety?”
It’s a crucial question, and it connects directly to the governance gap I’m talking about.
The question assumes AI tools in healthcare are validated, regulated, and appropriate for clinical use. Most aren’t.
So the question becomes: “How do we help doctors understand which AI tools are trustworthy for clinical decisions, and what governance infrastructure do we need to make that clear?”
Dr. Aguilar said patients now trust AI more than their doctors. That’s concerning.
Not because AI can’t be trustworthy. It can be, if it’s designed, tested, and deployed correctly.
But because right now, most AI in healthcare isn’t any of those things. And if patients are trusting AI blindly while doctors are using unvalidated tools, we have a perfect storm of risk.
Part II, where I discuss what AI should look like in healthcare, is available now
Read Jim’s full piece here.
P.S. If your team is dabbling with AI but nobody actually understands what’s being built, we need to talk. This is exactly what my AI governance consulting fixes...before you become another statistic in that 95%.






Great post, Soribel. There needs to be regulation, the fact that a doctor I am paying top dollars can go on ChatGPT for something so important, makes me really mad. I'll be very careful next time I go to the doctor. I knew they are using AI to take notes, and I don't know how I feel about it, but using ChatGPT is too much.
This was a frightening read. As a former cardiothoracic nurse AI has no place in patient care. AI algorithms cannot catch the physical nuance between a PVC and a slight drop in potassium that may seem very innocuous, but is the precursor to something larger. Only a human with critical thinking skills looking at their patient can determine whether or not immediate treatment is necessary. That split second decision-making, especially in the ICU cannot be caught by a machine. I can just imagine sitting in an office having someone dictate your survival based on a large language model logarithm. That is horrific. You know it’s just as frustrating as knowing that healthcare as it stands has only specific information here towards males between 18 and 45, which is why persons of color and women suffer so much especially in the American healthcare system. There is not enough research or data collected on everyone race, ethnicity, gender to substantiate, using a large language model for healthcare.