Mit researchers develop humble AI to reduce overconfidence in medical decisions

2026-03-25
2 min read.
New framework helps artificial intelligence systems admit uncertainty and work as collaborative coaches rather than authoritative oracles.

Artificial intelligence (AI) holds promise for helping doctors diagnose patients and choose personalized treatments. However, current AI systems often act overconfidently. They may give incorrect advice while appearing highly reliable, which can lead doctors to follow wrong suggestions even when their own judgment differs.

One solution is to make AI more humble. Humble systems would clearly show when they are not confident in a diagnosis or recommendation. They would then encourage doctors to gather extra information, such as additional tests or specialist opinions, before making a final decision.

Researchers at the MIT created a framework that guides developers in building such humble AI. The approach adds several computational modules to existing systems. A key module is the Epistemic Virtue Score. This score acts as a self-awareness check. It evaluates whether the AI’s confidence matches the actual quality and completeness of the available data. Epistemic virtue means being intellectually honest about the limits of one’s knowledge.

Intellectually honest AI

When the system detects that its confidence exceeds what the evidence supports, it pauses and flags the uncertainty. It may request specific tests, patient history, or consultation with a specialist. The goal is for AI to function as a coach or co-pilot rather than an all-knowing oracle. This keeps human doctors in control and encourages creative thinking.

The work also addresses problems with training data. Many AI models learn from electronic health records that were not designed for this purpose. These records often lack important context and exclude patients from rural areas or different backgrounds, which can introduce bias. Workshops at MIT bring together doctors, patients, data scientists, and others to examine datasets carefully and reduce such inequities.

Testing of the framework is now planned using large hospital databases, including the MIMIC database from Beth Israel Deaconess Medical Center. The same ideas could apply to AI tools that analyze X-ray images or recommend treatments in emergency rooms.

This approach aims to create safer, more thoughtful artificial intelligence that supports rather than overrides human decision-making in medicine.

This research is published in BMJ Health and Care Informatics..

#AIApplications



Related Articles


Comments on this article

Before posting or replying to a comment, please review it carefully to avoid any errors. Reason: you are not able to edit or delete your comment on Mindplex, because every interaction is tied to our reputation system. Thanks!