GPT-4 performs as well as human radiologists
Oct. 02, 2024.
2 mins. read.
3 Interactions
The diagnostic performance of GPT-4 on MRI scan reports of brain tumors is as good as that of human radiologists.
Researchers at Osaka Metropolitan University have compared the diagnostic performance of GPT-4 based ChatGPT and human radiologists (two board-certified neuroradiologists and three general radiologists) on 150 preoperative brain tumor MRI reports and concluded that GPT-4 has performed as well as the radiologists.
The study is described in a research paper published in European Radiology.
“GPT-4 exhibited good diagnostic capability, comparable to neuroradiologists in differentiating brain tumors from MRI reports,” conclude the researchers. “GPT-4 can be a second opinion for neuroradiologists on final diagnoses and a guidance tool for general radiologists and residents.”
The accuracy of both sets of diagnoses was evaluated based on the actual diagnoses of the tumors after removal. The evaluation results show a 73 percent accuracy for GPT-4, compared to 72 percent for neuroradiologists and 68 percent for general radiologists.
The researchers emphasize that the diagnostic accuracy of GPT-4 seems to increase with the skill level of the human source of the input report: the accuracy with neuroradiologist reports was 80 percent, compared to 60 percent when using general radiologist reports.
In the future, said graduate student Yasuhito Mitsuyama, the lead author of the paper, in the Osaka Metropolitan University press release, “we intend to study large language models in other diagnostic imaging fields with the aims of reducing the burden on physicians, improving diagnostic accuracy, and using AI to support educational environments.”
Future AI applications in medicine
The instance of GPT-4 used in the study is a few months old (May 24 version). In view of the fast pace of development of Artificial Intelligence (AI) technology, and in particular the ongoing development of AI systems with enhanced reasoning ability, it seems likely that more spectacular results could materialize soon.
It seems plausible that AI systems could playing a growing and eventually leading role in medical research and clinical practice.
For a fascinating overview of current and future AI applications in medicine, written by top experts in the field, see “The AI Revolution in Medicine: GPT-4 and Beyond” (2023).
Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter.
2 Comments
2 thoughts on “GPT-4 performs as well as human radiologists”
The 'Large' in LLM is the answer. Such models, if tweaked properly and are restricted from 'hallucinations' which typically stems from the need to generate answers regardless of accuracy (they are not trained to say I don't know, isn't it sad), can do rigorous analysis.
🟨 😴 😡 ❌ 🤮 💩
I think this is a good point. Speaking of condensed matter, Philip Anderson quipped "more is different." Same here. When Big Matter or Big Data is big enough, interesting things begin to happen. I don't think language models are sufficient for real AGI, but they are necessary and might play a bigger role than we think at this moment.
🟨 😴 😡 ❌ 🤮 💩