Over 200 Total Lots Up For Auction at Two Locations - VA 06/30, MO 07/01

Researchers develop system to identify AI-generated radiology reports

por Gus Iversen, Editor in Chief | March 17, 2026
Artificial Intelligence Business Affairs X-Ray
Researchers at the State University at Buffalo have developed a system designed to distinguish radiology reports written by clinicians from those generated by artificial intelligence, a capability intended to help detect falsified medical documentation and fraudulent insurance claims.

The work was led by Nalini Ratha, SUNY Empire Innovation Professor in the department of computer science and engineering at the University at Buffalo, alongside Ph.D. students Arjun Ramesh Kaushik and Tanvi Ranga. The team presented its findings at the GenAI4Health workshop held during the Conference on Neural Information Processing Systems in December.

“With generative AI becoming more capable of producing remarkably convincing radiology reports, there’s a greater risk of fabricated reports being used to falsify medical histories and support fraudulent claims,” Ratha said. “Radiology reports have highly specialized structure, vocabulary and stylistic norms, making general-purpose detectors unreliable. Therefore, our goal was to build a detection framework designed specifically for radiology that can distinguish clinician-written medical documentation from synthetic text before it reaches clinical or insurance workflows.”
stats
DOTmed text ad

We repair MRI Coils, RF amplifiers, Gradient Amplifiers and Injectors.

MIT labs, experts in Multi-Vendor component level repair of: MRI Coils, RF amplifiers, Gradient Amplifiers Contrast Media Injectors. System repairs, sub-assembly repairs, component level repairs, refurbish/calibrate. info@mitlabsusa.com/+1 (305) 470-8013

stats
As part of the project, the researchers assembled a data set containing 14,000 pairs of chest X-ray reports: one written by radiologists and one generated by AI. Synthetic reports were produced in two ways: by paraphrasing existing reports using large language models and by generating reports directly from radiographs using vision-language models.

The data set focuses on the findings section of reports, which typically contains detailed clinical observations and domain-specific terminology.

Using this data set, the team built a detection framework based on a BERT–Mamba architecture designed to separate stylistic patterns from clinical content. According to the researchers, large language models often replicate medical terminology accurately but differ from clinicians in writing style.

“AI systems leave subtle stylistic fingerprints such as patterns in phrasing, punctuation, and word choice that differ from how radiologists naturally write. By disentangling style from content and treating it as its own measurable feature, our model was able to detect those patterns with exceptional precision,” Kaushik said.

In testing, the system achieved Matthews correlation coefficient scores ranging from 92% to 100% when distinguishing human-written and AI-generated reports. The model also identified synthetic reports generated by AI systems it had not encountered during training.

“What we found is LLMs tend to write in polished, expansive language, while clinicians write in concise, direct terms,” Ranga said.

The researchers plan to expand the data set to include additional radiology categories and a broader range of AI models as they prepare the framework for public release.

You Must Be Logged In To Post A Comment