Researchers develop system to identify AI-generated radiology reports

por Gus Iversen, Editor in Chief | March 17, 2026

Artificial Intelligence Business Affairs X-Ray

Researchers at the State University at Buffalo have developed a system designed to distinguish radiology reports written by clinicians from those generated by artificial intelligence, a capability intended to help detect falsified medical documentation and fraudulent insurance claims.

The work was led by Nalini Ratha, SUNY Empire Innovation Professor in the department of computer science and engineering at the University at Buffalo, alongside Ph.D. students Arjun Ramesh Kaushik and Tanvi Ranga. The team presented its findings at the GenAI4Health workshop held during the Conference on Neural Information Processing Systems in December.

“With generative AI becoming more capable of producing remarkably convincing radiology reports, there’s a greater risk of fabricated reports being used to falsify medical histories and support fraudulent claims,” Ratha said. “Radiology reports have highly specialized structure, vocabulary and stylistic norms, making general-purpose detectors unreliable. Therefore, our goal was to build a detection framework designed specifically for radiology that can distinguish clinician-written medical documentation from synthetic text before it reaches clinical or insurance workflows.”

NEW AROBELLA 1000D ADVANCED ULTRASOUND WOUND THERAPY FOR SALE OR RENT

Brand-New FDA-cleared Advanced Ultrasound Medical Device available for sale or lease to Wound Care Centers or any other Medical Facilities.The Arobella 1000D is designed for non-contact or debridement ultrasound wound healing therapy, or any other wounds

As part of the project, the researchers assembled a data set containing 14,000 pairs of chest X-ray reports: one written by radiologists and one generated by AI. Synthetic reports were produced in two ways: by paraphrasing existing reports using large language models and by generating reports directly from radiographs using vision-language models.

The data set focuses on the findings section of reports, which typically contains detailed clinical observations and domain-specific terminology.

Using this data set, the team built a detection framework based on a BERT–Mamba architecture designed to separate stylistic patterns from clinical content. According to the researchers, large language models often replicate medical terminology accurately but differ from clinicians in writing style.

“AI systems leave subtle stylistic fingerprints such as patterns in phrasing, punctuation, and word choice that differ from how radiologists naturally write. By disentangling style from content and treating it as its own measurable feature, our model was able to detect those patterns with exceptional precision,” Kaushik said.

In testing, the system achieved Matthews correlation coefficient scores ranging from 92% to 100% when distinguishing human-written and AI-generated reports. The model also identified synthetic reports generated by AI systems it had not encountered during training.

“What we found is LLMs tend to write in polished, expansive language, while clinicians write in concise, direct terms,” Ranga said.

The researchers plan to expand the data set to include additional radiology categories and a broader range of AI models as they prepare the framework for public release.



You Must Be Logged In To Post A Comment Iniciar sesión Si ya has creado una cuenta, utiliza tu dirección de correo electrónico y contraseña para iniciar sesión utilizando el formulario de abajo. Problemas de inicio de sesión: Chasque aquí si tiene problemas de inicio de sesión. Correo Electrónico: Contraseña: ¿Olvidaste tu contraseña? ¿Problemas para iniciar sesión? Consulta nuestro Aviso Legal y Aviso de Privacidad Registro Registrarse es Gratis y Fácil. Disfruta de los beneficios del Mercado de Equipos Médicos Nuevos y Usados líder en el mundo. ¡Regístrate ahora!