CAIM Talk Apr 16th 2026: Kayhan Batmanghelich

Über Uns

CAIM talk by Kayhan Batmanghelich on vision language models

Date: April 16th, 2026, 4pm

Location: Medical University of Vienna, Anna Spiegel Research Building, Seminar Room Level 3

Speaker: Kayhan Batmanghelich, Boston University

Title: From Generation to Auditing: Foundation Models in Medical Imaging

Abstract: Recent advances in foundation models are creating new opportunities in medical imaging by enabling capabilities that go beyond prediction alone. In this talk, I will focus on two such directions. First, generative models offer a powerful mechanism for improving data efficiency, providing a building block for model explainability, and serving as strong priors for solving ill-posed inverse problems. Second, foundation models can themselves be audited and improved to make them more reliable for clinical use. Through these examples, I will discuss how foundation models can support both the creation of richer imaging representations and the rigorous analysis of AI systems built on top of them.

The first part of the talk will focus on controllable generative modeling for volumetric medical imaging, with an emphasis on lung CT. I will present a unified line of work on high-resolution 3D CT generation that combines radiology reports with optional anatomical or abnormality-specific segmentation prompts, enabling flexible and fine-grained control over the synthesis process without requiring exhaustive annotations. These methods address core challenges in volumetric generation, including memory efficiency, long-context conditioning, and preservation of anatomically meaningful structures such as vessels, airways, fissures, and localized abnormalities. I will further show how such controllable generative models can be used not only for realistic image synthesis and data augmentation, but also as strong priors for downstream inverse problems, including training-free reconstruction of 3D CT volumes from sparse X-ray projections by integrating a text-conditioned diffusion prior with a differentiable physics-based forward model.

The second part of the talk will focus on auditing foundation models and uncovering their hidden failure modes. I will present LADDER, a framework that uses large language models to discover systematic biases in pretrained vision systems by reasoning over language derived from model activations, metadata, and preprocessing pipelines. Unlike traditional slice discovery methods that rely on predefined attribute banks, LADDER uses language as an interface for domain knowledge and common-sense reasoning, allowing it to identify a broader range of bias-inducing factors, including those introduced during data preparation. I will also briefly highlight our broader vision for domain-specific foundation models through Mammo-FM, a foundation model for mammography that unifies diagnosis, localization, report generation, and risk prediction within a single clinically grounded framework. Together, these projects illustrate a broader agenda for medical imaging AI: foundation models that are not only powerful and controllable, but also transparent, auditable, and clinically reliable.

Key papers

S. Ghosh et al. (2026) Mammo-FM: Breast-specific foundational model for Integrated Mammographic Diagnosis, Prognosis, and Reporting. arXiv:2512.00198
Code/Github:
S. Ghosh et al. (2024) Mammo-CLIP: A Vision Language Foundation Model to Enhance Data Efficiency and Robustness in Mammography. International conference on medical image computing and computer-assisted intervention, pp. 632-642.
Code/Github: https://shantanu-ai.github.io/projects/MICCAI-2024-Mammo-CLIP/
S. Ghosh et al. (2025) Ladder: Language-driven slice discovery and error rectification in vision classifiers. In Findings of the Association for Computational Linguistics: ACL 2025, pp. 22935-22970.
Code/Github: https://shantanu-ai.github.io/projects/ACL-2025-Ladder/index.html

Bio Sketch: Kayhan Batmanghelich, Ph.D., is an Assistant Professor in the Department of Electrical and Computer Engineering at Boston University. His research focuses on the intersection of artificial intelligence and healthcare, with an emphasis on medical imaging, explainable AI, and multimodal learning. He develops domain-specific foundational models that integrate radiological imaging, clinical data, and molecular information to support diagnosis, prognosis, and therapeutic decision-making. Dr. Batmanghelich has led multiple research projects supported by the NIH, NSF, and industry sponsors, and collaborates closely with clinicians to translate machine learning innovations into clinical workflows. He is a recipient of the NSF CAREER, Google Faculty Research Award, and a Junior Faculty Fellow at the Hariri Institute for Computing.

This is part of the CAIM Talks series.