CAIM talk by Kayhan Batmanghelich on vision language models
Date: April 16th, 2026, 4pm (TBC)
Location: Medical University of Vienna, Anna Spiegel Research Building, Seminar Room Level 3
Speaker: Kayhan Batmanghelich, Boston University
Title: From Generation to Auditing: Foundation Models in Medical Imaging
Abstract: Recent advances in foundation models are creating new opportunities in medical imaging by enabling capabilities that go beyond prediction alone. In this talk, I will focus on two such directions. First, generative models offer a powerful mechanism for improving data efficiency, providing a building block for model explainability, and serving as strong priors for solving ill-posed inverse problems. Second, foundation models can themselves be audited and improved to make them more reliable for clinical use. Through these examples, I will discuss how foundation models can support both the creation of richer imaging representations and the rigorous analysis of AI systems built on top of them.
The first part of the talk will focus on controllable generative modeling for volumetric medical imaging, with an emphasis on lung CT. I will present a unified line of work on high-resolution 3D CT generation that combines radiology reports with optional anatomical or abnormality-specific segmentation prompts, enabling flexible and fine-grained control over the synthesis process without requiring exhaustive annotations. These methods address core challenges in volumetric generation, including memory efficiency, long-context conditioning, and preservation of anatomically meaningful structures such as vessels, airways, fissures, and localized abnormalities. I will further show how such controllable generative models can be used not only for realistic image synthesis and data augmentation, but also as strong priors for downstream inverse problems, including training-free reconstruction of 3D CT volumes from sparse X-ray projections by integrating a text-conditioned diffusion prior with a differentiable physics-based forward model.
The second part of the talk will focus on auditing foundation models and uncovering their hidden failure modes. I will present LADDER, a framework that uses large language models to discover systematic biases in pretrained vision systems by reasoning over language derived from model activations, metadata, and preprocessing pipelines. Unlike traditional slice discovery methods that rely on predefined attribute banks, LADDER uses language as an interface for domain knowledge and common-sense reasoning, allowing it to identify a broader range of bias-inducing factors, including those introduced during data preparation. I will also briefly highlight our broader vision for domain-specific foundation models through Mammo-FM, a foundation model for mammography that unifies diagnosis, localization, report generation, and risk prediction within a single clinically grounded framework. Together, these projects illustrate a broader agenda for medical imaging AI: foundation models that are not only powerful and controllable, but also transparent, auditable, and clinically reliable.
Bio Sketch: Kayhan Batmanghelich, Ph.D., is an Assistant Professor in the Department of Electrical and Computer Engineering at Boston University. His research focuses on the intersection of artificial intelligence and healthcare, with an emphasis on medical imaging, explainable AI, and multimodal learning. He develops domain-specific foundational models that integrate radiological imaging, clinical data, and molecular information to support diagnosis, prognosis, and therapeutic decision-making. Dr. Batmanghelich has led multiple research projects supported by the NIH, NSF, and industry sponsors, and collaborates closely with clinicians to translate machine learning innovations into clinical workflows. He is a recipient of the NSF CAREER, Google Faculty Research Award, and a Junior Faculty Fellow at the Hariri Institute for Computing.
This is part of the CAIM Talks series.