Image: Google Ireland Limited

Research AI AMIE Becomes Multimodal: Medical Diagnosis with Images, Text, and Intelligent Dialogue

AMIE, Google’s research diagnostic AI, is entering a new era: It can now intelligently process and reason about medical images in real-time conversations. Built on the multimodal Gemini 2.0 Flash model, this upgraded version emulates how real clinicians gather and interpret data – including photos, lab results, and documents – to form diagnoses.

In a groundbreaking expert study featuring 105 multimodal diagnostic scenarios, AMIE was evaluated against primary care physicians (PCPs). The results were compelling: AMIE matched or exceeded physician performance in diagnostic accuracy, clinical reasoning, and even empathy. Specialists rated AMIE’s ability to interpret visual data and generate management plans higher than human doctors.

A robust simulation environment allowed researchers to accelerate development and testing. Simulated patient interactions, complete with dermatological images (from the SCIN dataset) and ECG data (from PTB-XL), helped train AMIE to ask for the right information at the right time – and make the most of it.

Preliminary evaluations with Gemini 2.5 Flash, the newer and more capable model, showed even further performance gains: Top-3 diagnostic accuracy rose from 59% to 65%, and the quality of proposed management plans improved as well. Importantly, AMIE maintained its already low rate of hallucinations – statements unsupported by provided data.

This advancement brings AMIE a step closer to real-world use. While the current study used simulated patients and environments, a clinical trial with Beth Israel Deaconess Medical Center is already underway to test AMIE in real medical settings.

Still, the research team emphasizes that real-time, video-based doctor-patient communication remains unmatched – chat-based systems inherently limit nuance and non-verbal cues. Development of real-time audio-visual capabilities is part of AMIE’s roadmap.

These multimodal upgrades join recent advances like longitudinal care reasoning, moving AMIE toward becoming a comprehensive conversational AI for healthcare.

By teaching AMIE to “see,” Google is expanding what diagnostic AI can do – not just listen, but also look, interpret, and advise responsibly. This work reflects a commitment to safe, validated, and useful AI support for clinicians and patients alike.