The next leap for AI scribes provides eyes in the clinic

Publicly released:
Australia; SA
Getty Images
Getty Images

The introduction of vision-enabled artificial intelligence (AI) to medical scribes – the recording devices used by doctors to document meetings with patients in real-time – could increase the accuracy of patient notes and save valuable time for clinicians. Researchers found that a vision-enabled AI scribe, employing a combination of Google’s Gemini model and Ray-Ban Meta smart glasses, substantially improved the documentation accuracy of pharmacist-patient consultations and reduced omissions and errors in clinical notes.

News release

From: Flinders University

The introduction of vision-enabled artificial intelligence (AI) to medical scribes – the recording devices used by doctors to document meetings with patients in real-time – could increase the accuracy of patient notes and save valuable time for clinicians.

A Flinders University study, published in npj Digital Medicine, has found that AI medical scribes already reduce some administrative work that takes time away from patients, but these devices have the capacity to do more when fitted with visual recording apparatus.

Researchers from Flinders’ College of Medicine and Public Health found that a vision-enabled AI scribe, employing a combination of Google’s Gemini model and Ray-Ban Meta smart glasses, substantially improved the documentation accuracy of pharmacist-patient consultations and reduced omissions and errors in clinical notes.

“AI scribes are already helping clinicians by listening to consultations, but healthcare involves far more than spoken words,” says research author Bradley Menz, an academic pharmacist in Flinders’ College of Medicine and Public Health.

“A lot of clinically important information is visual. Important visual cues during consultations include patients’ medicine containers, prescriptions and devices, as well as their body language. When an AI system can use both what it hears and what sees in these consultations, it captures more of the details that matter for patient care.”

In the study, 10 clinical pharmacists recorded 110 ‘mock’ medication-history interviews, which contained more than 100 different medicine containers, including tablets, capsules, injections and creams.

Researchers wore Meta AI Ray-Ban glasses to record the interview before passing the video footage through to the AI scribe, which was developed using Google’s Gemini AI model.

An AI scribe that analysed both video and audio achieved 98 per cent accuracy, compared with 81 per cent  when the same system processed only audio information.

A significant benefit was capturing medication strength and form, which are crucial details for safe dosing. The AI scribe with video input captured this information 97 per cent of the time, while audio-only recordings fell to 28 per cent.

“This is an augmented tool, not a replacement for clinical judgement,” says Mr Menz. “The clinician still needs to review and sign off the document.

“The AI scribe can contain a verification step, take screenshots of medication packages, and generate a full spoken transcript, giving the health professional a much stronger basis for checking what the AI has produced.”

Senior author Associate Professor Ashley Hopkins says the study may point to the next stage of AI scribe usage in health care.

“AI scribes have gained traction because they reduce the burden of documentation and give clinicians more time with their patients. These findings suggest that the next step - when the scribe can see as well as hear – produces a more accurate and complete draft,” says Associate Professor Hopkins. “This means less time editing AI-documentation and even more time focusing on patient care.

“These findings suggest the next step may be that all scribe systems can interpret visual information as well as speech, which could open the door to wider clinical uses.”

The authors say the study has some limitation and underlines the need for human oversight and careful governance before these tools are adopted more broadly. The paper also highlights privacy, consent, data security and workflow integration as important issues that will need to be addressed as vision-enabled AI scribes move closer to practice.

The paper – *Vision-Enabled AI scribes reduce omissions in clinical conversations: evidence from simulated medication histories, by Bradley Menz, Nicholas Scarfo, Natansh Modi (University of South Australia), Erik Cornelisse, Lee Li, Jin Quan Eugene Tan, Jimit Gandhi (University of South Australia), Dorsa Maher, Dib Kousa, Kezia Daniel, Vidya Menon, Stephen Bacchi, Ross McKinnon, Michael Wiese (University of South Australia), Andrew Rowland, Michael Sorich and  Ashley Hopkins – was published in npj Digital Medicine (2026). https://doi.org/10.1038/s41746-026-02494-9*Please note that it is an unedited version of the manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Multimedia

Bradley Menz and Associate Professor Ashley Hopkins, Flinders University
Bradley Menz and Associate Professor Ashley Hopkins, Flinders University
Bradley Menz and Associate Professor Ashley Hopkins, Flinders University
Bradley Menz and Associate Professor Ashley Hopkins, Flinders University

Attachments

Note: Not all attachments are visible to the general public. Research URLs will go live after the embargo ends.

Research Springer Nature, Web page
Journal/
conference:
npi Digital Medicine
Research:Paper
Organisation/s: Flinders University, Adelaide University
Funder: The PhD scholarship of B.D.M is supported by the National Health and Medical Research Council, Australia (APP2030913). A.M.H holds an Emerging Leader Investigator Fellow, National Health and Medical Research Council, Australia (APP2008119). M.J.S. is supported by a Beat Cancer Research Fellowship from the Cancer Council South Australia. S.B. is supported by a Fulbright Scholarship.
Media Contact/s
Contact details are only visible to registered journalists.