Some AI chatbots may be as good as we are at judging mood from faces

Publicly released:
International
CC-0
CC-0

Using famous painted and photographed portraits combined with binary choice questions about the subject’s mood or feelings, Israeli researchers found some artificial intelligence (AI) chatbots - ChatGPT-4o, Grok and Gemini - interpret emotions almost indistinguishably from humans, while Claude and Mistral are less similar. The team compared five chatbots with humans across five scenarios and 43,200 simulations, judging facial pictures based on variation in perceived competence, benevolence, integrity, and demographics. In terms of judging mood, the chatbots were more consistent than people, the scientists say, and more discerning - we tend to collapse individual features together to look for a global 'good person' impression, while the chatbots assessed the different dimensions separately. The researchers hope their findings will help develop emotionally sensitive AI chatbots in the future.

News release

From: The Royal Society

A closer look at how large language models 'trust' humans: patterns and biases

Large language models increasingly advise on human-related decisions, yet we know little about how they “trust” people. Across five real-world scenarios and 43,200 simulations, models generally reward higher competence and integrity, echoing patterns seen in human participants. Yet their trust is much more “by-the-book”: responses are more extreme, more internally consistent, and less shaped by the human halo effect that blends traits into a single global impression. Models also differ from one another, with scenario-dependent weighting of trust cues. Most concerning, demographic labels can systematically shift some models’ trust, underscoring fairness risks in deployment in practice.

Can’t read my poker face - AI models may be able to read your face like an open book. Using famous painted and photographed portraits combined with binary choice questions about the subject’s mood or feelings, researchers found some AI models, including ChatGPT-4o, Grok and Gemini interpreted emotions almost indistinguishably from humans, while Claude and Mistral showed some divergence. The findings “offer a foundation for the development of emotionally competent agents capable of operating in socially nuanced environments”, the authors said

Attachments

Note: Not all attachments are visible to the general public. Research URLs will go live after the embargo ends.

Research The Royal Society, Web page The URL will go live at some point after the embargo ends
Journal/
conference:
Proceedings of the Royal Society A
Research:Paper
Organisation/s: The Hebrew University of Jerusalem, Israel
Funder: The authors received no funding for this study.
Media Contact/s
Contact details are only visible to registered journalists.