ChatGPT is getting better at citing research, but it still sometimes makes stuff up

Publicly released:
International
Photo by Jonathan Kemper on Unsplash
Photo by Jonathan Kemper on Unsplash

ChatGPT's newest model is better at citing real scientific studies to support its advice but it still references fake articles, according to international researchers. To test the capacity of the AI chatbot to help create medical training content, the researchers put both GPT-3.5 and its newer model GPT-4 to the test - asking them questions about learning health systems, asking them to cite journal articles to back up their claim and then verifying if the sources they chose were legitimate. The researchers say 98% of the references GPT-3.5 gave were fake, while 20.6% of the references GPT-4 gave were fake. The researchers say GPT-3.5 shouldn't be used to help create medical training content and GPT-4 can only be used with humans manually verifying its claims.

Attachments

Note: Not all attachments are visible to the general public. Research URLs will go live after the embargo ends.

Research JAMA, Web page The URL will go live after the embargo ends
Journal/
conference:
JAMA Network Open
Research:Paper
Organisation/s: Learning Health Community, USA
Funder: None reported.
Media Contact/s
Contact details are only visible to registered journalists.