What the study found
The paper presents a protocol for evaluating ChatGPT’s ability to generate disease-centric biomedical associations. It also describes a self-consistency strategy for assessing generative reliability across ChatGPT models and a use case for semantic verification using retrieval-augmented generation (RAG), a method that uses retrieved information to help an AI system answer.
Why the authors say this matters
The authors say the workflow helps establish truth over content generated by other large language models and exposes hallucination, meaning AI-generated information that is not supported by evidence. The study suggests this is useful for verifying biomedical associations and addressing limits in exact ontology matching.
What the researchers tested
The researchers outlined how to generate associations, validate biological entities using biomedical ontologies, and verify associations using literature. They also included a self-consistency strategy across ChatGPT models and a RAG-enabled workflow powered by open-source large language models for semantic verification.
What worked and what didn't
The abstract states that the protocol can support generation, ontology-based validation, literature verification, and semantic verification through RAG. It also states that the workflow is designed to address ontology exact-match limitations and to expose hallucination. The abstract does not provide quantitative results.
What to keep in mind
This is a protocol paper, so the abstract describes an evaluation workflow rather than reporting outcome data. The available summary does not describe limitations beyond ontology exact-match limitations and the need for semantic verification.
Key points
- The paper presents a protocol for evaluating ChatGPT in disease-centric biomedical association generation.
- It includes validation of biological entities using biomedical ontologies and verification using literature.
- The authors describe a self-consistency strategy to assess reliability across ChatGPT models.
- A RAG-enabled workflow with open-source large language models is used for semantic verification.
- The authors say the workflow can help expose hallucination and address ontology exact-match limitations.
Disclosure
- Research title:
- Protocol for evaluating ChatGPT in biomedical association generation
- Image credit:
- Photo by Tara Winstead on Pexels
Get the weekly research newsletter
Stay current with peer-reviewed research without reading academic papers — one filtered digest, every Friday.


