Monday, July 28, 2025
Researchers at the National Institutes of Health (NIH) have developed an artificial intelligence (AI) agent powered by a large language model (LLM) that creates more accurate and informative descriptions of biological processes and their functions in gene set analysis than current systems.
The system, called GeneAgent, cross-checks its own initial predictions—also known as claims—for accuracy against information from established, expert-curated databases and returns a verification report detailing its successes and failures. The AI agent can help researchers interpret high-throughput molecular data and identify relevant biological pathways or functional modules, which can lead to a better understanding of how different diseases and conditions affect groups of genes individually and together.
AI-generated content is produced by LLMs trained on enormous amounts of text data from across the internet. LLMs use those data to recognize patterns and predict what words might follow each other in a sentence. However, LLMs are not designed to verify truth, meaning AI-generated content can be false, misleading, or fabricated, a phenomenon called AI hallucinations. Additionally, LLMs are prone to circular reasoning—fact-checking their generated results against their own data—which makes them sound more confident in the output even when the information is false.
Staving off AI hallucinations is important when using LLM tools for gene set analysis—the process of generating collective functional descriptions of grouped genes and their potential interactions. Previous studies that taught LLMs to answer genomic questions or summarize biological processes in a given gene set did not explicitly address hallucinations in the generated content.
GeneAgent mitigates this issue by taking its own claims and independently comparing them to established knowledge compiled in external, expert-curated databases. The research team first tested GeneAgent on 1,106 gene sets sourced from existing databases with known functions and process names. For each gene set, GeneAgent first generated an initial list of functional claims. It then independently used its self-verification agent module to cross-check these claims against the curated databases and create a verification report that noted whether each of its claims was supported, partially supported, or refuted.
To best determine its accuracy in the self-verification step, the researchers next brought in two human experts to manually review 10 randomly selected gene sets with a cumulative 132 claims and judge whether GeneAgent’s self-verification reports were correct, partially correct, or incorrect. Of the self-verification reports generated by GeneAgent, the experts determined that 92% of its decisions were correct, indicating high performance in its ability to conduct self-verification, especially when compared to GPT-4. Their detailed review confirmed the model’s effectiveness in minimizing hallucinations and generating more reliable analytical narratives.
The research team also looked at real-world application of GeneAgent on animal-model gene sets. When applied to seven novel gene sets derived from mouse melanoma cell lines, GeneAgent was able to offer valuable insight into novel functionalities for specific genes. This could mean knowledge discovery for things such as potential new drug targets for diseases like cancer.
While LLMs such as GeneAgent are still limited by the information they can use and their inability to reason as humans, GeneAgent’s ability for self-driven fact-checking shows remarkable promise in mitigating AI hallucinations.
About the National Library of Medicine (NLM): NLM is a leader in research in biomedical informatics and data science and the world’s largest biomedical library. NLM conducts and supports research in methods for recording, storing, retrieving, preserving, and communicating health information. It creates resources and tools that are used billions of times each year by millions of people to access and analyze molecular biology, biotechnology, toxicology, environmental health, and health services information. Additional information is available at https://www.nlm.nih.gov.
About the National Institutes of Health (NIH): NIH, the nation’s medical research agency, includes 27 Institutes and Centers and is a component of the U.S. Department of Health and Human Services. NIH is the primary federal agency conducting and supporting basic, clinical, and translational medical research, and is investigating the causes, treatments, and cures for both common and rare diseases. For more information about NIH and its programs, visit www.nih.gov.
NIH…Turning Discovery Into Health®
​​​​​​​Reference
Wang, Z., Jin, Q., Wei, CH. et al. GeneAgent: self-verification language agent for gene-set analysis using domain databases. Nat Methods (2025). https://doi.org/10.1038/s41592-025-02748-6