LLMs and machine learning for genomics research

2024-12-04
2 min read.
AI methods such as large language models (LLMs) like GPT-4 and machine learning help advance genomics research.

Researchers at UC San Diego have shown that large language models (LLMs) like GPT-4 can speed up functional genomics.

Functional genomics studies what genes do and how they work together. A common method called gene set enrichment compares new gene groups to known databases to find their function. However, this method misses out on new biology not in these databases.

Using artificial intelligence (AI), specifically LLMs, could cut down the time researchers spend on this task.

The researchers describe the methods and results of this study in a paper published in Nature Methods.

The team tested five LLMs and found that GPT-4 was the best, with a 73% success rate in naming gene functions correctly. When given random genes, GPT-4 correctly refused to name functions 87% of the time, avoiding made-up answers or hallucinations. It also gave explanations for its choices.

The study suggests more work is needed, but LLMs could transform genomics by quickly creating new scientific ideas. The researchers made a website to help others use LLMs in their work.

Machine learning finds new patterns in the genome

In related news, researchers at the University of Toronto are using machine learning to study how human chromosomes are organized. This can affect health and disease, like cancer.

The researchers developed a method called "Signature," which uses machine learning to find new patterns in the genome, which is all the genetic material in a human. A paper published in Nature Communications describes the development of Signature and some preliminary tests.

Signature combines imaging with chromosome conformation capture (Hi-C), a technique that gives billions of reads of genetic data, allowing to study many interactions at once. The researchers analyzed 62 data sets, each with over 3.8 million possible chromosome interactions.

"In supervised learning, you know your target. In unsupervised, you let the data speak," notes a researcher. The team used network clustering in the unsupervised approach to find patterns in the data.

#AIApplications



Related Articles


Comments on this article

Before posting or replying to a comment, please review it carefully to avoid any errors. Reason: you are not able to edit or delete your comment on Mindplex, because every interaction is tied to our reputation system. Thanks!

Mindplex

Mindplex is an AI company, a decentralized media platform, a global brain experiment, and a community dedicated to the rapidly unfolding future. Our platform empowers our community to share and discuss futurist content while showcasing AI and blockchain tools that enhance the media experience. Join us and shape the future of digital media!

ABOUT US

FAQ

CONTACT

Editors

© 2025 MindPlex. All rights reserved