Biomedical researchers often use RNA sequencing, a technique to measure which genes are active in individual cells, to create detailed maps of tissues, organs, and diseases. These maps come from millions of cells and require both biological knowledge and programming skills to analyze. To help, researchers at the Austrian Academy of Sciences and the Medical University of Vienna have created CellWhisperer, an AI method and software tool. It connects gene expression, the process where genes produce proteins or other molecules, with descriptive text from over a million biological samples. This lets users investigate complex biology through a simple chat box in English, avoiding the need for code.
From genes to text - and vice versa
CellWhisperer applies multimodal deep learning, an AI approach that combines different data types like numbers from gene activity and words from descriptions, on data curated from public databases. It enables text-based searches, such as asking to see immune cells from inflamed colons in autoimmune diseases. The tool also includes a large language model hat mimics talks between biologists and data experts. Users can query about active genes in specific cells and get comments on possible biological meanings. Built into a user-friendly web interface based on the CELLxGENE browser, it is freely accessible online and open source on GitHub, with example datasets like human embryonic development.
To show its value, the tool analyzed single-cell data from human embryos, identifying time points, cell groups, and marker genes linked to organ formation. Some markers confirmed known ones, while others suggested new candidates. As a proof-of-concept, CellWhisperer supports exploratory research by giving quick insights into datasets, though results need checking with standard methods. It marks a step toward AI assistants that empower scientists in understanding cell behavior and disease foundations.
The researchers have described CellWhisperer and outlined the methods and results of this study in a paper published in Nature Biotechnology.