back Back

Generative AI for databases

Jul. 10, 2024.
2 min. read Interactions

Easier way to analyze complex tabular data

About the writer

Amara Angelica

198.01421 MPXR

Electronics engineer and inventor

A new tool enables someone to perform complicated statistical analyses on tabular data using just a few keystrokes (credits: MIT News; iStock)

MIT researchers have developed a new tool that makes it easier for database users to perform complicated statistical analyses of tabular data, without the need to know what’s going on behind the scenes.

GenSQL, a generative AI system for databases, could help users make predictions, detect anomalies, guess missing values, fix errors, or generate synthetic data with just a few keystrokes.

GenSQL combines a tabular dataset with a generative probabilistic AI model, which can account for uncertainty and adjust their decision-making based on new data.

GenSQL can also produce and analyze synthetic data that mimic the real data in a database—useful where sensitive data cannot be shared, such as patient health records, or when real data are sparse.

Extending SQL

This new tool is built on top of SQL, a programming language for database creation and manipulation that was introduced in the late 1970s and is used by millions of developers worldwide.

Compared to popular, AI-based approaches for data analysis, GenSQL is faster and also produces more accurate results, the researchers say. Also, the generated models are explainable, so users can read and edit them.

Next, the researchers want to apply GenSQL more broadly to conduct large-scale modeling of human populations. With GenSQL, they can generate synthetic data to draw inferences about things like health and salary while controlling what information is used in the analysis.

ChatGPT-like AI expert

In the long run, the researchers want to enable users to make natural language queries in GenSQL. Their goal: develop a ChatGPT-like AI expert one could talk to about any database, which grounds its answers using GenSQL queries.   

The research was recently presented at the ACM Conference on Programming Language Design and Implementation. It is funded in part by the Defense Advanced Research Projects Agency (DARPA), Google, and the Siegel Family Foundation.

Citation: Mathieu Huot et al., 20 June 2024, Proceedings of the ACM on Programming Languages, Volume 8, Issue PLDI, (open access)

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

Comment on this content


0 thoughts on “Generative AI for databases




💯 💘 😍 🎉 👏
🟨 😴 😡 🤮 💩

Here is where you pick your favorite article of the month. An article that collected the highest number of picks is dubbed "People's Choice". Our editors have their pick, and so do you. Read some of our other articles before you decide and click this button; you can only select one article every month.

People's Choice