Easier way to analyze complex tabular data
MIT researchers have developed a new tool that makes it easier for database users to perform complicated statistical analyses of tabular data, without the need to know what’s going on behind the scenes.
GenSQL, a generative AI system for databases, could help users make predictions, detect anomalies, guess missing values, fix errors, or generate synthetic data with just a few keystrokes.
GenSQL combines a tabular dataset with a generative probabilistic AI model, which can account for uncertainty and adjust their decision-making based on new data.
GenSQL can also produce and analyze synthetic data that mimic the real data in a database—useful where sensitive data cannot be shared, such as patient health records, or when real data are sparse.
Extending SQL
This new tool is built on top of SQL, a programming language for database creation and manipulation that was introduced in the late 1970s and is used by millions of developers worldwide.
Compared to popular, AI-based approaches for data analysis, GenSQL is faster and also produces more accurate results, the researchers say. Also, the generated models are explainable, so users can read and edit them.
Next, the researchers want to apply GenSQL more broadly to conduct large-scale modeling of human populations. With GenSQL, they can generate synthetic data to draw inferences about things like health and salary while controlling what information is used in the analysis.
ChatGPT-like AI expert
In the long run, the researchers want to enable users to make natural language queries in GenSQL. Their goal: develop a ChatGPT-like AI expert one could talk to about any database, which grounds its answers using GenSQL queries.
The research was recently presented at the ACM Conference on Programming Language Design and Implementation. It is funded in part by the Defense Advanced Research Projects Agency (DARPA), Google, and the Siegel Family Foundation.
Citation: Mathieu Huot et al., 20 June 2024, Proceedings of the ACM on Programming Languages, Volume 8, Issue PLDI, https://doi.org/10.1145/3656409 (open access)
Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter.
0 Comments
0 thoughts on “Generative AI for databases”