Move over Transformers, the Titans are coming!

2025-01-16
2 min read.
A new AI method could overcome some important limitations of the transformer architecture and help develop LLMs with long term memory.

A new Artificial Intelligence (AI) paper by Google researchers, titled "Titans: Learning to Memorize at Test Time" and published in arXiv, is picking up attention. Some observers are describing this as "the successor to the Transformer architecture." They are expressing hopes that the new approach could overcome some important limitations of the transformer architecture introduced in the seminal 2017 paper "Attention Is All You Need," which sparked the current wave of AI advances.

There isn't yet much besides the arXiv paper itself, but more explanations are likely to come. Meanwhile, AI commentators are dissecting and analyzing the paper on social media. Matthew Berman argues on X that "this is huge for AI." Transformers, the backbone of most AI today, "struggle with long-term memory due to quadratic memory complexity," he says. "Titans aims to solve this with massive scalability."

The core concept of the "Titans" AI architecture is to integrate short-term and long-term memory capabilities within a neural network, effectively addressing the limitations of existing models like Transformers and Recurrent Neural Networks (RNNs).

Longer term memory

Researcher Ali Behrouz, the first author of the Titans paper, has posted a long X thread to explain "Titans: a new architecture with attention and a meta in-context memory that learns how to memorize at test time."

Behrouz argues that Titans are more effective than Transformers and modern linear RNNs, and can effectively scale to larger context window, with better performance than current large language models (LLMs).

Behrouz tries to explain things intuitively. Attention, he says, "performs as a short-term memory, meaning that we need a neural memory module with the ability to memorize long past to act as a long-term, more persistent, memory."

Titans uses a memory mechanism that is able to retain and access information over very long sequences. Therefore, Titans is less computationally expensive than Transformers with longer inputs. Titans could improve document summarization, long-term narrative understanding, or the ability to maintain context in dialogues over extended periods.

#Deeplearning

#NeuralNetworks



Related Articles


Comments on this article

Before posting or replying to a comment, please review it carefully to avoid any errors. Reason: you are not able to edit or delete your comment on Mindplex, because every interaction is tied to our reputation system. Thanks!

Mindplex

Mindplex is an AI company, a decentralized media platform, a global brain experiment, and a community dedicated to the rapidly unfolding future. Our platform empowers our community to share and discuss futurist content while showcasing AI and blockchain tools that enhance the media experience. Join us and shape the future of digital media!

ABOUT US

FAQ

CONTACT

Editors

© 2025 MindPlex. All rights reserved