back Back

Move over Transformers, the Titans are coming!

Jan. 16, 2025.
2 mins. read. 1 Interactions

A new AI method could overcome some important limitations of the transformer architecture and help develop LLMs with long term memory.

About the Writer

Giulio Prisco

122.33785 MPXR

Giulio Prisco is Senior Editor at Mindplex. He is a science and technology writer mainly interested in fundamental science and space, cybernetics and AI, IT, VR, bio/nano, crypto technologies.

A new Artificial Intelligence (AI) paper by Google researchers, titled “Titans: Learning to Memorize at Test Time” and published in arXiv, is picking up attention. Some observers are describing this as “the successor to the Transformer architecture.” They are expressing hopes that the new approach could overcome some important limitations of the transformer architecture introduced in the seminal 2017 paper “Attention Is All You Need,” which sparked the current wave of AI advances.

There isn’t yet much besides the arXiv paper itself, but more explanations are likely to come. Meanwhile, AI commentators are dissecting and analyzing the paper on social media. Matthew Berman argues on X that “this is huge for AI.” Transformers, the backbone of most AI today, “struggle with long-term memory due to quadratic memory complexity,” he says. “Titans aims to solve this with massive scalability.”

The core concept of the “Titans” AI architecture is to integrate short-term and long-term memory capabilities within a neural network, effectively addressing the limitations of existing models like Transformers and Recurrent Neural Networks (RNNs).

Longer term memory

Researcher Ali Behrouz, the first author of the Titans paper, has posted a long X thread to explain “Titans: a new architecture with attention and a meta in-context memory that learns how to memorize at test time.”

Behrouz argues that Titans are more effective than Transformers and modern linear RNNs, and can effectively scale to larger context window, with better performance than current large language models (LLMs).

Behrouz tries to explain things intuitively. Attention, he says, “performs as a short-term memory, meaning that we need a neural memory module with the ability to memorize long past to act as a long-term, more persistent, memory.”

Titans uses a memory mechanism that is able to retain and access information over very long sequences. Therefore, Titans is less computationally expensive than Transformers with longer inputs. Titans could improve document summarization, long-term narrative understanding, or the ability to maintain context in dialogues over extended periods.

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

Comment on this article

0 Comments

0 thoughts on “Move over Transformers, the Titans are coming!

1

Like

Dislike

Share

Comments
Reactions
💯 💘 😍 🎉 👏
🟨 😴 😡 🤮 💩

Here is where you pick your favorite article of the month. An article that collected the highest number of picks is dubbed "People's Choice". Our editors have their pick, and so do you. Read some of our other articles before you decide and click this button; you can only select one article every month.

People's Choice
Bookmarks