back

Grandmaster-Level Chess Without Search

Apr. 11, 2024.
3 min. read. 24 Interactions

DeepMind's transformer model challenges Stockfish 16 in chess, achieving 2895 Lichess blitz Elo rating without search algorithms.

About the Writer

Wendwossen Dufera

3.70687 MPXR

Wendwossen is a young tech enthusiast with a vision for AI and blockchain to drive growth in less developed nations, ensuring global inclusivity for the impending singularity. He is committed to bridging technological gaps and fostering equal opportunities worldwide.


Credit: Tesfu Assefa

Artificial Intelligence (AI) has been a significant player in the world of chess for decades, with systems like IBM’s Deep Blue making headlines in the late 90s for defeating world champion Garry Kasparov. More recently, AI advancements have led to the development of systems like AlphaZero and Stockfish 16, which use machine learning techniques to improve their gameplay. 

Research in the area still continues robustly, as exemplified by a recent paper from Google DeepMind. The DeepMind researchers have trained a transformer model with 270 million parameters using supervised learning on a dataset of 10 million chess games. Each game in the dataset was annotated with action-values provided by the powerful Stockfish 16 engine, which led to approximately 15 billion data points.

In the world of chess, a player’s skill level is often measured using the Elo rating system. An average club player might have an Elo rating of around 1500, while a world champion’s rating is typically over 2800. A Lichess blitz Elo rating of 2895, as mentioned in this paper, indicates a very high level of skill, comparable to the top human players in the world.

The model was able to achieve a Lichess blitz Elo rating of 2895 when playing against human opponents, and it was also successful in solving a series of challenging chess puzzles. Remarkably, these achievements were made without any domain-specific tweaks or explicit search algorithms.

Credit: Tesfu Assefa

In terms of performance, the model outperformed AlphaZero’s policy and value networks (without MCTS) and GPT-3.5-turbo-instruct. The researchers found that strong chess performance only arises at sufficient scale. They also conducted an extensive series of ablations of design choices and hyperparameters to validate their results.

The researchers concluded that it is possible to distill a good approximation of Stockfish 16 into a feed-forward neural network via standard supervised learning at sufficient scale. This work contributes to the growing body of literature showing that complex and sophisticated algorithms can be distilled into feed-forward transformers. This implies a paradigm shift away from viewing large transformers as mere statistical pattern recognizers to viewing them as a powerful technique for general algorithm approximation.

The paper also discusses the limitations of the model. While the largest model achieves very good performance, it does not completely close the gap to Stockfish 16. All scaling experiments point towards closing this gap eventually with a large enough model trained on enough data. However, the current results do not allow the researchers to claim that the gap can certainly be closed.

Another limitation discussed is that the predictors see the current state but not the complete game history. This leads to some fundamental technical limitations that cannot be overcome without small domain-specific heuristics or augmenting the training data and observable information.

Finally, when using a state-value predictor to construct a policy, the researchers consider all possible subsequent states that are reachable via legal actions. This requires having a transition model 𝑇 (𝑠, 𝑎), and may be considered a version of 1-step search. While the main point is that the predictors do not explicitly search over action sequences, the researchers limit the claim of ‘without search’ to their action-value policy and behavioral cloning policy.

In conclusion, the paper presents a significant advancement in the field of AI and chess, demonstrating that a complex, search-based algorithm, such as Stockfish 16, can be well approximated with a feed-forward neural network via standard supervised learning. This has implications for the broader field of AI, suggesting that complex and sophisticated algorithms can be distilled into feed-forward transformers, leading to a paradigm shift in how we view and utilize large transformers.

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

Comment on this article

8 Comments

8 thoughts on “Grandmaster-Level Chess Without Search

  1. Nice Article

    Like
    Dislike
    Share
    Reply
  2. Great!

    1 Like
    Dislike
    Share
    Reply
  3. What just happened
    1 Like
    Dislike
    Share
    Reply
  4. Yeju mocha
    1 Like
    Dislike
    Share
    Reply
  5. All really interesting stuff!

    1 Like
    Dislike
    Share
    Reply
  6. Good!

    2 Likes
    Dislike
    Share
    Reply
  7. Anime

    1 mon ago
    7.34295 MPXR
    5 interactions

    Over time humans will be replaced by AI. This makes me a little worried because little by little all fields will be taken over by AI

    4 Likes
    Dislike
    Share
    Reply
  8. nice write


    2 Likes
    Dislike
    Share
    Reply

11

Like

Dislike

1

Share

8

Comments
Reactions
💯 💘 😍 🎉 👏
🟨 😴 😡 🤮 💩

Here is where you pick your favorite article of the month. An article that collected the highest number of picks is dubbed "People's Choice". Our editors have their pick, and so do you. Read some of our other articles before you decide and click this button; you can only select one article every month.

People's Choice
Bookmarks