Grandmaster-Level Chess Without Search

Artificial Intelligence (AI) has been a significant player in the world of chess for decades, with systems like IBM’s Deep Blue making headlines in the late 90s for defeating world champion Garry Kasparov. More recently, AI advancements have led to the development of systems like AlphaZero and Stockfish 16, which use machine learning techniques to improve their gameplay. 

Research in the area still continues robustly, as exemplified by a recent paper from Google DeepMind. The DeepMind researchers have trained a transformer model with 270 million parameters using supervised learning on a dataset of 10 million chess games. Each game in the dataset was annotated with action-values provided by the powerful Stockfish 16 engine, which led to approximately 15 billion data points.

In the world of chess, a player’s skill level is often measured using the Elo rating system. An average club player might have an Elo rating of around 1500, while a world champion’s rating is typically over 2800. A Lichess blitz Elo rating of 2895, as mentioned in this paper, indicates a very high level of skill, comparable to the top human players in the world.

The model was able to achieve a Lichess blitz Elo rating of 2895 when playing against human opponents, and it was also successful in solving a series of challenging chess puzzles. Remarkably, these achievements were made without any domain-specific tweaks or explicit search algorithms.

Credit: Tesfu Assefa

In terms of performance, the model outperformed AlphaZero’s policy and value networks (without MCTS) and GPT-3.5-turbo-instruct. The researchers found that strong chess performance only arises at sufficient scale. They also conducted an extensive series of ablations of design choices and hyperparameters to validate their results.

The researchers concluded that it is possible to distill a good approximation of Stockfish 16 into a feed-forward neural network via standard supervised learning at sufficient scale. This work contributes to the growing body of literature showing that complex and sophisticated algorithms can be distilled into feed-forward transformers. This implies a paradigm shift away from viewing large transformers as mere statistical pattern recognizers to viewing them as a powerful technique for general algorithm approximation.

The paper also discusses the limitations of the model. While the largest model achieves very good performance, it does not completely close the gap to Stockfish 16. All scaling experiments point towards closing this gap eventually with a large enough model trained on enough data. However, the current results do not allow the researchers to claim that the gap can certainly be closed.

Another limitation discussed is that the predictors see the current state but not the complete game history. This leads to some fundamental technical limitations that cannot be overcome without small domain-specific heuristics or augmenting the training data and observable information.

Finally, when using a state-value predictor to construct a policy, the researchers consider all possible subsequent states that are reachable via legal actions. This requires having a transition model 𝑇 (𝑠, 𝑎), and may be considered a version of 1-step search. While the main point is that the predictors do not explicitly search over action sequences, the researchers limit the claim of ‘without search’ to their action-value policy and behavioral cloning policy.

In conclusion, the paper presents a significant advancement in the field of AI and chess, demonstrating that a complex, search-based algorithm, such as Stockfish 16, can be well approximated with a feed-forward neural network via standard supervised learning. This has implications for the broader field of AI, suggesting that complex and sophisticated algorithms can be distilled into feed-forward transformers, leading to a paradigm shift in how we view and utilize large transformers.

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

Breaking Ground in 3D Modeling: Unveiling 3D-GPT

Researchers from the Australian National University, University of Oxford, and Beijing Academy of Artificial Intelligence have collaboratively developed a groundbreaking framework known as 3D-GPT for instruction-driven 3D modeling.

The framework leverages large language models (LLMs) to dissect procedural 3D modeling tasks into manageable segments and appoints the appropriate agent for each task.

The paper begins by highlighting the increasing use of generative AI systems in various fields such as medicine, news, politics, and social interaction. These systems are becoming more widespread and are used to create content across different formats. However, as these technologies become more prevalent and integrated into various applications, concerns arise regarding public safety. Consequently, evaluating the potential risks posed by generative AI systems is becoming a priority for AI developers, policymakers, regulators, and civil society.

To address this issue, the researchers introduce 3D-GPT, a framework that utilizes large language models (LLMs) for instruction-driven 3D modeling. The framework positions LLMs as proficient problem solvers that can break down the procedural 3D modeling tasks into accessible segments and appoint the apt agent for each task.

The 3D-GPT framework integrates three core agents: the task dispatch agent, the conceptualization agent, and the modeling agent. They work together to achieve two main objectives. First, they enhance initial scene descriptions by evolving them into detailed forms while dynamically adapting the text based on subsequent instructions. Second, they integrate procedural generation by extracting parameter values from enriched text to effortlessly interface with 3D software for asset creation.

The task dispatch agent plays a crucial role in identifying the required functions for each instructional input. For instance, when presented with an instruction such as “translate the scene into a winter setting”, it pinpoints functions like add snow layer() and update trees(). This pivotal role played by the task dispatch agent is instrumental in facilitating efficient task coordination between the conceptualization and modeling agents. From a safety perspective, the task dispatch agent ensures that only appropriate and safe functions are selected for execution, thereby mitigating potential risks associated with the deployment of generative AI systems.

The conceptualization agent enriches the user-provided text description into detailed appearance descriptions. After the task dispatch agent selects the required functions, we send the user input text and the corresponding function-specific information to the conceptualization agent and request augmented text. In terms of safety, the conceptualization agent plays a vital role in ensuring that the enriched text descriptions accurately represent the user’s instructions, thereby preventing potential misinterpretations or misuse of the 3D modeling functions.

Credit: Tesfu Assefa

The modeling agent deduces the parameters for each selected function and generates Python code scripts to invoke Blender’s API. The generated Python code script interfaces with Blender’s API for 3D content creation and rendering. Regarding safety, the modeling agent ensures that the inferred parameters and the generated Python code scripts are safe and appropriate for the selected functions. This process helps to avoid potential safety issues that could arise from incorrect parameter values or inappropriate function calls.

The researchers conducted several experiments to showcase the proficiency of 3D-GPT in consistently generating results that align with user instructions. They also conducted an ablation study to systematically examine the contributions of each agent within their multi-agent system.

Despite its promising results, the framework has several limitations. These include limited curve control and shading design, dependence on procedural generation algorithms, and challenges in processing multi-modal instructions. Future research directions include LLM 3D fine-tuning, autonomous rule discovery, and multi-modal instruction processing.

In summary, the research paper introduces a novel framework that holds promise in enhancing human-AI communication in the context of 3D design and delivering high-quality results.

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter