One of the many domains where deep neural networks, particularly the Transformer types, are anticipated to achieve their next breakthrough is in the realm of scientific exploration. This is due to their demonstrated proficiency in areas such as computer vision and language modeling, where they have already demonstrated notable success. However, these neural networks seem to be limited in their ability to perform logic tasks. These tasks, which could range from traditional vision or language tasks whose input has a combinatorial nature, seem to make representative data sampling challenging. This has motivated the machine learning community to heavily focus on reasoning tasks, including explicit tasks in the logical domain (like arithmetic and algebra, algorithmic CLRS or LEGO), or implicit reasoning in other modes (such as Pointer Value Retrieval or Clevr for vision models, LogiQA and GSM8K for language models).
Since these efforts continue to be difficult for Standard Transformer Structures, it is only natural to investigate whether they may be managed more efficiently with alternative methods, such as making better use of the Boolean nature of the task. In this regard, the process of Training transformers leads to an undesirable generalization, and this in turn makes interpretability challenging. This raises the question of how to improve generalization and interoperability of these Transformer models. But research by a team from Apple and EPFL seems to have found a breakthrough which can answer that question. They have come up with the Boolformer, the first neural network of the Transformer design to solve problems in symbolic logic. The Boolformer can predict compactcompcat formulas for complex functions which were not seen during training, thus generalizing consistently to functions and data that are more sophisticated than those during the training.
The boolformer predicts Boolean formula, which can be seen as a symbolic expression of the 3 basic logic gates: AND, OR and NOT. The model is trained with a set of training examples, which are synthetically created functions. The truth table of the functions acts as input and their formula used as targets. This setup, with control of the data generation process helps with gaining both generalization and interpretability. The researchers from Apple and EFPL have demonstrated the powerful performance of this approach in both theoretical and real world settings, and they also lay the foundation for future advancements in this area.
The research by the team has made several contributions. By training transformers over synthetic datasets to perform end-to-end symbolic regression of boolean regression, researchers show that the Boolformer can predict a compact formula when given a full truth table of unseen function. The researchers also demonstrate that the Boolformer can handle noisy and incomplete data, by giving as inputs truth tables with flipped bits and irrelevant variables. Not only this, but they have evaluated the Boolformer with various real-world binary classification tasks from the PMLB database, and show that it is competitive with classic machine learning approaches like Random Forests while providing interpretable predictions. They have also applied the Model on the well-studied task of modeling gene-regulatory networks (GRNs) in biology. They demonstrate that the Boolformer is competitive with current state-of-the-art approaches, and that it even has inference time that is several times faster than the other methods.
Their code and models are open source and available to the public, which can be found on their github. They have made sure that anyone who wants to contribute to their work is easily set-up and starts work. Do check their work.
There seem to be some constraints that point to new areas for research however. First, the quadratic cost of self-attention limits the model’s effectiveness on high-dimensional functions and big datasets, which caps the number of input points at one thousand. Second, because the logical functions of the training sets did not include the XOR gate explicitly, the model has been limited in the compactness of the formulas it predicts and in its ability to express complex formulas such as parity functions. This limitation came due to the simplification process used during the generation procedure. The process required rewriting the XOR gate in terms of AND, OR and NOT. Adapting the production of simplified formulas consisting of XOR gates as well as operators with higher parity is left as a future effort by the research team. And thirdly the formulas predicted by the model are only single-output functions and gates with a fan-out of one (Multi output functions are predicted independently component-wise).
In conclusion, The Boolformer is a new breakthrough in the field of Machine Learning, which helps in the advancement of the field, making machine learning more accessible, efficient and performative, as well as unlocking the potential of AI in newer domains and in the process helping the advancement of science and knowledge.
Do not forget to check out the paper and their github.
Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter.