AI scientists in Japan have developed a new LLM with about 172 billion parameters and shared parameters and training data with the public.

Artificial intelligence (AI) scientists at the National Institute of Informatics (NII) in Japan have developed a new large language model (LLM) with about 172 billion parameters. The new LLM goes by the name "llm-jp-3-172b-instruct3."

The scientists trained the model with 2.1 trillion tokens. The NII has shared both the parameters and the training data of the model with the public.

The NII scientists tested this model against others like GPT-3.5 on two benchmarks: "llm-jp-eval" for Japanese language skills and "llm-leaderboard" for general language understanding. The model did better than GPT-3.5 on both. The model was built on a platform called mdx and with help from ABCI, a supercomputer system.

For training, the NII scientists used texts in Japanese, English, Chinese, Korean, and program code. They got Japanese texts from places like Common Crawl, the National Diet Library, Wikipedia, and research project summaries. The model uses the LLaMA-2 architecture.

The NII scientists tuned the model with 13 types of Japanese instructions and some English translations. The results showed it performed well, scoring higher than GPT-3.5 in both evaluations. However, the NII team admit that ensuring complete safety in responses is challenging, and they've done what they can with current technology. They tested for safety with 181 items, and while it mostly passed, there were seven responses that didn't meet safety standards.

Accelerating innovation in AI

The model's data and tools are available online. See the website of the LLM-jp consortium and the llm-jp-3-172b-instruct3 repository at Hugging Face.

The NII team plan to keep developing these models to make them transparent and reliable, focusing on safety as they grow. They've also kept intermediate training data which they might share later. This work is part of a broader effort by NII to advance language model research in Japan.

This development by NII is pushes forward the boundaries of AI language understanding, particularly in Japanese. By making the model's parameters and training data public, it not only fosters transparency but also empowers researchers worldwide to further refine and study language models. This openness can accelerate innovation in AI.

#NeuralNetworks

About the Writer

GP

Giulio Prisco

248.30312 MPXR

Giulio Prisco is Senior Editor at Mindplex. He is a science and technology writer mainly interested in fundamental science and space, cybernetics and AI, IT, VR, bio/nano, crypto technologies.

Japanese AI scientists release a fully open LLM

Accelerating innovation in AI

About the Writer

Related Articles

Comments on this article

Mindplex

QUICK LINKS

ABOUT US

CONTACT