A new study shows that artificial intelligence (AI) agents, like ChatGPT, can create shared social rules just by interacting. Researchers from City St George’s, University of London, and the IT University of Copenhagen found that large language models (LLMs), don’t just follow instructions. When grouped, these agents form their own linguistic norms, similar to how humans build community rules. The study appears in Science Advances.
The researchers used a method called the “naming game” to test how LLMs behave in groups. They paired agents randomly and asked them to pick a “name,” like a letter or random characters, from a shared list. If both chose the same name, they got a reward. If not, they faced a penalty and saw each other’s choices. Agents only remembered their recent interactions, not the whole group’s actions. Over time, the group developed a shared naming rule without any central control, much like how human cultures form norms naturally.
Collective Behaviors and Biases
The study revealed surprising group behaviors. Collective biases appeared that didn’t come from individual agents. These biases emerged from interactions alone, showing that group dynamics can create new patterns. This finding challenges current AI safety research, which often focuses on single models, not groups. The researchers also found that small, determined groups of agents could shift the entire group to a new naming rule, similar to how critical mass works in human societies.
The experiments tested groups of 24 to 200 agents and used four different LLMs: Llama-2-70b-Chat, Llama-3-70B-Instruct, Llama-3.1-70BInstruct, and Claude-3.5-Sonnet. The results stayed consistent across all models. As LLMs appear more in online spaces, like social media or self-driving cars, this study highlights the need to understand how AI and human reasoning align or differ. The researchers see their work as a step toward safer AI systems, helping humans coexist with AI that negotiates and shapes shared behaviors, just like people do.