Why Language and Thought Shouldn’t be Equated: A Cognitive Perspective on Large Language Models
Feb. 20, 2023.
1 min. read.
4 Interactions
University of Texas at Austin and Massachusetts Institute of Technology (MIT) researchers have proposed a new way of thinking about large language models (LLMs). They argue in their paper “Dissociating language and thought in large language models: a cognitive perspective” that in order to understand the capabilities and limitations of LLMs, we must distinguish between “formal” and “functional” linguistic competence. The capacities required to produce and comprehend a given language are referred to as formal linguistics, whereas functional linguistics is concerned with using language to do things in the world.
In their paper, the researchers investigate two common fallacies in language and thought. The first fallacy is that an entity that is proficient in language is also proficient in thought. The second fallacy is that a language model is not a good model of human language if it cannot fully capture the richness and sophistication of human thought. These fallacies, according to the researchers, stem from the same misconception of equating language and thought.
LLMs excel at formal linguistic competence, which encompasses linguistic rules as well as statistical regularities that cannot be captured by rules. They do, however, have a long way to go in terms of functional linguistic competence. To achieve near-human performance, LLMs require unrealistic amounts of data, and they lack pre-existing biases that can help them learn quickly from sparse and noisy input. What would be interesting is to research inductive biases that can help LLMs learn faster and with less data, as well as architectures that can capture these biases.
Interesting story? Please click on the ? button below!
Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter.
0 Comments
0 thoughts on “Why Language and Thought Shouldn’t be Equated: A Cognitive Perspective on Large Language Models”