Anthropic will include user interactions in Claude's training data

2025-09-15
5 min read.
By including interactions with users in training data, AI models could gain improved statistics on how people actually interact and an improved ability to resolve imprecisions and capture shifting language patterns.
Anthropic will include user interactions in Claude's training data
Credit: Tesfu Assefa

Anthropic, the developer of the Claude artificial intelligence (AI) chatbot, announced a significant policy shift on August 28, 2025, requiring users to decide by September 28, 2025, whether their conversations and coding sessions can be used to train future Claude models.

Previously, Anthropic did not use consumer chat data for training, automatically deleting prompts and outputs within 30 days unless legally required or flagged for policy violations (up to two years retention).

The new policy extends data retention to five years for users who opt in, applying to new or resumed chats and coding sessions on Claude Free, Pro, Max, and Claude Code accounts. Enterprise users (Claude Gov, Claude for Work, Claude for Education, or API users via Amazon Bedrock or Google Cloud’s Vertex AI) are exempt.

Users can opt out during signup (new users) or via a pop-up (existing users), with the option to toggle off data sharing. Changes apply only to future data, and Anthropic emphasizes using tools to filter sensitive information and not selling data to third parties. The company frames this as a way to enhance model safety, coding, and reasoning, but critics note it aligns with competitive pressures to leverage real-world data against rivals like OpenAI and Google. Users who fail to decide by the deadline must select a preference to continue using Claude.

Reactions

OpenAI, a key competitor, has faced similar scrutiny over data retention, particularly due to a court order requiring indefinite retention of ChatGPT conversations amid a lawsuit by The New York Times and other publishers. OpenAI’s COO, Brad Lightcap, criticized this as conflicting with user privacy commitments. Anthropic’s move is seen as mirroring industry trends, as companies like OpenAI and Google increasingly rely on user data to refine models. However, OpenAI’s enterprise customers, like Anthropic’s, are protected from such policies, highlighting a common strategy to prioritize business clients.

TechCrunch noted that Anthropic’s shift reflects a broader industry need for vast conversational data to stay competitive, suggesting it’s less about user benefit and more about catching up with rivals.

Media outlets like The Verge criticized Anthropic’s opt-out design, noting the pop-up’s prominent “Accept” button and smaller, default-on toggle risks users unknowingly consenting. Privacy advocacy argued the policy lacks transparency and raises consent concerns, especially given Claude’s use for sensitive discussions (e.g., mental health).

Unlocking AI potential through user Interactions

Anthropic’s decision to incorporate user chats into Claude’s training data could be a transformative step that unlocks Claude's potential, enabling the model to evolve dynamically through real-world interactions. By leveraging the rich, diverse dataset of user conversations, Claude could achieve significant improvements in performance, safety, and contextual understanding.

Claude’s users generate a vast and diverse array of conversations spanning coding, analysis, reasoning, and everyday queries. This diversity is critical for training models that excel in real-world applications. For instance, including user coding sessions on Claude Code in the model's training data could refine its ability to generate accurate, contextually relevant code.

Anthropic emphasizes that such data will enhance skills like reasoning and analysis, making Claude more adept at handling complex tasks. Unlike controlled training environments, real-world data captures authentic language patterns, cultural nuances, and user intent, enabling Claude to better understand and respond to diverse queries. This aligns with industry trends, as competitors like OpenAI and Google rely on similar data to stay ahead, underscoring its necessity for cutting-edge AI.

Anthropic argues that, by integrating user chats, Claude gains a dynamic learning mechanism. This “unlocks” its potential to become more intelligent, safer, and contextually aware, fostering a virtuous cycle where user interactions drive innovation, and improved models enhance user experience.

Credit: Tesfu Assefa

Privacy and safety concerns

Extended data retention (five years for opted-in users) allows Anthropic to track long-term usage patterns, improving model safety and robustness. By analyzing conversations over time, Claude can better detect harmful patterns, such as spam or abuse.

This is particularly crucial given recent incidents, like a hacker using Claude Code for cyberattacks, highlighting the need for proactive safety measures. Training on user data enables Claude to refine its ability to identify and mitigate misuse, such as refusing harmful requests (e.g., generating malicious code or extremist content). Anthropic’s commitment to filtering sensitive data using automated tools is meant to promote responsible use, balancing innovation with privacy. This long-term data approach also supports smoother model upgrades.

Anthropic argues that user-driven training democratizes AI improvement, allowing Claude to evolve in alignment with user needs. By opting in, users contribute to a feedback loop where their interactions directly shape Claude’s capabilities, creating a collaborative ecosystem. Anthropic frames this as “user-powered safety,” where collective data enhances moderation and comprehension. For example, users discussing sensitive topics like mental health can help Claude develop guardrails, similar to OpenAI’s recent addition of mental health-focused responses in ChatGPT. This participatory model ensures Claude remains relevant and user-centric, adapting to real-world demands rather than static training sets.

Critics argue that the opt-out design risks uninformed consent and privacy breaches, especially for sensitive data. Anthropic wants to mitigate this with a transparent opt-out process and a stated commitment to not selling data or using past chats without resumption.

The default-on toggle, though controversial, often present in tech interfaces to encourage participation, and users can reverse decisions anytime. Moreover, the risk of data reproduction is a broader industry challenge, not unique to Anthropic, and their filtering tools aim to minimize this. The competitive pressure to match OpenAI and Google necessitates this shift, as Anthropic’s prior stance limited its data pool, potentially stunting Claude’s growth.

Other AI operators could follow

It seems likely that Anthropic's move could be soon imitated by other AI operators. By including interactions with users in training data, AI models could gain improved statistics on how people actually interact and an improved ability to resolve imprecisions and shifting language patterns. As a user, I would like AI models to internalize their interactions with me and I would like to think that I'm making small contributions to their training.

#LargeLanguageModels(LLMs)

#Learning



Related Articles


Comments on this article

Before posting or replying to a comment, please review it carefully to avoid any errors. Reason: you are not able to edit or delete your comment on Mindplex, because every interaction is tied to our reputation system. Thanks!