# Anthropic Publishes Comprehensive New Constitution for Claude

> Coverage of lessw-blog

**Published:** January 21, 2026
**Author:** PSEEDR Editorial
**Category:** risk
**Content tier:** free
**Accessible for free:** true



**Word count:** 385


**Tags:** AI Safety, Anthropic, Constitutional AI, LLM Governance, Machine Learning, Open Source

**Canonical URL:** https://pseedr.com/risk/anthropic-publishes-comprehensive-new-constitution-for-claude

---

In a recent update, lessw-blog reports on Anthropic's release of a significantly expanded "constitution" for its AI model, Claude, marking a pivotal step in transparency regarding how Large Language Models (LLMs) are aligned with human values.

In a recent post, **lessw-blog** discusses the release of a new, comprehensive "constitution" for Anthropic's AI model, Claude. This document represents a significant evolution in the methodology of "Constitutional AI," where a model is trained to follow a specific set of written principles rather than relying solely on broad reinforcement learning from human feedback (RLHF).

The concept of AI alignment often struggles with opacity; users and researchers rarely know the specific instructions that govern a model's refusal to answer prompts or its tone. This topic is critical because as models become more capable, the "black box" nature of their ethical guardrails becomes a safety risk. **lessw-blog** highlights that Anthropic is addressing this by publishing the specific text that shapes Claude's behavior, offering a clear window into the model's operational constraints.

According to the analysis, this new constitution is more than twice the length of the previous internal "soul document." It serves as a detailed description of Anthropic's vision, explicitly defining the model's operational context and desired entity type. The document outlines what it means for the AI to be helpful, safe, ethical, and cooperative. Crucially, this is not merely a policy document for human reviewers but a functional component of the model training process. The text is used to generate the feedback data that fine-tunes the model, directly influencing its output.

The post also notes that Anthropic has released this document under a Creative Commons CC0 1.0 Deed. This places the constitution in the public domain, allowing other researchers and developers to use, critique, or adapt these alignment principles for their own models without restriction. This move suggests an effort to establish industry standards for AI governance by making the "rules of the road" transparent and accessible.

For those interested in the mechanics of AI safety and the specific values being encoded into leading foundation models, the full post offers a necessary breakdown of this new framework.

[Read the full post at LessWrong](https://www.lesswrong.com/posts/mLvxxoNjDqDHBAo6K/claude-s-new-constitution)

### Key Takeaways

*   Anthropic has released a detailed constitution that directly shapes Claude's behavior during training.
*   The new document is significantly longer than the previous "soul document," incorporating expanded definitions of values.
*   The constitution is released under a CC0 license, placing it in the public domain for unrestricted use.
*   The text explicitly defines core attributes such as helpfulness, safety, ethics, and cooperation.
*   This release promotes transparency in AI alignment, moving away from opaque safety filters.

[Read the original post at lessw-blog](https://www.lesswrong.com/posts/mLvxxoNjDqDHBAo6K/claude-s-new-constitution)

---

## Sources

- https://www.lesswrong.com/posts/mLvxxoNjDqDHBAo6K/claude-s-new-constitution
