Constitutional AI

An AI training methodology developed by Anthropic that uses a set of principles to guide model self-improvement, generating synthetic preference data and critiques based on constitutional values.

Constitutional AI (CAI) is a training approach where AI models critique and revise their own outputs based on a written set of principles or constitution. Rather than relying solely on human feedback, the model generates synthetic training data by evaluating responses against constitutional values. This methodology underpins Anthropic's approach to alignment and is distinct from the constitution document itself.

Also known as

CAI, constitutional training, RLAIF