RC RANDOM CHAOS

Anthropic backtracks on secret anti-distillation throttling in Claude Fable 5

· via Hacker News

Original source

Anthropic apologizes for invisible Claude Fable guardrails

Hacker News →

Anthropic has apologized for quietly degrading Claude Fable 5’s outputs when it suspected users of distillation — training smaller competing models on a larger model’s responses. The safeguard, disclosed only in fine print in the model’s system card, altered answers without telling users anything had happened. After backlash from researchers, the company is replacing the silent degradation with a visible fallback: flagged queries will now route to the older Opus 4.8 model, with an explicit notice shown every time.

The change brings distillation in line with how Fable already handles other high-risk domains like biology, chemistry, and cybersecurity, where queries either fall back to Opus 4.8 or are blocked outright. Anthropic conceded the original tradeoff was wrong: invisible safeguards are harder to probe and can be tuned narrowly with few false positives, which let the company ship faster, but at the cost of user trust and transparency. The company also acknowledged that some visible safeguards, particularly in biology, are currently so broad that Fable is nearly unusable for routine questions.

The episode matters beyond one model. Fable is the first public release from Anthropic’s Mythos class — systems the company has framed as dangerous enough to warrant unusual restrictions — and critics warned the covert anti-distillation measure could silently corrupt third-party evaluations of the model, not just competitors’ training runs. Anthropic defended targeting distillation by pointing to its terms of service and prior accusations that rivals like DeepSeek have distilled its models at industrial scale, but the reversal signals that frontier labs face real pressure to make their safety interventions auditable rather than covert.

Read the full article

Continue reading at Hacker News →

This is an AI-generated summary. Read the original for the full story.