Tuesday, 24 February 2026

What are "Distillation Attacks"?

"Distillation attacks" are a form of intellectual property theft whereby an attacker repeatedly queries a proprietary AI model and uses input-output pairs to mimic it. 

This removes the need to access or acquire the relevant training data or weights (the production process of which can be extremely expensive).

Anthropic, for example, has accused a number of developers in China of large-scale distillation "campaigns", namely DeepSeek, MiniMax Group and Moonshot AI.

No comments: