jexCooking's profile picture.

Jexen

@jexCooking

Jexen reposted

New research from Anthropic Fellows Program: Selective GradienT Masking (SGTM). We study how to train models so that high-risk knowledge (e.g. about dangerous weapons) is isolated in a small, separate set of parameters that can be removed without broadly affecting the model.

AnthropicAI's tweet image. New research from Anthropic Fellows Program: Selective GradienT Masking (SGTM).

We study how to train models so that high-risk knowledge (e.g. about dangerous weapons) is isolated in a small, separate set of parameters that can be removed without broadly affecting the model.

Loading...

Something went wrong.


Something went wrong.