Adversarial ML Attack that Secretly Gives a Language Model a Point of View
Schneier on Security
OCTOBER 21, 2022
Machine learning security is extraordinarily difficult because the attacks are so varied—and it seems that each new one is weirder than the next. Here’s the latest: a training-time attack that forces the model to exhibit a point of view: Spinning Language Models: Risks of Propaganda-As-A-Service and Countermeasures.”
Let's personalize your content