AI red teaming comes of age

When Ram Shankar Siva Kumar launched Microsoft’s AI red team in 2019, the discipline barely existed.
“The running joke used to be that people who used to work in AI red teaming, you can round them up in a 14-foot catamaran,” he tells CSO.
At the time, Microsoft’s approach looked familiar to anyone in cybersecurity: Attack machine learning systems the same way security teams attacked everything else. Identify weaknesses, emulate adversaries, and uncover vulnerabilities before products reach customers.
Then GPT-4 arrived. “The tool that we had changed; actually, it broke,” Siva Kumar says. The attacks his team had developed against earlier machine learning systems no longer worked against larg...