Databite No. 161: Red Teaming Generative AI Harm

Databite No. 161: Red Teaming Generative AI Harm

What exactly is generative AI (genAI) red-teaming? What strategies and standards should guide its implementation? And how can it protect the public interest? In this conversation, Lama Ahmad, Camille François, Tarleton Gillespie, Briana Vecchione, and Borhane Blili-Hamelin examined red-teaming’s place in the evolving landscape of genAI evaluation and governance. Our discussion drew on a new report by Data & Society (D&S) and AI Risk and Vulnerability Alliance (ARVA), a nonprofit that aims to empower communities to recognize, diagnose, and manage harmful flaws in AI. The report, Red-Teaming in the Public Interest, investigates how red-teaming methods are being adapted to confront uncertainty about flaws in systems and to encourage public engagement with the evaluation and oversight of genAI systems. Red-teaming offers a flexible approach to uncovering a wide range of problems with genAI models. It also offers new opportunities for incorporating diverse communities into AI governance practices. Ultimately, we hope this report and discussion present a vision of red-teaming as an area of public interest sociotechnical experimentation. 00:00 Opening 00:12 Welcome and Framing 04:48 Panel Introductions 09:34 Discussion Overview 10:23 Lama Ahmad on The Value of Human Red-Teaming 17:37 Tarleton Gillespie on Labor and Content Moderation Antecedents 25:03 Briana Vecchione on Participation & Accountability 28:25 Camille François on Global Policy and Open-source Infrastructure 35:09 Questions and Answers 56:39 Final Takeaways