▶

Ethan Perez - Discovering Language Model Behaviors with Model Written Evaluations

account_balance Center for AI Safety (CAIS)

This discussion is part of a series of guest speaker events as part of the CAIS Philosophy Fellowship. For more information, visit https://philosophy.safe.ai