Development and Evaluation: New Frontiers for the Social Sector

Development and Evaluation: New Frontiers for the Social Sector

This session is the second of a two-part series entitled “Evaluating AI for Early-Career STEM Support: Trust, Use, and Perceived Outcomes from a Chatbot Intervention,” organized by Claremont Graduate University for Glocal Evaluation Week 2026. In this session, the panel examines the development of TrueNorth, an AI-powered chatbot designed to support early-career STEM professionals, as a case study in integrating evaluation into AI system design. Developed through a collaboration between computer science and evaluation students, TrueNorth uses retrieval-augmented generation (RAG) and is grounded in a positive psychology framework. The session explores how evaluators engaged during the design process, rather than only assessing the tool after completion. It looks at how concepts such as trust, support, and professional agency were translated into chatbot features, and what was gained or lost in that process. Drawing on experiences from a hackathon, iterative prototyping, and a supporting white paper, the session reflects on the role of evaluation in shaping AI tools from the start.