By: Nana
Appiah Acquaye
The Policy Innovation Lab at
Stellenbosch University, in collaboration with AI Safety South Africa has
hosted an AI Safety and Cooperative AI Workshop at the SU School for Data
Science and Computational Thinking. The event brought together researchers to
discuss the latest developments in AI safety, cooperative AI, and AI
governance, strengthening South Africa’s role in responsible artificial
intelligence research.

Dr Gray Manicom, a
researcher at the Policy Innovation Lab, presented his work on mechanistic
interpretability for AI safety, highlighting its focus on causal insights
rather than traditional correlation-based methods. His research examines how
these techniques can support the localisation of AI models within African
contexts.
Following this, Prof Willem
Fourie, Chair of the Policy Innovation Lab, and analyst Isabel Ray introduced
ASIF, a theoretical framework designed to address value under-specification by
advancing more explicit and auditable approaches to AI alignment.

The workshop also explored
challenges in cooperative AI. Presentations included Omer Ebead on adversarial
dynamics in multi-agent systems, Yves Bicker on pro-social behavior in
reinforcement learning, Joseph Low and Oscar Duys on the safety implications of
delegating complex tasks to AI agents, and Akash Kundu on how perceived
similarity may influence cooperation between models. Discussions emphasized the
importance of developing AI systems that are transparent, auditable, and
ethically grounded.

The Policy Innovation Lab
and AI Safety South Africa thanked all participants and presenters for their
contributions to advancing responsible and collaborative AI research.