Giles Howdle - Auto-Intentional Agency and AI Risk - PIBBSS Symposium '23
PIBBSS Fellowship PIBBSS Fellowship
1.53K subscribers
125 views
0

 Published On Sep 25, 2023

This is one of the talks given by a PIBBSS Fellow at the PIBBSS '23 Symposium.

Abstract: The dynamic we identify, ‘auto-intentional agency’, is found in systems which create abstract explanations of their own behaviour — in other words, apply the intentional stance (or something functionally similar) to themselves. We argue that auto-intentional agents acquire distinctive planning capacities distinct from goal-directed systems which, while amenable to being understood via the intentional stance, are not fruitfully understood as applying the intentional stance to themselves. We unpack this notion of auto-intentional agency with reference to hierarchically deep self-models in the active inference framework. We also show how auto-intentional agency dovetails with insights from the philosophy of action and moral psychology. We then show the implications of this distinct form of agency for AI safety. In particular, we argue that auto-intentional agents, in modelling themselves as temporally extended, are likely to have more sophisticated planning capacities and to be more prone to explicit self-preservation and power-seeking than other artificially intelligent systems.

Watch more videos like this on our channel, and subscribe for similar content. Apply to work on such problems on our Website www.pibbss.ai

show more

Share/Embed