Eleni Angelou - Overview of Problems in the Study of Language Model Behavior - PIBBSS Symposium '23
PIBBSS Fellowship PIBBSS Fellowship
1.53K subscribers
176 views
0

 Published On Sep 28, 2023

This is one of the talks given by a PIBBSS Fellow at the PIBBSS '23 Symposium.

Abstract: There are at least two distinct ways to approach Language Model (LM) cognition. The first is the equivalent of behavioral psychology for LMs and the second is the equivalent of neuroscience for human brains, i.e., interpretability. I focus on the behavioral study of LMs and discuss some key problems that are observed in attempts to interpret LM outputs. A broader question about studying the behavior of models concerns the potential contribution to solving the AI alignment problem. While it is unclear to what extent LM behavior is indicative of the internal workings of a system, and consequently, of the degree of danger a model may pose, it surely seems that further work in LM behavioral psychology would at least provide some tools for evaluating novel behaviors and informing governance regimes.

Watch more videos like this on our channel, and subscribe for similar content. Apply to work on such problems on our Website www.pibbss.ai

show more

Share/Embed