Building a GENERAL AI agent with reinforcement learning
21,678 views
0

 Published On Mar 20, 2024

Dr. Minqi Jiang and Dr. Marc Rigter explain an innovative new method to make the intelligence of agents more general-purpose by training them to learn many worlds before their usual goal-directed training, which we call "reinforcement learning".

Their new paper is called "Reward-free curricula for training robust world models" https://arxiv.org/pdf/2306.09205.pdf

  / minqijiang  
  / marcrigter  

Interviewer: Dr. Tim Scarfe

Please support us on Patreon, Tim is now doing MLST full-time and taking a massive financial hit. If you love MLST and want this to continue, please show your support! In return you get access to shows very early and private discord and networking.   / mlst  

We are also looking for show sponsors, please get in touch if interested mlstreettalk at gmail.

MLST Discord:   / discord  

00:00:00 - Intro
00:01:05 - Model-based Setting
00:02:41 - Similar to POET Paper
00:05:27 - Minimax Regret
00:07:21 - Why Explicitly Model the World?
00:12:47 - Minimax Regret Continued
00:18:17 - Why Would It Converge
00:20:36 - Latent Dynamics Model
00:24:34 - MDPs
00:27:11 - Latent
00:29:53 - Intelligence is Specialised / Overfitting / Sim2real
00:39:39 - Openendedness
00:44:38 - Creativity
00:48:06 - Intrinsic Motivation
00:51:12 - Deception / Stanley
00:53:56 - Sutton / Rewards is Enough
01:00:43 - Are LLMs Just Model Retrievers?
01:03:14 - Do LLMs Model the World?
01:09:49 - Dreamer and Plan to Explore
01:13:14 - Synthetic Data
01:15:21 - WAKER Paper Algorithm
01:21:24 - Emergent Curriculum
01:31:16 - Even Current AI is Externalised/Mimetic
01:36:39 - Brain Drain Academia
01:40:10 - Bitter Lesson / Do We Need Computation
01:44:31 - The Need for Modelling Dynamics
01:47:48 - Need for Memetic Systems
01:50:14 - Results of the Paper and OOD Motifs
01:55:47 - Interface Between Humans and ML

show more

Share/Embed