Towards Generative Decision-Making Agents

Yuexiang (Simon) Zhai, Final year PhD candidate at Berkeley EECS
Seminar
TPC Seminar Graphic featuring title of event with the date.

 Yuexiang (Simon) Zhai is a final year PhD candidate at Berkeley EECS, advised by prof. Sergey Levine, and prof. Yi Ma. His research focuses on understanding the limitations of foundation models, and developing algorithms that train foundation models for decision-making problems. In the past, he also has past experience in statistical machine learning, (Signal Processing with Adaptive Sparse Structured Representations 2019 (SPARS)) and reinforcement learning. He interned at Cruise AI research in 2023 summer and he received Berkeley EECS fellowship in 2020.

Abstract:  Recent developments in foundation models have demonstrated remarkable performance in various tasks requiring sequential decision-making, such as solving IMO questionsintelligent search, and general AI agents. Solving such challenging tasks usually requires the generative agents to interact with the environment with vision language inputs, and utilize the environment feedback to improve themselves. In this talk, I will first discuss some existing issues in Vision-Language models (VLM), including the catastrophic forgetting in supervised fine-tuning, and the short-coming in visual capabilities. With these existing issues I will next introduce an end-to-end training framework for VLM using reinforcement learning (RL4VLM), which directly trains VLMs as decision making agents via RL in various tasks. The RL4VLM framework lays out the foundation for post-training VLMs with only environment feedback, which could potentially be extended to other different environments and other VLMs.

For more information about upcoming speakers please visit the TPC Seminar Series Webpage:

https://tpc.dev/tpc-seminar-series/