From Multimodal Agents to Swarm Intelligence

Abstract

This individual is exploring pathways for building future agent-based societies, starting with the Natural Language Society of Mind (NLSOM) model. They will review the early developments and evolution of agent-based societies built upon large language models, with a particular focus on GPTSwarm—a graph-based, optimizable agent swarm system that offers a novel approach for constructing autonomous agent societies. Additionally, they will introduce a critical technology for 2024: the Agent-as-a-Judge framework. This framework allows agents to evaluate other agent systems, providing intermediate feedback in complex tasks and essential reward signals to enhance self-improvement capabilities. The presentation will also highlight cutting-edge open-source projects, such as MetaGPT—a multi-agent meta-programming framework—and OpenHands, which offers best practices for single-agent systems. The talk aims to provide an accessible overview for those interested in the latest advancements in the field.

Speaker

Mingchen Zhuge, a third-year PhD candidate at the KAUST AI Initiative, under the supervision of Jürgen Schmidhuber. His research interests span several areas, including vision-language pre-training, with a particular focus on video-based large language models (LLMs); post-training of LLMs; meta-learning, specifically recursive self-improvement; and LLM-based multi-agent systems and code generation. Additionally, he is exploring the concept of AI judges, examining the roles of agents and LLMs as judges. Mingchen is set to join Meta as a Research Scientist Intern in the summer of 2024 and is open to future collaboration opportunities.

Video

Extra Details

Speaker Website / Paper Link / Paper Code / Paper Project Page