What Foundation Models Can and Cannot Do for Bringing Helpful Robotic Assistants into Our Lives

What Foundation Models Can and Cannot Do for Bringing Helpful Robotic Assistants into Our Lives

Abstract: The past few years have seen remarkable advancements in AI. What began with the NLP revolution has sparked new ideas across many fields, including robotics, driving the search for a "RobotGPT." But is this all we need to finally have robots assist humans in everyday tasks? What challenges have been addressed, and which remain unsolved as we look ahead? In this talk, I will discuss recent ways we have integrated Foundation Models into robotic solutions, as well as the limitations of the current FM-based paradigm—particularly in learning from few human demonstrations and in seeking information for complex manipulation tasks.

Bio: Roberto Martin-Martin is an Assistant Professor of Computer Science at the University of Texas at Austin. His research connects robotics, computer vision, and machine learning. He studies and develops novel AI algorithms that enable robots to perform tasks in human uncontrolled environments such as homes and offices. He creates novel decision-making solutions based on reinforcement learning, imitation learning, planning, and control in that endeavor. He explores topics in robot perception, such as pose estimation and tracking, video prediction, and parsing. Martin-Martin received his Ph.D. from the Berlin Institute of Technology (TUB) before a postdoctoral position at the Stanford Vision and Learning Lab under the supervision of Fei-Fei Li and Silvio Savarese. His work has been selected for the RSS Best Systems Paper Award, RSS Pioneer, ICRA Best Paper Award, Winner of the Amazon Picking Challenge, and has been a finalist for ICRA, RSS, and IROS Best Paper. He is chair of the IEEE Technical Committee in Mobile Manipulation and co-founder of QueerInRobotics.