Karl Pertsch

I am a postdoc at UC Berkeley and Stanford University, where I work with Sergey Levine and Chelsea Finn on deep learning, reinforcement learning and robotics.

I completed my PhD at the University of Southern California (USC), working with Joseph Lim. During my PhD, I was fortunate to intern at Meta AI and spend time as a student researcher at Google Brain with Karol Hausman. Before my PhD, I spent one year as a Fulbright Scholar at the University of Pennsylvania, working with Kostas Daniilidis.

Email  /  Twitter  /  Google Scholar  /  CV  /  LinkedIn

Research

I'm interested in machine learning, reinforcement learning and robotics. At the moment, I am working on methods for using large datasets to facilitate the learning of complex, long-horizon robotic behaviors. Towards this goal I explore approaches that learn world models, representations of the environment or reusable skills from offline data and transfer them to new tasks.

Octo: An Open-Source Generalist Robot Policy
Dibya Ghosh*, Homer Walke*, Karl Pertsch*, Kevin Black*, Oier Mees*, ..., Dorsa Sadigh, Chelsea Finn, Sergey Levine
ArXiV, 2023
project page / tech report / code

We introduce Octo, an open-source generalist policy, trained on 800k robot trajectories. Octo is a large, transformer-based diffusion policy that supports flexible task specification, observation and action spaces. It can control a diverse range of robots out of the box and supports efficient finetuning to new robot configurations. We release pre-trained checkpoints and our full training + finetuning pipelines.

Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Open X-Embodiment Collaboration
ArXiV, 2023
project page / arXiv / dataset

We introduce the Open X-Embodiment Dataset, the largest robot learning dataset to date with 1M+ real robot trajectories, spanning 22 robot embodiments. We train large, transformer-based policies on the dataset (RT-1-X, RT-2-X) and show that co-training with our diverse dataset substantially improves performance.

Cross-Domain Transfer via Semantic Skill Imitation
Karl Pertsch, Ruta Desai, Vikash Kumar, Franziska Meier, Joseph J. Lim, Dhruv Batra, Akshara Rai
Conference on Robot Learning (CoRL), 2022
project page / arXiv / code

We learn a semantic skill policy that enables cross-domain imitation: from robot to robot between different environments and even from human video to robot. We show that we can learn long-horizon robotic manipulation tasks in a simulated kitchen environment using only three minutes of human video, recorded in my kitchen with a GoPro strapped to my head.

Assisted Teleoperation for Scalable Robot Data Collection
Shivin Dass*, Karl Pertsch*, Hejia Zhang, Youngwoon Lee, Joseph J. Lim, Stefanos Nikolaidis
project page / arXiv / code

We enable scalable robot data collection by assisting human teleoperators with a learned policy. Our approach estimates its uncertainty over future actions to determine when to request user input. In real world user studies we demonstrate that our system enables more efficient teleoperation with reduced mental load and up to four robots in parallel.

Task-Induced Representation Learning
Jun Yamada, Karl Pertsch, Anisha Gunjal, Joseph J. Lim
International Conference on Learning Representations (ICLR), 2022
project page / arXiv / code

We evaluate the effectiveness of representation learning approaches on visually complex environments with substantial distractors. We compare common unsupervised representation learning approaches to task-induced representations, that leverage task information from prior tasks to learn what parts of the scene are important to model and what parts can be ignored.

Skill-based Meta-Reinforcement Learning
Taewook Nam, Shao-Hua Sun, Karl Pertsch, Sung Ju Hwang, Joseph J. Lim
International Conference on Learning Representations (ICLR), 2022
project page / arXiv / code

We perform meta-RL on top of skills extracted from large task-agnostic offline datasets. By combining meta-training tasks with offline data we can meta-learn policies that can quickly learn new long-horizon, sparse reward tasks.

Demonstration-Guided Reinforcement Learning with Learned Skills
Karl Pertsch, Youngwoon Lee, Yue Wu, Joseph J. Lim
Conference on Robot Learning (CoRL), 2021
project page / arXiv / code

We follow long-horizon demonstrations by imitating the demonstrated skills instead of the primitive actions. By using skills learned from large, task-agnostic experience datasets for imitation, our approach SkiLD can seamlessly integrate task-agnostic data & demonstrations via a skill-based learning framework.

Accelerating Reinforcement Learning with Learned Skill Priors
Karl Pertsch, Youngwoon Lee, Joseph J. Lim
Conference on Robot Learning (CoRL), 2020 (Plenary Talk, top 4%)
Workshop on Robot Learning @ NeurIPS, 2020 (Best Paper Runner-up Award)
Deep RL Workshop @ NeurIPS, 2020 (Oral)
project page / arXiv / code

We jointly learn an embedding space of skills and a prior over skills. This skill prior tells us when to use which skill and guides learning on new tasks for effective skill transfer from large offline datasets.

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments
Jun Yamada*, Youngwoon Lee*, Gautam Salhorta, Karl Pertsch, Max Pflueger, Gaurav S.Sukhatme, Joseph J. Lim, Peter Englert
Conference on Robot Learning (CoRL), 2020
project page / arXiv / code

Our approach augments model-free RL agents with motion planning capabilities, enabling them to solve long-horizon manipulation tasks in cluttered environments.

Long-Horizon Visual Planning with Goal-Conditioned Hierarchical Predictors
Karl Pertsch*, Oleh Rybkin*, Frederik Ebert, Chelsea Finn, Dinesh Jayaraman, Sergey Levine
Conference on Neural Information Processing Systems (NeurIPS), 2020
project page / arXiv / video / code

We propose a hierarchical prediction model that predicts sequences by recursive infilling. We use this model to devise a hierarchical planning approach that allows to scale visual MPC to long-horizon tasks with hundreds of time steps.

Keyframing the Future: Keyframe Discovery for Visual Prediction and Planning
Karl Pertsch*, Oleh Rybkin*, Jingyun Yang, Shenghao Zhou, Kosta Derpanis, Joseph Lim, Kostas Daniilidis, Andrew Jaegle
Conference on Learning for Dynamics and Control, 2020
project page / arXiv / video / poster

We propose a keyframe-based video prediction model that can unsupervisedly discover the moments of interesting change, the keyframes, in the data. We show that using the predicted keyframes as subgoals for planning improves performance on a simulated pushing task.

Hover over image (or tap the screen) to see the video.

Learning what you can do before doing anything
Oleh Rybkin*, Karl Pertsch*, Kosta Derpanis, Kostas Daniilidis, Andrew Jaegle
International Conference on Learning Representations (ICLR), 2019
project page / arXiv / poster

We learn an agent's action space from pure visual observations along with a predictive model. It can then be used to perform model predictive control, requiring orders of magnitude fewer action annotated videos.

Hover over image (or tap the screen) to see the video.

iPose: Instance-Aware 6D Pose Estimation of Partly Occluded Objects
Omid Hosseini Jafari*, Siva Karthik Mustikovela*, Karl Pertsch, Eric Brachmann, Carsten Rother
Asian Conference on Computer Vision (ACCV), 2018

Combining a CNN-based regression of dense on-object surface labeling with RANSAC-based pose fitting for accurate 6DoF pose estimation of texture-less objects under heavy occlusion.


I borrowed this website layout from here!