By Aaron Aupperlee
New method developed by Carnegie Mellon University researchers allows robots to ‘learn from nature’
The robot watched Shikhar Bahl open the refrigerator door. He recorded his movements, the swinging of the door, the location of the refrigerator and more, analyzing this data and preparing to imitate what Bahl had done.
It failed at first, completely missing the handle at times, grabbing it in the wrong place, or pulling it incorrectly. But after a few hours of practice, the robot succeeded and opened the door.
“Imitation is a great way to learn,” says Bahl, a doctoral student at Carnegie Mellon University’s Robotics Institute (RI). Informatic school. “Getting robots to actually learn by looking directly at humans remains an unsolved problem in the field, but this work takes an important step to enable that capability.”
Bahl worked with Deepak Pathak and Abhinav Gupta, both RI faculty members, to develop a new learning method for robots called WHIRL, short for In-the-Wild Human Imitating Robot Learning. WHIRL is an efficient algorithm for unique visual imitation. It can learn directly from videos of human interaction and generalize this information to new tasks, making robots well suited for learning household chores.
People constantly perform various tasks at home. With WHIRL, a robot can observe these tasks and gather the video data it needs to eventually figure out how to perform the job itself.
The team added a camera and its software to a ready-to-use robot, and it learned to perform more than 20 tasks – from opening and closing appliances, cabinet doors and drawers to putting a lid on a pan, pushing a chair and even pulling a trash bag out of the trash can. Each time, the robot watched a human complete the task once, then practiced and learned to complete the task on its own. The team presented their research this month at the Robotics: Science and Systems conference in New York.
“This work presents a way to bring robots into the home,” says Pathak, an assistant professor at RI and a member of the team. “Instead of waiting for robots to be programmed or trained to successfully perform different tasks before deploying them in people’s homes, this technology allows us to deploy the robots and teach them how to perform tasks, while adapting to their surroundings and improving just by looking.”
Current methods of teaching a task to a robot are generally based on learning by imitation or reinforcement. In imitation learning, humans manually operate a robot to teach it to perform a task. This process must be repeated several times for a single task before the robot learns. In reinforcement learning, the robot is typically trained on millions of examples in simulation and then asked to adapt that training to the real world.
Both learning models work well when teaching a robot a single task in a structured environment, but they are difficult to scale and deploy. WHIRL can learn from any video of a human performing a task. It is easily scalable, not limited to a specific task, and can work in realistic home environments. The team is even working on a version of WHIRL trained by watching human interaction videos on YouTube and Flickr.
Advances in computer vision have made the work possible. Using models trained on internet data, computers can now understand and model motion in 3D. The team used these models to understand human movement, facilitating WHIRL training.
With WHIRL, a robot can perform tasks in its natural environment. Appliances, doors, drawers, lids, chairs and trash bag have not been modified or manipulated to fit the robot. The robot’s first attempts at a task ended in failure, but once it had a few successes, it quickly figured out how to accomplish it and mastered it.
Although the robot may not accomplish the task with the same movements as a human, that is not the goal. Humans and robots have different parts and they move differently. What matters is that the end result is the same. The door is open. The switch is off. The tap is open.
“To evolve robotics in nature, data must be reliable and stable, and robots must improve in their environment by training themselves,” says Pathak.