In his Robot Learning Lab, Ashutosh Saxena, assistant professor of computer science at Cornell University, has developed a robot, equipped with a 3D camera, that scans its environment and identifies the objects in it, using computer vision software previously developed in Saxena's lab.
The robot has been trained to associate objects with their capabilities: a pan can be poured into or poured from; stoves can have other objects set on them, and can heat things. The robot can identify the pan, locate the water faucet and stove and incorporate that information into its procedure.
If you tell it to "heat water" it can use the stove or the microwave, depending on which is available. And it can carry out the same actions tomorrow if you've moved the pan, or even moved the robot to a different kitchen, researchers said.
Saxena's research group uses techniques computer scientists call "machine learning" to train the robot's computer brain to associate entire commands with flexibly-defined actions.
The computer is fed animated video simulations of the action - created by humans in a process similar to playing a video game - accompanied by recorded voice commands from several different speakers.
To test the robot, researchers gave instructions for preparing ramen noodles and for making affogato - an Italian dessert combining coffee and ice cream: "Take some coffee in a cup. Add ice cream of your choice. Finally, add raspberry syrup to the mixture."
The robot performed correctly up to 64 percent of the time even when the commands were varied or the environment was changed, and it was able to fill in missing steps.
That was three to four times better than previous methods, the researchers said.
Saxena and graduate students Dipendra K Misra and Jaeyong Sung will describe their methods at the Robotics: Science and Systems conference at the University of California, Berkeley, July 12-16.