I was talking with my wife one day while I was at work, and she told me that she had worked on cleaning the living room that day. In the cleaning process, she collected stuff belonging to each member of the family and placed it in bags. So I knew that when I got home, I'd have a bag waiting for me. (Naturally, since she was the one doing the cleaning, she could remove the evidence of HER bag before we got home. But hey, that's the advantage of doing the cleaning in the first place.)
But wouldn't it be neat if my wife had a robot that could have done the work?
[R]esearchers at Carnegie Mellon University presented work on an object-recognition system that lets a robot sort through a pile of recyclables by hand.
Technology Review noted that this is difficult:
Being able to pick out items from a cluttered, disordered environment is no easy feat. And, while other robots are now dexterous enough to grasp an egg without breaking it damage or pick up unfamiliar objects, these systems generally only work if the object in question has been positioned carefully.
In addition to being able to pick things up, the robot needs to "know" what the objects are. This requires an object-recognition algorithm. And, being researchers, that requires a paper:
Object Recognition and Full Pose Registration from a Single Image for Robotic Manipulation
Alvaro Collet Romea, Dmitry Berenson, Siddhartha Srinivasa, and David Ferguson
IEEE International Conference on Robotics and Automation (ICRA '09), May, 2009.
Abstract
Robust perception is a vital capability for robotic manipulation in unstructured scenes. In this context, full pose estimation of relevant objects in a scene is a critical step towards the introduction of robots into household environments. In this paper, we present an approach for building metric 3D models of objects using local descriptors from several images. Each model is optimized to fit a set of calibrated training images, thus obtaining the best possible alignment between the 3D model and the real object. Given a new test image, we match the local descriptors to our stored models online, using a novel combination of the RANSAC and Mean Shift algorithms to register multiple instances of each object. A robust initialization step allows for arbitrary rotation, translation and scaling of objects in the test images. The resulting system provides markerless 6-DOF pose estimation for complex objects in cluttered scenes. We provide experimental results demonstrating orientation and translation accuracy, as well a physical implementation of the pose output being used by an autonomous robot to perform grasping in highly cluttered scenes.
The paper itself can be found at the link. Look for the PDF item.
So, how much more time until the robot can identify the items, associate them with a person ("Sports Illustrated - that belongs to the husband"), and bag them accordingly?