This December at SIGGRAPH Asia 2011, researchers at Carnegie Mellon University (CMU) plan to present a paper that illustrates just how far so-called "cloud robotics" has come.
Their eventual goal, they say, is to enable robots to perform data-intensive functions—like image processing and voice recognition—by offloading to remote servers the vast amounts of data the tasks require. Previously, robots were limited in what they could do by the comparatively small amount of information able to reside onboard the robot itself.
"A robot can, for example, easily navigate in an environment, avoiding obstacles in its path," explains Alexei Efros, associate professor of computer science and robotics at CMU. "But where it needs help is recognizing the obstacles and identifying them, which requires huge amounts of data to let them understand what they are looking at."
Three years ago, the researchers unveiled IM2GPS, an object-recognition platform that estimates geographic information from a single image. It utilizes six million geo-tagged images that the team downloaded from the Flickr Web site.
"It was a relatively simple process that enabled us to take a photograph of, say, a building in Paris, scan it, and then match it against our cloud-based image database," explains Efros. When we found a match, the picture’s geo-tag told the program its global location."
In the soon-to-be-published paper, the CMU team took the process a step further with what it calls "data-driven visual similarity for cross-domain image matching." The more intelligent image-search process permits researchers to query for similar objects or scenes.
For example, says Efros, "we can use a human-drawn sketch of a car and do a search for cars that look like the sketch. What this requires is a system with a cross-domain understanding of the visual content. In other words, a system that can understand the sketch and not be sidetracked by the idea that it now needs to find a photo—not a sketch—of the same object."
If applied to, say, a household robot, a person would be able to show the robot a sketch of lost glasses and instruct the robot to find them. Currently that is a difficult task for a robot that wouldn’t be able to identify glasses that are unmarked or that have no branding or logo.
The paper on image retrieval will be preceded by a paper on object detection and object understanding that the researchers plan to publish in November at ICCV 2011.
Meanwhile, at Google, a former member of the CMU team, James Kuffner, is focusing on making smartphones and tablets a gateway to the cloud for what he calls "remote-brain robots."
Indeed, in May, at Google I/O 2011, the company unveiled its open-sourced Android Open Accessory API and Development Kit (ADK) I/O board that connects to the USB port of a phone or tablet. The board is now available off-the-shelf or online at $30-$150 depending on functionality.
"Hobbyists, researchers, developers, and students can now buy a little board that allows them to hook up cameras, sensors, lasers, any devices they want, and have them controlled by an app running on an Android smartphone," he says. "The robot revolution hasn’t happened mostly because of cost. We’re trying to enable that robot revolution by cost-reducing it!
"Cloud robotics is still an experimental, part-time research effort at Google and there are no official product plans at this time," says Kuffner. "However, the ADK board has been getting lots of interest, and one of Google’s goal is to help stimulate the robotics ecosystem by providing to the community Android-compatible, open-source software and hardware that enable low-cost, easy development."
Paul Hyman was editor-in-chief of several hi-tech publications at CMP Media, including Electronic Buyers’ News.
No entries found