New code libraries allow Windows apps to use Webcams as pointing devices So where are they — those futuristic computer interfaces, the kind they have in the movies? Most recently, Iron Man’s Tony Stark designed CAD models for an entire suit of high-tech combat armor using only his bare hands. So how come we’re all still using keyboards and ordinary old mice? InfoWorld’s ace reporter Paul Krill was the first to spot a new project from Microsoft Office Labs and Microsoft developer Mike Wasserman that could bring those next-generation UIs one step closer. Begun by Wasserman as a college project, the Touchless SDK is a set of .Net components that can be used to simulate the gestural interfaces of devices like the iPhone in thin air — using nothing fancier than an ordinary USB Webcam. Magic markers If you’ve ever seen a digital camera that can automatically focus on the faces in the picture, you already have an idea of how the Touchless SDK works. Touchless-enabled apps can be trained to recognize “markers” — objects that are easy to spot in the video feed from the Webcam. The motion of the markers in front of the camera can then be mapped to onscreen UI elements. To see it in action, I installed the SDK’s 156KB demo application on an Asus Eee PC 901 with a built-in Webcam. It brought up a video feed immediately. I held my makeshift marker in front of the screen and drew a circle around it with the mouse to identify it, as instructed, and pushed the button to start the demo. A few moments later I was controlling my PC with a Roma tomato. The demo applications were a far cry from Iron Man, but they were effective in demonstrating the potential power of a hands-free UI. Waving the tomato in thin air, I drew on the screen, scrolled an image back and forth, and played a rudimentary game. The SDK even supports multiple markers, enabling iPod-like gestural controls. Needs a touch-up Not everything went smoothly. The demo apps seemed to expect the video feed from my Webcam to work like a mirror, not a camera. But my Webcam doesn’t flip the image along the vertical axis. When I move my hand to the right, the marker in the onscreen video moves left — and so does the Touchless cursor. The current Touchless SDK still has some kinks to work out, too. For starters, its marker-location algorithm is very much keyed to color. That’s probably an efficient way to identify contrasting shapes, but color response varies by camera and is heavily influenced by ambient light conditions. I also found that if a bright red scarf suddenly strolls past the camera, it’s likely to be mistakenly identified as a red marker. The detection routine also seems to require a lot of juice. The demo program managed to soak up 64 percent of my Eee PC’s 1.6GHz Atom CPU, and the video from the Webcam soon developed a disconcerting few seconds’ lag that made controlling the onscreen cursors difficult at best. Gestures and beyond Doubtless these bugs will be ironed out in future releases of the SDK. But if they’re showstoppers for you, Wasserman isn’t actually the first to experiment with this type of UI. ARToolkit is a cross-platform library that can achieve many of the same things as the Touchless SDK. Its focus is a little different, however. ARToolkit is billed as a means of enabling “augmented reality,” in which real-life objects are enhanced with computer generated elements onscreen — the first-down marker on Monday Night Football, for example. Because of this shift in emphasis, ARToolkit is more limited than Touchless in some important ways. It only supports square objects as markers, for example. On the plus side, ARToolkit’s code base seems much more mature than the Touchless SDK’s. Its marker-recognition algorithm seems both more accurate and more efficient than Wasserman’s version, and it registered fewer false positives. It’s also portable to more environments than Touchless, which requires .Net 3.0. Of course, neither of these toolkits can yet hold a candle to the magical gesture-based interfaces of Iron Man or Minority Report. Neither algorithm is advanced enough to register gestures made with bare hands, for starters — both require some form of easily recognized marker. But they’re a start. Better yet, they’re both open source, which means anyone can help to make them better. And even in their present forms, these libraries hold exciting possibilities for desktop application developers, particularly in the area of UIs for computer users with limited mobility. Check ’em out and let me know what you think. Software Development