OASIS (Object Aware Situated Interactive System) is a flexible software architecture that enables us to prototype applications that use RGB-D cameras (color video + depth) and underlying computer vision algorithms to recognize and track objects and gestures, combined with interactive projection. It is structured in a way that allows us to plug in new recognition code, to use different output display methods, and to flexibly associate actions, animations or graphics with detected input events. Example input events include the appearance of new objects, relocation of objects, gestures, and multi-touch surface interaction. Using this software framework we can create dramatically different styles of interactive physical-virtual systems that showcase the underlying computer vision and perception technologies. OASIS does not need special bar codes, tags, or instrumentation on any objects or on the user's hands. It uses everyday surfaces (countertops, tables, walls, floors) and turns these into interactive touch screens.
Underlying Technology for OASIS:
Our current Kitchen OASIS prototype runs all software (calibration, computer vision algorithms, GUI and rendering software) on a Dell Inspiron 2GHz dual-core laptop. Our Interactive LEGO OASIS currently runs on an Intel Sandy Bridge workstation. Both applications will work with Linux Ubuntu and Windows operating systems.
Output is provided by a 170 lumen SVGA LED projector or a laser pico-projector. The projector’s display area in our current configuration is slightly smaller than the camera’s field of view, covering an area of approximately 36”x24”.
OASIS input is based on novel RGB+D methods. Sensing is provided by an advanced prototype depth-camera mounted 30” above the counter surface. It runs at 320x240, capturing aligned RGB and depth channel information. Depth is computed using structured infrared. The camera is used for all system input: foreground segmentation, object recognition and tracking, and finger input detection.
Video processing pipeline (shown above): left - RGB scene view, center - depth data view of same scene, and right - foreground object view of same scene after background is subtracted (with object labels as recognized and finger tracking).
No comments:
Post a Comment