Other Free Encyclopedias » Online Encyclopedia » Encyclopedia - Featured Articles » Contributed Topics from U-Z

Vision-Based Interaction

computer human visual information

Definition: Vision-based human-computer interaction provides a wider and more expressive range of input capabilities by using computer vision techniques to process sensor data from one or more cameras in real time, in order to reliably estimate relevant visual information about the user.

Human-computer interaction involves information flow in both directions between computers and humans, which may be referred to as input (human to computer) and output (computer to human). Traditional computer interfaces have very limited input capabilities, typically restricted to keyboard typing and mouse manipulations (pointing, selecting, dragging, etc.). The area of vision-based interaction seeks to provide a wider and more expressive range of input capabilities by using computer vision techniques to process sensor data from one or more cameras in real-time, in order to reliably estimate relevant visual information about the user – i.e., to use vision as a passive, non-intrusive, non contact input modality for human-computer interaction.

In human-to-human interaction, vision is used to instantly determine a number of salient facts and features about one another such as location, identity, age, facial expression, focus of attention, posture, gestures, and general activity. These visual cues affect the content and flow of conversation, and they impart contextual information that is different from, but related to, other interaction modalities. For example, a gesture or facial expression may be intended as a signal of understanding, or the direction of gaze may disambiguate the object referred to in speech as “this” or the direction “over there.” The visual channel is thus both co-expressive and complementary to other communication channels such as speech. Visual information integrated with other input modalities (including keyboard and mouse) can enable a rich user experience and a more effective and efficient interaction. Vision-based interaction may be useful in a wide range of computing scenarios in additional to standard desktop computing, especially mobile, immersive, and ubiquitous computing environments. A nice example of simple vision technology used effectively in an interactive environment was the KidsRoom project at the MIT Media Lab. Another example is HandVu, which allows users of mobile augmented reality systems to use their hands to drive the interface, by robustly tracking hands and looking for a few known hand gestures/postures. Figure 1 shows HandVu at work.

In order to provide this kind of input about users, many researchers in the field of computer vision have focused on modeling, recognizing, and interpreting human behavior. Among the most studied sub-areas are face detection and location, face recognition, head and face tracking, facial expression analysis, eye gaze tracking, articulated body tracking, hand tracking, and the recognition of postures, gaits, gestures, and specific activities. Several of these have applications in areas such as security and surveillance, biometrics, and multimedia databases, as well as in human-computer interaction. Although many significant technical challenges remain, there has been notable progress in these areas during the past decade, and some commercial systems have begun to appear. In general, further research needs to improve the robustness and speed of these systems, and there needs to be a deeper understanding of how visual information is best utilized in human-computer interaction.

Visitation [next] [back] Virutal Presence

User Comments

Your email address will be altered so spam harvesting bots can't read it easily.
Hide my email completely instead?

Cancel or