Other Free Encyclopedias » Online Encyclopedia » Encyclopedia - Featured Articles » Contributed Topics from A-E

Context-Aware Multimedia - Introduction, Context, Context-Awareness, Context-A ware Multimedia, Modeling, Retrieval, Authoring and Presentation, Summary

user information media applications

Wolfgang Klas
University of Vienna, Vienna, Austria and
Research Studio Digital Memory Engineering, Vienna, Austria

Ross King
Research Studio Digital Memory Engineering, Vienna, Austria

Definition: Context-aware multimedia refers to a specific subset of context-aware applications related to multiple media types.


Multimedia applications face a variety of media types, from single media types like audio, video, images, text to compositions of these single media forming new multi -media objects. Furthermore, single media as well as composed media very often are to be constructed, retrieved, and interpreted according to a particular context given in an application setting.

In this article, we will first establish the meaning of the concept context-aware multimedia , first defining the terms context and context-awareness . We will then explore four key research aspects of context-aware multimedia, namely: modeling, retrieval, authoring, and presentation.


A survey of the literature reveals several variations in the definition of context . We infer from the common thread of these definitions that context is a very broad concept including more or less anything that can directly or indirectly affect an application, and consider the following to be the most comprehensive:

“Context is any information that can be used to characterize the situation of an entity. An entity is a person, place, or object that is considered relevant to the interaction between a user and an application, including the user and applications themselves.”

In other words, any information that depicts the situation of a user can be entitled context. This would include the number of people in the area, the time of day, and any devices the user may employ. One can however distinguish between those contextual characteristics that are critical to the application versus those that may be relevant but are not critical.

Within computing applications, there are three major context categories of interest: user context, computing resources, and environmental aspects. Orthogonal to this view, context can be explicit (that is, information provided directly by the user) or implicit (derived from on one hand sensors, on the other hand from an analysis of user behavior). As we learn more about human-computer interaction, it is becoming clear that the user is not prepared to deliver a large volume of explicit information and, as a result, information gathered implicitly (from for example sensor networks) is becoming more and more relevant.

In previous decades, a narrow aspect of user context – the user preferences in terms of search and retrieval of data – was denoted in the database community with the concept of views . Today, user context is considered far more broadly and includes user interests, goals, social situation, prior knowledge and history of interaction with the system. Thus, the dimension of time may also be included in the user’s context. “Context information may be utilized in an immediate, just in time, way or may be processed from a historical perspective”.

Computer resources include characteristics of the client, server, and network, as well as additional available resources such as printers, displays, and storage systems. This type of context has become particularly relevant for the mobile computing community, in which the range of client capabilities is enormous and to a great extent limited in comparison with standard computer workstations.

Environmental contextual aspects include (but are not limited to) location, time, temperature, lighting conditions, and other persons present. Note that there is some ambiguity in the literature in the use of the term environment, which can refer to the computing environment as well as the actual physical environment of the user. Here of course we refer to the latter.

The assessment of environmental context faces a number of research challenges, as described in. Context information is often acquired from unconventional heterogeneous sources, such as motion detectors or GPS receivers. Such sources are likely to be distributed and inhomogeneous. The information from these sensors must often be abstracted in order to be used by an application; for example, GPS data may need to be converted to street addresses. Finally, environmental context information must be detected in real time and applications must adapt to changes dynamically.

It should be noted that an enormous amount of research is presently being dedicated to the location context, which is most frequently referred to as location-based services, as this is considered to be one of the potential “killer-applications” of the next generation wireless networks. However, one should not overlook the fact that the question “what are you doing” can be equally or more important than “where you are”. It is all well and good to know that a professor is in a lecture hall, but it is even more important to know whether she is attending a lecture or giving it herself. This point leads us from the concept of context in itself to the concept of context-awareness.


The phrase context-aware begins to appear with regularity in relatively recent literature, although its introduction is largely attributed to a paper from 1994 by Shilit & Theimer, in which they define context-aware computing as any system that “adapts according to its location of use, the collection of nearby people and objects, as well as changes to those over time.”

However, given the extended considerations of context reviewed in the previous section, it is clear that the concept of context-awareness has also evolved to meet these considerations, for example: “A system is context-aware if it uses context to provide relevant information and/or services to the user, where relevancy depends on the user’s task.”

Implicit in the concept of context-awareness is that something must be context-aware, for example, the “system” quoted above. We will follow the literature and assume that this system in question is a context-aware (computing) application.

“Context-aware applications” is a phrase that appears most often in the literature of ubiquitous computing and pervasive computing environments. Context-aware environments refer to fixed locations, either in the workplace or at home, with ubiquitous sensors that can determine the location, actions, and intent of the location inhabitants (be they employees or family members).

Other examples of context-aware applications include the so-called Tour Guide applications (the canonical example of location-based services), or the personalized online shopping services exemplified by where the analysis of user behavior over time as well as the aggregate behavior of users play a major role.

Context-A ware Multimedia

The term multimedia often simply refers to media content other than text; however, more properly, the term multimedia should be applied in situations involving the combination of two or more media types, including text, images, video, audio, animations, three-dimensional models, and so on. We can therefore conclude that context-aware multimedia refers to a specific subset of context-aware applications related to multiple media types.

It is interesting to observe that the term context-aware multimedia is most often employed within the mobile computing and applications community – exactly those for whom multimedia content is least suitable, given the constraints on bandwidth and display properties. However, although significant research effort is dedicated towards adaptive multimedia delivery to end devices such as PDAs and mobile telephones, user mobility need not necessarily equate to mobile devices and can instead concentrate on the mobility of the user; see for example.

The line between multimedia content and multimedia applications is gradually becoming blurred. Take for example Flash™ presentations, which can be considered traditional multimedia content in the form of graphical animations of images and text, but can also be seen as applications when methods for user interaction and response are packaged and delivered with the presentation. In our opinion, multimedia of the future will continue to blur this distinction, as more and more functionality, including context-awareness, is encapsulated within the medium itself. With this in mind, we will continue by using the terms context-aware multimedia with and context-aware multimedia applications interchangeably.

Given this definition, and following the analysis of, we infer that there are at least two features that a context-aware multimedia application should provide:

  • assimilation of new context information within the media metadata in order to support later retrieval
  • presentation of multimedia information and services to a user, taking at least some of the discussed aspects of context (user preferences, location, etc.) in to account

In order to support the first feature, methods for modeling and querying are required. In order to support the second, tools for authoring and presentation are required. We now briefly discuss the research challenges in these areas:


For the sake of completeness, we note one obvious fact: most multimedia content is not inherently self-describing, and therefore metadata must play a central role in any context-aware multimedia applications. Of course metadata is also required in order to describe other aspects of context, including the computing environment and user preferences (e.g. device and user profiles). In addition to such requirements, context-aware modeling should also include the incorporation of domain knowledge into the document model; that is, not only must one describe the media, but also the relationships between media, in order to create documents and/or presentations that are valuable to the user.

Currently there are a few promising standardized metadata frameworks that address the need for machine-processable and context-aware content descriptions: for example, the Multimedia Content Description Interface MPEG-7 [ISO/IEC TR 15938], and the Resource Description Framework (RDF) and Web Ontology Language (OWL), both standards developed within the W3C Semantic Web Activity.

Another important W3C recommendation, Composite Capability/Preference Profiles (CC/PP) proposes a standard for modeling the computer resource context. Research is already underway to extend this work in the direction of more general context modeling, for example the universal profiling schema (UPS) model which is built on top of CC/PP and RDF.

Models based on the standards described above tend to only include concrete measurable quantities; however, it is far from clear that these are sufficient to optimally model context. More complex models, allowing the application of fuzzy logic and Bayesian reasoning e.g, are likely to drive future research efforts. The problem of context modeling may also be approached at a more abstract level, see for example.


Multimedia content, consisting principally of unstructured data, is clearly a case for the established methods of information retrieval (IR) and information filtering (IF). The concept of retrieval has also evolved to include not only active retrieval or searching, but also passive retrieval, or notification. Multimedia retrieval and filtering are broad research topics and various specific aspects are covered in separate entries of this encyclopedia. We note that most work tends to concentrate on retrieval of a single type of media, be they images or videos or others, due to the divergent spectrum of techniques required by those types.

Owing to its status as a standard, a number of proposed retrieval frameworks are based on MPEG-7. Recent research is directed towards extensions of such systems in order to allow for reasoning, semantic, and context-based searches, for example.

Adding context-awareness to the concept of multimedia retrieval results in context-aware retrieval , or CAR, a term proposed by Brown and Jones. They categorize CAR methods as either user-driven, which is essentially equivalent to traditional IR where the query is automatically derived or modified by context information, or author-driven, which is nearly identical to information filtering, with the exception that the filters are attached to the media themselves rather than to the user profiles.

Authoring and Presentation

The main question regarding context-aware multimedia presentation is how content can be flexibly delivered to users, while at the same time accounting for the various types of context information discussed above. There are three possible approaches: 1) provide different documents for different contexts; 2) store single media elements and allow the application to select and merge them into a coherent presentation adapted to the context; or 3) employ a flexible document model that inherently includes context information with the multimedia. Even if the number of possible end devices could be enumerated, the first approach is clearly insufficiently dynamic to account for more rapidly changing contextual characteristics such as user activity or environmental factors, leaving the second and third approaches as viable strategies.

Bulterman and Hardman provide a recent (as of this writing) overview of multimedia presentation authoring paradigms and tools, a rich topic that is beyond the scope of this article. They conclude that the future of multimedia authoring lies in two non-exclusive possibilities: first, advances in the authoring interfaces that will enable more efficient and elegant authoring of multimedia presentations, and second, the development of methods for automatic presentation creation, based on user requests. These possibilities correspond to the second (presentation-based) and third (application-based) approaches mentioned in the previous paragraph.

In the first case, the transformation of content to a context-sensitive presentation is often referred to as adaptation . Adaptation occurs in at least two phases; first, in the selection multimedia material based on the user preferences, and second in the further adjustment of the selected material based on the computing environment context. Consideration of which display device is currently being used and the available network bandwidth are primary filters for determining what media will be sent and in what format. The groundwork for much of today’s research in this area was established in the field of adaptive hypermedia (see and references therein). More recent work in this area, taking a very general approach that integrates the consideration of bandwidth, latency, rendering time, quality of service, and utility-cost ratios, is exemplified by.

In the case of multimedia applications, the question of presentation authoring is eliminated. Instead of tools for authoring multimedia presentations, one requires tools for authoring context-aware applications. A number of frameworks and middleware systems have been reported in the literature over the past decade:

One of the earlier approaches was the Context Toolkit, provides “context widgets” – encapsulated software components that provide applications with access to context information from their operating environment. The QoSDREAM middleware integrates distributed sensor location data with a service to configure and manage real-time multimedia streams with an emphasis, as the name suggests, on quality of service. The M-Studio authoring tool was produced in order to help mobile story creators to design, simulate and adjust mobile narratives. The CAPNET architecture provides a similar tool, combining aspects of both mobile and ubiquitous computing, including a context-based storage system (CBS) as the underlying persistence mechanism.

And as previously mentioned the unnamed framework discussed in takes context-aware multimedia frameworks to the next level by integrating inference mechanisms and Bayesian reasoning.

A final area of research related to authoring is the idea of contextual-metadata capture, which aids the process of including as much metadata as possible during the media production process. As much metadata and semantic information as possible should be captured in the earliest phases of media production, when the producers who can best describe and disambiguate the content are still available. Without this early integration, valuable contextual information can be lost. Given the ever-growing importance of metadata, which is presently hindered by the expense (human and computational) of producing it, it is clear that this concept of “conservation of meta-information” will play an essential role in the future of multimedia content production.


This article introduces the concept of context-aware multimedia by first defining the terms context and context-awareness . Based on these definitions four key aspects are addressed: modeling, retrieval, authoring, and presentation. Providing for context-awareness in multimedia applications requires reconsidering the approaches on modeling, retrieving, authoring and presenting multimedia content due to the interdependencies with the concepts of context and context-awareness.


We would like to acknowledge contributions for the series of articles linked to this article from our group members (in alphabetical order): Bernhard Haslhofer, Wolfgang Jochum, Bernhard Schandl, Karin Schellner, Maia Zaharieva, and Sonja Zillner.

Context-Aware Musical Audio [next] [back] Context-Aware Modeling of Multimedia Content

User Comments

Your email address will be altered so spam harvesting bots can't read it easily.
Hide my email completely instead?

Cancel or