Other Free Encyclopedias » Online Encyclopedia » Encyclopedia - Featured Articles » Contributed Topics from P-T » Real Time Multimedia - Real-Time Networked Multimedia, Real-time Streaming Media Protocols

Multimedia Conferencing Standards

video sip protocol based

Multimedia conferencing, or video conferencing, is an important application of real time media. In video conferencing, people at different sites are brought together for a meeting by transmitting real time audio, video and collaboration data on communication channels, as illustrated in Figure 1. Video conferencing are widely used in telecommuting, distant collaboration, telemedicine, distant learning, career services and etc. With the fast development of technology, video conferencing will have broader impacts and huge potential market.

As an advanced communication tool, inter-connectivity and inter-operability require that the video conferencing devices involved in the same conference can talk to each other, i.e., they comply with some common standard. Video conferencing standard is an umbrella set of standards because it not only has to specify audio and video coding standards, but also needs to address call control, conference management, media packetization and delivery. There are two major categories of video conferencing standards, H.32x series from the telecommunication world, standardized by ITU (International Telecommunications Union), and SIP (Session Initial Protocol) based video conferencing standard from the Internet world, recommended by IETF (Internet Engineering Task Force).

ITU-T H.32x video conferencing standards

H.320 2, the first international standard of video conferencing, was released by ITU in the early 1990s. The standard was designed for narrow band switched ISDN (Integrated Service Digital Networks). Later on its variants for different network infrastructures were standardized in the mid and late 1990s, which include H.321 3 (for broadband ISDN), H.322 4 (for guaranteed bandwidth packet switched networks), H.324 6 (for Public Switched Telephone Network) and H.323 5 (for non-guaranteed bandwidth packet switched networks). Due to the overwhelming of IP networks, H.323 becomes the most popular video conferencing standard in recent years. H.323 standard has been updated to version 5 7 to improve reliability, scalability, flexibility and extensibility. We will use H.323 as an example to illustrate the H.32x video conferencing standards.

H.323 defines four major components for a network-based communication system: Terminals, Gateways, Gatekeepers, and Multipoint Control Units (MCU), as shown in Figure 2. Terminals are the client endpoints that provide real-time, two-way communications. A Gatekeeper is the most important component of an H.323 enabled network. It acts as the central point for all calls within its zone and provides call control services to registered endpoints. Gateways, which are optional in H.323, provide translation function between H.323 conferencing endpoints and other terminal types such as H.324 and H.320. The MCU supports conferences between three or more endpoints.

Figure 3 describes the protocol stack of H.323 systems. All terminals must support voice communications; video and data are optional. H.323 specifies the modes of operation required for different audio, video, and/or data terminals to work together. All H.323 terminals must also support H.245, which is used to negotiate channel usage and capabilities. Three other components are required: Q.931 for call signaling and call setup, a component called Registration/Admission/Status (RAS), which is a protocol used to communicate with a Gatekeeper, and support for RTP/RTCP for sequencing audio and video packets. Optional components in an H.323 terminal are video codecs, T.120 data conferencing protocols, and MCU capabilities. The real time media data (audio and video) is carried over RTP (Real Time Protocol)/RTCP (Real Time Control Protocol), which will be further discussed in the following section.

SIP based video conferencing standards

In parallel to ITU, IETF has also released standards for multimedia teleconferencing (MMTC) over IP networks, including the Session Initiation Protocol (SIP) 8 and the Session Announcement Protocol (SAP) 10. Developed by the IETF Multiparty Multimedia Session Control Working Group (MMUSIC WG) to support Internet teleconferencing and multimedia communications, SIP is a lightweight, text-based signaling protocol used for establishing sessions in an IP network. SIP deals generically with sessions, which can include audio, video, chat, interactive games, and virtual reality. The sessions are described using a separate protocol called Session Description Protocol (SDP) 9. SDP is transported in the message body of a SIP message. SIP is an application-independent protocol; it simply initiates, terminates and modifies sessions without knowing any details of the sessions. This simplicity means that SIP was designed at the outset to be extremely flexible, scalable and extensible. SIP is a request-response protocol that closely resembles two other Internet protocols, HTTP and SMTP (the protocols that power the World Wide Web and email). Consequently, using SIP, conferencing easily becomes another web application and can be integrated easily into other Internet services. In addition, IP streaming techniques are used for media delivery, and the IP Multicast Protocol is adopted as the foundation for building bandwidth-efficient and scalable multipoint-to-multipoint communication applications over IP. SAP is the protocol to announce MMTC sessions. Figure 4 shows the protocol stack used in SIP based video conferencing systems. Similar to H.323, audio and video data is carried on RTP/RTCP, but the call management is done using SIP.

H.323 vs. SIP

H.323 is originated from telecommunication world; the encoding of H.323 protocols is binary; while SIP is a HTTP like text-based protocol. Based on principles gained from the Internet community, text based protocol is easy to extend, process and debug. SIP has lower complexity, richer extensibility and better scalability, according to the early comparison done by the authors of SIP 11, although this comparison is out dated as H.323 has been updated to version 5. On the other hand, H.323 has a better interoperability with legacy PSTN networks and other H.32x conferencing systems. H.323 based products have been largely deployed. More than 90 percent of VoIP (Voice over IP) traffic is carried using H.323, and it is supported on 80 percent of new videoconferencing systems, according to the H.323 Forum 12. SIP supports instant messaging and can be easily integrated with other web-based applications. SIP was adopted by 3GPP (the 3rd Generation Partnership Project) and more and more SIP based conferencing products have been released such as Cisco’s IP phone and Microsoft’s XP messenger. It is really a topic of debate to say which will gain more popularity in the future, but it is worth noting that both H.323 and SIP are improving themselves by learning from the other side, and the differences between them are decreasing with each new version. For example, in version 5 of H.323, SIP is adopted as a supported protocol; H.323 gateways can have the wherewithal to advertise themselves as both an H.323 gateway and a SIP gateway to a Gatekeeper.

 

User Comments

Your email address will be altered so spam harvesting bots can't read it easily.
Hide my email completely instead?

Cancel or