Other Free Encyclopedias » Online Encyclopedia » Encyclopedia - Featured Articles » Contributed Topics from P-T

Peer-to-Peer Streaming - Streaming P2P Architectures, Streaming Process, Peer-to-Peer System Operation

nodes network data content

Roger Zimmermann, Leslie S. Liu
Computer Science Department
University of Southern California, Los Angeles, USA

Definition: Peer-to-Peer (P2P) architecture for multimedia streaming is emerging in recent years which can eliminate the need for costly dedicated video servers in the traditional client-several approach.

The basic concept of peer-to-peer (P2P) computing is not new and some techniques date back many years when the Internet was first designed. However, the key phrase “peer-to-peer” has become widely and publicly recognized mostly after the pioneering Napster file sharing network emerged in the late 1990’s. Peer-to-peer is a very general term and people associate different concepts with it. Various forms of P2P techniques have been used in the fields of computing, networking, distributed file systems, and others. In this chapter we focus on how P2P techniques are being used for streaming media distribution.

P2P systems have some key characteristics that distinguish them from the traditional and widely used client-server model. The most prominent feature is that a P2P system is composed of a number of member nodes, each of which combines the functionality that is traditionally associated with both the server and the client. As such, multiple P2P nodes can form a collective that aggregates their resources and functionality into a distributed system. Node A may act as a client to node B, while at the same time function as a server to Node C . Beyond this fundamental characteristic, there are a number of features that are often associated with P2P systems. Note, however, that usually only a subset of the following characteristics holds true for any practical system.

  • Reduced central control. Many P2P systems work in a fully decentralized fashion where all the nodes have equal functionality. The members are connected based on a system-specific construction policy and form a distributed topology. Exceptions to this model exist. For example, the original Napster file sharing network used a centralized index to locate files; subsequently the data was exchanged directly between individual peers.
  • Heterogeneity. Members of a P2P system are usually heterogeneous in terms of their computing and storage capacity, network bandwidth, etc. A system may include high performance nodes on a university network and computers owned by residential users with broadband or modem connections.
  • Flat topology. Members of the P2P network are often treated equally which results in a flat connection topology. However, hierarchical systems exist that introduce the concept of “super-peers.”
  • Autonomy. The time and resources that a member node can or will contribute to the system are dynamic and unpredictable. Often, nodes are under different administrative control. Hence the enforcement of global policies is a challenge.
  • Fault resilience. P2P members may join or leave the topology at any time. Therefore, not only is the formed community very dynamic, but no assumptions should be made about the availability of resources or network paths. A P2P system must be able to recover from the unexpected and ungraceful leave of any of its members at any time.

Members of a P2P system are also referred to as nodes because they are often represented as network nodes in topology graph.

Streaming P2P Architectures

Streaming is a process of generating and delivering a steady, isochronous flow of data packets over networking medium, e.g., the Internet, from a source to a destination. The rendering of the content starts as soon as a small fraction of the data stream has been received. Streaming media usually denotes digital audio and video data, however haptic or other data may be streamed as well. One of the main resource bottlenecks that afflicts large client-server distribution architectures is the massive bandwidth that must be available from the server into the core of the network. This network connection is often very costly (compared to the server and client hardware) and may render a technically feasible solution economically not viable. Peer-to-Peer streaming is an alternative that alleviates the bandwidth cost problem by offering a service to deliver continuous media streams directly between peer nodes. However, the previously listed characteristics of P2P systems influence the design of such decentralized streaming solutions.

Theoretically, P2P architecture can be built over any networking medium and at potentially different layers of the network. However, most of the existing P2P implementations and their associated research have focused on application-level overlay networks. The Internet, as the dominant networking medium for research, business and entertainment, is also the preferred choice for P2P network substrates.

One of the virtues of today’s P2P systems is their scalable nature. Peer-to-peer technologies were first widely used and accepted as file-sharing platforms in systems such as Napster, Gnutella and KaZaA. Subsequently, the P2P architecture evolved and was adapted for store-and-forward streaming. Examples of streaming systems that may be used to distribute previously stored content are Narada, HMTP, and Pastry. One distinguishing characteristic among these proposals is the shape of the streaming topology they construct, which will be described later in this chapter. Even though these designs promise good performance in terms of network link stress and control overhead, only a few of them have been implemented in real systems. Next, P2P technology was adapted for live streaming. In this scenario, media streams are generated by live sources (e.g., cameras and microphones) and the data is forwarded to other nodes in real-time. We distinguish two types of live streaming: one-way and two-way. The requirements for the two are quite different and more details follow below.

Streaming Process

A streaming process can be separated into three stages that overlap in time (Figure 1): data acquisition, data delivery and data presentation. Data acquisition is the stage that determines how the streaming content is acquired, packetized and distributed for streaming. The data presentation stage represents the methods on how to buffer, assemble and render the received data. Data delivery is the process of how the stream data is transported from the source to the destination. The source, the destination and all the intermediate nodes in a streaming system participate in a topology that is constructed based on the specific system’s protocol. In a P2P streaming system, this network architecture exhibits peer-to-peer characteristics.

Data Acquisition and Presentation

At the streaming source, the content is prepared for distribution. If the data was prerecorded and is available as files, we categorize this as on-demand streaming (Figure 2). On the other hand, if the data is acquired in real time from a device, we term this live streaming (Figure 3). Content for on-demand streaming is pre-recorded and made available at source nodes usually long before the first delivery requests are initiated. This pre-recorded content can be distributed onto a single or multiple source nodes. Compared with a live streaming system, on-demand streaming usually can utilize a more sophisticated distribution process which may mean encoding the content into a processing-intensive, high-quality format and pre-loading it onto multiple source nodes. The efficiency and scalability of on-demand streaming is improved by caching copies of the content at the intermediate peers. With this approach, popular content is automatically replicated many times within the network and a streaming request can often be satisfied by peers in close proximity.

One-way live applications have similar requirements as their on-demand cousins. One obvious difference is that the source data is generated in real time by a source device such as a camera, a microphone or some other sensor. One application is the broadcasting of live events such as sports games. Data may be cached for later on-demand viewing. Two-way live applications have very different requirements. Here, the end-to-end latency is crucial to enable interactive communications. Note that P2P topologies have a disadvantage in terms of minimizing the latency among participants because application-level processing is often required at every node. Skype was probably the first successful Internet telephony system built on a P2P streaming architecture. It demonstrated that the latency problem can be solved and that P2P technology, with its many advantages, can indeed be used for live streaming purposes. AudioPeer, which is built on top of the ACTIVE architecture, is another multiparty audio conferencing tool. It is designed specifically for large user groups. Its design distinguishes active users from passive users and provides low-latency audio service to active users.

Data Delivery

The transition of one or multiple copies of the content from a source node to a destination node is called a streaming session. A streaming session starts when a streaming request is made and ends when all associated destination nodes have received the last byte of the content. Depending on the number of source and destination nodes involved in a streaming session, we can distinguish three types of streaming systems: one-to-many, many-to-one and many-to-many (see Figures 4, 5, 6). All of these three types apply to either live or on-demand streaming. One-to-many streaming is also called broadcasting. It delivers content from a single source to multiple destination nodes. Much research has focused on how to make the delivery process fast and efficient for one-to-many streaming. P2P systems naturally produce a multi-cast distribution tree since any peer that receives a stream can forward it to multiple other nodes. Many-to-one streaming delivers data from multiple sources to a single destination. A good example is an on-demand movie viewer who simultaneously downloads fragments of the movie clip from multiple peers. Many-to-many streaming combines the features of the previous two designs and usually requires a more complicated delivery network, which we will discuss in detail in following sections.

The P2P network architecture represents the topology how the nodes are inter-connected in a P2P system. P2P streaming architecture is the data path over which the streaming content is delivered from source to destination nodes. For a P2P streaming system, the network architecture is not necessary the same as the streaming architecture. For example, Scribe is a P2P network protocol constructing a ring-shaped network architecture and Pastry is the streaming architecture built on top of Scribe. But for most P2P systems, these two architectures are identical and can be represented in a single topology graph.

P2P streaming topologies, including the network architecture and the streaming architecture, can be categorized into four types: tree, mesh, ring and hybrid (see Figure 7). Tree structures start with a root node and add new nodes in a pa rent/children fashion. Many systems are built as tree topologies, e.g., AudioPeer, Yoid and HMTP. A mesh-based topology builds a full interconnect from each node to every other node and constructs a fully-connected map. For example, Narada builds a mesh structure among all the

peers and then for each peer constructs a single-source multicast tree from the mesh structure. Due to its centralized nature, Narada does not scale well. A ring-shaped topology links every node in the graph sequentially. This is usually done by assigning each node a unique node ID, which is generated by specific algorithms such as a distributed hash table (DHT). Finally, a hybrid approach combines two or more of the previous designs into their topology graph. Hybrid systems are usually divided into multiple hierarchical layers and different topologies are built at each layer. For example, NICE was developed as a hierarchical architecture that combines nodes into clusters. It then selects representative parents among these clusters to form the next higher level of clusters, which then is represented as a tree topology.

Peer-to-Peer System Operation

From the perspective of a peer, the life-cycle of a P2P streaming session can be decomposed into a series of four major processes: finding the service, searching for specific content, joining or leaving the service, and failure recovery when there is an error.

Service Discovery and Content Search

In most P2P systems, service discovery is accomplished through a bootstrap mechanism that allows new nodes to join the P2P substrate. It may be accomplished through some dedicated “super-peers” to act as the well-known servers to help new peers to find other member nodes. These “super-peers” are called Rendezvous Point (RP) servers and are sometimes under the control of the administrator of the P2P system. A new peer finds the existence of the running Rendezvous Point Server from its pre-loaded RP server list. The list can be updated once a peer is connected to one of the RP servers. RP servers can also be used to collect statistic data and in some systems, these “super-peers” are connected to form a backbone streaming platform to make the system more stable.

The next step for a peer, after joining the collective, is to locate a stream or session. The availability of specific content can be discovered in two distinct ways. In an unstructured design, streams and files are located by flooding the P2P network with search messages. This technique is obviously wasteful and may result in significant network traffic. The second approach, called structured, is to index the content such that search messages can be forwarded efficiently to specific nodes that have a high probability to manage the desired content. To keep with the distributed theme of P2P systems, indexing is often achieved by hashing a content key and assigning that key to nodes with a distributed hash table (DHT) mechanism.

JOIN: After retrieving the necessary information from the RP server, or gaining enough information from the P2P system through some methods such as flood-based search, a new peer can join an existing session by establishing the necessary connections to already joined peers. After the join operation is done, a peer is considered to be a legitimate member.

LEAVE: Every member of a P2P system is usually also serving some other peers as part of the duty to share the load of the whole system. An unexpected departure of a peer can cause disruptions or loss of service for other peers in the system. Ideally a peer should help to reconcile the disconnect in the streaming network caused by its departure. If a system protocol is well designed, this process can be very fast and almost unnoticeable to the end user application.

RECOVERY: In the dynamic environment of a P2P system where peers are under different administrative control, the unexpected departure of peers is unavoidable. A P2P streaming system must cope with these failures and include a robust and efficient recovery mechanism to repair the streaming topology. However, on the positive side, since a robust recovery mechanism is an integral part of the design, this makes P2P systems naturally very tolerant to faults.

 

Peer-to-Peer Systems [next] [back] Peer-to-Peer Multicast Video

User Comments

Your email address will be altered so spam harvesting bots can't read it easily.
Hide my email completely instead?

Cancel or