# Multimedia Authentication - Authentication using cryptography, Symmetric-key cryptography, Hash functions, Asymmetric-key cryptography

### image message watermark value

**Jiying Zhao and Abdulmotaleb El Saddik
School of Information Technology and Engineering
University of Ottawa, Ontario, Canada**

**Definition:** Multimedia authentication deals with confirming the genuineness or truth of the structure and/or content of multimedia.

Multimedia signal can be easily reproduced and manipulated. Although we cannot perceive the change, what we are seeing or listening to may have been changed maliciously for whatever reasons. Multimedia authentication is to confirm the genuineness or truth of the structure and/or content of multimedia. Multimedia authentication answers the following questions: 1) is the multimedia signal from its alleged source? 2) has it been changed in any way? 3) where and to what degree has it been changed if changed? There are mainly two approaches that can answer these questions. The first approach to multimedia authentication is cryptograph; while the second approach is the digital watermarking. In addition, cryptograph can be integrated into digital watermarking to provide more desirable authentication. It is worth mentioning that multimedia authentication is different from user authentication.

## Authentication using cryptography

Treating multimedia signal as normal text (sequence of bytes), cryptographic techniques and hash functions can be used to authenticate multimedia content. There are three common authentication modes:

1) The sender uses a secure one-way hash function to generate a hash value *H* ( *m* ) and appends to the plain message *m* without further encryption to form *m* + *H* ( *m* ) to send; the receiver calculates hash value *H* ( *m’* ) of the received possibly-attacked message *m* ‘, and compare *H* ( *m’* ) with the received possibly-damaged hash value ( *H* ( *m* ))‘. If *H* ( *m* ) and ( *H* ( *m* ))’ are same, the received message *m’* is authentic ( *m’ = m* ); otherwise is not.

2) The sender uses a one-way hash function (secure or not) to generate a hash value *H* ( *m* ), encrypts the hash value and appends to the plain message to form *m* + *E* ( *H* ( *m* )) to send; the receiver decrypts the received encrypted hash value to obtain the possibly-damaged hash value ( *H* ( *m* ))‘, calculates the hash value *H* ( *m’* ) of the received possibly-attacked message *m’* , and compares the calculated hash value *H* ( *m’* ) with the decrypted hash value ( *H* ( *m* ))‘. If *H* ( *m’* ) and ( *H* ( *m* ))’ are same, the received message *m’* is authentic ( *m’ = m* ); otherwise is not.

3) The sender uses a one-way hash function (secure or not) to generate a hash value *H* ( *m* ), appends to the plain message, encrypts the resulting combination of the plain message and the hash value to form *E* ( *m + H* ( *m* )) to send; the receiver decrypts the received combination of the plain message and hash value to obtain the possibly attacked or damaged combination *m’+* ( *H* ( *m* ))‘, calculates hash value *H* ( *m’* ) , and compares the calculated hash value *H* ( *m’* ) with the received hash value ( *H* ( *m* ))‘. If *H* ( *m’* ) and ( *H* ( *m* ))’ are same, the received message *m’* is authentic ( *m’= m* ); otherwise is not.

In the above three modes, + stands for appending (concatenation) operation, *m?* stands for the possibly attacked version of the transmitted message *m* , and *(H(m))?* stands for the possibly-damaged version of *H(m).*

In the above authentication modes, both cryptography and hash function are indispensable. In the following, we introduce symmetric-key cryptography (sometimes called private-key cryptography), asymmetric-key cryptography (sometimes called public-key cryptography), and secure one-way hash functions.

## Symmetric-key cryptography

In symmetric-key cryptography, two parties share a single cipher key, *K.* There are an encryption function, *E(·)* , and a decryption function *D* (·). A plaintext, *m* , is converted to encrypted ciphertext, *m c* , by *m c* = *E(m, K)* . Similarly, the ciphertext, *m c* , is decrypted back to the plaintext *m* by *m = D(m c ,K).* The two most important symmetric algorithms are the Data Encryption Standard (DES) and the Advanced Encryption Standard (AES). See the short article: Data Encryption Standard (DES) and Advanced Encryption Standard (AES).

In addition to encryption, a symmetric-key cryptographic algorithm can be used for the abovementioned three authentication modes. It can authenticate the multimedia content and the source. Assuming that only the sender and receiver share a secret key *K,* then only the genuine sender would be able to successfully to encrypt a message for the receiver. If the message includes a hash value, *H(m)* , the receiver is assured that no alternations have been made if the re-calculated hash value is equal to the sent hash value.

Symmetric cryptography is more efficient than asymmetric cryptography when used on peer-to-peer session, but does not work well when there are more parties involved.

## Hash functions

A hash function maps a variable-length string into a fixed-length hash value. The simplest hash functions can be error detection codes such as parity check, checksum, or Cyclic Redundancy Code (CRC). However, these error detection codes cannot check out malicious alternations, since an attacker can try to change the content in a way that does not change the error detection result. For example, the attacker can make the alternation divisible by the used CRC polynomial in order to defeat CRC. Therefore, these insecure error detection codes should be used together with a symmetric-key or asymmetric-key cryptographic algorithm to authenticate multimedia content. Without using cryptography, a secure hash function is needed to defeat malicious attacks.

A secure hash function accepts a variable-size message *m* as input and produces a fixed-size message digest *H(m)* as output. A secret key can be appended to the end of message to generate a more secure hash value. A one-way hash function is a hash function that is reasonably cheap to calculate, but is prohibitively expensive to invert. Common one-way secure hash functions are MD5 Message-Digest Algorithm and the Secure Hash Algorithm (SHA). MD5 produces a 120-bit digest. SHA is considered to be the successor to MD5. SHA has four members, SHA-1, SHA-256, SHA-384, and SHA-512. They produce respectively a 160-, 256-, 384-, and 512-bit digest. The longer the digest size a hash function produces, the more difficult to beak, therefore the more secure it is. SHA is considered more secure than MD5 because it can produce longer digest size. See the short article: MD5 Message-Digest Algorithm and Secure Hash Algorithm (SHA).

Hash functions are indispensable in cryptograph-based multimedia content authentication, and it is also used in watermarking-based authentication algorithms.

## Asymmetric-key cryptography

Asymmetric-key cryptography uses a different key for encryption than is used for decryption. The most widely used public-key algorithm is RSA. In RSA?s encryption, a pair of keys are used, the receiver’s public key *K RE* and the receiver’s private key *K RD .* The message is encrypted using the receiver’s public key *K RE* , in *m c* = *E(m, K RE* ); and is decrypted using the receiver’s private key *K RD* , in *m* = *D(m c , K RD* ). See the short article: the RSA Public-Key Encryption Algorithm.

Asymmetric-key cryptography can be used for authentication in terms of source and data integrity. To do this, the sender’s private key *K SD* and sender’s public key *K SE* are used. For authentication, the message is encrypted using the sender’s private key *K SD* , in *m c = E(m,K SD )* ; and is decrypted using the sender’s public key *K SE* , in *m = D(m c* , *K SE* ). No one else except for the sender can encrypt a message that can be decrypted by using the sender’s public key. This feature can be used to ensure that the message is from the right sender and has not been altered. Sometimes, it is difficult to judge whether the message is altered by using only the encryption. Therefore a secure hash function is indispensable no mater whether it is symmetric cryptograph or asymmetric cryptograph.

Asymmetric cryptography scales better than symmetric cryptography, but it is slower.

The cryptography and hash function based authentication gives a binary judgment, as to whether the message has been changed or not. A message with even 1-bit change will be judged as inauthentic. This is not always desirable for multimedia content authentication, since for some applications a multimedia signal with imperceptible changes may be considered as authentic. Digital watermarking is a good candidate for authenticating multimedia content. Cryptograph and digital watermarking can work together to provide better solutions.

## Authentication using digital watermarking

Digital watermarking can be used to authenticate the multimedia content. The basic idea is that the embedded watermark will become undetectable if the multimedia content is changed. Depending on the robustness of the watermark, the watermarking based authentication can be classified into exact authentication, selective authentication, and localization. It is worth pointing out that the classification is fuzzy since some watermarking algorithms belong to more than one class.

## Exact authentication

Exact authentication is to check whether the multimedia content has undergone any change at all. If even a single bit, pixel, or sample has been changed, the multimedia content will be declared as inauthentic. Exact authentication can be fulfilled by using fragile watermarking.

A fragile watermark becomes undetectable after multimedia content is modified in any way. A simple example of a fragile watermark is the least-significant-bit (LSB) watermark, in which the LSB of each signal sample is replaced (over written) by a payload data bit embedding one bit of data per input sample. If a predefined bit sequence embedded in the least-significant-bit can be detected, it is implied that the content has not undergone changes. The least significant bit is the most vulnerable and will be changed by the lightest global processing. However, malicious attacks can defeat this approach if the watermark is independent from the multimedia content. Attackers can simply replace the LSB of a multimedia content by the LSB of the authentic one. In addition the method cannot detect the changes in the significant bits.

Yeung and Mintzer proposed a pixel-wise image authentication system. The extraction process employs a pseudo-random mapping, *m* (·), from pixel intensities into binary values. To embed a binary bit at pixel *X(i,j),* the embedder enforces the equation *W(i, j) = m(X w* (i, *j* )), where *W(i, j)* is the watermark bit, and *X w (i, j)* is the watermark pixel. The embedder compares the mapped binary value *m(X (i, j))* to the watermark bit *W(i,j)* . If *W(i, j) = m(X(i, j))* no action is needed ( *X w* ( *i,j* ) = *X* ( *i,j* )). lf *W* ( *i,j* ) ? *m* ( *X* ( *i,j* )), *X* ( *i,j* ) is replaced by *X w (i,j)* that the closest to *X(i, j)* and *W(i, j)* = *m(X w (i, j))* . The detector extracts the watermark from pixel *X w (i,j)* of the possibly attacked watermarked image using mapping W *(i, j) = m(X w (i, j)).* If the watermarked image has not been modified, this extracted value *W (i,j)* exactly matches the watermark bit *W(i,j).* However, if a region has been modified, *W (x, j)* will not match *W(i,j).*

The LSB based algorithms and Yeung and Mintzer’s algorithm are not secure. An authentication signature can be created to solve the problem. Friedman described a “trustworthy digital camera” in which a digital camera image is passed through a hash function and then is encrypted using the photographer’s private key to produce a piece of authentication data separated from the image. These data are used in conjunction with the image to ensure that no tampering has occurred. Specifically, the photographer’s public key is used to decrypt the hashed original image and the result is compared to the hashed version of the received image to ensure authentication. The resulting signature can be embedded into the least significant bits of the image to enhance the security. The authentication signature can guarantee the integrity of the content, therefore exact authentication. Walton proposed a technique in which a separate piece of data is not required for authentication. The method requires the calculation of the checksums of the seven most significant bits of the image, so that they may be embedded into randomly selected least significant bits. These two techniques are focused on detecting whether an image was tampered with or not. However, they do not clearly specify how and where the image has been changed.

A well-known block-based algorithm is the Wong’s scheme, where the hash value from a block with LSB zeroed out is XORed with the corresponding block of the binary logo image, encrypted, and inserted to the LSB of the block. To verify an image, the LSBs are extracted, decrypted, and XORed with the hash value calculated from the possibly attacked image. If the result is the original binary logo, then the image is authentic. Any tamper to a block will generate a very different binary output for the block due to the property of the hash function.

## Selective authentication

Image, video, or audio is different from text message. A small change may not be noticeable to human’s eyes or ears. That is also the reason why they can be compressed lossy. Some distortions are acceptable such as light compression and noise addition. Some attacks are considered as malicious such as modification and cropping. This can be implemented via semi-fragile watermarking. A semi-fragile watermark is robust to light changes and fragile to significant changes. The watermark should be embedded in such as way that when the distortion is light the watermark should be detectable; when the distortion is significant the watermark should become undetectable.

Most semi-fragile watermarking algorithms are designed based on quantization. Here only a few typical algorithms are discussed. Wu and Liu proposed a watermarking scheme for image authentication which was based on a look-up table in frequency domain. The table maps every possible value of JPEG DCT coefficient randomly to 1 or 0 with the constraint that runs of 1 and 0 are limited in length to minimize the distortion to the watermarked image. To embed a 1 in a coefficient, the coefficient is unchanged if the entry of the table corresponding to that coefficient is also a 1. If the entry of the table is a 0, then the coefficient is changed to its nearest neighboring values for which the entry is 1. To embed a 0 is similar to embedding a 1.

Lin and Chang proposed a semi-fragile watermarking algorithm to survive light distortions. Signature is extracted from the low-frequency coefficients of the pairs of DCTs of an image and is embedded into high frequency coefficients in the DCT domain. The authentication signature is based on the invariance of the relationship between DCT coefficients of the same position in separate blocks of an image. According to a property of quantization, if two values are quantized by the same step size, the resulting values maintain the same relation to one another. That is, if *a* > *b,* then *a* *q* = *b* *q* , where is the quantization operator defined in the short article: Quantization. The relationship will be preserved when these coefficients are quantized in a JPEG compression by a quantization step size that is smaller or equal to the quantization step size used for watermark embedding. Other distortions such as cropping will destroy the relationship between DCT blocks, therefore destroy the authentication watermark. See the short articles: Discrete Cosine Transform (DCT) and Quantization.

Kundur and Hatzinakos presented a fragile watermarking approach that embeds a watermark in the discrete wavelet domain of an image. In this approach, the discrete wavelet decomposition of a host image is first computed. A user-defined coefficient selection key is then employed. The binary watermark bit is embedded into the selected coefficient through an appropriate quantization procedure. Finally, the corresponding inverse wavelet transform is computed to form the tamper-proved image. In the extraction procedure, a quantization function is applied to each of the selected coefficients to extract the watermark values. For authentication, if it fails, then a tamper assessment is employed to determine the credibility of the modified content. The quantization technique used is similar to the one used in . This approach is also referred to as “tell-tale” watermarking. Watermark in the different wavelet level can give different authentication precision. For discrete wavelet transform, see the short article: Discrete Wavelet Transform (DWT).

It is desirable to know not only whether the content has been changed, but also where has been changed, even how has been changed. This can be fulfilled by using tell-tale watermark.

## Localization

Some authentication algorithms can identify whether the content has been changed, where has not been changed (therefore still usable). This is referred to as localization. Watermarking algorithms that have localization function must be able to provide temporal or spatial information. Block-wise and sample-wise authentication can locate the change to the content. The algorithm of Wu and Liu embeds 1 bit to each DCT block, and the block position provides spatial information of an image. If the embedded bit cannot be detected, then it is clear that the corresponding 8×8 block has been changed. The algorithm of Kundur and Hatzinakos embeds watermark into wavelet coefficients. Discrete wavelet transform carries both frequency and spatial (or temporal) information. Embedding watermark in the discrete wavelet domain allows the detection of changes in the image in localized spatial and frequency domain regions.

Audio, video, and image all allow lossy compression, that is, slight changes will not be noticeable. Therefore, watermarking based authentication methods can be used. In addition, most ideas used for image authentication normally can be extended to audio or video authentication. For example, discrete wavelet transform based watermarking algorithms are applicable to audio, video and image.

In addition to multimedia content authentication, some research is on authentication of multimedia structure. Sakr and Georganas first proposed an algorithm for authentication and protection of MPEG-4 XMT structured scenes. They generate a unique data signature about the XMT-A structured scene using MPEG-4 BIFS and pseudorandom encoding sequence. By doing so, the algorithm can detect the structure change of an MPEG-4 stream.

Related to multimedia authentication, digital watermarking can be used for quality evaluation. The basis for this is that a carefully embedded watermark will suffer the same distortions as the host signal does. Cai and Zhao proposed an algorithm that can evaluate the speech quality under the effect of MP3 compression, noise addition, low-pass filtering, and packet loss. Watermarking also can be used for the quality evaluation of image and video. For example, a watermark can be designed to evaluate the effect of JPEG or MPEG compression, i.e., what quantization factor has been used for the compression.

In conclusion, there are currently two approaches to multimedia authentication. One is cryptograph, and the other is digital watermarking. Cryptograph is a relatively well-studied topic, while the digital watermarking is still in its infancy. A strong and seamless integration between cryptograph and digital watermarking will provide us with advanced means of authenticating multimedia content.

## User Comments