Digital Image Processing  Introduction, Digital representation of images, Histogram, Arithmetical operations, Geometrical transformations, Neighborhood operations (convolution)
pixel compression value values
LUKAS ROSENTHALER, Ph.D.
University of Basel
Introduction
Digital images do not have a physical existence; rather, they are digital data that represents the image. It is only through screen display or hard copy that the data becomes a real physical image. This axiomatic fact should always be kept in mind when dealing with digital images. Digital data representing an image must always be converted by some technical means (such as an LCD screen or a beamer) into an analog distribution of light for humans to perceive as image. The digital representation of an image—the “digital image”—can be created, manipulated, and analyzed by computer algorithms. These functions also form the three basic families of computer algorithms that deal with digital images. These three families can be described as follows:
 Data ? image: image creation, computer graphics
Within this family, data is converted into images. Computer graphics create images out of geometric, threedimensional data that can describe a scene, including surface characteristics and illumination of the scene. The photorealistic rendering uses mathematical methods, such as ray tracing and radiosity, which rely on mathematical models of how light propagates and interacts with material, to generate images with a very high degree of “reality.” These methods are often used in the motion picture industry to generate special effects.
If there is scientific data to be presented and subsumed for a human observer, a visualization is often a very suitable way to accomplish this. Visualization methods depend highly on methods of computer graphics to present “virtual objects” that illustrate scientific findings to the human observer  Image ? image: image processing
Image processing is used to manipulate and modify digital images. Applying image processing routines to a digital image will again result in a digital image, which differs from the original image. This family of methods will be the main focus of this section.  Image ? data: image analysis, pattern recognition
Image analysis deals with the problem of extracting data from a digital image. From this point of view, photogrammetry is the inverse of computer graphics: It attempts to extract geometrical data out of images. Image analysis is still a very active field of research with many open problems.
Digital representation of images
To be processed, a digital image has to be represented in computer memory. A computer memory can be considered to be a labeled, linear array of memory cells where each cell holds a value (a number between 0255 for an 8bit memory cell and between 065535 for a 16bit memory cell). The label of a cell is also called the “address” of the cell and is used to reference the cell for accessing its value (or “content”). Usually the labels are numbers. In fact, the computer memory resembles a vast cabinet with a lot of drawers, each drawer containing a sheet of paper with a value written on it. All the drawers are consecutively numbered from 0 to n (n being the number of drawers).
A digital image is a subset of the computer memory, in which each memory cell holds a numerical value that represents the brightness, luminosity, or optical density at a predefined position of the image, usually called a pixel. Normally the memory cells are ordered so that the first cell represents the value at the upper left corner of the image, the next cell represents the next value to the right, etc. In case of color or multispectral images, several memory cells are used for each location. The way the memory cells are ordered with respect to their position in the image can differ and may be adjusted to the type of processing required. The following example shows the geometrical arrangement of image data starting at address 100124 (representing an image of 5 × 5 pixels) where the lowest address represents the upper left corner and the highest address represents the lower right corner of the image.
For all of the following examples, this arrangement of pixel values is assumed. Figure 39 shows a gray value image where in each memory cell the brightness value (0 = black, 255 = white) is stored.
Usually, a digital image is treated as a twodimensional matrix with n x pixels in horizontal and n y pixels in vertical direction. The mathematical notation is then as follows:
I = f(i,k) where i = 0 … (n x 1),k = 0 … ( n y 1)
Image processing now applies algorithms to this matrix of numerical values, which modify these values.
Point operations
Point operations are modifications of pixel values, where the new pixel value depends only on the previous value at the same pixel. These operations are used to adjust the brightness dependence of the previous brightness at a given position. For example, if each pixel’s brightnessvalue is divided by 2, the image appears much darker. Figure 40 is an example of a point operation.
Mathematically, this operation would be noted as follows:
Please note the rather complex addressing of the pixel. The term k*nx + i is required to translate the twodimensional matrix of the pixels to the onedimensional property of computer memory. This snippet of program code applies the transformation (division by two) to each pixel of the image. Point operations are used to adjust brightness, contrast, and gamma. 2Point operations may be visualized by a simple graph, where the x axis denotes the input values (from 0255) and the y axis denotes the output values. The above transformation would be described by the following graph (Figure 41).
Histogram
A common tool for the analysis of the brightness values of an image is the histogram, which plots the frequency of each brightness value. See the shapes of the histograms in Figure 42 for the images in Figure 40.
Arithmetical operations
Arithmetical operations are point operations, which involve different images. To perform this kind of image manipulation, all involved images must have the same dimensions in x and y ; that is, they must have the same width and height in pixels. Some of the most prominent arithmetical operations are:
 Difference of two images
 Weighted addition
Figure 43 shows an example of an image difference. The first image is an uncompressed TIFF image; the second image is the same image, but in a heavily compressed JPEG format. The difference image (contrastenhanced) clearly shows the artifacts introduced by the lossy compression of the JPEG algorithm.
Figure 44 shows how two images can be merged with a third image supplying the coefficients. The mathematical formula is as follows:
I 1 and I 2 are the two input images; K is the image containing the weights. It is important to note that the image K has to be normalized to a value range between 0.0 and 1.0.
Geometrical transformations
Often, digital images have to be geometrically transformed. Common examples are the image rotation, correction of optical distortions, and rectification of aerial images. Geometrical transformations create a new image in which the pixels of the original image are at different locations as given by the transformation:
An example of such a transformation is the affine transformation, which is quite common (image rotation, perspective distortion, etc.):
The problem with this kind of geometrical transformation is that the new pixel location may fall in between the pixels. In general, the procedure is as follows: For each pixel in the new image, the position of this pixel in the original image is calculated. If the position is not exactly at the center of a pixel in the original image, the value is interpolated. The simplest interpolation is to take just the value of the nearest pixel (Nearest Neighbor interpolation). A better way is to perform a bilinear interpolation, as shown in Figure 46.
Neighborhood operations (convolution)
Neighborhood operations modify the pixel values depending on the value of the pixel that is processed at a given position and values of the surrounding pixels.
Linear neighborhood operations
Linear neighborhood operations calculate the weighted sum of the neighborhood of each pixel. The resulting pixel values are inserted into a new image so that the original pixel values still are available for calculating the next weighted sum. This kind of operation is also called a convolution; the matrix with the weights is called the kernel of the convolution. The kernel usually has an odd dimension (3 × 3 or 5 × 5 but also 3 × 5, 7 × 11, etc.), because this easily allows assignment of the “central pixel.” The weights in the kernel are also allowed to be negative. Most often, the sum of the weights is either 1.0 or 0.0. The kernel is first positioned at the upper left position of the image. Then each pixel is calculated moving the kernel from left to right for every line (top to bottom) of the image.
Linear neighborhood operations (also called linear filters) can be used for eliminating noise and for sharpening, for example, but they can also be used for the enhancement of features such as vertical or horizontal edges. The following examples will illustrate some of the common uses of neighborhood filters.
Figure 48 displays the effect of a smoothing filter, which averages the pixel values within the neighborhood.
In color images the convolution is usually applied separately to each color (red, green, and blue).
Figure 49 shows the effect of a sharpening filter with the following kernel:
This convolution kernel enhances the edges in the images and adds the result to the original image. Please note that the coefficients sum up to 1.0. As soon as there are negative coefficients in a convolution kernel, the resulting pixel may become negative. In such cases, there are two possibilities: The negative pixel is just set to 0, or an off set is added to all resulting pixel values to avoid negative values. In the example of Figure 49, negative pixel values would have been set to 0 if this occurred.
Figure 50 shows the effect of a filter where the sum of the kernel coefficients is 0. In this case, negative values have to be expected. Therefore, an off set of 127 is added to all resulting pixels. The following kernels have been used:
a is an isotropic filter, which enhances edges independent of their orientations. It is called a Laplace filter. b enhances selectively vertical edges; c enhances horizontal edges. These latter two filters are called Sobel filters. Since the sums of the kernels are 0, homogeneous areas will result in the pixel value of 127 (given by the off set) and appear in a neutral gray.
Nonlinear neighborhood operation
There are neighborhood operations, which cannot be described by a simple kernel in which the kernel coefficients are the weights of a summation. In nonlinear neighborhood operations, often some sort of thresholding or decisionprocessing is accomplished depending on the pixel values in the neighborhood of a pixel. One of the most prominent nonlinear filters is the median filter: The pixel values of the neighborhood are sorted according to their value and the new pixel value is the median value (that is, the “middle” value) of the sorted list of pixel values. The median filter has the advantage of usually not smoothing edges too much but still eliminating shot noise very well (see Figure 51).
There are many other nonlinear filters, but using them approaches black magic in its complexity and requires a lot of experience.
Compression algorithms
Compression algorithms are used to reduce the size of the digital image file. There are two fundamentally different compression schemes: lossy and lossless.
Lossless compression
Lossless compression schemes try to reduce the redundancy that is usually found within digital data. Lossless compression schemes do not depend on specific image properties but can be applied to all kinds of digital data. Since images usually contain some redundancy (a pixel value at a certain position is often similar to the neighboring pixel values), a slight reduction of the amount of data can usually be achieved (about 30 to 50 percent). The reduction will be greater if the image contains large homogeneous areas or areas with repetitive patterns. For images that contain large areas of irregular patterns (random noise), the reduction will be very small. In some cases, lossless compression may even inflate the size of the image file.
Lossy compression
Lossy compression does eliminate information that the compression algorithm decides is of little interest. Therefore, lossy compression algorithms directly depend on the properties of images and particularities of the human visual system. Lossy compression algorithms do, therefore, modify the image in an irreversible way. They usually eliminate only information that is considered of little importance to the viewer. However, if the compression factor is too high or, more important, the compressed image is manipulated with image processing methods in a later stage (such as contrast enhancement), artifacts may become visible. Therefore, lossy compression should be only applied to images that are used only for viewing. Images that are to be archived or are meant to be manipulated in a
later stage should never be stored in a lossy format! Common compression algorithms (and corresponding file formats) are:

JPEG
JPEG stands for Joint Photographers Experts Group and is a wellestablished lossy compression scheme, which uses the Discrete Cosine Transform 3 (DCT) as a basis for the compression. The image is divided into 8 × 8pixel blocks. The DCT is calculated for each block. Depending on the compression level, only the major coefficients of the DCT are used. The JPEG algorithm may lead to very particular artifacts that make the blockstructure of the algorithm visible. The compression ratio should usually be in the range of 5 to 25. Higher compression ratios will often lead to visible artifacts.  JPEG2000
JPEG2000 is a successor of the JPEG algorithm, but uses a totally different approach. It relies on the Wavelettransform, which builds up a resolution pyramid of the image. It creates less visible artifacts, but the compressed image may give the impression of less sharpness or crispness. There is also a lossless variant of the JPEG2000 algorithm. Another very interesting feature of the JPEG2000 algorithm is that the whole image file doesn’t always have to be read. If only the first part is read, the whole image can be displayed as if it were compressed at a higher compression rate. For example, to display a thumbnail image, only about the first 5 to 10 percent of a JPEG2000 (lossless variant) image has to be read.
User Comments
over 6 years ago
It is good presentation. I need from you knowledge of multimedia.I wont good friend of you.