Image sampling and quantization

How do we capture natural images which are \(2D\) continuous functions in digital form? Sampling and quantization are the two concepts that will help to answer this question.

The $$\sin()$$ function is sampled at two points -- $$x_1$$ and $$x_2$$. The quantized values corresponding to these two points are $$y_1$$ and $$y_2$$. By taking sufficient number of samples, we can reconstruct the original $$\sin()$$ function.
The \(\sin()\) function is sampled at two points – \(x_1\) and \(x_2\). The quantized values corresponding to these two points are \(y_1\) and \(y_2\). By taking sufficient number of samples, we can reconstruct the original \(\sin()\) function.

For the ease of exposition, first we consider a \(1D\) continuous analog signal. Using an analog to digital converter, the image is sampled and quantized. Sampling refers to measuring the amplitude of the signal at discrete time points. Quantization refers to how precisely the amplitude values are distinguished from each other. For example, if we use one bit for storing amplitude values, the amplitude can be assigned only one of two (\(2^1\)) values. On the other hand, using two bits, the amplitude can be assigned only one of four (\(2^2\)) values. We can generalize this by saying that by using \(n\) bits, we can assign an amplitude any one of the (\(2^n\)) values. The number of bits used for quantization is referred to as the intensity resolution.

A periodic function $$f$$ is resolved into a linear combination of $$\sin$$ and $$\cos$$ functions -- Fourier series. The frequencies of the $$\sin$$ and $$\cos$$ functions span the frequency spectrum. They are represented as *peaks* on the frequency axis.
A periodic function \(f\) is resolved into a linear combination of \(\sin\) and \(\cos\) functions – Fourier series. The frequencies of the \(\sin\) and \(\cos\) functions span the frequency spectrum. They are represented as peaks on the frequency axis.

Any periodic function \(f\) can be expressed as a linear combination of sinusoidal functions of different frequencies and amplitudes. That is, \(f = a_n\cos(nx) + b_n \sin(nx)\). The representation of \(f\) in the frequency domain is denoted by \(\hat{f}\).

Fourier series animation of a *square wave* periodic function $$f$$. The first frame of the animation shows how $$f$$ is resolved into a linear combination of sinusoidal functions of different amplitudes and frequencies -- Fourier series. The component frequencies are represented as a collection *peaks* in the frequency domain (last frame of the animation). This frequency domain representation of $$f$$ is denoted by $$\hat{f}$$. Source: [Wikipedia: Fourier series](https://en.wikipedia.org/wiki/Fourier_series)
Fourier series animation of a square wave periodic function \(f\). The first frame of the animation shows how \(f\) is resolved into a linear combination of sinusoidal functions of different amplitudes and frequencies – Fourier series. The component frequencies are represented as a collection peaks in the frequency domain (last frame of the animation). This frequency domain representation of \(f\) is denoted by \(\hat{f}\). Source: Wikipedia: Fourier series

The number of samples taken in a unit time interval is referred to as the sampling rate or spatial resolution. By increasing the sampling rate, we can accurately reconstruct the original analog signal from the digitized samples. However, the disk space required for storing the digital signal increases with higher sampling rates and intensity resolution.

Extending the sampling and quantization concepts to \(2D\) continuous functions is straightforward. We sample in the \(2D\) space. Assume that we have \(M\) samples along the \(x-\) axis and \(N\) samples along the \(y-\) axis. The resulting digital image, denoted \(f(x,y)\), is represented as:

\[f(x,y) = \begin{bmatrix} f(1,1) & f(1,2) & \cdots & f(1, N) \\ f(2,1) & f(2,2) & \cdots & f(2, N) \\ \vdots & \vdots & \vdots & \vdots \\ f(M,1) & f(M,2) & \cdots & f(M, N) \\ \end{bmatrix}\]

There are \(N\) samples in each row of \(f(x,y)\). We have \(M\) rows, and the total number of pixels is \(M \times N\). In other words, the \(2D\) image is sampled at \(M \times N\) spatial locations.

A $$5 \times 5$$ binary image which represent the decimal digit 7. The $$2D$$ continuous function of the image is sampled at 25 locations. The quantized values are stored using one bit. This gives us two distinct values 0 and 1 for quantization.
A \(5 \times 5\) binary image which represent the decimal digit 7. The \(2D\) continuous function of the image is sampled at 25 locations. The quantized values are stored using one bit. This gives us two distinct values 0 and 1 for quantization.

A $$5 \times 5$$ grayscale image. The $$2D$$ continuous function of the image is sampled at 25 locations. The quantized values are stored using eight bits. This gives us 256 distinct values for quantization in the range $$0, \ldots, 255$$.
A \(5 \times 5\) grayscale image. The \(2D\) continuous function of the image is sampled at 25 locations. The quantized values are stored using eight bits. This gives us 256 distinct values for quantization in the range \(0, \ldots, 255\).

When the number of samples is insufficient to reconstruct the original image, we have undersampled the image. On the other hand, if we have more samples than the minimum needed to reconstruct the original image, we have oversampled the image.


Back to course home