Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (10.36 MB, 489 trang )
Digital image processing and manipulation
429
Figure 26.3 Some spatial convolution filters. (a) and
(b) Averaging filters. (c) x- differentiation.
(d) y-differentiation. (e) Laplacian edge detector.
(f) Crispening operator based on the Laplacian. K is a
positive number greater than 4 whose value determines the
strength of the effect
Figure 26.1
Digital convolution
A more rigorous approach is to assume the original
image is periodic as illustrate in Figure 26.2. When
the filter array is overlapping the edge of the original
image, values from the opposite edge are included in
the multiplication. This approach will clearly result in
meaningless edge values in the processed image
unless there is indeed such a relationship, or there is
a degree of uniformity of the image at its boundary
and beyond. The advantage of this approach is that
the convolution produces results identical to the
equivalent frequency domain method implemented
by the ubiquitous Fourier transform.
Some sample convolution filters are shown in
Figure 26.3, and examples of their implementation
are included in Figure 26.4.
Linear filters have two important properties:
(1)
(2)
Two or more different filters can be applied
sequentially in any order to an image. The total
effect is the same.
Two or more filters can be combined by
convolution to yield a single filter. Applying this
filter to an image will give the same result as the
separate filters applied sequentially.
Frequency domain filtering
The convolution theorem implies that linear spatial
filtering can be implemented by multiplication in the
spatial frequency domain. In other words, the convolution of equation (2) can be implemented by the
relationship:
G = H×F
Figure 26.2 For many filtering operations, a digital image
is assumed to be periodic
(3)
F is the Fourier transform of the original image f, H
is the Fourier transform of the filter h and G is the
Fourier transform of the processed image g. Most
Fourier transform routines will require that the image
is square, with the number of pixels on a side being a
power of 2 (256, 512, 1024 etc.). This is a
requirement of the fast Fourier transform (FFT)
430 Digital image processing and manipulation
(a)
(b)
(c)
(d)
Figure 26.4
(a)
(b)
(c)
(d)
Examples of the use of some of the convolution filters in Figure 26.3
Original image
x-differentiation
y-differentiation
The Laplacian crispening operator with K = 5
algorithm used. An image is processed by taking its
Fourier transform, multiplying that transform by a
frequency space filter, and then taking the inverse
Fourier transform. The rationale for such a procedure
is often developed from a consideration of the
frequency content of the image and the required
modification to that frequency content, rather than an
alternative route to a real space convolution operation. Figure 26.5 illustrates such a procedure. The
image in (a) fades into the distance at the top of the
Digital image processing and manipulation
(a)
(b)
(c)
431
(d)
Figure 26.5
(a) Original image
(b) The Fourier transform of (a)
(c) The Fourier transform from (b) with frequencies in selected direction attenuated as indicated by the pale line
(d) The processed image. Note the removal of the very low frequency component responsible for the fade at the top of the
original image. The motorway bridge is also reduced in prominence
frame. This indicates a very strong low frequency
component varying in the vertical, y-direction. There
is also a prominent motorway bridge. This feature
contains a wide range of spatial frequencies in a
direction perpendicular to the alignment of the bridge.
The modulus of the Fourier transform of the image is
displayed as a grey-scale distribution in (b). The
spatial frequencies contained by the bridge and by the
fade feature are distributed along an almost vertical
line through the centre (the origin). If these are
432 Digital image processing and manipulation
(a)
(b)
(c)
(d)
Figure 26.6
(a) Original image
(b) The Fourier transform of (a)
(c) The Fourier transform in (b) multiplied by a low pass filter of the form of equation (4)
(d) The inverse Fourier transform of (c), showing severe ‘ringing’ due to the abrupt filter cut-off
removed (c) and the inverse Fourier transform taken,
we arrive at the processed image in (d). The bridge is
still visible, but it is far less obvious. The fading has
been completely removed. Other features aligned in
the same direction as the bridge will also be reduced
in contrast, but they are less significant. The same
effect could be achieved by operating directly on the
appropriate pixels in the original image, but this
would require considerable care to produce a ‘seamless’ result.
Digital image processing and manipulation
433
Low-pass filtering
A filter that removes all frequencies above a selected
limit is known as a low-pass filter. It can be
defined:
H(u, v) = 1 for u 2 + v 2 ≤ ω02
= 0 else
(4)
u and v are spatial frequencies measured in two
orthogonal directions (usually the horizontal and
vertical). ω0 is the selected limit. Figure 26.6 shows
the effect of such a ‘top-hat’ filter. As well as
‘blurring’ the image and smoothing the noise, this
simple filter tends to produce ‘ringing’ artefacts at
boundaries. This can be understood by considering
the equivalent operation in real space. The Fourier
transform (or inverse Fourier transform) of the tophat function is a Bessel function (a kind of twodimensional sinc function). When this is convoluted
with an edge, the edge profile is made less abrupt
(blurred) and the oscillating wings of the Bessel
function create the ripples in the image alongside the
edge. To avoid these unwanted ripple artefacts, the
filter is usually modified by replacing the abrupt
transition at ω0 with a gradual transition from 1 to 0
over a small range of frequencies centred at ω0 .
Gaussian, Butterworth and trapezoidal are example
filters of this type.
Figure 26.7 The result of applying a high-pass Gaussian
filter to the image of Figure 26.6(a)
more filters will depend on the order in which they
are applied. Because the decision making step can be
arbitrarily complex, a vast range of effects is possible.
Some simple examples are illustrated in the following
sections.
High-pass filtering
If a low-pass filter is reversed, we obtain a high-pass
filter:
H(u, v) = 0 for u 2 + v 2 ≤ ω02
= 1
else
(5)
This filter removes all low spatial frequencies below
ω0 and passes all spatial frequencies above ω0 . With
this filter, ringing is so severe, it is of little practical
use. Instead a gradual transition version (Gaussian or
Butterworth) is used. Figure 26.7 shows the result of
a Gaussian type high-pass filter.
Non-linear filtering
This process differs from linear convolution illustrated in Figure 26.1 by the inclusion of a decision
making step (an operator) in place of the addition of
the nine products, to form the output pixel value.
Figure 26.8 illustrates this difference. Unlike linear
filters, two or more non-linear filters cannot be
combined into a single filter, and the result of two or
Shrink filter
The filter elements are set to unity and the operator is
‘use minimum value’. The minimum pixel value in a
local neighbourhood is therefore selected as the
output pixel at that position. This is sometimes called
a minimum filter. The effect is to cause dark areas to
expand by the size of the filter and to remove all
detail.
Expand filter
This is the same as the shrink filter except the
operator is ‘use maximum value’. It causes light areas
to expand. It may be referred to as a maximum
filter.
Shrink and expand filters are commonly used
sequentially on images. Shrink, followed by expand,
will cause bright regions smaller than the filter size to
be completely removed. Larger regions are virtually
unaltered. Figure 26.9 illustrates the effect of shrink
and expand filters, used individually and in
sequence.
434 Digital image processing and manipulation
Figure 26.8
(a) The linear spatial convolution process. (b) The non-linear spatial process
Threshold average filter
The filter values are set to produce the local average
about but not including the central pixel, i.e
1/8
1/8
1/8
0
1/8
1/8
h =
1/8
1/8
1/8
This average is then compared with the central pixel
value, and if it differs by more than a pre-set
threshold, the central pixel is replaced by the local
average. This is a useful process for dealing with
corrupted pixels of a random nature (so-called ‘data
drop-outs’).
Statistical operations (point,
grey-level operations)
Given an 8-bit deep image f(i, j), it is useful to
consider the pixel values f as a random variable
taking values between 0 and 255. For randomly
selected pixels we can define the probability distribution function, P(f) as:
P(f) = probability that pixel value is less than or
equal to f.
It follows that
0 ≤ P(f) ≤ 1
Median filter
For this important filter, the operator is ‘select the
median’. Normally the filter values are set to unity, in
which case the output pixel is just the median (middle
value) of the neighbourhood. defined by the filter
size. The effect of the median filter is to preserve the
edges of larger objects while removing all small
features. In other words it smoothes the image whilst
retaining the significant edges as sharply defined
luminance discontinuities. It is useful as a noise
reduction filter for certain types of image noise.
and P(255) = 1.
The first derivative of the probability distribution
function is known as the probability density function
(PDF). If we denote this as p(f) we have:
p(f) =
dP(f)
df
(6)
For a digital image the value of the PDF for f = fk
can be estimated from the number of image pixels
taking the value fk, i.e.
Digital image processing and manipulation
(a)
(b)
(c)
435
(d)
Figure 26.9
(a)
(b)
(c)
(d)
Various non-linear filters applied to the image of Figure 26.6(a)
Shrink filter
Expand filter
Shrink followed by expand
Expand followed by shrink
p(fk ) ≈
Nk
N
where Nk = the number of pixels of value fk, and N =
the total number of pixels in the image. The PDF is
therefore the normalized grey-level histogram. If the
histogram is denoted hg(f), we have:
p(f) ≈
hg(f)
N
(7)
436 Digital image processing and manipulation
The mean and variance of a digital image
Using the usual definitions of mean and variance, we
have:
μ =
σ2 =
1
255
N
f=0
Α f.h(f)
1
255
N
(8)
f=0
Α (f – μ)2.h(f)
(9)
These are the simplest measures of image statistics.
They are important in many areas of image processing, including image restoration and image classification. Note that any image processing operation that
changes the image histogram also changes the image
statistics.
Histogram modification techniques
These are operations affecting the image grey-levels
on a point-by-point basis. They can be represented by
the 1-d transform:
g = T(f)
(10)
Figure 26.10 Example transformations. (a) Negative.
(b) Selective grey-level stretch. (c) Binary segmentation.
(d) Sawtooth
where f is the input grey-level and g is the output
grey-level.
T(f) = 0
Negative transform
f > t1
(13)
This is illustrated in Figures 26.10(c) and 26.11.
The transform is given by:
T(f) = 255 – f
(11)
Saw-tooth grey-level transformation
and is illustrated in Figures 26.10(a) and 26.11.
This transformation, shown in Figure 26.10(d), is
useful for displaying images with a large dynamic
range, e.g. astronomical images where the scene may
be quantized to 16 bits. To see the detail in these
images on a simple 256 grey-level display the sawtooth transformation is employed.
Selective grey-level stretch
The transform is given by:
T(f) = 0
T(f) = 255
f < t0,
f – t0
t1 – t0
T(f) = 255
t0 ≤ f ≤ t1
f > t1
(12)
It is plotted in Figure 26.10(b).
Pixels less than t0 are set to zero. Pixels greater
than t1 are set to 255. The remaining pixel values are
stretched linearly between zero and 255. Note that
information is generally lost with this transformation
(because some pixels are set to zero or 255), although
the resulting image may be visually enhanced.
Binary segmentation
T(f) = 0
f < t0,
T(f) = 255
t0 ≤ f ≤ t 1
Histogram equalization
If we assume that the important information in an
image is contained in the grey-level values that occur
most frequently, then it makes sense to apply a greylevel transformation that stretches the contrast in
regions of high PDF and compresses the contrast in
regions of low PDF. The derivation of the required
transformation can be understood by referring to
Figure 26.12.
The number of pixels in the grey-level interval Δf
(input) is the same as the number of pixels in the
Digital image processing and manipulation
(a)
(b)
(c)
437
(d)
Figure 26.11 The negative of (a) is shown in (b). The image in (c) is a binary segmentation in which the upper threshold, t1, is
set to 255. The process is essentially binary thresholding. The image in (d) arises from binary segmentation using the
transformation 26.10(c)
grey-level interval Δg (output), i.e.:
pf (f).Δf = pg (g).Δg
pg (g) = pf (f).
Δf
Δg
Letting Δf and Δg tend to zero yields:
df
= constant so that
pg (g) = pf (f).
dg
so that
= constant
(14)
dg
df
= kpf (f).
(15)
438 Digital image processing and manipulation
struct the input histogram from a selected sub-region
of the image. An example of histogram equalization is
shown in Figure 26.13.
Image restoration
The process of recovering an image that has been
degraded, using some a priori knowledge of the
degradation phenomenon, is known as image restoration. In contrast, those image processing techniques
chosen to manipulate an image to improve it for some
subsequent visual or machine based decision are
usually termed enhancement procedures. Image restoration requires that we have a model of the degradation, which can be reversed and applied as a filter.
The procedure is illustrated in Figure 26.14.
We assume linearity so that the degradation can be
regarded as a convolution with a point spread
function together with the addition of noise, i.e.
f(x, y) = g(x, y) ᮏ h(x, y) + n(x, y)
Figure 26.12 The principle of histogram equalization. The
top diagram shows a hypothetical PDF for the input image.
The lower diagram shows the constant PDF achieved by
appropriate stretching and compression of the input contrast
(17)
where f(x, y) is the recorded image, g(x, y) is the
original (‘ideal’) image, h(x, y) is the system point
spread function and n(x, y) is the noise.
We need to find a correction process C͕ ͖ to apply
to f(x, y) with the intention of recovering g(x, y): i.e.
C͕ f(x, y)͖ → g(x, y) (or, at least, something close).
Ideally the correction process should be simple such
as a convolution.
Inverse filtering
If we integrate both sides of this expression with
respect to f, we obtain the required relationship
between g and f:
f
g = k ͐ pf (x)dx = kPf (f)
(16)
0
Hence the required grey-level transformation, T(f)
= kPf (f), is a scaled version of the probability
distribution function of the input image. The scaling
factor k is just gmax (usually 255).
In practice, pixel values are integers and T(f) is
estimated from the running sum of the histogram.
Since the output image must also be quantized into
integers, the histogram of the output image is not flat
as might be expected. The histogram is stretched and
compressed sideways so as to maintain a constant
local integral. As a result, some output grey-levels
may not be used.
It is crucial to understand that if the most
frequently occurring pixel values do not contain the
important information, then histogram equalization
will probably have an adverse effect on the usefulness
of the image. For the process to be controllable most
image processing packages offer an option to con-
This is the simplest form of image restoration. It
attempts to remove the effect of the point spread
function by taking an inverse filter. Writing the Fourier
space equivalent of equation (17) we obtain:
F(u, v) = G(u, v).H(u, v) + N(u, v)
(18)
where F, G, H and N represent the Fourier transforms
of f, g, h and n respectively. u and v are the spatial
frequency variables in the x- and y-directions.
An estimate for the recovered image, Gest (u, v) can
be obtained using:
Gest (u, v) =
F(u, v)
H(u, v)
= G(u, v) +
N(u, v)
H(u, v)
(19)
If there is no noise, N(u, v) = 0 and if H(u, v)
contains no zero values we get a perfect reconstruction. Generally noise is present and N(u, v) ≠ 0. Often
it has a constant value for all significant frequencies.
Since, for most degradations, H(u, v) tends to zero for
large values of u and v, the term
N(u, v)
H(u, v)
Digital image processing and manipulation
(a)
(b)
(c)
Figure 26.13 Histogram equalization
(a) Original image
(b) Equalized image
(c) Histogram of original image
(d) Histogram of equalized image
(d)
439