利用离散余弦变换进行图像压缩毕业论文外文翻译.doc

资源描述

《利用离散余弦变换进行图像压缩毕业论文外文翻译.doc》由会员分享，可在线阅读，更多相关《利用离散余弦变换进行图像压缩毕业论文外文翻译.doc（24页珍藏版）》请在三一办公上搜索。

1、Image Compression Using the Discrete Cosine TransformAbstractThe discrete cosine transform (DCT) is a technique for converting a signal into elementary frequency components. It is widely used in image compression. Here we develop some simple functions to compute the DCT and to compress images. These

2、 functions illustrate the power of Mathematica in the prototyping of image processing algorithms.The rapid growth of digital imaging applications, including desktop publishing, multimedia, teleconferencing, and high-definition television (HDTV) has increased the need for effective and standardized i

3、mage compression techniques. Among the emerging standards are JPEG, for compression of still images Wallace 1991; MPEG, for compression of motion video Puri 1992; and CCITT H.261 (also known as Px64), for compression of video telephony and teleconferencing.All three of these standards employ a basic

4、 technique known as the discrete cosine transform (DCT). Developed by Ahmed, Natarajan, and Rao 1974, the DCT is a close relative of the discrete Fourier transform (DFT). Its application to image compression was pioneered by Chen and Pratt 1984. In this article, I will develop some simple functions

5、to compute the DCT and show how it is used for image compression. We have used these functions in our laboratory to explore methods of optimizing image compression for the human viewer, using information about the human visual system Watson 1993. The goal of this paper is to illustrate the use of Ma

6、thematica in image processing and to provide the reader with the basic tools for further exploration of this subject.The One-Dimensional Discrete Cosine TransformThe discrete cosine transform of a list of n real numbers s(x), x = 0, ., n-1, is the list of length n given by:s(u)= C(u) u=0,nwhere for

7、u=0 =1 otherwiseEach element of the transformed list S(u) is the inner (dot) product of the input list s(x) and basis vector. The constant factors are chosen so that the basis vectors are orthogonal and normalized. The eight basis vectors for n = 8 are shown in Figure 1. The DCT can be writthe produ

8、ct of a vector (the input list) and the n x n orthogonal matrix whose rows are the vectors. This matrix, for n = 8, can be computed as follows:DCTMatrix =Table If k=0,Sqrt1/8,Sqrt2/8 CosPi (2j+1) k/16 ,k,0,7,j,0,7 / N;We can check that the matrix is orthogonal:DCTMatrix . TransposeDCTMatrix / Chop /

9、 MatrixForm1. 0 0 0 0 0 0 00 1. 0 0 0 0 0 00 0 1. 0 0 0 0 00 0 0 1. 0 0 0 00 0 0 0 1. 0 0 00 0 0 0 0 1. 0 00 0 0 0 0 0 1. 00 0 0 0 0 0 0 1.Each basis vector corresponds to a sinusoid of a certain frequency:ShowGraphicsArrayPartitionListPlot#, PlotRange - -.5, .5, PlotJoined - True, DisplayFunction -

10、 Identity& / DCTMatrix, 2 ;Figure 1. The eight basis vectors for the discrete cosine transform of length eight.The list s(x) can be recovered from its transform S(u) by applying the inverse cosine transform (IDCT):= x=0,nwhere for u=0 =1 otherwiseThis equation expresses s as a linear combination of

11、the basis vectors. The coefficients are the elements of the transform S, which may be regarded as reflecting the amount of each frequency present in the inputs.We generate a list of random numbers to serve as a test input:input1 = TableRandomReal, -1, 1, 80.203056, 0.980407, 0.35312, -0.106651, 0.03

12、99382, 0.871475, -0.648355, 0.501067The DCT is computed by matrix multiplication:output1 = DCTMatrix . input10.775716, 0.3727, 0.185299, 0.0121461, -0.325, -0.993021, 0.559794, -0.625127As noted above, the DCT is closely related to the discrete Fourier transform (DFT). In fact, it is possible to com

13、pute the DCT via the DFT (see Jain 1989, p. 152): First create a new list by extracting the even elements, followed by the reversed odd elements. Then multiply the DFT of this re-ordered list by so-called twiddle factors and take the real part. We can carry out this process for n = 8 using Mathemati

14、cas DFT function.DCTTwiddleFactors = N Join1, TableSqrt2 Exp-I Pi k /16, k, 71., 1.38704 - 0.275899 I, 1.30656 - 0.541196 I, 1.17588 - 0.785695 I, 1. - 1. I, 0.785695 - 1.17588 I, 0.541196 - 1.30656 I, 0.275899 - 1.38704 IThe function to compute the DCT of a list of length n = 8 is then:DCTlist_ :=

15、Re DCTTwiddleFactors *InverseFourierNlist1, 3, 5, 7, 8, 6, 4, 2Note that we use the function InverseFourier to implement what is usually in engineering called the forward DFT. Likewise, we use Fourier to implement what is usually called the inverse DFT. The function N is used to convert integers to

16、reals because (in Version 2.2) Fourier and InverseFourier are not evaluated numerically when their arguments are all integers. The special case of a list of zeros needs to be handled separately by overloading the functions, since N of the integer 0 is an integer and not a real. UnprotectFourier, Inv

17、erseFourier;Fourierx:0 .:= x;InverseFourierx:0 .:= x;ProtectFourier, InverseFourier;We apply DCT to our test input and compare it to the earlier result computed by matrix multiplication. To compare the results, we subtract them and apply the Chop function to suppress values very close to zero:DCTinp

18、ut10.775716, 0.3727, 0.185299, 0.0121461, -0.325, -0.993021, 0.559794, -0.625127% - output1 / Chop0, 0, 0, 0, 0, 0, 0, 0The inverse DCT can be computed by multiplication with the inverse of the DCT matrix. We illustrate this with our previous example:InverseDCTMatrix . output10.203056, 0.980407, 0.3

19、5312, -0.106651, 0.0399382, 0.871475, -0.648355, 0.501067% - input1 / Chop0, 0, 0, 0, 0, 0, 0, 0As you might expect, the IDCT can also be computed via the inverse DFT. The twiddle factors are the complex conjugates of the DCT factors and the reordering is applied at the end rather than the beginning

20、:IDCTTwiddleFactors = ConjugateDCTTwiddleFactors1., 1.38704 + 0.275899 I, 1.30656 + 0.541196 I, 1.17588 + 0.785695 I, 1. + 1. I, 0.785695 + 1.17588 I, 0.541196 + 1.30656 I, 0.275899 + 1.38704 IIDCTlist_ := ReFourierIDCTTwiddleFactors list 1, 8, 2, 7, 3, 6, 4, 5For example:IDCTDCTinput1 - input1 / Ch

21、op0, 0, 0, 0, 0, 0, 0, 0The Two-Dimensional DCTThe one-dimensional DCT is useful in processing one-dimensional signals such as speech waveforms. For analysis of two-dimensional (2D) signals such as images, we need a 2D version of the DCT. For an n x m matrix s, the 2D DCT is computed in a simple way

22、: The 1D DCT is applied to each row of s and then to each column of the result. Thus, the transform of s is given by u=0,n v=0,mwhere for u=0 =1 otherwiseSince the 2D DCT can be computed by applying 1D transforms separately to the rows and columns, we say that the 2D DCT is separable in the two dime

23、nsions.As in the one-dimensional case, each element S(u, v) of the transform is the inner product of the input and a basis function, but in this case, the basis functions are n x m matrices. Each two-dimensional basis matrix is the outer product of two of the one-dimensional basis vectors. For n = m

24、 = 8, the following expression creates an 8 x 8 array of the 8 x 8 basis matrices, a tensor with dimensions 8, 8, 8, 8:DCTTensor = ArrayOuterTimes, DCTMatrix#1, DCTMatrix#2&, 8, 8;Each basis matrix can be thought of as an image. The 64 basis images in the array are shown in Figure 2. The package Gra

25、phicsImage.m, included in the electronic supplement, contains the functions GraphicsImage and ShowImage to create and display a graphics object from a given matrix. GraphicsImage uses the built-in function Raster to translate a matrix into an array of gray cells. The matrix elements are scaled so th

26、at the image spans the full range of graylevels. An optional second argument specifies a range of values to occupy the full grayscale; values outside the range are clipped. The function ShowImage displays the graphics object using Show. GraphicsImage.mShowGraphicsArrayMapGraphicsImage#, -.25, .25&,

27、ReverseDCTTensor, 2 ;Figure 2. The 8 x 8 array of basis images for the two-dimensional discrete cosine transform.Each basis matrix is characterized by a horizontal and a vertical spatial frequency. The matrices shown here are arranged left to right and bottom to top in order of increasing frequencie

28、s.To illustrate the 2D transform, we apply it to an 8 x 8 image of the letter A:ShowImage input2 =0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0,0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0,0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0,0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0As in the 1D ca

29、se, it is possible to express the 2D DCT as an array of inner products (a tensor contraction):output2 = Array(Plus FlattenDCTTensor#1, #2 input2)&,8, 8;ShowImageoutput2The pixels in this DCT image describe the proportion of each two-dimensional basis function present in the input image. The pixels a

30、re arranged as above, with horizontal and vertical frequency increasing from left to right and bottom to top, respectively. The brightest pixel in the lower left corner is known as the DC term, with frequency 0, 0. It is the average of the pixels in the input, and is typically the largest coefficien

31、t in the DCT of natural images.An inverse 2D IDCT can also be computed in terms of DCT Tensor; we leave this as an exercise for the reader. Since the two-dimensional DCT is separable, we can extend our function DCT to the case of two-dimensional input as follows:DCTarray_?MatrixQ :=TransposeDCT / Tr

32、ansposeDCT / array This function assumes that its input is an 8 x 8 matrix. It takes the 1D DCT of each row, transposes the result, takes the DCT of each new row, and transposes again. This function is more efficient than computing the tensor contraction shown above, since it exploits the built-in f

33、unction InverseFourier.We compare the result of this function to that obtained using contraction of tensors :DCTinput2 - output2 / Chop / Abs / Max 0The definition of the inverse 2D DCT is straightforward:IDCTarray_?MatrixQ :=TransposeIDCT / TransposeIDCT / array As an example, we invert the transfo

34、rm of the letter A:ShowImageChopIDCToutput2;As noted earlier, the components of the DCT output indicate the magnitude of image components at various 2D spatial frequencies. To illustrate, we can set the last row and column of the DCT of the letter A equal to zero:output28 = Table0, 8;Dooutput2i, 8 =

35、 0, i, 8;Now take the inverse transform:ShowImageChopIDCToutput2;The result is a blurred letter A: the highest horizontal and vertical frequencies have been removed. This is easiest to see when the image is reduced in size so that individual pixels are not as visible.QuantizationDCT-based image comp

36、ression relies on two techniques to reduce the data required to represent the image. The first is quantization of the images DCT coefficients; the second is entropy coding of the quantized coefficients. Quantization is the process of reducing the number of possible values of a quantity, thereby redu

37、cing the number of bits needed to represent it. Entropy coding is a technique for representing the quantized data as compactly as possible. We will develop functions to quantize images and to calculate the level of compression provided by different degrees of quantization. We will not implement the

38、entropy coding required to create a compressed image file.A simple example of quantization is the rounding of reals into integers. To represent a real number between 0 and 7 to some specified precision takes many bits. Rounding the number to the nearest integer gives a quantity that can be represent

39、ed by just three bits.x = RandomReal, 0, 7 2.78452Roundx 3In this process, we reduce the number of possible values of the quantity (and thus the number of bits needed to represent it) at the cost of losing information. A finer quantization, that allows more values and loses less information, can be

40、obtained by dividing the number by a weight factor before rounding:w = 1/4;Roundx/w 11Taking a larger value for the weight gives a coarser quantization.Dequantization, which maps the quantized value back into its original range (but not its original precision) is acheived by multiplying the value by

41、 the weight:w * % / N 2.75The quantization error is the change in a quantity after quantization and dequantization. The largest possible quantization error is half the value of the quantization weight.In the JPEG image compression standard, each DCT coefficient is quantized using a weight that depen

42、ds on the frequencies for that coefficient. The coefficients in each 8 x 8 block are divided by a corresponding entry of an 8 x 8 quantization matrix, and the result is rounded to the nearest integer.In general, higher spatial frequencies are less visible to the human eye than low frequencies. There

43、fore, the quantization factors are usually chosen to be larger for the higher frequencies. The following quantization matrix is widely used for monochrome images and for the luminance component of a color image. It is given in the JPEG standards documents, yet is not part of the standard, so we call

44、 it the de facto matrix: qLum = 16, 11, 10, 16, 24, 40, 51, 61, 12, 12, 14, 19, 26, 58, 60, 55, 14, 13, 16, 24, 40, 57, 69, 56, 14, 17, 22, 29, 51, 87, 80, 62, 18, 22, 37, 56, 68,109,103, 77, 24, 35, 55, 64, 81,104,113, 92, 49, 64, 78, 87,103,121,120,101, 72, 92, 95, 98,112,100,103, 99;Displaying th

45、e matrix as a grayscale image shows the dependence of the quantization factors on the frequencies: ShowImageqLum;To implement the quantization process, we must partition the transformed image into 8 x 8 blocks:BlockImageimage_, blocksize_:8, 8 :=Partitionimage, blocksize /; And IntegerQ / (Dimension

46、simage/blocksize)The function UnBlockImage reassembles the blocks into a single image:UnBlockImageblocks_ :=Partition FlattenTransposeblocks, 1, 3, 2, Times Dimensionsblocks2, 4For example:Tablei + 8(j-1), j, 4, i, 6 / MatrixForm1 2 3 4 5 69 10 11 12 13 1417 18 19 20 21 2225 26 27 28 29 30BlockImage%, 2, 3 / MatrixForm1 2 3 4 5 69 10 11 12 13 1417 18 19 20 21 2225 26 27 28 29 30UnBlockImage% / MatrixForm1 2 3 4 5 69 10 11 12 13 1417 18 19 20 21 2

展开阅读全文