外文文献—数字图像压缩技术介绍.doc

资源描述

《外文文献—数字图像压缩技术介绍.doc》由会员分享，可在线阅读，更多相关《外文文献—数字图像压缩技术介绍.doc（6页珍藏版）》请在三一办公上搜索。

1、附录附录IDigital image compressionDigital image compression, also known as image compression or image coding is divided into still image compression and motion image compression (video compression). There is a high degree of correlation in the image data, an image of internal and video images between a

2、lot of redundant information. Redundant information including the following five: (1) Time redundancy: the difference between adjacent frames of the image sequence is very small, this time redundancy is called temporal redundancy. (2) spatial redundancy: an image internal uniform coloring part, or t

3、he images within the regular pattern, this space-related redundancy is known as spatial redundancy. (3) structural redundancy: in strong texture, or between the various parts of the image there is a certain relationship, such as self-similarity in the part of the image area memory. This redundancy i

4、s called structural redundancy. (4) the redundancy of knowledge: The information contained in the image and some basic knowledge of a priori, such as in the general face images, the mutual position of the head, eyes, nose and mouth is some common sense. This redundancy is called knowledge redundancy

5、. (5) visual redundancy: In most cases, the ultimate recipients of the reconstructed image is the human eye. In order to achieve higher compression ratio, you can use the characteristics of the human visual system. For example, the human eye, the ability to distinguish different colors, the sensitiv

6、ity of different directions. Therefore, if the encoding scheme is the use of some of the features of the human visual system, can further improve the compression ratio and image of the so-called subjective quality. Image coding is possible to remove redundant information of the various forms in orde

7、r to reduce the number of bits representing the image required Commonly used in image compression methods are the following:1, the run length encoding (RLE) Length encoding (run-length encoding) is one of the easiest way to compress a file. Its approach is a series of duplicate values (for example,

8、the gray values of image pixels) with a single value plus A count value to replace. For example, there is such a letter sequence aabbbccccccccdddddd the stroke length Encoding is 2a3b8c6d. This method is very easy to implement, but also for string compression with long repeated values。The coding is

9、very effective. For example, there are large areas of continuous shadow or the image of the same color, using this method pressure。Reduction effect of a good. Many bitmap file formats with a run length encoding, such as TIFF, PCX, GEM.2, the LZW coding This is the abbreviation of the name of three i

10、nventors (the Lempel, Ziv, Welch), its principle is that each one byte The value should be paired with the value of the next byte is a character, and set a code for each character. When the same Kind of a character on the re-emergence of code instead of this character pair, then this code and the ne

11、xt Character matching. LZW coding principle is an important feature, the code is not only able to replace a bunch of the same value of the data, but also be able to replace.A bunch of different data values. If some of the different data values in the image data is often repeated, can also be found A

12、 code to replace the data string. In this regard, the LZW compression principle is better than RLE.3, Huffman coding Huffman coding (Huffman encoding) instead of the original data is not fixed length coding to achieve. Huffman coding was first established, in order to compress the text file and so f

13、ar has been a lot of change Body. Its basic idea is the frequency the higher the value, the shorter the length of its corresponding coding, on the contrary the frequency of the more Low values, the corresponding coding length. Huffman coding rarely achieve 8:1 compression ratio, In addition, it also

14、 has the following two problems: The it must be refined Indeed the statistics of the frequency of occurrence of each value in the original document, if not this precise statistics, the effect of compression on will be greatly reduced, or even less than the compression effect. Huffman coding is usual

15、ly to go through twice the operation, the first Over the statistics, the second time the code, the encoding process is relatively slow. In addition, due to various length,encoded in the decoding process is relatively complex, so the extraction process is relatively slow. it is more sensitive. Huffma

16、n coding all together regardless of byte sub, so increase Plus one, or reduce one will make the decoding results beyond recognition.4, prediction and interpolation coding Usually in the local region in the image pixels are highly correlated, so using the previous pixel gray Expected degree of knowle

17、dge of the current pixel gray, which is forecast. The so-called interpolation is based on previous and pixel gray-scale knowledge to infer the current pixel grayscale. If the prediction and interpolation is correct,Do not have to compress each pixel gray, but the difference between the predicted val

18、ue and the actual pixel values after Entropy coded and sent to the receiving end. Predictive value and the difference signal to reconstruct the original pixel in the receiving end.Predictive coding can be obtained relatively high coding quality, and relatively simple to achieve, which is widely used

19、 in image compression coding system. But its compression ratio is not high, and accurate prediction depends on the image special.Of a priori knowledge, and must make a large number of non-linear operation, it is generally not used alone, but used in combination with other methods. Such as predictive

20、 coding in JPEG DCT DC coefficient The encoding of the exchange coefficient is used to quantify the + RLE + Huffman coding.5, vector quantization coding Vector quantization encoding the high correlation between adjacent image data, the input image data sequence grouping,Each set of m data constitute

21、 an m-dimensional vector, is encoded together, that is, to quantify more than once. According to the Shannon rate, Distortion theory for memoryless sources, the vector quantization coding is always better than scalar quantization coding.Before coding, first by the large number of samples of the trai

22、ning or learning, or self-organizing feature map neural network, get A series of standard image mode, each image pattern is called a codeword or code vector, these codewords or code vectors together.Together are called the codebook, the codebook is actually a database. The input image block in accor

23、dance with a certain way to form an input Vector. Encoding this input vector and all codewords of the code book to calculate the distance to find the nearest codeword, That is to find the best matching image block. The output index (address) as the encoding results. Decoding process is the opposite.

24、 According to the coding results of the index from the code book to find the index corresponding to code word (the code book and codingCodebook), constitute the decoding result. Therefore, vector quantization coding is a lossy codec. At present the use of more,the multi-vector quantization coding sc

25、heme is a random vector quantization, the transform domain vector quantization, finite state vector quantization, the address vector quantization waveform gain vector quantization, classified vector quantization, and prediction vector quantization.6, transform coding Transform coding image intensity

26、 matrix (time-domain signal) transform to the coefficient space (frequency domain signal) motivated Line processing method. Has a strong signal in space, reflected in the frequency domain within certain areas.The amount is often together, or the distribution of the coefficient matrix with some regul

27、arity. We can use these rules,Law to reduce the number of quantization bits in the frequency domain, to achieve the purpose of compression. As the transformation matrix of orthogonal transformation is reversible .Inverse matrix transpose matrix are equal, which makes the decoding operation is the so

28、lvability of operator convenience, so the operational matrix of the total is the choice of the orthogonal transform to do.Commonly used transform coding K-L transform coding and DCT coding. K-L transform coding in compression ratio is superior to DCT coding, but the large amount of computation and t

29、here is no fast algorithm for DCT coding is widely used in practical application.7, the model law coding Predictive coding, vector quantization coding and transform coding is a waveform coding, its theoretical foundation is a signal processor.Theory and information theory; starting point is the imag

30、e signal as irregular statistical signal from the correlation between pixels.This image signal statistical model starting the design of the encoder. Model coding is the use of computer vision and computer Graphics analysis and synthesis of knowledge on the image signal.Model coding the image signal

31、as the target and scene projection in the 3D world to the product of the two-dimensional plane, while Evaluation of the product is determined by the characteristics of the human visual system. Model encoded key is a particular graph.Like model, and according to this model to determine the characteri

32、stic parameters of the image of the scene, such as motion parameters, shape parameters.And so on. Decoding according to the parameters and known model synthesis image reconstruction of images. Encoded object is a specialSign parameters, instead of the original image, it is possible to achieve relati

33、vely large compression ratio. The error introduced by the model coding is less sensitive to the human visual geometric distortion, the reconstructed image is very natural and realistic. In addition, in recent years, fractal coding coding and wavelet transform techniques and an increasing number of a

34、pplications in image compression.Reduction of the field, but most are still in the research stage, still in front of the common image compression method described in the main. Of course, in actual applications, a variety of image compression methods are often combined to use, such as JPEG.数字图像压缩技术介绍

35、数字图像压缩又称为图像压缩或图像编码，分为静止图像压缩和运动图像压缩（视频压缩）。由于图像数据中存在着高度的相关性，一幅图像内部及视频图像之间存在大量的冗余信息。这些冗余信息主要包括以下五种：（1）时间冗余：图像序列的相邻帧之间差别很小，这种与时间相关的冗余称为时间冗余。（2）空间冗余：一幅图像内部存在均匀着色的部分，或者图像内部存在规则的模式，这种与空间相关的冗余称为空间冗余。（3）结构冗余：在图像的部分区域内存在着较强的纹理结构，或者图像的各部分之间存在着某种关系，如自相似性。这种冗余称为结构冗余。（4）知识冗余：图像中包含的信息与某些先验的基础知识有关，如在一般的人脸图像中，头、眼

36、、鼻和嘴的相互位置等信息就是一些常识。这种冗余称为知识冗余。（5）视觉冗余：在多数情况下，重建图像的最终接受者是人的眼睛。为了达到较高的压缩比，可以利用人类视觉系统的特点。比如人眼对不同颜色的分辨能力不同，对不同方向的敏感度也不同等等。因此，如果编码方案利用人类视觉系统的一些特点，可以进一步提高压缩比和图像的所谓主观质量。图像编码就是要尽可能的去除上述各种形式的冗余信息，以降低表示图像所需的比特数。常用的图像的压缩方法有以下几种： 1、行程长度编码（RLE）行程长度编码（run-length encoding）是压缩一个文件最简单的方法之一。它的做法就是把一系列的重复值（例如图象像素的灰度值

37、）用一个单独的值再加上一个计数值来取代。比如有这样一个字母序列aabbbccccccccdddddd它的行程长度编码就是2a3b8c6d。这种方法实现起来很容易，而且对于具有长重复值的串的压缩编码很有效。例如对于有大面积的连续阴影或者颜色相同的图象，使用这种方法压缩效果很好。很多位图文件格式都用行程长度编码，例如TIFF，PCX，GEM等。 2、LZW编码这是三个发明人名字的缩写（Lempel，Ziv，Welch），其原理是将每一个字节的值都要与下一个字节的值配成一个字符对，并为每个字符对设定一个代码。当同样的一个字符对再度出现时，就用代号代替这一字符对，然后再以这个代号与下个字符配对。 L

38、ZW编码原理的一个重要特征是，代码不仅仅能取代一串同值的数据，也能够代替一串不同值的数据。在图像数据中若有某些不同值的数据经常重复出现，也能找到一个代号来取代这些数据串。在此方面，LZW压缩原理是优于RLE的。 3、霍夫曼编码霍夫曼编码（Huffman encoding）是通过用不固定长度的编码代替原始数据来实现的。霍夫曼编码最初是为了对文本文件进行压缩而建立的，迄今已经有很多变体。它的基本思路是出现频率越高的值，其对应的编码长度越短，反之出现频率越低的值，其对应的编码长度越长。霍夫曼编码很少能达到81的压缩比，此外它还有以下两个不足：它必须精确地统计出原始文件中每个值的出现频率，如果没有

39、这个精确统计，压缩的效果就会大打折扣，甚至根本达不到压缩的效果。霍夫曼编码通常要经过两遍操作，第一遍进行统计，第二遍产生编码，所以编码的过程是比较慢的。另外由于各种长度的编码的译码过程也是比较复杂的，因此解压缩的过程也比较慢。它对于位的增删比较敏感。由于霍夫曼编码的所有位都是合在一起的而不考虑字节分位，因此增加一位或者减少一位都会使译码结果面目全非。 4、预测及内插编码一般在图象中局部区域的象素是高度相关的，因此可以用先前的象素的有关灰度知识来对当前象素的灰度进行预计，这就是预测。而所谓内插就是根据先前的和后来的象素的灰度知识来推断当前象素的灰度情况。如果预测和内插是正确的，则不必对每一个

40、象素的灰度都进行压缩，而是把预测值与实际象素值之间的差值经过熵编码后发送到接收端。在接收端通过预测值加差值信号来重建原象素。预测编码可以获得比较高的编码质量，并且实现起来比较简单，因而被广泛地应用于图象压缩编码系统。但是它的压缩比并不高，而且精确的预测有赖于图象特性的大量的先验知识，并且必须作大量的非线性运算，因此一般不单独使用，而是与其它方法结合起来使用。如在JPEG中，使用了预测编码技术对DCT直流系数进行编码，而对交流系数则使用量化游程编码霍夫曼编码。 5、矢量量化编码矢量量化编码利用相邻图象数据间的高度相关性，将输入图象数据序列分组，每一组m个数据构成一个m维矢量，一起进行编码，即

41、一次量化多个点。根据香农率失真理论，对于无记忆信源，矢量量化编码总是优于标量量化编码。编码前，先通过大量样本的训练或学习或自组织特征映射神经网络方法，得到一系列的标准图象模式，每一个图象模式就称为码字或码矢，这些码字或码矢合在一起称为码书，码书实际上就是数据库。输入图象块按照一定的方式形成一个输入矢量。编码时用这个输入矢量与码书中的所有码字计算距离，找到距离最近的码字，即找到最佳匹配图象块。输出其索引（地址）作为编码结果。解码过程与之相反，根据编码结果中的索引从码书中找到索引对应的码字（该码书必须与编码时使用的码书一致），构成解码结果。由此可知，矢量量化编码是有损编码。目前使用较多的矢量量化

42、编码方案主要是随机型矢量量化，包括变换域矢量量化，有限状态矢量量化，地址矢量量化，波形增益矢量量化，分类矢量量化及预测矢量量化等。 6、变换编码变换编码就是将图象光强矩阵（时域信号）变换到系数空间（频域信号）上进行处理的方法。在空间上具有强相关的信号，反映在频域上是某些特定的区域内能量常常被集中在一起，或者是系数矩阵的分布具有某些规律。我们可以利用这些规律在频域上减少量化比特数，达到压缩的目的。由于正交变换的变换矩阵是可逆的且逆矩阵与转置矩阵相等，这就使解码运算是有解的且运算方便，因此运算矩阵总是选用正交变换来做。常用的变换编码有KL变换编码和DCT编码。KL变换编码在压缩比上优于DCT编码

43、，但其运算量大且没有快速算法，因此实际应用中广泛采用DCT编码。 7、模型法编码预测编码、矢量量化编码以及变换编码都属于波形编码，其理论基础是信号理论和信息论；其出发点是将图象信号看作不规则的统计信号，从象素之间的相关性这一图象信号统计模型出发设计编码器。而模型编码则是利用计算机视觉和计算机图形学的知识对图象信号的分析与合成。模型编码将图象信号看作三维世界中的目标和景物投影到二维平面的产物，而对这一产物的评价是由人类视觉系统的特性决定的。模型编码的关键是对特定的图象建立模型，并根据这个模型确定图象中景物的特征参数，如运动参数、形状参数等。解码时则根据参数和已知模型用图象合成技术重建图象。由于编码的对象是特征参数，而不是原始图象，因此有可能实现比较大的压缩比。模型编码引入的误差主要是人眼视觉不太敏感的几何失真，因此重建图象非常自然和逼真。此外，近些年来，分形编码编码和小波变换的技术也越来越多的应用在图像压缩的领域中，但是大多仍处于研究阶段，常见的图像压缩方法仍以前面介绍的为主。当然，在实际的应用中，多种图像压缩方法往往是结合起来使用的，如JPEG等。

展开阅读全文