《jpgstego01.doc》由会员分享,可在线阅读,更多相关《jpgstego01.doc(6页珍藏版)》请在三一办公上搜索。
1、Steganalysis Based on JPEG CompatibilityJessica Fridrich* Correspondence: Email: fridrichbinghamton.edu; WWW: http:/www.ssie.binghamton.edu/fridrich; Telephone: 607 777 2577; Fax: 607 777 2577a, Miroslav Goljanb, Rui DubaCenter for Intelligent Systems, bDepartment of Electrical Engineering, SUNY Bin
2、ghamton, Binghamton, NY 13902-6000ABSTRACTIn this paper, we introduce a new forensic tool that can reliably detect modifications in digital images, such as distortion due to steganography and watermarking, in images that were originally stored in the JPEG format. The JPEG compression leaves unique f
3、ingerprints and serves as a “fragile watermark” enabling us to detect changes as small as modifying the LSB of one randomly chosen pixel. The detection of changes is based on investigating the compatibility of 88 blocks of pixels with JPEG compression with a given quantization matrix. The proposed s
4、teganalytic method is applicable to virtually all steganographic and watermarking algorithms with the exception of those that embed message bits into the quantized JPEG DCT coefficients. The method can also be used to estimate the size of the secret message and identify the pixels that carry message
5、 bits. As a consequence of our steganalysis, we strongly recommend avoiding using images that have been originally stored in the JPEG format as cover-images for spatial-domain steganography.Keywords: Steganography, steganalysis, JPEG1. INTRODUCTIONSteganography is the art of secret communication. It
6、s purpose is to hide the very presence of communication as opposed to cryptography whose goal is to make communication unintelligible to those who do not posses the right keys1. Digital images, videos, sound files, and other computer files that contain perceptually irrelevant or redundant informatio
7、n can be used as “covers” or carriers to hide secret messages. After embedding a secret message into the cover-image, a so-called stego-image is obtained. It is important that the stego-image does not contain any easily detectable artifacts due to message embedding. A third party could use such arti
8、facts as an indication that a secret message is present. Once this message detection can be reliably achieved, the steganographic tool becomes useless.Obviously, the less information is embedded into the cover-image, the smaller the probability of introducing detectable artifacts by the embedding pr
9、ocess. Another important factor is the choice of the cover-image. The selection is at the discretion of the person who sends the message. The sender should avoid using cover-images that would be easy to analyze for presence of secret messages. For example, one should not use computer art, charts, im
10、ages with large areas of uniform color, images with only a few colors, and images with a unique semantic content, such as fonts. Although computer-generated fractal images may seem as good covers6 because of their complexity and irregularity, they are generated by strict deterministic rules that may
11、 be easily violated by message embedding3. Scans of photographs or images obtained with a digital camera contain a high number of colors and are usually recommended and considered safe for steganography. Some steganographic experts recommend grayscale images as the best cover-images2. There are esse
12、ntially three types of image formats: raw, uncompressed formats (BMP, PCX), palette formats (GIF), and lossy compressed formats (JPEG, Wavelet, JPEG2000). Only few current steganographic programs offer the capability to embed messages directly in the JPEG stream. It is a difficult problem to devise
13、a steganographic method that would hide messages in the JPEG stream in a secure manner while keeping the capacity practical. Far more programs use the BMP, PCX, or the GIF palette-based format. The GIF format is a difficult environment for secure steganography with reasonable capacity3,7. Also, most
14、 steganographic techniques for GIFs implemented in current software products prioritize capacity over security and are thus relatively easy to detect4,5. The raw formats, such as BMP, offer the highest capacity and best overall security. In this paper, we demonstrate that even 24-bit images or grays
15、cale 8-bit images may actually be extremely poor candidates for cover-images if they were initially acquired as JPEG images and later decompressed to a lossless format. In fact, it is quite reasonable to expect that most casual users of steganographic programs will use scanned images or images from
16、a digital camera that were originally stored in the JPEG format due to its efficiency in data storage.All steganographic methods strive to achieve the minimal amount of distortion in order to minimize the likelihood of introducing any visible artifacts. Consequently, if the cover-image, was initiall
17、y stored in the JPEG format, the act of message embedding will not erase the characteristic structure created by the JPEG compression and one can still easily determine whether or not a given image has been stored as JPEG in the past. Actually, unless the image is too small, one can reliably recover
18、 even the values of the JPEG quantization table by carefully analyzing the values of DCT coefficients in all 88 blocks. After message embedding, however, the cover-image will become (with a high probability) incompatible with the JPEG format in the sense that it may be possible to prove that a parti
19、cular 88 block of pixels could not have been produced by JPEG decompression of any block of quantized coefficients. This finding provides strong evidence that the block has been modified. It is highly suspicious to find an image stored in a lossless format that bears a strong fingerprint of JPEG com
20、pression, yet is not fully compatible with any JPEG compressed image. This can be interpreted as evidence for steganography.By checking the JPEG compatibility of every block, we can potentially detect messages as short as one bit. And the steganalytic method will work for virtually any steganographi
21、c or watermarking method, not just the LSB embedding! Indeed, in our experiments, we have found out that even one randomly selected pixel whose gray level has been modified by one can be detected with very high probability. For longer messages, one can even attempt to estimate the message length and
22、 its position in the image by determining which 88 blocks are incompatible with JPEG compression. It is even possible to analyze the image and estimate the likely candidate for the cover-image or its blocks (the closest JPEG compatible image/block). This way, we may be able to identify individual pi
23、xels that have been modified. All this indicates that an extremely serious information leakage from the steganographic method can occur and thus completely compromise the steganographic channel.In this paper, we elaborate on the idea presented in the previous paragraph. In Section 2, we describe an
24、algorithm that can decide if a given 88 block of pixels is compatible with JPEG compression with a given quantization matrix. Appendix A contains details on how the quantization matrix can be estimated from the image. Throughout the paper, we point out some limitations of the proposed steganalytic t
25、echnique and finally, in Section 3, we conclude the paper and outline future research directions. As a consequence of this research, we strongly urge users of steganographic programs to avoid images previously stored in the JPEG format as cover-images.2. STEGANALYSIS BASED ON JPEG COMPATIBILITYAltho
26、ugh in this paper we explain the technique on grayscale images, it can be extended to color images in a straightforward manner. We start with a short description of the JPEG compression algorithm. In JPEG compression, the image is first divided into disjoint blocks of 88 pixels. For each block Borig
27、 (with integer pixel values in the range 0-255), the discrete cosine transform (DCT) is calculated, producing 64 DCT coefficients. Let us denote the i-th DCT coefficient of the k-th block as dk(i), 0 i 64, k = 1, , T, where T is the total number of blocks in the image. In each block, all 64 coeffici
28、ents are further quantized to integers Dk(i) using the JPEG quantization matrix Q.The quantized coefficients Dk(i) are arranged in a zig-zag manner and compressed using the Huffman coder. The resulting compressed stream together with a header forms the final JPEG file.The decompression works in the
29、opposite order. The JPEG bit-stream is decompressed using the Huffman coder and the quantized DCT coefficients Dk(i) are multiplied by Q(i) to obtain DCT coefficients QDk, QDk(i) = Q(i)Dk(i) for all k and i. Then, the inverse DCT is applied to QDk and the result is rounded to integers in the range 0
30、-255,(1)and x=integer_round(x) for 0 x 255, x=0 for x 255. In the last expression, we dropped the block index k to simplify the notation. We note that because the JPEG compression is lossy, in general Borig may not be equal to B. If the block B has no pixels saturated at 0 or 255, we can write in th
31、e L2 norm|Braw-B|2 16,(2)because |Braw(i)-B(i)|1/2 for all i = 1, , 64 due to rounding.Suppose that we know the quantization matrix Q (see Appendix A). Our steganalytic technique is based on the following question: Given an arbitrary 88 block of pixel values B, could this block have arisen through t
32、he process of JPEG decompression with the quantization matrix Q? Denoting QD=DCT(B), we can write using the Parservals equality.On the other hand, we can find a lower estimate for the expression |QD-QD|2 by substituting for QD(i) the closest integer multiple of Q(i): . (3)The quantity S can be calcu
33、lated from the block B provided the quantization matrix Q is known. If S is larger than 16, we can conclude that the image block B is not compatible with JPEG compression with the quantization matrix Q. We reiterate that this is true only for blocks that do not have pixels that are saturated at 0 or
34、 255. Indeed, the estimate (2) may not hold for blocks that have saturated pixels because the rounding at 0 and 255 can be much larger than 1/2.If, for a given block B with unsaturated pixels, S is smaller than 16, the block B may or may not be JPEG compatible. Let qp(i), p=1, , be integer multiples
35、 of Q(i) that are closest to QD(i) ordered by their distance from QD(i) (the closest multiple is q1(i). In order to decide whether or not a given block B is compatible with JPEG compression with quantization table Q, we need to inspect all 64-tuples of indices p(1), , p(64) for which(4)and check if
36、B=DCT-1(QD), where QD(i)= qp(i)(i). (5)If, for at least one set of indices p(1), , p(64), the expression (5) is satisfied, the block B is JPEG compatible, otherwise it is not.The number of 64-tuples p(1), , p(64) satisfying expression (4) is always finite but it rapidly increases with increasing JPE
37、G quality factor. For quality factors higher than 95, a large number of quantization factors Q(i) become 1 or 2, and the total number of combinations of all 64 indices becomes too large to handle. We performed our experiments in Matlab on a Pentium II computer with 128MB memory. Once the quality fac
38、tor exceeded 95, the running time became too long because Matlab ran out of memory and had to access the hard disk. We acknowledge this complexity increase as a limitation of our approach. In the future, we would like to develop a better and faster algorithm for testing JPEG compatibility for high q
39、uality compression factors.Description of the algorithm:1. Divide the image into a grid of 88 blocks, skipping the last few rows or columns if the image dimensions are not multiples of 8. 2. Arrange the blocks in a list and remove all saturated blocks from the list (a block is saturated if it has at
40、 least one pixel with a gray value 0 or 255). Denote the total number of blocks as T.3. Extract the quantization matrix Q from all T blocks as described in Appendix A. If all the elements of Q are ones, the image was not previously stored as JPEG and our steganalytic method does not apply (exit this
41、 algorithm). If more than one plausible candidate exists for Q, the steps 4-6 need to be carried out for all candidates and the results that give the highest number of JPEG compatible blocks will be accepted as the result of this algorithm.4. For each block B calculate the quantity S (see equation (
42、3).5. If S16, the block B is not compatible with JPEG compression with quantization matrix Q. If S16, for each DCT coefficient QDi calculate the closest multiples of Q(i), order them by their distance from QDi, and denote them qp(i), p=1, . For those combinations, for which the inequality (4) is sat
43、isfied, check if expression (5) holds. If, for at least one set of indices p(1), , p(64) the expression (5) is satisfied, the block B is JPEG compatible, otherwise it is not.6. After going through all T blocks, if no incompatible JPEG blocks are found, the conclusion is that our steganalytic method
44、did not find any evidence for presence of secret messages. If, on the other hand, there are some JPEG incompatible blocks, we can attempt to estimate the size of the secret message, locate the message-bearing pixels, and even attempt to obtain the original cover image before secret message embedding
45、 started.7. If all blocks are identified as JPEG incompatible or if the image does not appear to be previously stored as JPEG, we should repeat the algorithm for different 88 divisions of the image (shifted by 0 to 7 pixels in the x and y directions). This step may be necessary if the cover image ha
46、s been cropped prior to message embedding. 3. CONCLUSIONS AND FUTURE EFFORTIn this paper, we describe a new steganographic technique that can reliably detect modifications in digital images, such as those due to steganography and watermarking, as long as the original image (the cover-image) has been
47、 previously stored in the JPEG format. The steganalytic technique starts with extracting the JPEG quantization matrix by carefully inspecting the clusters of DCT coefficients in all 88 blocks. Then, each block is analyzed if its pixel values are truncated values of an inverse DCT transform of a set
48、of coefficients quantized with the extracted quantization matrix. A simple necessary condition is derived that makes the analysis computationally feasible. If the corresponding quantized DCT coefficients are found, the block is termed compatible, otherwise it is not. The steganalytic technique will work for all steganographic methods, except the methods that embed information directly into the compressed JPEG stream. It has a potential to detect changes as small as one pixel. By inspecting the closest compatible JPEG block, we can even attempt to locate the pixels that have been modified.