Eyes, JAPAN Blog > JPEG Compression

JPEG Compression

Emre Yalcinkaya

この記事は1年以上前に書かれたもので、内容が古い可能性がありますのでご注意ください。

Introduction

Probably everyone has seen those low quality images with weird blocky artifacts in them while surfing on the vast oceans of the world wide web. But why do they exist, where do those blocks come from and how come the quality gets worse the more a picture gets posted?

Well, as the title suggests, the culprit is called JPEG-Compression. In this blog entry I want to answer all of the above questions and give you an in-depth look at what JPEG is and why it is so widely used.

JPEG

JPEG is an image compression standard created in 1992 by the Joint Photographic Experts Group. That is also where the acronym JPEG comes from. 

The purpose of image compression is to reduce the size of an image file to save storage and for faster transmissions. This can be done lossless or lossy and as space efficient as possible depending on the needs. Lossless compressed files can be exactly reconstructed to their original state while the more space efficient lossy compression methods like JPEG lose a certain amount of Information during the process and cannot be exactly reconstructed. Therefore it is important for lossy compression methods to choose which information is necessary to retain and to process it so the compressed image is as indistinguishable as possible from the original image for human perception. 

The human eye is better at perceiving differences in brightness than differences in color. Also it is more sensitive to higher-frequency content in image. The JPEG  compression algorithm takes advantage of these facts by keeping the brightness while reducing the amount of color information and filtering out higher frequency content.

The compression algorithm

Let us look in detail how the JPEG algorithm works. These are the common steps:

  1. Color space transformation
  2. Chroma subsampling
  3. Block splitting
  4. Discrete cosine transform
  5. Quantization
  6. Entropy coding

1. Color space transformation 

In this first step we transform the RGB color space into the YCbCr color space. While RGB gives us the ability to adjust the red, green and blue components of an image the YCbCr color space gives us more control of the luminance or brightness by separating it to the Y component which will be important in the next step. The chrominance or color components Cb and Cr describe respectively the blue and red difference of an image.

The formula for the color space conversion looks as follows:

2. Chroma subsampling

Now that we have separated luminance and chrominance  values we can easily downsample the chrominance value by retaining the luminance which the human eye is more sensitive to. 

These are the most common ratios:

The first value describes the horizontal reference sample size, the second gives us the number of samples in the first row and the third value gives us the number of changes made for the second row.

3. Block splitting

For each YCbCr component we device the image into blocks of 8×8 or 16×16. These are called the Minimum Coded Units (MCU).

As you might already guess the further processing of the image in separate blocks is where the blocky artifacts in JPEG images come from.

4. Discrete cosine function

To prepare our blocks for quantization we perform the discrete cosine function on each block.

This way we get each block represented by a linear combination of the following 64 blocks:

If you zoom in on an JPEG compressed image you might recognize these patterns in the blocky artifacts.

5. Quantization

This is the most lossy step. We divide the DCT coefficients by quantization values and higher frequencies get higher quantization values due to the human eye not being sensitive to them as much. By adjusting these quantization tables the quality of the resulting image can also be controlled.

6. Entropy encoding

Lastly we sort the DCT coefficients by their frequencies which results in a zigzag-pattern and then apply the runlength encoding (RLE) on them and use the Huffman coding on what is left. This step happens lossless.

Decoding

Now to decode the JPEG file the inverse functions in inverse order need to be applied. 

Conclusion

As we saw there is a quite complicated process behind the curtains of JPEG and the loss of information is oftentimes worth it for the reduction of size it provides. Because it is around for such a long time it is also one of the most supported image formats. JPEG performs best when there are few sharp edges in an image such as in photographs depicting natural environments. It performs worse on images with typography or vector graphics. JPEG provides control over the quality of the compressed image but if size is not an issue and image quality is preferred  PNG provides a lossless alternative.

 

Comments are closed.