JPEG compression level setting

Doug Kerr · Jun 2, 2014

We recognize that when using the JPEG scheme for compression of image data, we often have the opportunity to set what we can speak of (imprecisely) as the "compression level".

Sometimes (as in a typical digital camera) we have only two of three choices, often with names ("normal", "fine", "superfine").

In image manipulation software, we are often able to set the desired compression level numerically, perhaps with an index that runs from 1-12 and sometimes one that runs from 0-100.

Does such an index set a clearly-defined parameter of the compression process?

Well, yes and no.

First let us note that there are at least two wholly-different aspects of the overall JPEG compression process that we can vary.

Chroma subsampling pattern

We do not necessarily store the definition of the pixel chroma (which, in a certain way, defines the hue and saturation of the pixel's color) separately for each pixel. We may store a pair of chroma values for each pixel, or for only one out of each two pixels, or for only one out of each four pixels. This of course influences the amount of data in the file for chroma information (a significant influence on file size), and as a trade-off influences the resolution of chromaticity information in the reconstructed image.

Quantization tables

Quantization refers to taking information known to a certain precision and restating it to a lesser precision. ("Rounding" is a familiar example).

Simplistically, the more "coarsely" we quantize a certain body of data, the fewer bits are needed to represent it.

In JPEG, the luma and chroma information (derived from the RGB representation of each pixel's color) is represented in a scheme that is called the discrete cosine transform (DCT), somewhat like using the Fourier transform of an electrical signal to represent it.

The DCT coefficients (initially reckoned to a fairly high precision) are then quantized (so they are represented in fewer bits). The quantizing "coarseness" varies over the coefficients representing different spatial frequencies in the image, which takes into account the perceptual properties of the eye.

Thus, there is a table that tells, for each combination of vertical and horizontal frequency, how the DCT coefficient for that frequency combination is to be quantized (as if that table told, for each frequency pair, to how many significant figures the coefficient will be stated).

Thus, to quantize "more finely" or "more closely" overall, we would use a different table.

By the way, there is a separate table for the luma and chroma data.

If there a standard "repertoire" of quantizing tables, running from "very fine" to "very coarse" from which we choose when encoding, depending on the compression level; chosen by the user?

Not exactly. For one thing, the designer of the JPEG encoder is free to use such designs of quantizing tables as he finds suitable. This does not baffle the receiving decoder, since the details of both encoding tables used to encode an image are carried in the JPEG file "header", and the way they are to be "worked" is standardized.

But in fact there is a "suggested" repertoire given by an industry group (which in fact publishes a library of routines that can be used to build a JPEG encoder.

And the way that works is that there are "model" tables that can then be scaled up or down (to be "more coarse" of "more fine" by an algorithm.

The parameter for this scaling is related to a "quality index" which runs from 0-100, 100 leading to the "finest" quantization that is possible given various aspect of the system structure).

So if in fact the JPEG encoder used in a particular application follows this "suggested" quantization table building doctrine (which I understand few serious encoders do), then the program may offer the user a "JPEG quality setting" control, whose setting will then lead to a specific pair of quantization tables, and thus a specific (standard) result in terms of compression.

And of course, if the application uses it own quantization table building scheme, it may offer what looks like the same user opportunity: to set the "quality", perhaps on a scale of 0-100. But the implications of a certain value may not be the same as for another application. (Still, since the "result" can't really be very well quantified anyway, there may not be much of a difference that we can tell.)

Now does this mean that in such an application there are 101 different pairs of quantizing tables that might be used? Not really. The entries in the tables are small integers, so we cannot scale the tables "that finely". What is often done that the user-input range of 0-100 (if that's the way the user interface works) is divided into perhaps 10-12 ranges, and for each range a certain value of the "scaling parameter" is used.

So if the user selects a quality of 71, or 73, the result may be exactly the same.

In other cases, the user is only able to select a "quality" from a list, perhaps 0-10-20-30-40 etc.

Back to subsampling

But the subsampling pattern is also an aspect of the overall compression scheme, and its result. How is that controlled?

In some applications, the user just sets that.

My favorite image editor, Picture Publisher does that. Sadly, it bungles the rather peculiar notation by which the various subsampling pattern are often stated.

In some applications, the same pattern is always used.

In some applications, the pattern that is used is preordained for each of two or three ranges of the user-set "quality".

Best regards,

Doug

Doug Kerr · Jun 2, 2014

It is interesting to see how the quantization works.

The quantization itself

The quantizing table provides, for each coefficient of the DCT representation (each coefficient tells the amplitude of a pair of cosines at certain frequencies, one pertaining to the horizontal constant of the pixel block and one the vertical) how "coarsely" it is to be quantized. This is controlled by a quantizing parameter, q.

We can understand this by using a decimal example. We will work only with integer original values, which is the situation in our actual case of interest.

Suppose that we have a number, 4231, and we want to round it to "hundreds" precision. So we divide it my 100:

4231/100 = 42.31

We drop the fractional part (if we are being fastidious, we bump the units value by 1 if the fractional part was 0.5 or over) and get:

42

Now we multiply by 100 and get:

4200

But if our reason for doing this was to decrease the amount of data that we had to "send to a distant place", we just send:

42

and tell the distant end that this was "in hundreds". So the distant end multiplies by 100 to get:

4200

In this example, q, the quantizing parameter is 100.

When we quantize the DCT coefficients, we use this same scheme (in the form where we do not multiply by q before "sending" the value).

Because the distant end received (in the JPEG file "header) the table with all the values of q, it knows how to multiply each received value to get the original (but rounded) value.

The larger the value of q, the more "coarse" is the rounding. If q is 1 (given that we are dealing with integer values to begin with), there is no effect of quantization.

Scaling a base quantization table

In many cases in JPEG encoders, different overall quantization doctrines are attained by merely scaling all the q entries in a basic quantization table.

This is done basically by multiplying all the q values by a scaling factor, which is the value S divided by 100 (so S sort of works in percent).

But S is derived from the "user quality", Q, which runs from 1 to 100.

Here we see the scaling factor (S/100) as a function of Q (the algorithm is a bit peculiar, but you need not be concerned with it):

Q___(S/100)
1_____50 (largest possible scaling factor, most coarse quantization possible for that base table entry)
2_____25
50____1 (leaves base table unchanged)
80____0.4
90____0.2
100___0 (but there is a "0.5 bump" in the scaling math, so this makes all the table q values become 1: a table that does no quantization)

So the "base table" is that for "quality = 50".

All very interesting.

Best regards,

Doug

Doug Kerr · Jun 2, 2014

Here we see a "base" JPEG luma quantization table suggested by the IJG (Independent JPEG Group, which provides a library of functions for use in JPEG encoders and decoders):

IJG base luma quantization table

The value in the far upper-left corner is the zero-frequency (V and H) coefficient. It basically tells the average value of the underlying property being encoded (Y, Cb, or Cr) over the entire bock being encoded.

The value in the far lower-right corner is the highest-frequency (V and H) coefficient. It tells the amplitude of two cosine functions, both at the highest frequency that is used here, that are components (in the V and H directions) of the variation in the underlying property being encoded (Y, Cb, or Cr).

We note that for the higher-frequency components (generally, to the right and down) the q values are higher, leading to more coarse quantization. This is of course to respect the decreasing sensitivity of the eye to higher spatial frequency continents.

I note that the table is not symmetrical about its diagonal. Thus there is a difference between the vertical and horizontal aspects, perhaps a creature of how humans interpret vertical and horizontal detail. I could imagine a sophisticated camera using different tables when in portrait and landscape orientation!

Here we see the table scaled according to user quantity "setting" Q = 80 (which means all q values in the table are scaled to 0.40 of their values in the base table - S=40):

IJG base luma quantization table scaled to Q = 80

This is "0.4 × as coarse as" ("2.5 × finer than") the quantization we would get with the base table (Q=50).

Best regards,

Doug

JPEG compression level setting

Doug Kerr

Well-known member

Doug Kerr

Well-known member

Doug Kerr

Well-known member