• Please use real names.

    Greetings to all who have registered to OPF and those guests taking a look around. Please use real names. Registrations with fictitious names will not be processed. REAL NAMES ONLY will be processed

    Firstname Lastname

    Register

    We are a courteous and supportive community. No need to hide behind an alia. If you have a genuine need for privacy/secrecy then let me know!
  • Welcome to the new site. Here's a thread about the update where you can post your feedback, ask questions or spot those nasty bugs!

On the terms "lossless" and "lossy" compression

Doug Kerr

Well-known member
I will be speaking here of data compression in the sense of taking a suite of data that describes s thing and replacing with with a smaller suite of data that describes the same thing (perhaps "exactly", perhaps not.

Compression schemes can be divided into two classes, which can meaningfully be called "reversible" and "non-reversible".

In a reversible scheme, one can take the "smaller" suite of data and from it reconstruct exactly the original suite of data.

In a non-reversible scheme, one cannot do that.

Sadly, the term "lossless" has come into widespread use to denote reversible schemes, and the term "lossy" then was back-formed to denote non-reversible schemes.

Although there is an outlook that makes these terms sort of apt (I'll present that later), the usual concepts and explanations surrounding them are often misleading.

The direct connotation is that, in the "compressed" suite of data (under a "lossy" scheme), some of the original data has been lost. (It would hardly seem wise to say that all the original data had been lost, or else it would seem that the "compressed" suite of data would be useless.)

But, if that is the case, we should be able to say (if we have all the details of a specific case) "how much of the original data" has been lost, or just what "pieces" of it have been lost.

Let's consider the application of the concept to a familiar topic - the JPEG scheme for compressing digital image data.

We start with an image comprising some number of pixels, each of which has a color described with three 8-bit values (24 bits altogether). We encode it in JPEG form and save it, the file having far fewer bits than the original representation. Later, the file is "decoded" with a JPEG decoder, giving a "reconstructed" image of the same number of pixels, each again described by three 8-bit numbers.

How many of those 24-bit color descriptions are identical to the original colors of their pixels? I have not seen any studies of this, but my guess is a very small fraction - my guess is less than 1% for a random image pattern. (I may be way off.)

So why is this image useful to us? Because, generally, it "looks very much like" the original image.

Now, have we lost data? In a sense yes - probably almost all the original color values are "gone", replaced by values that differ perhaps in one bit, or perhaps in all 24 (and of course how far the color represented diverges from the original color depends on which bits are different).

So in that sense, the moniker "lossy" is apt.

But where we get into trouble is with descriptions that say, "in lossy compression, the data suite is made smaller by losing some of the original data." That just gives no insight at all in what is going on.

In reality, in "lossy" compression, we make the data suite smaller, not by "discarding" data, but by exploiting the redundancy in the data (in particular, in a way that exploits our perceptual tolerances of certain kinds of differences between the original image and the reconstructed one). And as a result, in the case of a "lossy" algorithm, in the reconstructed data suite, probably almost all the original data is gone.

Trying to keep consistent with that explanation in the case of a reversible ("lossless") scheme gets us quickly into trouble. "In lossless compression, the data suite is made smaller by not losing some of the original data". That should be a hint that we are in trouble with this whole notion.​

I can further illuminate my point by a quite different example. Suppose we have a temperature sensor which delivers a high-precision digital representation of the air temperature. We compress this data to facilitate its transmission over a low bit rate channel to an indicator station. At the indicator station, the compressed data is decoded and the result (which we would hope would accurately mimic the original data) is displayed.

In this example, for each "reading", the first number is the temperature as digitized at the sensor end, and the second the value displayed by the indicator (assume degrees Fahrenheit):

72.33 72.22
74.61 74.61
70.01 69.99
72.23 72.92
75.17 75.21

Now, as a result of this non-reversible compression, have we lost data? And if so, how much?

Possible answers:

• No, we have a credible display for every reading - no data has been lost. Some seems to have been "corrupted".

• Yes, some of it - 1 of the original 5 readings was not delivered (that is, as it was) in the reconstructed readings. We have lost 20% of them.

• Yes, some of it - 10 of the original 20 decimal digits were not delivered (that is, as they were) in the reconstructed readings. We have lost 50% of them.

In fact, it is not the data that has been lost - it is its correctness. And we might use various measures of that. We might for example decide that a good measure of the overall amount of incorrectness for this suite of temperature is the standard deviation of the error. In this case, that would be 0.31 degree Fahrenheit.

My point here is not to suggest that authors (other than myself) give up use of the ill-chosen monikers "lossless" and "lossy" to describe reversible and non-reversible data compression. I just hope for us to recognize that these are metaphorical terms, and don't really relate to any property that we could "quantify" ("Well, how lossy is it? What data did we lose?")

Best regards,

Doug
 

Cem_Usakligil

Well-known member
Well formulated Doug, as usual. Another important point you have touched upon here. Thanks for your continued efforts in clarifying such common misconceptions and the associated fuzzy (or wrong) terminology. I like the example you have provided and I would love to have a survey containing the multiple choice answers. It would be very informative I expect. :)
 

Johan Nyberg

New member
Loss of data or of information?

Doug, it was stimulating to read your discussion. I was stimulated into mild opposition:

In your reasoning the term "data" is used to cover all forms of representation.
Nothing wrong with that, and as far as I know all is correct, but I think there is another way of looking at it that may make sense.

If we introduce the term, and concept of, "information", it becomes meaningful to say that one kind of data compression causes loss of information, is "lossy", while another doesn't. In both cases data is lost, of course. The correctness of data you mention seems closely related to what I call information.

In systems theory, as taught at present, the student (I used to be one) is supposed to make clear distinctions between "data", "information", and "knowledge". Data is the collection of symbols used to store or transfer a message of some kind. Information is the content, the meaning of the message, and knowledge is what the message becomes when it is used or understood somehow.

In digital photography, the raw bit stream from the sensor/ADC is data, it is interpreted by some system (a raw converter) and becomes information, and by looking at some reproduction of this information a person or AI can turn the information into knowledge, for example by recognising the subject and perhaps learn something about it.

I suppose that in some cases loss of "distracting" data or information can lead to gain in information or knowledge, for example by making interpretation easier.

BTW, information is a quantifiable property, even though I can't explain that without preparing myself.

Just my two pennies.

My best wishes to all.

Johan Nyberg
 

Bob Rogers

New member
I suppose that in some cases loss of "distracting" data or information can lead to gain in information or knowledge, for example by making interpretation easier.

I've never seen it expressed that way, but it's definitely true. A rolling average does exactly that!
 

Doug Kerr

Well-known member
Hi, Johan,

Doug, it was stimulating to read your discussion. I was stimulated into mild opposition:
There are many paradoxes in this area, and I vary as to how strongly I feel about my exact position myself!

In your reasoning the term "data" is used to cover all forms of representation.
Nothing wrong with that, and as far as I know all is correct, but I think there is another way of looking at it that may make sense.

If we introduce the term, and concept of, "information", it becomes meaningful to say that one kind of data compression causes loss of information, is "lossy", while another doesn't. In both cases data is lost, of course. The correctness of data you mention seems closely related to what I call information.

In systems theory, as taught at present, the student (I used to be one) is supposed to make clear distinctions between "data", "information", and "knowledge". Data is the collection of symbols used to store or transfer a message of some kind. Information is the content, the meaning of the message, and knowledge is what the message becomes when it is used or understood somehow.
Indeed. I shied away from that subtlety in my presentation!

In digital photography, the raw bit stream from the sensor/ADC is data, it is interpreted by some system (a raw converter) and becomes information, and by looking at some reproduction of this information a person or AI can turn the information into knowledge, for example by recognising the subject and perhaps learn something about it.

I suppose that in some cases loss of "distracting" data or information can lead to gain in information or knowledge, for example by making interpretation easier.

BTW, information is a quantifiable property, even though I can't explain that without preparing myself.
Indeed it is, and that leads to one of my problems with the notion that we "lose" information during non-reversible encoding-decoding.

For example if take the handy case of a set of 5-digit decimal numbers, and all 10^5 possible values are equiprobable, then the information content of each is about 16.6 bits [log2(10^5)].

Now suppose we encode these in some "compressed" way, so that when decoded there is some distribution of error, but all recovered 5-digit values are still equiprobable.

Now the information content, from an information-theoretic standpoint, of each of those 5-bit numbers is 16.6 bits . But is the information of any value? Well, to slide back to our photographic situation, all that imperfect information is certainly "valuable" - it describes images that we enjoy, learn from, sell, and distribute.

But is it worth "less" than the unerrored data? Almost certainly so, qualitatively. For example, an image with visible JPEG compression artifacts is certainly "worth less" than one free of them. Were that not the case, we certainly wouldn't undergo the cost of storage or transmission to use a "less aggressive" JPEG compression for more demanding uses.

Now the real basic philosophical question here is "what is it that we have less of in the delivered image data than we had in the original"? Is there less information in the information-theoretic sense? No, because that measure doesn't deal with the "accuracy" or "inerrancy" or "value" of the information.

OK, so there must be some other measure. We have many measures of the degree or error in a transformed data set (RMS errror, to mention one often-cited metric).

But suppose I have a set of numeric values, with a total information content of 300,000 bits, and pass them through some route than results in error, and I find out that the RMS error of the "delivered" values is 965. We understand that. Now I say, OK, knowing that, how much less of something does the delivered data have, and what is that something?

This is where I hit a dead end. And thus, I seem to have no alternative but to say that the errors caused by non-reversible compression-decompression do not constitute a loss of data or information.

Perhaps a new metric is needed that "discounts" information content based on "accuracy", like the way errors are considered in scoring a typist's word rate on a test.

In fact, we really do something sort of like this in modern transmission theory, where coding systems consume a certain amount of the channel capacity in the interest of reducing errors. The relationships there are often used to assign a "throughout degradation" to the occurrence of error. (I used to teach that stuff, but it has somewhat gotten away from me now!).

But that's not exactly the same as saying how much data (or information) is lost as a result of error.

So, let's think of a suite of data comprising 1000 5-digit decimal numbers (all from an equiprobable "dictionary", to make it simple). That's 16,610 bits of information.

I transmit them through a non-reversible compression system. Suppose that the result in one trial is that:

• 925 of the 10-bit numbers have no errors

• The RMS error of the entire suite (1000 values) is 12,500.

• The information content of the delivered data is 16,610 bits.

Now:

• what is it that we have less of than originally, and

• by how much?

Now there is an outlook that says that if we have 12,000,000 pixels in an image, and after compression and decompression 1,546,312 of them do not have their original values, that we have "lost" 1,546,312 pixels worth of data. But of course they are not just "gone" (perhaps replaced by black); their values just differ in varying degree from the original values (hopefully with a fairly small RMS error).

So for those who say "indeed data has been lost here", I can't really fully refute it. I may just say, "maybe so - how much?"

I just wish we described what really happens: that we don't get the original data back.

Best regards,

Doug
 

StuartRae

New member
Hi Doug,

While I understand the mechanics of JPEG compression, I've always thought of it (simplistically and possibly quite erroneously) as 'rounding' colour values so that a range of similar colours are represented by one 'average' colour.
It may be useful to consider the number of unique colours in the JPEG and TIFF representations of the same image.
For example I have chosen a random landscape and counted the numbers of colours using FSIV. The results are:

8-bit TIFF 147,256
JPEG (CS5 Quality 10) 102,863
JPEG (CS5 Quality 5) 93,830

In other words the Quality 10 JPEG has 30% fewer unique colours than the TIFF and the Quality 5 has 36% fewer.

Regards,

Stuart
 
I just wish we described what really happens: that we don't get the original data back.

Hi Doug,

I'd call that lossy ;-)

What's next is that we could try and quantify how significant the changes are, and ultimately if that changes the information content significantly (e.g. by distraction due to artifacts, or wrong enough colors to make a difference). Different uses may require different metrics, there's not a single universal number that describes a 2 dimensional dataset very well.

Perhaps a Delta E metric (comparing before and after compression pixel colors/brightness) is somewhat more relevant for photographic quality than e.g. the file size, or an image's entropy.

JPEG uses a system of tossing out precision, based on spatial frequencies and by subsampling of nearby colors, and does so in a block sized (therefore adaptive) fashion. Therefore colors, especially of very small details will be compromised. Whether that is important depends on how the image is used (e.g. forensic versus vacation snapshot).

Here is a tool to estimate the amount of compression based on an existing JPEG image and a certain subsampling method. And here is a tool that allows to tweak the compression parameters of a.o. JPEGs, and gives a visual comparison between the before and after images and its size. It allows to balance/reduce the size of websites versus the quality of the images used.

Cheers,
Bart
 
It may be useful to consider the number of unique colours in the JPEG and TIFF representations of the same image.
For example I have chosen a random landscape and counted the numbers of colours using FSIV. The results are:

8-bit TIFF 147,256
JPEG (CS5 Quality 10) 102,863
JPEG (CS5 Quality 5) 93,830

In other words the Quality 10 JPEG has 30% fewer unique colours than the TIFF and the Quality 5 has 36% fewer.

Hi Stuart,

Yes, counting unique colors gives an idea of how much data is changed/lost, but it can't suggest how relevant these losses are. All we could say is that less is probably better, but it doesn't specify how many colors are unchanged compared to the original. However, Photoshop not just changes the amount of compression, but it also changes the JPEG subsampling parameters when lower 'qualities' are selected. Therefore the loss of data is not continuous and may affect different parts of images differently at the different settings.

Cheers,
Bart
 

Doug Kerr

Well-known member
Hi, Stuart,

While I understand the mechanics of JPEG compression, I've always thought of it (simplistically and possibly quite erroneously) as 'rounding' colour values so that a range of similar colours are represented by one 'average' colour.
That's not a good metaphor at all. The compression scheme is not based on that principle.
It may be useful to consider the number of unique colours in the JPEG and TIFF representations of the same image.
For example I have chosen a random landscape and counted the numbers of colours using FSIV. The results are:

8-bit TIFF 147,256
JPEG (CS5 Quality 10) 102,863
JPEG (CS5 Quality 5) 93,830

In other words the Quality 10 JPEG has 30% fewer unique colours than the TIFF and the Quality 5 has 36% fewer.
That is interesting. Thanks for the analysis.

Best regards,

Doug
 

Doug Kerr

Well-known member
Hi, Bart,
I'd call that lossy ;-)
I understand the attraction to saying that.

My problem is that this phraseology is so completely different from what we have in many other cases. For example, just consider the somewhat simpler issue of bit errors in data transmission. Although we have many metrics for quantifying the impact of such (and, as you aptly point out, the development of meaningful metrics is no trivial task), there is no history of speaking of this affliction as a "loss of data". If I were pressed to answer, in that case, "what have we lost", I would have to say "accuracy". (When we do speak of "loss of data" is where the channel just goes dead for a period of time - or develops such a high error rate that it is "as good as dead".)

But in fact none of the metrics try to tell us "how much accuracy we still have", and thus "how much accuracy we have lost". Rather, they characterize the error (perhaps in very complex, multidimensional ways). In rare cases (generally marketing information), we may hear of "this instrument being 99.9% accurate". But not usually in true technical work.

What's next is that we could try and quantify how significant the changes are, and ultimately if that changes the information content significantly (e.g. by distraction due to artifacts, or wrong enough colors to make a difference).
Yes, such metrics could be very useful. But note that even in your description, you spoke of "how significant the changes are", not of the loss of any describable thing.

My guess is that any system of metrics for quantifying the impact of compression error will not have the "shape" of a measure of the loss of something. They will have the shape of a measure of a "defect".

Now, going back to the comparison with data transmission, but considering again digital images, suppose that in a certain instance, where the image was handled, from an error encoding basis, as 8x8 pixel blocks, and that there were 10,000 of them in the image. Suppose that as a result of something that happened to the data set (not necessarily JPEG compression/decompression error), 250 of the blocks were so corrupted as to be essentially useless as components of a perceived image (they "looked nothing like the original block"). I would have no problem describing that as a "loss of data" (or a "loss of information).

Thanks for your inputs to this fascinating matter!

Best regards,

Doug
 

StuartRae

New member
Hi, Stuart,
That's not a good metaphor at all. The compression scheme is not based on that principle.
Doug

Thanks Doug. I said I was probably wrong, and I was right. It's an attractive misconception though, when one sees a smooth, uncompressed blue sky replaced by blocks of the same colour.

Regards,

Stuart
 

Doug Kerr

Well-known member
Hi, Stuart,

Thanks Doug. I said I was probably wrong, and I was right. It's an attractive misconception though, when one sees a smooth, uncompressed blue sky replaced by blocks of the same colour.

Is that a common manifestation of compression error? It is certainly a manifestation of what we might call "insufficient bit depth" (in that regard we often speak of "banding").

Do you have any examples of the same scene as a TIFF file and then as a JPEG file that might illustrate this?

Thanks.

Best regards,

Doug
 

Doug Kerr

Well-known member
Hi, Bart,
In my mind I translate 'changes' to a loss of the ability to exactly reconstruct the original to how it was before compression.
Makes perfect sense to me.

Now if reconstruction is perfect, then the amount of "ability" we have to exactly reconstruct the original is perhaps describable as "100%".

Then, if reconstruction is imperfect, then the amount of "ability" we have to exactly reconstruct the original is "0%".

So if the "thing" whose loss is being discussed is "that ability", then indeed a non-reversible compression scheme "loses all of it".

So perhaps that is what is lost in a "lossy" compression scheme - not data or information.

I can go with that.

And there could be no metric for that! ("How much not a virgin are you, Marie?")

Thanks.

Best regards,

Doug
 

StuartRae

New member
Hi, Stuart,

Is that a common manifestation of compression error? It is certainly a manifestation of what we might call "insufficient bit depth" (in that regard we often speak of "banding").
Do you have any examples of the same scene as a TIFF file and then as a JPEG file that might illustrate this?

Thanks.

Best regards,

Doug

Hi Doug,

Yes, it's a common artifact seen in over-compressed JPEGs. It's not a result of insufficient bit-depth or banding, but (I believe) adjacent quantisation blocks being reduced to the same single colour.
Here's an example with the TIFF substituted by a 'lossless' PNG.

IMG_0435.png
IMG_0435.jpg


Regards,

Stuart
 

Bob Rogers

New member
It seems like perhaps good analogy is found in audio.

It seems like jpg compression is similar to distortion. When an audio signal is distorted by an amplifier (or other audio component) the resulting sounds are different from the original, in ways that usually cannot be reversed.

When you read distortion specs for an amplifier, they too give you and idea of how much different the signal is from the original, but now how important the differences are. And some people, through training or perhaps for other reasons, are more able to hear distortion than some others.
 

Doug Kerr

Well-known member
Hi, Stuart,

Hi Doug,

Yes, it's a common artifact seen in over-compressed JPEGs. It's not a result of insufficient bit-depth or banding, but (I believe) adjacent quantisation blocks being reduced to the same single colour.

Sure. I'll have to think a little about how that happens.
Here's an example with the TIFF substituted by a 'lossless' PNG.
Great! Thanks so very much.

Best regards,

Doug
 

Doug Kerr

Well-known member
Stuart's nice images give me an opportunity to talk about "amount of information".

For those not familiar with the concept of quantifying the "amount of information" in a set of data, I plan to post a brief(!) tutorial on the subject shortly.

However, we could easily demonstrate that the amount of information (from an information theory standpoint) in Stuart's second image (encoded in a "lossy" encoding system - JPEG - and then decoded and turned into an image) is less than in his first image (encoded in a "lossless" encoding system - PNG - and then decoded and turned into an image).

Now consider a different source image, a bare tree almost in silhouette against a rather uniform sky. (Buzzards optional). We will handle it in the same two ways.

In the image decoded from the JPEG encoding, we will see the familiar compression artifacts - little ripples surrounding the branches, reminiscent of a cartoonist's convention for the branches shaking.

In this case, the amount of information (in the information theory sense) in the delivered image conveyed in JPEG form is greater than the amount of information of the delivered image conveyed in PNG form.

Now that's not to say that this proves that there is no "loss" of information in the JPEG case. As an analogy, I can order a set of 48 excellent watches, and along the way, the package is rifled and 20 of them replaced by 25 cheap watches. Now, there are more watches when the shipment arrives than when it was sent, but I have in fact lost 20 good watches.

Still, I think the point is made that we need to be very careful when we speak of the loss of information to know just what we intend to mean by that.

Best regards,

Doug
 

Johan Nyberg

New member
Well, this thread certainly woke up!

Hello all.
This became very interesting, though sad to say I don't know enough information theory to follow all of you. I will try to grasp the general flow. I realise I mixed up information in the "Shannon" sense with information in the more or less common sense of meaning. I will read it up in Doug's text, and perhaps do some more thinking. And picture taking too, I hope.

Yours
Johan.
 

Johan Nyberg

New member
How about loss of certainty?

Hello all.

After some, but not much, reading and thinking, I want to suggest that what is lost with irreversible data reduction is certainty. I guess this is some of my half-baked statistical thinking laced with systems theory, equally half baked. If you don't want to look at the mess in my head, then skip the rest.

The data coming off the camera sensor is a sample of the luminosities of the motive. In the case of a sensor with a Bayer filter it is a stratified sample with certain proportions between red, green and red data points. (In this stage we have an enormous loss of data, but the sample size is supposedly enough to be representative of the motive.)
This sample is perhaps cleaned regarding outliers (dead or stuck pixels), weighted (dark frame subtraction) and scaled to simulate a change in sensitivity or to compensate for amplification, what I am getting at here are the different tricks regarding ISO-settings. The output is the raw data stream/ file.

The raw data is input to a simulation model of what a picture is to a human being. This is the raw converter, where the operator is able to tweak some parameters.
The output here is not "the truth", it is a product of the model design, ie the demosaicing algorithm, and the operator's preferences. Let us say that this output is the "original" image, because this represents the photographer's intentions. More or less.

The output from the raw converter may be input to a "lossy" compression algorithm, based on some model of human visual perception. For example a JPEG encoder. The output from this is a collection of data, a JPEG file, that can be input to another simulation model, a JPEG decoder, to build a representation of the "original" image. This representation can have the same pixel dimensions and bit depth and thus contain the same amount of data (no data loss, right?), but because of the generalisations made in the model, the certainty that the image is representative of the "original" interpretation of the motive becomes less the more severe the generalisation, or compression, is.

With "original" images that fit the model assumptions/ generalisations well, I suppose the uncertainty is less. Think kittens, sunsets, smiling couples etc. :)

Can this supposed loss of certainty be quantified? I guess not, partly because I believe that some of the image processing is based on heuristics and not on theory that can be expressed mathematically. Perhaps some metric could be found, some kind of confidence interval based on statistical properties of the original image and the model (JPEG encoder-decoder system) parameters.

Just another 2 pennies.

Johan
 

Doug Kerr

Well-known member
Upon reflection, I think that my position is not so much that "lossless/lossy" is a bad characterization of the distinction between two classes of compression/restoration algorithms, but rather that "reversible/non-reversible" is a better characterization. It says exactly what the distinction is, and does not require any esoteric contemplation to explain or justify it - "just what is it that is lost, and how much?".

If the process, end-to-end, restores exactly (e.g., bit-for-bit) the original body of data (as we must have, for example, when compressing computer object code in a ZIP file) it is reversible; if it doesn't, it isn't (that is, is non-reversible). Period.

In other words, just say what you mean. Metaphors, not matter how clever, or appealing, rarely do as well. They can help to explain, but rarely to define.

I'm reminded of an early discussion I had with my mother after a life insurance agent left our home. I asked her what was 'life insurance', anyway? She said that it was an arrangement so that if anything happened to my father, we would get some money. I (evidently already wise to the issue I discuss above) said, "Wow! That's great! Something happens to Dad every day." She said, "No, I meant if he dies". I said, "Well, why didn't you just say so?"

In technology, say what you mean. Leave metaphors up to the poets.

Best regards,

Doug
 

Johan Nyberg

New member
Doug, thank you.
I must tell that I found my way to this forum from DPR, after getting fed up with the flaming and the nasty tone that is a bit too common there, and also the low S/N.
I am glad to be here. It seems to me that discussion and friendly instruction are both possible in this forum, which I like. Having one's opinions scrutinised in a positive way is a privilege. Feel free to tell me when I am absurd.

Johan
 
Top