• Please use real names.

    Greetings to all who have registered to OPF and those guests taking a look around. Please use real names. Registrations with fictitious names will not be processed. REAL NAMES ONLY will be processed

    Firstname Lastname

    Register

    We are a courteous and supportive community. No need to hide behind an alia. If you have a genuine need for privacy/secrecy then let me know!
  • Welcome to the new site. Here's a thread about the update where you can post your feedback, ask questions or spot those nasty bugs!

The "offset" in Canon raw files

Doug Kerr

Well-known member
[This has been edited to revise what I say about one of the reference documents.]
I am a bit confused by the matter of the "offset" involved in the sensel value data in contemporary Canon raw files (.CR2).

While doing some study in this area, I have encountered two significant reference works:

• The "help" (Use Guide) for the raw file analysis program Rawnalyze.

(I can provide this to anyone who is interested.)

• A paper entitled "Understanding What is stored in a Canon RAW .CR2 file, How and Why" by "clevy":

http://lclevy.free.fr/cr2

I have been led to believe that sensel values in the Canon raw files involve a defined offset, perhaps 256, or 1024, or 2048, depending on the bit depth of the file, the camera model, and perhaps even the ISO value in effect.

This is seemingly for the familiar purpose in integer representation schemes: to accommodate negative values. While of course a negative sensel value is meaningless from a photometric standpoint, they can certainly occur as a result of noise; thus we need to be certain that our number scale is not clipped prematurely.

The Rawnalyze documentation seems to suggest this:

In the raw file there is a "table" with an entry for each sensel position stating the "black level" for that sensel. This is the numerical value (DN) that should be taken as meaning a zero photometric exposure ("black") with regard to that individual sensel.

Rawnalyze, for example, reports the minimum, maximum, and average values of these black levels over the entire file being examined (presumably, over all sensels).

Rawnalyze also uses the lowest of these black level values as:

• The beginning of the DN scale of its basic histogram display.
• The initial ("default") value of its Black Point setting.

In a sample 40D raw file I have been looking at, these black level statistics are:

Minimum: 951; maximum: 1097; average: 1022

In a test file (all senels "well blown"), the statistics are:

Minimum: 973; maximum: 1100; average: 1021

The Rawnalyze documentation indicates that, in the interpretation of the raw sensel data, first, from each value is subtracted the recorded black level for that particular sensel, giving a value on a new scale in which zero represents black. These "black corrected" values are used in the further interpretation of the raw data. There is some discussion of the fact that "negative" values are clipped by this process.

My initial thoughts on the information in the levy paper were off the mark, and I am studying it further to see where it fits into this picture.​

Is it possible that the "offset" model is just an attempt to simplistically describe the real situation? Or to force into the Canon situation a concept learned in another camera family? Or have I missed something important here?

I'd appreciate any thoughts on this from my colleagues here.

Best regards,

Doug
 
Last edited:

Doug Kerr

Well-known member
Perhaps there is not so much of a mystery here as it had seemed. It appears that the notion of a stable offset in the raw file (1024 for my 40D) is quite realistic.

I took several files that I had shot "as all black as was practical" and examined them with Rawnalyze. In them, for the entire image, or for any substantial sized area within it, the sensel values (all three "flavors") had an average of 1024 (certainly consistent with a scale that ran from 1024; that is, an offset of 1024.

In the one file, the actual sensel values ran from 952 (a blue one) to someplace about 1075. (I have to say it that way since there were a handful of "hot" sensels that ran considerably higher.)

But now what about this "black level" matter. Rawnalyze reported that the black levels (which are implied to be my sensel) run from 975-1084 for this file. Where did it get these? are they in fact stored in the file? We often hear of determining black levels from "masked" sensels in the array (usually getting a single value from that), and in fact that might well happen inside the camera, but we certainly don't see that in the Canon raw files that I can tell.

Now "clevy", in his discussion of the Canon raw format, tells us that the file contains three black level values (one per "channel"). He is apparently unable to read these directly, but deduced them from the corresponding fields in DNG files blown from his Canon raw files. He gives the actual black levels, for all three channels (but global, not per sensel), as 1023 (for files from a 5D2). (We can easily understand a discrepancy of one unit based on how these things are defined.)

The DNG format, by the way, provides several ways to record black levels:

• For each sensel
• A global value for each "channel".
• A global baseline value for each "channel", with increments (applying to all channels) for the various rows, and the various columns, of the sensor (based on the notion that there would be a pattern in the black levels).

It sounds as if the DNG file for the Canon raw files utilizes the "global for each channel" approach, intimating that this is what we have in the Canon raw file itself.

I'll be doing some more probing of the soul of Rawnalyze for an insight into the residual mysteries regarding "black level correction".

Best regards,

Doug
 

Doug Kerr

Well-known member
It begins to look as if this issue of the "offset" vs. "black level" is a matter of how one chooses to look at it.

With regard to what we have looked at as an offset, a creature of the number scale used for the raw sensel data, I think Gabor Schorr (the developer of Rawnalyze; sadly, it appears that he died earlier this year) looked at it as "the normal value of the black level."

(With regard to the DNG file format, the documentation there takes the same outlook - there is no place to define an offset, nor any mention of the concept, but the black level can be either global or by sensel.)

It still looks as if the black level is stored in the Canon raw file on a per sensel basis. (lclevy's writings suggest not, but perhaps that is the case for the EOS 5D2 - the model he specifically mentions.)

I took a look at a 40D raw file with Rawnalyze, taking a number of specific sensels and noting their values both "as they arrived" and after the black level correction was applied. The apparent black level varied from sensel to sensel, but was always fairly close to 1024 (three particular sensels I examined apparently had black level values of 1019, 1023, and 1020, at least with this one file in place).

(Yes, this is worse than pixel peeping!)

Best regards,

Doug
 
I took a look at a 40D raw file with Rawnalyze, taking a number of specific sensels and noting their values both "as they arrived" and after the black level correction was applied. The apparent black level varied from sensel to sensel, but was always fairly close to 1024 (three particular sensels I examined apparently had black level values of 1019, 1023, and 1020, at least with this one file in place).

When you make a histogram of a blackframe (body cap, viewfinder eye piece blocked, no exposure, shortest shutter time), it will show as a nice Gaussian like distribution around the offset value for that camera, with the Gaussian widening with ISO.

(Yes, this is worse than pixel peeping!)

Yes, they aren't even pixels yet, they're data numbers.

Cheers,
Bart
 

Mike Shimwell

New member
(Yes, this is worse than pixel peeping!)


Didn't all this start with the observation that Canon give us amplifier gain implemented iso settings so that we can get lower noise and higher dynamic range for a given exposure than would be the case if we simply pushed the base iso in raw conversion? I know it wasn't put like that, but that seems to be the discovery?

:)

MIke
 

Doug Kerr

Well-known member
Hi, Mike,
Didn't all this start with the observation that Canon give us amplifier gain implemented iso settings so that we can get lower noise and higher dynamic range for a given exposure than would be the case if we simply pushed the base iso in raw conversion?
I suspect that control of ISO sensitivity by analog amplifier gain was used very early in most digital camera design, but I'm not sure.

In any case, I don't think the issue here is related to that notion.

The issue in my inquiry here is, essentially, does the "offset" (black level) in Canon raw files vary sensel-to-sensel? It was prompted by the apparent indication, in the "help" for Rawnalyze, that such is so.

Best regards,

Doug
 

Doug Kerr

Well-known member
Hi, Bart,

When you make a histogram of a blackframe (body cap, viewfinder eye piece blocked, no exposure, shortest shutter time), it will show as a nice Gaussian like distribution around the offset value for that camera, with the Gaussian widening with ISO.
Yes, I can see that in Rawnalyze.

If I make Rawnalyze apply "black level subtraction", we now see the right-hand half of a similar distribution about zero.

I might expect its standard deviation to be less, based on the presumption that the departures from the offset were the sum both of:

• variations in the black level sensel-by-sensel
• noise

We might expect that the black level subtraction would remove the first component and leave only the second, giving a more compact distribution (lower sigma).

I'm not able yet to analyze this numerically, but visual inspection suggests that this result does not obtain. That is, the "spread" of the data values is just as broad after subtracting the pixel-by-pixel black levels.

I have confirmed that the amount subtracted is not uniformly 1024 but rather varies sensel-by-sensel on any given shot. But it does not appear that this is a "fixed" table for the camera - it differs between shots (even for a consistent ISO setting). (I don't yet know if its overall "pattern" is consistent or not.)

Rawnalyze reports, for each shot, the range of the black levels. Typically the minimum is in the general area of 940-980, while the maximum is in the general range of 1070-1090. The average is often 1023 or nearby.

On the other hand, the average of the sensel values in the file (before black level subtraction) is generally 1024, and in almost any case, in the range 1023-1025.

I plan to contact some developers of raw development software and see what they can (and are willing to) tell me about this.

Do you have any suggestions in that regard?

Thanks so much for your input into this.

Best regards,

Doug
 

Doug Kerr

Well-known member
Laurent Clévy (author of the "lclevy" paper on Canon raw files) was kind enough to call my attention to a Canon patent Application (US 2008/0291290 A1) which he said might have some bearing on this issue.

The actual techniques taught by the patent are ingenious, but most important was its implications on what needed to be done.

The topic was correction for black level variations across the sensor, which we can think of as correction for "fixed pattern noise".

The technique in the patent did this on a column-by column basis, in a very sophisticated way, allowing quite precise measurement of the average black level for each column. This is done freshly for each exposure, and uses "masked pixels" at the top and/or bottom edges of the sensor.

Although this technique is extremely clever, it is not what I report here. Rather, the significant issue to me at this point is the intimation that the black level correction can be done by column (not by sensel). That suggests that the variation is systematic by column, and thus that it results from column-common components (such as column read amplifiers).

That in turn suggests that in the Canon Raw files the black level values are not per sensel but rather per column (a much smaller data load, of course).

I note that in the DNG file (and it turns out that this is paralleled for the Exif file), there is provision for black level information that is global, by column, by row, or by sensel.

I will do some poking around in Rawnalyze and see if in fact the black levels seem consistent by column (or by row - which direction is the "row" and which the "column" do not necessarily match our conventions for that).

All very interesting.

Best regards,

Doug
 

Doug Kerr

Well-known member
I have not been able to access with ExifTool (especially in its convenient form, via the ExifToolGUI front end), the Exif tags that I thought would carry the key to the matter of the black levels in a Canon CR2 file. (ExifToolGui only displays certain tags, as chosen by the author, and those are seemingly too obscure to be included.)

Just for kicks, I took one of my sample CR2 files and had Photoshop CS5 blow a DNG file from it. Then I looked at that with ExifTool(GUI).

Aha! For this filetype, ExifToolGUI does display the critical tags. (They have the same names and ID's in Exif and DNG files.)

They show that the CR2 file uses the black level mode in which the values are given for each row! The tag in which these per-row values are carried (BlackLevelDeltaV) is shown as having 30591 bytes. That is curious, so there is some mystery to still.

The per-row black level values are of the data type SRATIONAL (signed rational), which in this area of work is an 8-byte data type. The CR2/DNG file has 2600 rows; thus we might expect 20800 bytes of data. So I am missing something here yet. But we'll crack it! (There may be a problem in ExifTool; one of the other tags in this area seems to have the wrong data length as well.)

I'm taking on some other metadata examining tools, and I may be able to get a better look inside that tag. (Actually, I should be able to have ExitTool (itself) deliver the entire content of the tag, but I am still having some trouble driving ExifTool from the command line, so I'm not there yet.)

Best regards,

Doug
 

Doug Kerr

Well-known member
Well, it turns out that when Exif(GUI) reports that there are 30591 bytes in the BlackLevelDeltaV tag, that means 30591 bytes (that is, characters) on the text listing of all the numerical values (which have varying numbers of characters, as trailing zeroes are not included).

But it turns out that there are in fact 2596 values - the number of rows in the raw data. (There are 4 more rows than pixels in the developed image - this is needed to provide for the working of the CFA demosaicing for pixels near the edge of the image.)

Most of the values are near 1024.

There are still some mysteries, but I think I have the picture now.

Best regards,

Doug
 
Top