About matrix multiplication

Doug Kerr · Oct 3, 2015

If we delve into the processing of images in a photographic chain, we may well run into calculations done by matrix multiplication. An important example would be the transformation of colors represented by the coordinates of the CIE XYZ color space into their r,g,b coordinates (the linear coordinates that are the precursors of the R,G,B coordinates of the sRGB color space). Or we may be working with the transformation of the three outputs of a sensor into (an approximation of) r, g, and b, often done by simple matrix multiplication.

Its very easy to get confused about matrix multiplication, which has regularly happened to me. In fact, just a few months ago, I needed the help of a colleague, the eminent color scientist Bruce Lindbloom, to straighten me on just how it worked in a certain matter I was looking into.

So I thought I would pass on my new understanding of this so that, if you should have to deal with the matter, you won't have to ask Bruce how it works.

I am not here going to speak of how the matrix multiplication is actually calculated. Rather, I will limit this note to the often-baffling matter of just how a matrix multiplication is set up, and how we speak of the operation.

In our work such as I mentioned above, we most often run into matrix multiplication involving a source matrix that is really a 1 × 3 "vector" of three values , a 3 × 3 matrix that does our "transformation", and a result matrix that is a 1 × 3 vector of three values - again, as when transforming X, Y, Z to r, g, b.

But the symmetry of a 3 × 3 matrix can make it hard to clearly see how the "rules" of these operations work, so I will at first use an example with an asymmetrical matrix.

Figure 1a

Here the two matrixes we are multiplying are identified as A and B (never mind for now "which is multiplied by which", a matter that is rather tricky, and of which I will speak in due time.)

Matrix A is a "3 × 4" matrix (we always mention the number of rows first and then the number of columns; this is contrary to the notation for indexes and coordinates, just to keep us off-balance!) Matrix B is a "4 × 2" matrix, and matrix C is a "3 × 2" matrix. Why did I choose that particular set of sizes? Well, as you will see in a minute, once I got started there was little choice as to how to proceed.

Note that I have put the "result" (and the equals sign) on the left side of this equation. That is the way we most find it done in mathematics. But it does not matter at all. If I were to move the equals sign to the right of A and B, and then put C to its right, nothing else would change. So I wont waste your time showing that variation in presentation.

There are some brackets and arrows that show how the dimensions of these three matrices are related. The relationship labeled "mandatory" is just that: unless the number of columns in the first factor is the same as the number of rows in the second factor, the multiplication is not defined. (In that statement and in similar ones to follow, "first" and "second" mean exactly that, and only that.) But the number of columns in the second factor can be whatever is needed to fit the overall calculation we need to make.

Th other thinner arrows show how the result, C, gets its dimensions. Its number of rows will be the same as the number of rows in the first factor; its number of columns will be the same as the number of columns in the second factor.

Got that?

Now that we've got those rules down pat (!), I'll move to examples involving a 3 × 3 matrix as one of the factors, since that is the case we will most often encounter, and in addition that case illuminates some further tricky aspects of this matter.

Figure 1b

This might be the situation in which B is the "vector" comprising the three outputs of a sensor, A is the transform matrix, and matrix C is the result (the r, g, and b values).

Now in normal "scalar" mathematics, if one way to set up a calculation is

C = A • B

we might (if only as a matter of "style") want to instead write it as:

C = B • A

The two are exactly equivalent, so in fact which we do is just a matter of "style".

But with matrix multiplication, we can't be nearly so sanguine. Here we see how we might, for some reason, rearrange our calculation see just above:

Figure 1c

The first thing we see is that the source matrix, B, and the result matrix , C, are now "laid on their sides". This is an unavoidable consequence of the "rules" I discussed above regarding the dimensions of the various matrixes. The primes on their designations are just to recognize this seemingly-trivial, but vital, difference.

Not visible, given the rather generic way I showed the matrixes, is that A, the transform matrix, must also be rearranged. In particular, it is flipped over its upper-left to lower right diagonal axis. (Technically, it is said to be "the transpose" of the original A matrix. The double prime on its designation alerts us tom this difference.)

We can often get into trouble when we speak of something being multiplied by something else.

Consider regular scalar multiplication. We might, in a narrative description of some calculation, say that "we then take the number of towers (10) and multiply it by the number of antennas on each tower (4) to get the total number of antennas needed (40).

Now does that narrative description mean:

<number of antennas> = 4 • 10

or

<number of antennas> = 10 • 4

Well, that verbal description doesn't tell us. But that really doesn't matter, since the two formulations are exactly equivalent.

But as you can guess for my discussion above, with matrix multiplication, that ambiguity can't just be blown off. So if we say "we take the set of three outputs from the sensor, B, and multiply it by the transformation matrix, A", does that mean in particular what we see in figure 1b or what we see in figure 1c? As we've seen, those are not the same thing, so we have to understand which is meant.

But fortunately there is terminology that can be used to describe exactly what we mean.

For example, if the specific calculation we have in mind is that seen in figure 1b, we can say, "we take the set of three outputs of the sensor, B, and left-multiply it by the transformation matrix, A.

Now do people often do that? No.

Best regards,

Doug

Doug Kerr · Oct 3, 2015

In studying this matter, one is easily derailed by erroneous information included in well-meaning tutorials on the Internet.

Here is my adaptation and expansion (into an Excel spreadsheet) of a numerical example that threw me off for a while. Its "model" appeared in an otherwise very accurate and clear tutorial.

Here, as is often (quite properly) done in explanations of the workings of matrix multiplication, the equals sign and the "result" are placed on the right. This makes sense as a way of illustrating the "flow" of the calculation process.

We see in equations (1) through (3) the proper (and inevitable) layout of the result matrix.

But in equation (4), the result is inexplicably (and erroneously) rotated from its proper orientation.

By the way, in equations (1) through (3), the calculation of the coefficients of the result matrix was done with the Excel matrix multiply function. Of course equation (4) was not done that way - I had to plug in the coefficients by hand.

By the way, that is how I discovered that equation (4) was wrong. Early in my recent encounter with matrix multiplication, I tried to do in Excel an actual calculation equivalent to equation (4). It was a color coordinate transformation, with a 3 × 3 transformation matrix. I tried to make Excel deliver the result as a vertical vector of three values, as suggested by equation (4). It would not do it!

Best regards,

Doug

Doug Kerr · Oct 3, 2015

In the notes above, I rather tersely referred to the fact that by custom, in different kinds of mathematical work, we may choose to put the "result" of an equation on the left or on the right, that making no difference in the operation or meaning of the equation.

I thought I would expand a little on that.

When we use an equation to explain a mathematical relationship, we often put the "result" on the left. A common example is the usual presentation of the equation of a straight line in two-dimensional space:

y = mx + b

I also gave above an example of the calculation of the number of antennas needed in a radio system project:

This makes sense. We "name" the result, and then tell how to get it.

And assignment statements in programming languages generally must be done that way.

cost = quantity * unitPrice

On the other hand, when we show how to do some calculation (or even to give one cell of the "multiplication table"), we normally put the "result" on the right:

2 × 3 = 6

(a +b) • (c +d) = ac +ad +bc +bd

And this makes sense - it gives a sense of "flow" from the ingredients to the result.

Note that none of this has anything to do with the matter of whether the "factors" in a multiplication can be interchanged, a different matter altogether.

We saw earlier that in "ordinary" multiplication the factors can be changed in order without any effect on the result (a property called commutativity, which means in this context "switch-around ability").

And we also saw that matrix multiplication does not exhibit commutativity. We cannot change the order of the factors and have no effect on the result (or on the validity of the "setup").

Best regards,

Doug

Asher Kelman · Oct 3, 2015

Got it ......... If I don't take another coffee! But now what?

How does it impact the chain from voltages to a specific pointer to a color in a chosen color space?

.......and where are the pitfalls?

Asher

Doug Kerr · Oct 3, 2015

Hi, Asher,

Asher Kelman said:
Got it ......... If I don't take another coffee! But now what?

How does it impact the chain from voltages to a specific pointer to a color in a chosen color space?

.......and where are the pitfalls?

Well, lets talk specifically about transforming:

• The three-value indication (I call that d, e, f) coming from the voltage outputs of three nearby detectors in a sensor

to

• The three coordinates (in linear form) of the representation of a color in some color space (perhaps sRGB) (I call that r, g, b).

The big problem is that a set of values of d, e, f does not reliably tell us the color of the light on those three photodetectors. That is, our sensors are not "colorimetric". That means they do not "measure color". That's a pretty direct description, in one word, of the situation.

The implications of this include:

• Two instances of light having different spectrums, but nevertheless having the same color (they are metamers) will not, in general, produce the same set of d, e, f values.

• Two instances of light having different spectrums and different colors may produce the same set of d, e, f values.

So how are we to transform a set of d, e, f, (which does not unambiguously describe a color of the light incident on the sensor) to a set of r, g, b (which will then become R, G, B) which does unambiguously describe a color?

Well, we do the best we can.

In simple "image processing chains", we multiply a set of d, e, f from the sensor by a certain 3 × 3 matrix to get a set of r, g, b, just as if the set of d, e, f did describe a color. The matrix is contrived so that the average error in the r, g, b, as a representation of the color of light on the sensor, in a test image with a specific set of multiple color patches (e.g., the not-neutral patches a ColorChecker chart), illuminated by some particular illuminant, is as small as possible.

In a more sophisticated system, we may use a 3D LUT rather than simple multiplication by a matrix to covert a set of e, f, g to a set of r, g, b. This table can be empirically contrived so that the error in the r, g, b values (in trying to describe the color of light on the sensor) is very small for every one of the patches on the test chart, again for an image taken under some specified illumination.

But of course if we take a shot under some different illumination, this carefully-contrived accuracy of color representation (for 18 specific reflective colors) will, to some degree, come undone.

This is quite beyond the simple matter of a correct white point.

Your practice, of constructing a DCT profile for each of your lens-body combinations for each actual illuminant you expect to shoot under, comes as close as is practically available to overcoming this dilemma.

Best regards,

Doug

Doug Kerr · Oct 3, 2015

ISO 17321 provides, among other things, for a single-valued "score" that rates what I call the "metameric error potential" of a sensor. The process is interesting.

First, we take a shot of a ColorChecker target under a specified illuminant, and then note the raw values (again, I call them d, e, f) for each of the non-neutral patches.

We have the spectrum for the illuminant and the spectrum for each of the patches on the test chart. So we can calculate the spectrum of the light that will be reflected from each patch.
From that we can determine the color of the light that was reflected from each patch.

Next, we design (only for purpose of this test scenario) a 3 × 3 matrix that will be used to transform the d,,e, f outputs of the sensor for each patch into the X, Y, Z coordinates of a color.

The matrix is designed so that the average discrepancy between

• what we know must be the color of the light reflected from each patch, and

• the color (as X, Y, Z) developed by the use of our matrix from the sensor d, e, f outputs for that patch

is as small as possible (averaged, that is, over the errors for all of the patches).

That "minimum attainable average error" is the basis of the metameric error "score" for the sensor (called the Sensitivity metamerism index, or SMI).

Note that "minimum attainable" means, "if the d, e, f to X, Y, Z transformation is done by simple multiplication by a matrix". We can of course do better with a look-up table.

In the DxOMark report on a camera sensor, in the Color Response tab, is stated the SMI as DxOMark has determined it under the ISO 17321 procedure, and the matrix that was developed as part of that process (this all being for a specific illuminant, there being pages for CIE A and CIE D65).

The intimation, put forth in the explanatory material for these test, is that this matrix would be appropriate for transformation between the sensor outputs and the underling linear coordinates of an sRGB representation of the color involved. That is a reasonable notion.

Curiously, the inverse of that matrix is presented on three three-bar bar charts, labeled to suggest that they represent the response of the three sensor channels to excitation by the three sRGB primaries.

The implication is that, if we know the response of each of the three sensor "channels" to excitation by the three sRGB primaries, and consider that in matrix form, the inverse of that matrix would be the optimum matrix one to transform the sensor outputs into a color representation (perhaps sRGB).

That part of the report doesn't make sense to me. I discussed above that the optimal matrix (even if only created as part of the ISO 17321 measurement process) must be developed from far more exhaustive measurements than the "sRGB primary stimulus" measurements suggested by the DxOMark report. And indeed the matrix presented in the DxOMark report is labeled, "Color matrix as defined in ISO standard 17321".

My own guess as to what has happened here (and it is only a "fairy tale) is:

At one time, DxOMark did measure the response of each sensor channel to excitation by the three sRGB primaries, reported that on three nice three-bar bar charts, and considered that one useful way to characterize the sensor's colorimetric response.

Later, they began to use the ISO 17321 procedure to develop an overall "score" of color accuracy (the SMI). Part of the procedure resulted in the development of the "optimal transformation matrix" I discussed earlier, and DxOMark felt it wouild be useful to present that.

Somebody decided that the inverse of that was meaningful as a description of sensor colorimetric behavior itself. (I haven't yet figured out the concept behind that; it might be valid, or may just be a wild hare.) So they presented that inverse matrix on the bar charts.

And they kept them labeled just as they had always been: in terms of response to the three sRGB primaries.

Now of course I have no idea what would have been a more appropriate labeling. And maybe the designers of the "new" DxoMark report had that same problem! So they just kept them labeled the way they had been.

I have inquired (blind) of DxOMark about this curiosity, but have not yet received any reply.

An illustrative DxOMark sensor report can be seen here:

http://www.dxomark.com/Cameras/Sony/A7R-II---Measurements

Click on "Color Response" to get the page I have been discussing.

For the benefit of those who might want to look at the DxOMark report, let me give an interpretation of the labeling of the "ISO 13721" matrix. I will put subscripts in square brackets, as I cannot actually render them here as subscripts.

What are called R[raw]. G[raw], and B[raw] are the outputs of the three sensor channels (what I call d, e, and f).

What are called R[sRGB}, G[sRGB}, and B[sRGB) are not in fact the R, G, and B coordinates of the sRGB color space.

Rather, they are the linear coordinates from which R, G, and B are derived by the application of "gamma precompensation" (I designate those linear coordinates r, g, and b).

To give the benefit of the doubt to the nice folks at DxOMark, I note that in much formal scientific work, the three linear coordinates of the sRGB color space are labeled R, G, and B, and the gamma-precompensated ones (the ones we see every day as R, G, and B) are called R', G', and B'.

This type of labeling of a matrix is not usually seen in formal mathematical work, but is quite common in engineering work, and can in fact be very helpful to the reader. The labels along the left edge in effect designate the three "inputs" to the matrix multiplication process, and the three labels along the top the three "outputs".

In terms of actual computation, to get the value of one of the outputs, we multiply each of the three inputs by the coefficient in its row, in the column of the desired output, and sum those three products to get the output value.

Best regards,

Doug

Doug Kerr · Oct 3, 2015

By the way, the "formal" explanation of the application of the "ISO 17321 matrix" as it is presented in the DxOMark report is this:

To be extremely precise, we might say, "The vector containing the sensor outputs (d, e, f) is right-multiplied by the ISO 17321 matrix to get the vector containing the linear sRGB coefficients, (r, g, b)".

By the way, in the examples above, I have used the "dot" to clearly indicate multiplication. That is actually inappropriate for matrix multiplication. For example, to show (in symbolic form) the multiplication of matrixes A and B to get matrix C, we would always write:

C = AB

Best regards,

Doug

About matrix multiplication

Doug Kerr

Well-known member

Doug Kerr

Well-known member

Doug Kerr

Well-known member

Asher Kelman

OPF Owner/Editor-in-Chief

Doug Kerr

Well-known member

Doug Kerr

Well-known member

Doug Kerr

Well-known member