|HOME FORUMS NEWS FAQ SEARCH|
The DCI standard for digital distribution of motion pictures - Part 1
Over the years, various matters of technology and technique have flowed back and forth between the fields of "still photography" and "cinematography". The same is true for the fields of "motion picture sound" and "sound recording".
An interesting example of the adoption of technology generally associated with still photography into the new realm of digital cinematography (which term implies work that is different, although not in any clear way, from the fields of "television" and "video") is a set of technical standards for the distribution of motion pictures in digital form to theaters with digital projection facilities.
The centerpiece of this structure is an overall architectural standard developed by Digital Cinema Initiatives, LLC (DCI), a consortium of six major motion picture studios, each with exhibition arms. (Metro-Goldwyn-Mayer had been an original member, but withdrew shortly before the first standard was issued - I do not know the back story there.)
Although the DCI standard is quite detailed, there are nevertheless numerous further details covered by a substantial suite of standards from SMPTE, The Society of Motion Picture and Television Engineers, and other cognizant standards bodies. Here, I will only discuss the DCI standard itself, and only its most prominent portions of its "picture" aspect.
To me, the fascinating thing about the DCI standard is that it involves numerous technical concepts with which we, as digital still camera enthusiasts, are familiar, but many of them in wholly unexpected ways.
Here we go.
The basic DCI standard provides for two basic "format size families", whose maximum pixel dimensions are:
• 4096 x 2160 px (the "4k" family) [The family designation means that the maximum format width is in the vicinity of 4Kipx: 4096 pixels.]
• 2048 x 1080 px (the "2k" family)
Four important specific formats exactly mimic image aspect ratios of the two common contemporary film distribution formats:
• 3996 x 2160 px (4k version), 1998 x 1080 px (2 k version), aspect ratio 1.85:1. These are called the "flat" formats, a term carried from film distribution, where it means (through some strange logic) that the image is on the film in its final aspect ratio, not "squeezed". This is the format used for a number of years for most "sort-of-wide screen" films.
• 4096 x 1716 px (4k version), 2048 x 858 px (2 k version), aspect ratio 2.39:1. These are called the "scope" formats, a term carried from film distribution, where it means (by reference to "Cinemascope", an important format of this genre) that the image is on the film "squeezed" such that it must be horizontally expanded, by a factor of 2, in projection to attain the actual viewed aspect ratio. This is the format used for a number of years for most "seriously wide screen" films.
By the way, the pixels are "square" (same pitch along each axis). There is no "squeezing" implied by the digital "scope" formats.
There is no concept of "progressive" vs. "interlaced", a notion that should really never have entered into the matter of digital video at all. (But don't get me started on that here!) If one must draw the comparison, we could say that the DCI formats are all "progressive" formats.
Two frame rates are possible for these formats, 24 f/s and 48 f/s. The 48 f/s rate is only applicable to the 2k forms of the formats.
[To be continued in part 2.]
The DCI standard for digital distribution of motion pictures - Part 2
The color of each pixel is encoded under the CIE XYZ color model. This model represents color in an "absolute" way, with each trio of values (X,Y,Z) implying an absolute chromaticity and, ordinarily, a luminance normalized to some arbitrary maximum value.
In the DCI application of this model, however, the luminance implication is absolute; the set of three coordinates describing the white point describes light of a certain chromaticity (x=0.3140, y=0.3510 on the CIE x-y chromaticity plane) and an explicit luminance, Y, of 48.00 cd/m^2. Thus, the absolute luminance of each pixel, on the screen, is directly implied by the digital image encoding. (We assume that this may be tampered with by the projector operator.)
The XYZ model is a "tristimulus" model (if fact, in some cases it is called the tristimulus model). That means that it describes a color in terms of a recipe of the amounts of three "primaries" that, if combined, will produce the color being described.In the DCI standard, the actual values recorded are "gamma precompensated" for a "gamma" of 2.6. That is, the values of X, Y, and Z for the pixel are each scaled to a certain reference value, the 2.6 root of the result taken, and that then multiplied by 4095 and digitized to give a 12-bit number. Those three 12-bit numbers describe the color of each pixel. The gamma-precompensated values are called X', Y', and Z'. (The 12-bit numbers have slightly different designations.)
The chromaticity gamut of the CIE XYZ color model itself embraces all chromaticities (that is, the chromaticities of all visible radiation), and in turn all colors (since any luminance may be involved, even if through choice of the "reference maximum" luminance).
However, legitimate colors in the DCI system are limited to a finite gamut. It is defined by the black and white points of the DCI color space and three specific color points, a "red", a "green", and "blue"; each has a define chromaticity and luminance. The implication is that the gamut of the projector must be at least as extensive as if it used those colors as its primaries in an RGB implementation.
This defined gamut may turn out to be a polyhedron of six faces (something like a double triangular pyramid) in CIE x-y-Y space, but I am not certain. I am at present not prepared to discuss how this relates to the gamuts with which we normally work in our image color spaces.
As this is not a "luminance-chrominance" color coordinate system (such as we typically find in JPG still camera image files), there is no concept of "chrominance subsampling". The color of each each pixel is directly, fully, and absolutely described.
The picture information is data-compressed using the JPEG 2000 scheme. Each image frame (which we will see shortly will become a single file in the output data!) is processed independently.
The compression will not necessarily be strictly reversible (which is usually, inaptly, called "lossless"). There are limits on the compressed data rates that may impose compromises in that regard.
Let me note before we proceed that there is provision for considering the overall presentation as being divided, for "show management" purposes, into "reels" of arbitrary duration. (In film distribution, a "reel" typically consisted of at most about 22 minutes of content.)
Each frame of the presentation is prepared as a separate TIFF file, using the "RGB" form, except that the 12-bit forms of X', Y', and Z' are placed in the locations normally used for R, G, and B (which are really R', G', and B', by the way). Each value is padded to 16 bits with leading zeros.
Each frame's TIFF file has a filename of this form:
Bambi_on_Mars.Reel_1.00001.tifwhere "Bambi_on_Mars" is the title, the reel identification is as earlier discussed, and the number (00001 in this example) is the sequential number of the frame. Presumably, the frame number field can be longer when required; there must be a consistent number of digits in each field (with leading zeros as required). (A three-hour presentation would have 260,000 frames, possibly in a single reel).
A separate directory is used for the frames of each reel. An example would be:
Bambi_on_Mars.Reel_1I do not know what file system is to be used.
The collection of all the frame files, plus the files carrying the sound and other ancillary information, none of it yet compressed or (where applicable) encrypted, for an entire presentation is called a DCDM (Digital Cinema Distribution Master). A DCDM, with the data compressed, and encryption (if applicable) applied, and carried by a physical medium, transmission data stream, or the like, is called a DCP (Digital Cinema Package). This is what is delivered to the exhibiting theaters, typically over a satellite link.
Whatever metadata is provided appears in the file for every frame. At a minimum, these metadata items are provided: Number of horizontal pixels, Number of vertical pixels, Frame rate, Number of frames in the "sequence" (not quite sure what that means).
There are elaborate arrangements for carrying the sound. These are beyond the scope of this note.
There are elaborate provisions for carrying subtitles and other kinds of "timed text". These are beyond the scope of this note.
Security and rights management
There are elaborate provisions (including the use of encryption) for security and rights management. These are beyond the scope of this note.
Total data load
The total data load per hour of picture plus sound depends on how the compression goes. Evidently, from estimates given in the SCI standard, we might expect a 3-hour feature film itself (picture and sound) to come up to something in the area of 150 GB. Not bad, by today's standards.
Thanks for your contribution. I'll have to take some time to read it carefully before I can comment if necessary/useful. I expect to learn more than I can contribute.
If you do what you did, you'll get what you got.
|Thread||Thread Starter||Forum||Replies||Last Post|
|Dealing with criticism of one's own work.||Asher Kelman||Art Theory: Idea workshop.||64||Today 12:46 AM|
|WW II Bunkers||Cem_Usakligil||Still Photo: Approaching Fine Photography||43||October 9th, 2010 12:59 PM|