Color can be mathematically represented in several different ways within a video system. The most obvious color-representation model is RGB, in which red, green, and blue are represented directly. Another is YCbCr, in which Y represents the black-and-white brightness or luminance, Cb represents color along the blue-yellow axis, and Cr represents color along the red-green axis. At the 2017 SMPTE tech conference, Dolby demonstrated the advantages of yet another color-representation model called ICtCp.
ICtCp is very similar to YCbCr. The letter “I” represents the black-and-white intensity, Ct represents color along the blue-yellow axis, and Cp represents color along the red-green axis. (The letter “t” stands for tritan, which refers to the blue-yellow axis of human vision, and “p” stands for protan, which refers to the red-green axis of human vision.) In both cases, the two “C”s (Cb/Cr or Ct/Cp) are called color-difference channels because the math used to define them relies on differences between the colors.
You will sometimes see the term Y’CbCr. The prime marker indicates that the “Y” channel has been scaled to the range from 0 to 1 according to the EOTF (electro-optical transfer function), which is gamma for standard dynamic range (SDR) and typically PQ or HLG for high dynamic range (HDR). This helps differentiate the “Y” in YCbCr from the absolute luminance “Y” in the XYZ color-representation model.
The “I” in ICtCp is also scaled according to the EOTF. No prime marker is used in this case because there is no other “I” in video nomenclature with which it could be confused.
Another bit of nomenclature you should know is that EOTF-encoded YCbCr using the BT.2020 color primaries is written Y’C’bC’r. The prime markers on Cb and Cr indicate that they represent BT.2020 colors, which use different equations than Cb and Cr for BT.709 colors. The two systems are not interchangeable, so prime markers were added to differentiate between the two sets of color primaries. Very confusing, I know, but that’s the way it is.
Why use these color models at all? Why not just use RGB? From a historical perspective, YCbCr was developed in the 1950s so that the new color-television signals would be backward compatible with black-and-white TVs. Those older TVs responded only to the Y channel of the signal and ignored Cb and Cr.
Another reason to use a color-representation model with color-difference channels is the ability to reduce the amount of information that must be represented. The human visual system is more sensitive to variations and resolution in brightness than it is to variations and resolution in color. So, if the brightness information is separated from the color information, the amount of color information can be reduced while maintaining all the brightness information. This results in smaller file sizes and lower transmission-bandwidth requirements with little degradation in the perceived image.
The process of removing information from the color-difference channels is called chroma subsampling and gives rise to the terms 4:4:4, 4:2:2, and 4:2:0. (There are other variations of chroma subsampling, but these are the most common for consumer video.) These terms indicate the relative resolution of the brightness channel and the two color-difference channels. With 4:4:4, all three channels are at full resolution. 4:2:2 indicates that the horizontal resolution of each color-difference channel is half that of the brightness channel, and 4:2:0 indicates that the horizontal and vertical resolution of each color-difference channel are half those of the brightness channel.
A video camera captures color as RGB, which is then converted to YCbCr. RGB could also be converted to ICtCp.
Virtually all forms of consumer video—from broadcast to streaming to DVD, Blu-ray, and UHD Blu-ray—use Y’CbCr (or Y’C’bC’r for HDR) and 4:2:0 chroma subsampling in order to minimize the storage and bandwidth requirements. However, the lost color information must be restored before it can be displayed. This works surprisingly well for content in SDR, but Y’C’bC’r runs into trouble with content in HDR, which includes colors that are more saturated than those found in SDR.
When Y’C’bC’r 4:2:0 or 4:2:2 is converted to RGB for display, highly saturated color information “leaks” into the luminance channel, causing visible noise. This is inherent in the model itself, not an artifact of the processor that performs the conversion. By contrast, ICtCp exhibits much less crosstalk when converted to RGB. Why? Because “I” is a much more accurate representation of EOTF-encoded luminance than Y’.
Dolby demonstrated this in its booth at SMPTE 2017. Two short sequences were captured using a Grass Valley LDX86n camera (1080p, 60 frames per second, 10-bit PQ). Each sequence was shot twice—once with the camera set to output PQ-encoded Y’C’bC’r 4:2:2, and again with the camera set to output PQ-encoded ICtCp 4:2:2. (It would have been better to output 4:2:0 in order to show a greater difference between the two color models, but that’s not an option in the camera.)
One sequence was a scene with highly saturated colored blocks, and the other was a close shot of a highly saturated blue dot-matrix display. Both sequences were then converted to RGB and stored on a Video Clarity ClearView 4K recorder/player. Each version was sent simultaneously via dual-link SDI from the player to a Dolby Maui reference monitor (1080p, PQ, 2000 nits peak brightness).
I couldn’t see much difference between the two images in the sequence with the colored blocks, though there was a bit more noise in the deep-red blocks of the Y’C’bC’r version. However, there was a lot more noise in the Y’C’bC’r version of the blue dot-matrix display than there was in the ICtCp version.
It’s impossible to fully represent HDR images on a website, but the following images provide some idea of what the Dolby demo looked like:
The bright-red blocks above the doll have a bit more noise in the Y’C’bC’r version (left). The same scene in ICtCp (right) has less noise in the red blocks, but the difference was subtle. (Image courtesy Dolby Labs)
The upper image is Y’C’bC’r, and you can clearly see more noise in the blue display than there is in the lower image, which is ICtCp. (Image courtesy Dolby Labs)
As I mentioned earlier, YCbCr is used in virtually all consumer video at this point. The only exception I know of is Netflix, which uses ICtCp for its Dolby Vision titles. ICtCp has been proposed for ATSC 3.0, the next-generation standard for over-the-air broadcasting. However, it is not included in the UHD Blu-ray spec.
All Dolby Vision-capable TVs—and most other displays—can decode ICtCp directly. But until more content is delivered using this color-representation model, that capability will go largely unused. I hope that more content providers learn about the advantage of ICtCp and use it for future HDR content.
Many thanks to Jaclyn Pytlarz, Senior Engineer of Applied Vision Science at Dolby Labs, for her help with this article.