Category: Video

  • ProRes levels

    Why would someone want to encode ProRes with full range levels? This question comes up often. To make a long story short, don’t do it! Since its inception in 2007, ProRes has always been a legal range codec, designed to accept YUV input. In the early days Apple didn’t provide much detail, but the codec has matured and in 2015 it was formalized in SMPTE RDD-36:

    A ProRes compressed slice contains an entropy-coded array of scanned quantized DCT coefficients for each color component (Y′, Cb, and Cr). Quantized DC coefficients are encoded differentially, while quantized AC coefficients are run-length encoded.

    RDD 36:2015 “Apple ProRes Bitstream Syntax and Decoding Process”

    Note: I’m aware that YUV technically refers to analog encoding, and we are really talking about Y’CbCr. But the term is so ubiquitous that I still use it, and YCC just sounds weird.

    Based on the above quote, all ProRes, including 422 and 444, is encoded as YUV, and therefore must be legal range. DaVinci Resolve offers a checkbox to encode full range ProRes, and some people believe this equates to better quality. I will demonstrate why this is wrong. Here is an image created in Photoshop, with gradients from black to fully saturated primaries.

    This is what it looks like on an RGB parade:

    And this is what it looks like when rendered to full-range ProRes 444 using Resolve. The darkest and brightest parts are clipped, meaning detail is lost. If this were a real picture instead of gradients it would look terrible:

    Now that’s not really fair, because Resolve automatically decodes ProRes as legal range by default (that ought be be a sign.) We can force it to decode as full-range using Clip Attributes, and then it looks like this:

    This is better, but look closely at the highlights. The luma (green) channel is correct, while red and blue are both clipped around 98%. This means you are losing detail in highly saturated colors. The reason for this dates back over 30 years to the days of analog video, and a more detailed explanation can be found in ITU-R BT.601. In 10-bit code values, luminance was recorded using a range of 64-940, while chroma difference was recorded using 64-960. Since all video software works with full-range internally, the full-range ProRes file scales 940 up to 1023 and the last 20 code values are discarded.

    There is one advantage to using full range. It means you gain a few extra code values which may theoretically reduce quantization error, increasing the perceived detail in gradients. However, this is not necessary because ProRes 444 already has 12 bits of precision using legal range, as shown here. In my opinion the extra precision is not worth the risk of chroma clipping shown above. And every major software interprets ProRes as legal by default, so if you choose full, you will have to explain to all your delivery partners why the image needs special treatment to avoid clipping.

  • ProRes bit depth

    There has been some confusion in the industry about the ProRes codec, especially regarding its maximum color precision, or bit depth.  One question I am often asked is, can ProRes really be 12 bit or 16 bit?  Apple tries to answer in their 2020 white paper:

    Apple ProRes 4444 XQ and Apple ProRes 4444 support image sources up to 12 bits and preserve alpha sample depths up to 16 bits. All Apple ProRes 422 codecs support up to 10-bit image sources […]
    Note: Like Apple ProRes 4444 XQ and Apple ProRes 4444, all Apple ProRes 422 codecs can in fact accept image samples even greater than 10 bits, although such high bit depths are rarely found among 4:2:2 or 4:2:0 video sources. 

    Unfortunately, this language is too vague. But we don’t need to take their word for it, as this can be proved empirically.  By the way, Apple’s note means you can send 16-bit buffers to the encoder, but those extra bits will be discarded.

    One of the hardest tasks for any codec is recording a low contrast image like LogC, and then later stretching it out to look nice in the grade. If the bit depth is too low, this process will make smooth gradients look like jagged steps, known as banding.

    To emulate this process, I created a linear gradient in Photoshop using a very narrow range of gray, from 12850 – 14006 in 16-bit code values. The values were chosen arbitrarily; all that matters is there are 1156 steps of luminance here.  Since Photoshop’s 16 bit mode is actually 15 bit + 1, this equates to 401 – 437 in 10-bit code values using the formula \(\frac{L}{2^{15}} * (2^{10} – 1)\)

    In a perfect world, these 1156 steps in 15 bit should equate to 144 steps in 12 bit, 36 steps in 10 bit, and 9 steps in 8 bit.
    I brought this into Resolve as a 16-bit TIFF and stretched the contrast using lift and gain controls until it looks like a ramp from black to white.  The waveform shows a nice diagonal line but I’m not going to bother counting it.

    I converted the original TIFF to a few different formats, and applied the same contrast stretch on each one. The ProRes 444 has approximately 146 steps in the waveform, which proves it can hold 12 bits of precision. ProRes XQ is very similar.
    Note: This does not mean every ProRes 444 file actually has this much detail. It’s quite possible for someone to convert an 8-bit DSLR video to ProRes, and you don’t magically gain anything in this case.

    All the 4:2:2 varieties show 31 steps, with some differences in sharpness and compression artifacts, therefore these are 10 bit. This image is ProRes Proxy, the worst looking one.

    Just for comparison, here is 8-bit HEVC showing 8 steps. By the way I also tested HEVC with main10 profile and it shows the same 31 steps as ProRes, but with slightly different compression characteristics.

    Over the years, various softwares have displayed this info in different ways. Today, Assimilate Scratch 9.3 identifies all ProRes as RGBa-FP16. This isn’t wrong, it just means this is the buffer size they chose to decode into (probably because it’s easier than 12 bit for a binary computer). It has nothing to do with the way the file was encoded.
    DaVinci Resolve used to do the same thing, but since version 16 they label 444 as 12-bit and 422 as 10-bit, which makes sense to me.

    In the past, FFmpeg decoded all ProRes to 10-bit, but this was fixed in 2018. Now it uses 12 bit for the 4:4:4 varieties, and 10 bit for the 4:2:2 varieties. You can see here the pixel formal is yuv444p12le.

    Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'TKD ProRes 444.mov':
      Duration: 00:00:05.01, start: 0.000000, bitrate: 7021 kb/s
        Stream #0:0(eng): Video: prores (4444) (ap4h / 0x68347061), yuv444p12le(tv, bt709, progressive), 1920x1080, 7018 kb/s, SAR 1:1 DAR 16:9, 23.98 fps, 23.98 tbr, 24k tbn, 24k tbc (default)

    One of the best tools for inspecting media metadata, MediaInfo, correctly omits the bit depth indicator because it doesn’t exist as a metadata atom. It’s important to understand, if someone hands you a ProRes file, there is no way to identify the bit depth by reading some metadata.

    There is still one thorn in my side from Apple. In macOS Catalina, QuickTime Player identifies ProRes 422 as “up to 12-bit”, and I have never found any example of this in the real world. Regardless, the distinction is purely academic, because anyone concerned with 12-bit precision ought to be using 4:4:4 anyway.

    Footnote: All ProRes encoding was done using legal range. Using full range would gain a few more bits of precision, but it will cause other problems as shown here.