S-Log2/S-gamut 10-bit 422 XAVC to Linear ACES RGB EXR
Vincent Olivier <vin...@...>
I'm trying to convert the S-Log2/S-gamut 10-bit 422 XAVC footage coming out of my Sony F55 camera to a sequence of Linear ACES RGB OpenEXR files. I would like to validate the assumptions I make throughout the conversion process with you guys, if you be so kind.
My preliminary code is here: https://github.com/up4/fauxraw/blob/master/fauxraw.cpp
Little-endian S-log2/S-gamut 10-bit 422 YCbCr to little-endian S-log2/S-gamut 10-bit 444 YCbCr
First, I'm using FFMPEG to open the XAVC file (which it recognizes simply as a MXF-muxed XAVC Intra stream). The x264 decoding seems to work superbly as far as I can see. Then, I'm using FFMEG's Lanczos 422 to 444 upscaling algorithm, which is slow, but produces the best results, IMHO.
My only question here is: what does Poyton mean when, comparing 8-bit to 10-bit, he says "The extra two bits are appended as least-significant bits to provide increased precision."?
Because FFMPEG indicates that the stream is 10-bit little-endian, which calls for a 8-bit shift of the second byte (lines 168-184 in my code). Anyways, this seems to work just fine. I'm just checking if there is something I don't understand in Poyton's qualification of the 10-bit YCbCr bitstream endianness or maybe it's reformatted under the hood by FFMPEG from the raw XAVC output. Mystery…
S-log2/S-gamut YCbCr to S-log2/S-gamut R'G'B'
My assumptions here are that:
1) The camera uses the Rec 709 R'G'B' to YCbCr transform. And I use the reverse Rec 708 to get R'G'B' from YCbCr. Or is there something in SMPTE-ST-2048-1:2011 I should know about and take into consideration here?
2) The footroom provision is 0…64 for luma samples and 0…512 for chroma samples. See lines 187-191 in my code. I have adapted that from the 8-bit footroom values (16/128) because it seems to make sense according to basic signal statistics I've made on the samples from one frame. But I'm REALLY not sure about that…
3) Slog2 code uses "full-range" RGB (0…255 and not 0…219). See matrix at lines 131-136 in my code for the YCbCr to RGB "full-range". The headroom-preserving transform matrix is at 140-145 (I'm not using this one).
S-log2/S-gamut R'G'B' to Linear S-gamut RGB
This is where it gets interesting. I have adapted the "slog2.py" code, part of the OpenColorIO-Configs project on Github provided by Jeremy Selan (I sent him an email weeks ago and didn't hear from him).
Assumptions made in my code:
1) The rescale at slog2.py:17-23 is redundant if you provide "full-range" RGB to the S-Log2 linearization algorithm. I left it in my code (see lines 65-66). But commenting it out seems to give more dynamic range to the result. Again, I might be dead wrong on this.
2) The differences between S-log1 and S-log2 are only: A: for the same input, S-log1 doesn't have a rescaling step and S-log2 has one (see my previous point), B: there is a linear portion in the shadows, and C: the highlight-portion power-function is scaled by 219/155.
Linear S-gamut RGB to Linear ACES RGB
There are two distinct CTL transforms from S-gamut to ACES for two white points: 3200K and 5500K. Why? Would one get the 5500K transform matrix by applying a white-balance transform on the 3200K version (and vice-versa)?
I feel like I'm reverse engineering the whole thing and I'm not confident enough (yet) of the rigorous "scene-referredness" of the output. It looks good, but there has been too much guesswork involved to fully trust it. I would really appreciate some pointers.
Finally, on a side note, I would eventually accelerate the linear parts of the color transform through CUBLAS. I think I can achieve realtime speed both for offline conversion and field monitoring. Has anyone tried to port some of the OCIO code to CUDA?
Thanks for everything!