Re: parseColorSpaceFromString() issue


Larry Gritz <l...@...>
 

It's complicated. Metadata, file naming conventions, and color space dictates/restrictions in the specification of particular file formats may all be mutually contradictory. What do you do? What is the decision hierarchy?

Personally, I would prefer that filenames be arbitrary and that metadata be used to indicate colorspace. (Notwithstanding hard color space constraints of particular formats.)

Jeremy and I have discussed this many times before, and (if I may put words in his mouth) he believed strongly that metadata was so frequently wrong as to be useless (image apps may not set, drop, or incorrectly set the color space metadata).

It's true that the convention at SPI is to bake the color space name into the file name, and ignore everything else. This is, IMHO, largely a historical response to file formats that did not support any way to specify color space information, and a hodgepodge of image-handling applications and libraries that might botch it or simply not propagate the color space info from input to output (and more often than not, were almost completely ignorant of metadata). Perhaps with EXR gaining dominance (arbitrary metadata, yay) and OIIO being the basis of a growing majority of our image-handling software (a lot of attention to reading and propagating the metadata), relying on metadata for color space hints is more achievable than it once was. But there are still a lot of edge cases to struggle with.

Jeremy and I had, at some point, discussed the possibility of a coordinated attack involving a function (in one or both of our packages) that might take as parameters the filename, format name, and metadata (if any), and given all this information would return the best guess of color space, with some set of (documented) sensible rules to adjudicate any conflicts. It's even possible that the rules could be part of an OCIO configuration (i.e., a way to say "at studio TLA, filenames take precedence over metadata, here's the regex that isolates the color space name, but override that by knowing that '.blah' files are always color space 'foo'"). But we never quite got around to fully fleshing it out.

-- lg


On Oct 1, 2014, at 2:23 PM, Mark Boorer <mark...@...> wrote:

Hi,

> May I suggest making parseColorSpaceFromString() optionally a little more strict?

I'm certainly happy to add in this functionality if it makes your lives easier.

> This is an especially egregious failure, considering that JPEG files are by specification LDR and sRGB (so very not "lnf").

As a quick aside, do many people use this functionality? I was under the impression that this was mostly an SPI thing (baking the colorspaces in to the filenames). Personally we use metadata / heuristics to determine which colorspaces our images are in. (eg, JPG files are sRGB, DPX's are in the camera's log space).

If you feel like knocking up a pull request containing your patch, I'll merge it (assuming it's all good). Otherwise I'm happy to do so on your behalf.

Cheers,
Mark


On Wed, Oct 1, 2014 at 9:49 PM, Sean Looper <sean....@...> wrote:
I've been bitten by this as well. Ended up writing my own parsing function.

On Wed, Oct 1, 2014 at 1:43 PM, Larry Gritz <l...@...> wrote:
+1 on the suggestion of recognizing OCIO color spaces in the filenames only if they are fully delimited.

    "brickKilnFront.jpg"              => "lnf"

This is an especially egregious failure, considering that JPEG files are by specification LDR and sRGB (so very not "lnf").


On Oct 1, 2014, at 12:46 PM, Michael Root <mi...@...> wrote:

May I suggest making parseColorSpaceFromString() optionally a little more strict?

The basic problem is that as we're transitioning to using OCIO, there are still many situations where a file might have an OCIO colorspace in the filename....but it may also not have one.

Even if strictparsing is true, the 'strictness' is still pretty lax.  Any part of the filename that happens to line up with a colorspace suddenly becomes the colorspace.  

For example, parseColorSpaceFromString() returns:

    "dialogSync1020.0100.dpx"         => "nc10"
    "brickKilnFront.jpg"              => "lnf"
    "digiDoubleForeign1029.0100.dpx"  => "gn10"

May I suggest a config option like:

    delimiters: _./

Which would restrict the parser to only consider substrings delimited on both sides by one of those chars (or the beginning or end of the string).  By default, delimiters is an empty string, which results in the current behavior.

Attached is a patch against 1.0.9 that implements this idea.

-miker

--
Larry Gritz
l...@...




--
You received this message because you are subscribed to the Google Groups "OpenColorIO Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ocio-dev+u...@....
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups "OpenColorIO Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ocio-dev+u...@....
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups "OpenColorIO Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ocio-dev+u...@....
For more options, visit https://groups.google.com/d/optout.

--
Larry Gritz
l...@...



Join ocio-dev@lists.aswf.io to automatically receive all group messages.