Review: YAML serialization


Malcolm Humphreys <malcolmh...@...>
 

- Initial checkin of YAML serialization
- Added ext/yaml-cpp-0.2.5.tar.gz with a patch to build it static
- Added yaml-cpp to the LICENCE file
- CDLTransform should be the only thing depending on tinyxml
- PyOCIO_Config_getXML now returns YAML not XML, this should be fixed soon. Possibly this python method should be removed and we use the config.__str__() instead, this would make it slightly more pythonic

http://github.com/malcolmhumphreys/OpenColorIO/commit/ef6cddf742b961e89061ddd9a68f968eb0d4e5fd

.malcolm


Jeremy Selan <jeremy...@...>
 

Nice work.

A few comments:

OCIOYaml.cpp
* There was a missing #include <cstring>
* For the Enums, (line 512+) this could could be greatly simplified by
relying on the helpers in OpenColorTypes.h.
* You mention in the code, "GroupTransform seems to be a mixed
concept". And, you dont allow for it to be serialized. Could you
elaborate your thinking behind this?

* The ExternalProject_Add stuff is slick, but appears to required
cmake 2.8. We're currently only requiring 2.6. Is this update
something we'd like to do?
* On the Imageworks linux system, there is a build error related to libyaml-cpp:

/usr/bin/ld: ../../ext/dist/lib/libyaml-cpp.a(node.cpp.o): relocation
R_X86_64_32S against `std::basic_string<char, std::char_traits<char>,
std::allocator<char> >::_Rep::_S_empty_rep_storage' can not be used
when making a shared object; recompile with -fPIC

I've addressed most of these in my local git repo in a low tech
manner, by manually copying yaml into src/core/yaml.
I've also patched yaml to put it into the OCIO namespace.

see http://github.com/jeremyselan/OpenColorIO/tree/yaml

While this approach isn't clean as your .tgz "ExternalProject_Add"
solution, it's potentially less prone to build errors, and addresses
all namespace issues. And, seeing as I imagine we wont be needing to
update yaml versions on a regular basis, the maintenance implications
aren't particularly troublesome either.

Of course, I'm open to your approach of ExternalProject_Add assuming
we actively agree to require cmake 2.8 (and address the build
problems.)

-- Jeremy

On Sun, Oct 17, 2010 at 5:09 PM, Malcolm Humphreys
<malcolmh...@...> wrote:
- Initial checkin of YAML serialization
- Added ext/yaml-cpp-0.2.5.tar.gz with a patch to build it static
- Added yaml-cpp to the LICENCE file
- CDLTransform should be the only thing depending on tinyxml
- PyOCIO_Config_getXML now returns YAML not XML, this should be fixed soon. Possibly this python method should be removed and we use the config.__str__() instead, this would make it slightly more pythonic

http://github.com/malcolmhumphreys/OpenColorIO/commit/ef6cddf742b961e89061ddd9a68f968eb0d4e5fd

.malcolm


Jeremy Selan <jeremy...@...>
 

I've also cleaned up the enum code.
See 1307a8730ffe745daeb53cff4298518ed5ceafc3


Malcolm Humphreys <malcolmh...@...>
 

Hi,

OCIOYaml.cpp
* There was a missing #include <cstring>
* For the Enums, (line 512+) this could could be greatly simplified by
relying on the helpers in OpenColorTypes.h.
I think you just fixed this, but yeah there was a bit of code dup.

* You mention in the code, "GroupTransform seems to be a mixed
concept". And, you dont allow for it to be serialized. Could you
elaborate your thinking behind this?
With the current structure it would be possible to have GroupTransform()s containing GroupTransform()s and possibly more GroupTransform()s and so on..

On a colorspace you can set ColorSpace::setTransform(GroupTransform, dir), to me this seems ok and I do serialize it in this case, but if I support the GroupTransform where I have support for all the other Transform()s then you could build multiple layer deep GroupTransform()s which doesn't seem necessary.

I would prefer to create a new type TransformList eg. ColorSpace::setTransform(TransformList, dir) which would replace GroupTransform and it would different than the base Transform()


* The ExternalProject_Add stuff is slick, but appears to required
cmake 2.8. We're currently only requiring 2.6. Is this update
something we'd like to do?
Did you try and copy the ExternalProject.cmake into share/cmake to see if that works. I have attached it, can you check to see if it imports ok as I'm not running 2.6.

I have been using 2.8 for about 6 months with no issues, so I would say it's safe to update esp if it helps to reduce the code base.

http://www.kitware.com/products/html/BuildingExternalProjectsWithCMake2.8.html

* On the Imageworks linux system, there is a build error related to libyaml-cpp:
I think I was getting a similar error when putting yaml-cpp into the OpenColorIO namespace, I was leaving this to fix later. Did you find a fix for this?

/usr/bin/ld: ../../ext/dist/lib/libyaml-cpp.a(node.cpp.o): relocation
R_X86_64_32S against `std::basic_string<char, std::char_traits<char>,
std::allocator<char> >::_Rep::_S_empty_rep_storage' can not be used
when making a shared object; recompile with -fPIC

I've addressed most of these in my local git repo in a low tech
manner, by manually copying yaml into src/core/yaml.
I've also patched yaml to put it into the OCIO namespace.

see http://github.com/jeremyselan/OpenColorIO/tree/yaml

While this approach isn't clean as your .tgz "ExternalProject_Add"
solution, it's potentially less prone to build errors, and addresses
all namespace issues. And, seeing as I imagine we wont be needing to
update yaml versions on a regular basis, the maintenance implications
aren't particularly troublesome either.
Umm I would prefer if this was a patch for the reasons you have stated that it isn't going to change all that much so having it all checked in seems like a waste, while also making it harder to absorb any changes when yaml-cpp changes. The dev on yaml-cpp is pretty active (http://code.google.com/p/yaml-cpp/updates/list)

If we were choosing to fork yaml-cpp (which I don't think we are) or other ext dependants, checking them in would be fine and would help to indicate a fork.

I was also hoping to look at using http://gcc.gnu.org/wiki/Visibility as well to help reduce symbol clashes, have you had any luck/experience with this?

Of course, I'm open to your approach of ExternalProject_Add assuming
we actively agree to require cmake 2.8 (and address the build
problems.)
+1

Let me know if the ExternalProject.cmake works in 2.6, I'm happy to clean up the other externals the same way while I'm still motivated to do so. :)

.malcolm



-- Jeremy

On Sun, Oct 17, 2010 at 5:09 PM, Malcolm Humphreys
<malcolmh...@...> wrote:
- Initial checkin of YAML serialization
- Added ext/yaml-cpp-0.2.5.tar.gz with a patch to build it static
- Added yaml-cpp to the LICENCE file
- CDLTransform should be the only thing depending on tinyxml
- PyOCIO_Config_getXML now returns YAML not XML, this should be fixed soon. Possibly this python method should be removed and we use the config.__str__() instead, this would make it slightly more pythonic

http://github.com/malcolmhumphreys/OpenColorIO/commit/ef6cddf742b961e89061ddd9a68f968eb0d4e5fd

.malcolm


Jeremy Selan <jeremy...@...>
 

Only addressing the GroupTransform issue...

I am in favor of keeping GroupTransform as is. Having a group of
transforms (at the API level) be interchangeable with a single
transform is a very powerful concept. The allows the API to be
pretty flexible, as class methods can seamlessly accept either a
single transform, or a group of transforms, using the same interface.

A good example of this is config.applyTransform(Transform *), which
can take either a single transform or a group of transforms. If we
were to create a separate object - TransformVec - we would need to
have 2 overloaded calls to expose the same functionality.

Another example of this is DisplayTransform.setDisplayCC(Transform *),
which controls the 'post display transform' color correction.
(Think the gamma slider in the nuke monitor).

In our internal compositing package, we actually let the user adjust
the white point, black point, and gamma. This is internally handled by
creating a group transform (at runtime), adding the appropriate
children, and then calling setDisplayCC. There is public code
analogous to this in nuke/Display/Display.cpp, Display::_validate.

In fact, I'd probably want to even go a step further and have the
ColorSpace.setTransform call take a Transform * rather than a
GroupTransform *. (If you want it to be a group, great! Otherwise,
why force it?) There's no reason to force a grouptransform anywhere
in the public interface, as far as I can tell.)

The potentially nested nature of groupTransforms / subgroups does not
bother me. If a group - containing a group - happens to be the
natural way to express a particular transform to the user, why not allow it?

-- Jeremy

* You mention in the code, "GroupTransform seems to be a mixed
concept". And, you dont allow for it to be serialized. Could you
elaborate your thinking behind this?
With the current structure it would be possible to have GroupTransform()s containing GroupTransform()s and possibly more GroupTransform()s and so on..

On a colorspace you can set ColorSpace::setTransform(GroupTransform, dir), to me this seems ok and I do serialize it in this case, but if I support the GroupTransform where I have support for all the other Transform()s then you could build multiple layer deep GroupTransform()s which doesn't seem necessary.

I would prefer to create a new type TransformList eg. ColorSpace::setTransform(TransformList, dir) which would replace GroupTransform and it would different than the base Transform()


Malcolm Humphreys <malcolmh...@...>
 

Hi,

Comments below.

Only addressing the GroupTransform issue...

I am in favor of keeping GroupTransform as is. Having a group of
transforms (at the API level) be interchangeable with a single
transform is a very powerful concept. The allows the API to be
pretty flexible, as class methods can seamlessly accept either a
single transform, or a group of transforms, using the same interface.

A good example of this is config.applyTransform(Transform *), which
can take either a single transform or a group of transforms. If we
were to create a separate object - TransformVec - we would need to
have 2 overloaded calls to expose the same functionality.

Another example of this is DisplayTransform.setDisplayCC(Transform *),
which controls the 'post display transform' color correction.
(Think the gamma slider in the nuke monitor).

In our internal compositing package, we actually let the user adjust
the white point, black point, and gamma. This is internally handled by
creating a group transform (at runtime), adding the appropriate
children, and then calling setDisplayCC. There is public code
analogous to this in nuke/Display/Display.cpp, Display::_validate.

In fact, I'd probably want to even go a step further and have the
ColorSpace.setTransform call take a Transform * rather than a
GroupTransform *. (If you want it to be a group, great! Otherwise,
why force it?) There's no reason to force a grouptransform anywhere
in the public interface, as far as I can tell.)
+1 this would clean the whole thing up for me.

The potentially nested nature of groupTransforms / subgroups does not
bother me. If a group - containing a group - happens to be the
natural way to express a particular transform to the user, why not allow it?
Yeah thats sounds cool, ColorSpace.setTransform(Transform *) clearly makes a GroupTransform just another Transform which I like a lot more than it being special in the ColorSpace() context.

.malcolm


-- Jeremy

* You mention in the code, "GroupTransform seems to be a mixed
concept". And, you dont allow for it to be serialized. Could you
elaborate your thinking behind this?
With the current structure it would be possible to have GroupTransform()s containing GroupTransform()s and possibly more GroupTransform()s and so on..

On a colorspace you can set ColorSpace::setTransform(GroupTransform, dir), to me this seems ok and I do serialize it in this case, but if I support the GroupTransform where I have support for all the other Transform()s then you could build multiple layer deep GroupTransform()s which doesn't seem necessary.

I would prefer to create a new type TransformList eg. ColorSpace::setTransform(TransformList, dir) which would replace GroupTransform and it would different than the base Transform()


Jeremy Selan <jeremy...@...>
 

Addressing the YAML topics:

* I have pushed your commit, plus my non-contentious updates, to
spi/yaml. Once build issues are worked out, this will be moved to
spi/master (and delete the yaml branch). +1 for GIT.

* We are on Cmake 2.8 @ Imageworks, I was just making the observation
that your use bumps up the required version from it's prior 2.6 value.

* Unfortunately, the compile issue I'm having does not appear to be
tied to the namespace stuff. Does your yaml branch currently build? On
my linux box, it does not. What OS are you testing on?

I tried for about 30 min today to find a fix, but could not figure it
out. I still believe the fix is to pass the -fPIC flag to the yaml
compilation, but was not able to figure out how to get this passed
along. (It's easy to verify though with make VERBOSE=1). I'll look
into this more tomorrow.

* "The dev on yaml-cpp is pretty active
(http://code.google.com/p/yaml-cpp/updates/list) ... If we were
choosing to fork yaml-cpp (which I don't think we are) or other ext
dependants, checking them in would be fine and would help to indicate
a fork.

Ok. I'll take the bait.

What's the argument against forking our internal dependencies? (I.e.,
locking off on the known working versions of yaml-cpp, tinyxml,
pystring, and md5)

If these were all live libraries (.tgz(s) in externals) - something
that could be updated easily - people would be inclined to bump up
when new versions of the libs are released. And, this feels counter
to the "if it aint broke, dont fix it" philosophy. If any of these
libraries were exposed in the public headers, it would be a different
situation. But seeing as they're all intended to be 100% private,
updating them on a semi-frequent basis (or making it easy to do so)
probably only has the potential to introduce bugs. Also, even if we
do go the .tgz route, we'll need non-trivial .patch files for the
namespace adjustments, which makes updating versions not trivial
anyways.

What new yaml development is being done? I had been presuming that the
core library we're considering using is rock solid in both API and
implementation.

The downside of forking, of course, is that we wouldn't be getting bug
fix releases, but when / if we encounter serialization bugs they'll
(hopefully) be obvious, and we could then update on an as-needed.
Yaml has 64 files, and it took me approx 15 min to namespace it so
re-doing that work as needed is not a huge issue.

I acknowledge that forking code feels a bit dirty though.
For the code voyeurs out there, here's what a fork'd yaml looks like:

http://github.com/jeremyselan/OpenColorIO/tree/yaml_fork

* http://gcc.gnu.org/wiki/Visibility

I hadn't seen this before (on non-windows), but it looks interesting.
I like the concept, I'll look into it more. (It will probably have
implications with regards to exceptions, but that's arguably a good
topic to revisit anyways).

-- Jeremy


Jeremy Selan <jeremy...@...>
 

It does compile on osx, so I'll do further linux investigations tomorrow.

-- Jeremy

On Mon, Oct 18, 2010 at 7:18 PM, Jeremy Selan <jeremy...@...> wrote:
* Unfortunately, the compile issue I'm having does not appear to be
tied to the namespace stuff. Does your yaml branch currently build? On
my linux box, it does not.  What OS are you testing on?


Malcolm Humphreys <malcolmh...@...>
 

Addressing the YAML topics:

* I have pushed your commit, plus my non-contentious updates, to
spi/yaml. Once build issues are worked out, this will be moved to
spi/master (and delete the yaml branch). +1 for GIT.

* We are on Cmake 2.8 @ Imageworks, I was just making the observation
that your use bumps up the required version from it's prior 2.6 value.
2.8 +1

* Unfortunately, the compile issue I'm having does not appear to be
tied to the namespace stuff. Does your yaml branch currently build? On
my linux box, it does not. What OS are you testing on?
- OSX 10.6
- CentOS 5.3
- Debian squeeze (testing)

I tried for about 30 min today to find a fix, but could not figure it
out. I still believe the fix is to pass the -fPIC flag to the yaml
compilation, but was not able to figure out how to get this passed
along. (It's easy to verify though with make VERBOSE=1). I'll look
into this more tomorrow.
do you mean adding this to the yaml-cpp project?

set_target_properties(<target_name> PROPERTIES
COMPILE_FLAGS -fPIC)

* "The dev on yaml-cpp is pretty active
(http://code.google.com/p/yaml-cpp/updates/list) ... If we were
choosing to fork yaml-cpp (which I don't think we are) or other ext
dependants, checking them in would be fine and would help to indicate
a fork.

Ok. I'll take the bait.

What's the argument against forking our internal dependencies? (I.e.,
locking off on the known working versions of yaml-cpp, tinyxml,
pystring, and md5)

If these were all live libraries (.tgz(s) in externals) - something
that could be updated easily - people would be inclined to bump up
when new versions of the libs are released. And, this feels counter
to the "if it aint broke, dont fix it" philosophy. If any of these
libraries were exposed in the public headers, it would be a different
situation. But seeing as they're all intended to be 100% private,
updating them on a semi-frequent basis (or making it easy to do so)
probably only has the potential to introduce bugs. Also, even if we
do go the .tgz route, we'll need non-trivial .patch files for the
namespace adjustments, which makes updating versions not trivial
anyways.

What new yaml development is being done? I had been presuming that the
core library we're considering using is rock solid in both API and
implementation.

The downside of forking, of course, is that we wouldn't be getting bug
fix releases, but when / if we encounter serialization bugs they'll
(hopefully) be obvious, and we could then update on an as-needed.
Yaml has 64 files, and it took me approx 15 min to namespace it so
re-doing that work as needed is not a huge issue.

I acknowledge that forking code feels a bit dirty though.
For the code voyeurs out there, here's what a fork'd yaml looks like:

http://github.com/jeremyselan/OpenColorIO/tree/yaml_fork
No bait intended :) I guess this comes down to a bit of personal choice.

Just because an external is in a .tgz shouldn't mean the version isn't any more unlocked than if it's checked in. Any private externals being updated should be tested the same way you would if you were merging in changes in the repo. dare I say test coverage? :)

I personally like being able to see in a single patch file the modifications that have been done to an external, regardless if the patch applies cleanly on a new version or not.

Updating an external with out a difference file seems like it would add more potential to miss any internal bug fixes we might do to an external ie the namespace change is pretty easy to spot, but if the changes are more esoteric then it might be missed in an update. Getting your hands on the un-patched version to do a diff to see the changes and then update the new version also seems a bit painful.

Regardless if we choose to have it as files or tgz in the repo, can we have the externals outside of the core/src dir? would make greping / searching for things like 'xml' or 'yaml' a bit more descriptive.

* http://gcc.gnu.org/wiki/Visibility

I hadn't seen this before (on non-windows), but it looks interesting.
I like the concept, I'll look into it more. (It will probably have
implications with regards to exceptions, but that's arguably a good
topic to revisit anyways).
I was also looking at GNU Export Maps (http://accu.org/index.php/journals/1372) which doesn't look as portable. eg.
http://open.rsp.com.au/projects/pyshake/browser/trunk/plugin.exp
http://open.rsp.com.au/projects/pyshake/browser/trunk/pyshake.export.map

This is also a good overview as well for maintaing ABI (http://people.redhat.com/drepper/dsohowto.pdf).

.malcolm


Malcolm Humphreys <malcolmh...@...>
 

Hi,

This should fix the link error on linux
http://github.com/malcolmhumphreys/OpenColorIO/commit/7b016f274aefd8a4ce2deb7c72639351db8de0b4

I always forget how to create these diffs so I guess other people might as well.
$> diff -Naur yaml-cpp-0.2.5 YAML_CPP_LOCAL > mynew.patch

I just edited the YAML_CPP_LOCAL src dir while running make in ext/build till OpenImageIO linked, then running the above command on a freshly extracted yaml-cpp-0.2.5.tar.gz which gives me the patch.

I found when building on a drd box we don't have any boost versions installed locally, these fix this issue.
http://github.com/malcolmhumphreys/OpenColorIO/commit/42c400a00f83a0e02734d9d216b6a07e1b7a4898
http://github.com/malcolmhumphreys/OpenColorIO/commit/8ec728181314d78be3a5d5b91b70793bb5b767b7

I'm not sure if we want to just include the boost/shared_ptr.hpp files or depend on boost.

.malcolm

On 19/10/2010, at 2:17 PM, Jeremy Selan wrote:

It does compile on osx, so I'll do further linux investigations tomorrow.

-- Jeremy

On Mon, Oct 18, 2010 at 7:18 PM, Jeremy Selan <jeremy...@...> wrote:
* Unfortunately, the compile issue I'm having does not appear to be
tied to the namespace stuff. Does your yaml branch currently build? On
my linux box, it does not. What OS are you testing on?


Jeremy Selan <jeremy...@...>
 

Works great on our linux distro now, thanks!

(LGTM)

I agree that it may make sense to include boost/shared_ptr.hpp, let me
look into this.

-- Jeremy

On Mon, Oct 18, 2010 at 11:27 PM, Malcolm Humphreys
<malcolmh...@...> wrote:
Hi,

This should fix the link error on linux
http://github.com/malcolmhumphreys/OpenColorIO/commit/7b016f274aefd8a4ce2deb7c72639351db8de0b4

I always forget how to create these diffs so I guess other people might as well.
$> diff -Naur yaml-cpp-0.2.5 YAML_CPP_LOCAL > mynew.patch

I just edited the YAML_CPP_LOCAL src dir while running make in ext/build till OpenImageIO linked, then running the above command on a freshly extracted yaml-cpp-0.2.5.tar.gz which gives me the patch.

I found when building on a drd box we don't have any boost versions installed locally, these fix this issue.
http://github.com/malcolmhumphreys/OpenColorIO/commit/42c400a00f83a0e02734d9d216b6a07e1b7a4898
http://github.com/malcolmhumphreys/OpenColorIO/commit/8ec728181314d78be3a5d5b91b70793bb5b767b7

I'm not sure if we want to just include the boost/shared_ptr.hpp files or depend on boost.

.malcolm

On 19/10/2010, at 2:17 PM, Jeremy Selan wrote:

It does compile on osx, so I'll do further linux investigations tomorrow.

-- Jeremy

On Mon, Oct 18, 2010 at 7:18 PM, Jeremy Selan <jeremy...@...> wrote:
* Unfortunately, the compile issue I'm having does not appear to be
tied to the namespace stuff. Does your yaml branch currently build? On
my linux box, it does not.  What OS are you testing on?