Review: Lut3d: Refactor to vector implementation

Jeremy Selan <jeremy...@...>

This is a refactor of the software trilinear 3d interpolation.

It was originally done as part of an SSE optimization, (and the SSE
code is left as an intermediate commit), but after performance testing
it was determined the SSE implementation offered no improvements so
it's been removed. But the newer vectorized 3d lut implementation is
much simpler code, and includes unit tests, so it's worth getting in
the codebase.

-- Jeremy