Date   

Re: Compiling OpenShadingLanguage under Windows

Wormszer <worm...@...>
 

I have been trying to get the shaders test to build but I have run into lots of issues with linking dependencies, exports imports etc.

The projects cmake generates for example oslquery uses source files from olsexec. This is causing issues with the DLL_PUBLIC etc type of declerations, because other projects use oslexec as a library to import.
I guess this may work fine on GCC but windows is pitches a fit, because in one project the defines are correct import/export in the other they are import when they should be nothing or export.

Are the source files being duplicated in the projects on purpose? Is this just a cmake issue? 
Is this just an issue with VS because GCC doesn;t care and just links everything together?

I think i should be able to rearrange the import/export types for windows, and then make olsquery have a depencency to oslexec instead of rebuilding the lexer etc. And for any of the other projects(i think the others are better).
And make sure the projects use the library imports rather than building the code themselves?

Or should i get them to build duplicating the code in areas and just fix the special cases with more defines.

I prefer the first option, and at first i could probably isolate it to windows in cmake and the source. But maybe its a issue for gcc as well in some areas.
Or maybe oslquery needs to run standalone you dont want to inlcude a dll/so.

Any thoughts?

Jeremy


On Thu, Jan 21, 2010 at 12:39 AM, Wormszer <worm...@...> wrote:

I have mcpp integrated into the compiler now. Working like CPP and writing everything to stdout.

A few things, the binary available for download at least on windows doesn't support forcing an include file.
So i had to build it from source, luckily it wasn't to difficult to create a new project for it.

I built it as VS for VS. Now that I think about it I bet I could build it as VS for GCC so that it would support the same command line options.

Another weird issue was I guess the lexer doesn't support  c-style /* comments? These might not be defined in the osl language, and if so probably should be removed from the stdosl.h.
Its the only difference i could see, that could of been causing the error.

CL would remove the */ */ comments at the end of the math defines in stdosl.h even though i told it to preserve comments.

mcpp when i told it to leave comments would actually leave them in there(seems like correct behavior), and then the lexer would crash.

I am not sure why comments are needed really at this point, and i think i must of included them for my own debugging, i don't think CPP was set to output them.

Thanks for the suggestions on a easier solution to the pre-processor issue.

I looked real quick for the boost but mcpp seemed easier since i could just get a binary and set it in my path. (but then i had to rebuild it)

Well now to see if I can get anything to happen with my compiled shaders.

Jeremy



On Wed, Jan 20, 2010 at 9:03 PM, Wormszer <worm...@...> wrote:
I am fine with either one. I think having something embedded or buildable would be usefull.

Otherwise there maybe issues with different compilers and would probably need some kind of config or something that cmake would generate at least on windows with several versions of VS etc.

Will just have to see how Larry or the other devs feel about using one of those two for Linux builds as well. I would assume it would be wise to have all the preprocessing done with the same tool when possible.

I will look at both real quick but I might lean towards mcpp.


On Jan 20, 2010, at 8:14 PM, Chris Foster <chri...@...> wrote:

On Thu, Jan 21, 2010 at 11:02 AM, Blair Zajac <bl...@...> wrote:
The main annoyance with wave is that it causes the compiler to issue truly
horrendous error messages if you get things wrong (the wave internals make
heavy use of template metaprogramming).

(Obviously this is only a problem when integrating wave into the project
source, and that's not really difficult at all.)

There's mcpp which is designed to be an embeddable C-preprocessor.

Ice, which we use at Sony Imageworks for all our middle-tier systems, uses
mcpp for it's own internal IDL type language, so I would recommend that.

mcpp looks nice.  In some sense, using wave would mean one less dependency
since OSL already relies on boost, but it does mean linking to libboost-wave,
so if you have a modular boost install the point may be moot...

~Chris

PS: Sorry to Blair for the duplicate message.  I intended to send it
to the list :-(
--
You received this message because you are subscribed to the Google Groups "OSL Developers" group.
To post to this group, send email to osl...@....
To unsubscribe from this group, send email to osl...@....
For more options, visit this group at http://groups.google.com/group/osl-dev?hl=en.





Re: run flags vs active intervals

Wormszer <worm...@...>
 

That's interesting, it kind of relates to my original question if the compiler was able to apply SIMD operations to the loop.
When you disabled vectorization did it effect the active index case?

Are those numbers taking into account the setup time to create either the active index or the intervals? Or is it basically the same for each method?

The iterator idea crossed my mind too but I too wouldn't of expected it to have such a performance hit either. I guess it prevents the compiler from unrolling the loop?
I wonder if the way you use the iterator is having an effect, where a for(begin, end, ++) implementation etc, if the compiler would do something different.

It looks like active index is the way to go, i wonder why it doesn't perform as well on the full range, is it because of the indirection that the compiler won't vectorize it? That the memory addresses may not be consecutive?

If the two methods perform well under different conditions is there enough of a benefit to say implement both, active intervals/indexs? Or a hybrid,  Active index, and if the # of index's == N, then its all on and could just do a loop without indirection and use a vectorized code path.

Is there enough coherence between frame to frame, execution to execution, that you could possibly score the run and use that method the next time?
Sort of like branch prediction, have some method to measure the coherence or incoherence of the current run to predict the next, even occasionally.


Jeremy


On Thu, Jan 21, 2010 at 8:36 PM, Chris Foster <chri...@...> wrote:
On Fri, Jan 22, 2010 at 4:56 AM, Larry Gritz <l...@...> wrote:
> I want to point out that Chris F tested out the "all on", "random on", and
> "alternating on/off" cases.  There's one more case that may be important,
> which is a few isolated "on" points with big "off" gaps in between -- and in
> that case, I expect Chris F's solution to perform even better (compared to
> what we have now) than the other cases.

Right, here's the initialization code for some sparse on-states:

   // Four isolated flags turned on.
   std::fill((Runflag*)r, r+len, (Runflag)Runflag_Off);
   r[0] = r[50] = r[100] = r[150] = Runflag_On;

results for this sparse case:

run flags:        1.710s
active intervals: 0.100s

Of course in this case, the run flags completely fail to captialize on the
fact that most of the shading elements are turned off, so the active intervals
formulation thrashes it.  Out of curiosity, I've also implemented the direct
indexing array Chris K suggested (code attached, see the function
addIndexed() ):

active index:     0.050s

as expected, blazingly fast here!


Here's the rest of the benchmarks redone with the active indexing method added:


All flags on (completely coherent):

run flags:        1.310s
active intervals: 0.690s
active index:     1.330s


Random flags on (completely incoherent):

run flags:        5.440s
active intervals: 3.310s
active index:     0.760s


Alternate flags on (maximum number of active intervals):

run flags:        1.500s
active intervals: 2.150s
active index:     0.710s


The results are quite interesting.  They suggest that active indexing is likely
to be faster for highly incoherent data, but that active intervals is the clear
winnier when all your flags are on.

The random flags benchmark is particularly interesting to me because the active
intervals formulation does a lot worse than the active index in this case.


As for the implementation, you can potentially have the ability to change
between any of these run state implementations if you introduce an iterator
abstraction.  The tricky thing seems to be making sure the abstraction is
getting optimized as efficiently as the plain loops.

Here's my crack at an active intervals iterator class, but alas, the benchmarks
show that this gives a time of 1.170s for the all-on case compared to 0.68s for
the loop-based implementation.


class RunStateIter
{
   private:
       const int* m_intervals; ///< active intervals
       const int* m_intervalsEnd; ///< end of active intervals
       int m_currEnd;          ///< end of current interval
       int m_idx;              ///< current index
   public:
       RunStateIter(const int* intervals, int nIntervals)
           : m_intervals(intervals),
           m_intervalsEnd(intervals + 2*nIntervals),
           m_currEnd(nIntervals ? intervals[1] : 0),
           m_idx(nIntervals ? intervals[0] : 0)
       { }

       RunStateIter& operator++()
       {
           ++m_idx;
           if (m_idx >= m_currEnd) {
               m_intervals += 2;
               if (m_intervals < m_intervalsEnd) {
                   m_idx = m_intervals[0];
                   m_currEnd = m_intervals[1];
               }
           }
           return *this;
       }

       bool valid() { return m_idx < m_currEnd; }

       int operator*() { return m_idx; }
};


Now why is this?  The above is essentially the loop-based code but unravelled
into an iterator.  I had to look at the assembly code to find out, and it turns
out that the compiler is optimizing addIntervals() using hardware SIMD!  (I spy
lots of movlps and an addps in there.)

Ack!  I hadn't expected that.  I was imagining that the efficiency gains in the
all-on case came from improving branch prediciton or some such.  Oops :-)
Using the flag -fno-tree-vectorize causes performance of the active intervals
code in the all-on case to devolve to 1.33s, just the same as the runflags
code.

So, this changes a few things, because (1) I guess there's not that many
operations which the compiler can produce hardware SIMD code for, so the
efficiency gains I've just shown may evaporate if user-defined types come into
play and (2) the compiler (g++) seems to have trouble producing hardware SIMD
code when an iterator abstraction is involved.  That's a pity because using an
iterator would let you reimplement this stuff once and decide what the iterator
implementation should be later, based on real benchmarks.

Hum, so suddenly the way forward doesn't seem quite so clear anymore.  The
active index method starts to look very attractive if we discount hardware
vectorization.

~Chris.

--
You received this message because you are subscribed to the Google Groups "OSL Developers" group.
To post to this group, send email to osl...@....
To unsubscribe from this group, send email to osl...@....
For more options, visit this group at http://groups.google.com/group/osl-dev?hl=en.



Re: run flags vs active intervals

Chris Foster <chri...@...>
 

On Fri, Jan 22, 2010 at 4:56 AM, Larry Gritz <l...@...> wrote:
I want to point out that Chris F tested out the "all on", "random on", and
"alternating on/off" cases.  There's one more case that may be important,
which is a few isolated "on" points with big "off" gaps in between -- and in
that case, I expect Chris F's solution to perform even better (compared to
what we have now) than the other cases.
Right, here's the initialization code for some sparse on-states:

// Four isolated flags turned on.
std::fill((Runflag*)r, r+len, (Runflag)Runflag_Off);
r[0] = r[50] = r[100] = r[150] = Runflag_On;

results for this sparse case:

run flags: 1.710s
active intervals: 0.100s

Of course in this case, the run flags completely fail to captialize on the
fact that most of the shading elements are turned off, so the active intervals
formulation thrashes it. Out of curiosity, I've also implemented the direct
indexing array Chris K suggested (code attached, see the function
addIndexed() ):

active index: 0.050s

as expected, blazingly fast here!


Here's the rest of the benchmarks redone with the active indexing method added:


All flags on (completely coherent):

run flags: 1.310s
active intervals: 0.690s
active index: 1.330s


Random flags on (completely incoherent):

run flags: 5.440s
active intervals: 3.310s
active index: 0.760s


Alternate flags on (maximum number of active intervals):

run flags: 1.500s
active intervals: 2.150s
active index: 0.710s


The results are quite interesting. They suggest that active indexing is likely
to be faster for highly incoherent data, but that active intervals is the clear
winnier when all your flags are on.

The random flags benchmark is particularly interesting to me because the active
intervals formulation does a lot worse than the active index in this case.


As for the implementation, you can potentially have the ability to change
between any of these run state implementations if you introduce an iterator
abstraction. The tricky thing seems to be making sure the abstraction is
getting optimized as efficiently as the plain loops.

Here's my crack at an active intervals iterator class, but alas, the benchmarks
show that this gives a time of 1.170s for the all-on case compared to 0.68s for
the loop-based implementation.


class RunStateIter
{
private:
const int* m_intervals; ///< active intervals
const int* m_intervalsEnd; ///< end of active intervals
int m_currEnd; ///< end of current interval
int m_idx; ///< current index
public:
RunStateIter(const int* intervals, int nIntervals)
: m_intervals(intervals),
m_intervalsEnd(intervals + 2*nIntervals),
m_currEnd(nIntervals ? intervals[1] : 0),
m_idx(nIntervals ? intervals[0] : 0)
{ }

RunStateIter& operator++()
{
++m_idx;
if (m_idx >= m_currEnd) {
m_intervals += 2;
if (m_intervals < m_intervalsEnd) {
m_idx = m_intervals[0];
m_currEnd = m_intervals[1];
}
}
return *this;
}

bool valid() { return m_idx < m_currEnd; }

int operator*() { return m_idx; }
};


Now why is this? The above is essentially the loop-based code but unravelled
into an iterator. I had to look at the assembly code to find out, and it turns
out that the compiler is optimizing addIntervals() using hardware SIMD! (I spy
lots of movlps and an addps in there.)

Ack! I hadn't expected that. I was imagining that the efficiency gains in the
all-on case came from improving branch prediciton or some such. Oops :-)
Using the flag -fno-tree-vectorize causes performance of the active intervals
code in the all-on case to devolve to 1.33s, just the same as the runflags
code.

So, this changes a few things, because (1) I guess there's not that many
operations which the compiler can produce hardware SIMD code for, so the
efficiency gains I've just shown may evaporate if user-defined types come into
play and (2) the compiler (g++) seems to have trouble producing hardware SIMD
code when an iterator abstraction is involved. That's a pity because using an
iterator would let you reimplement this stuff once and decide what the iterator
implementation should be later, based on real benchmarks.

Hum, so suddenly the way forward doesn't seem quite so clear anymore. The
active index method starts to look very attractive if we discount hardware
vectorization.

~Chris.


Re: Add derivatives to I in shader globals (issue186262)

cku...@...
 


Add derivatives to I in shader globals (issue186262)

aco...@...
 

Reviewers: osl-dev_googlegroups.com,

Description:
We were missing the derivatives in the I field which is important for
background shader. This little patch fixes the problem.

Please review this at http://codereview.appspot.com/186262/show

Affected files:
src/include/oslexec.h
src/liboslexec/exec.cpp


Index: src/include/oslexec.h
===================================================================
--- src/include/oslexec.h (revision 538)
+++ src/include/oslexec.h (working copy)
@@ -195,6 +195,7 @@ public:
VaryingRef<Vec3> P; ///< Position
VaryingRef<Vec3> dPdx, dPdy; ///< Partials
VaryingRef<Vec3> I; ///< Incident ray
+ VaryingRef<Vec3> dIdx, dIdy; ///< Partial derivatives for I
VaryingRef<Vec3> N; ///< Shading normal
VaryingRef<Vec3> Ng; ///< True geometric normal
VaryingRef<float> u, v; ///< Surface parameters
Index: src/liboslexec/exec.cpp
===================================================================
--- src/liboslexec/exec.cpp (revision 538)
+++ src/liboslexec/exec.cpp (working copy)
@@ -195,8 +195,16 @@ ShadingExecution::bind (ShadingContext *context, ShaderUse use,
sym.data (globals->P.ptr()); sym.step (globals->P.step());
}
} else if (sym.name() == Strings::I) {
- sym.has_derivs (false);
- sym.data (globals->I.ptr()); sym.step (globals->I.step());
+ if (globals->dIdx.ptr() && globals->dIdy.ptr()) {
+ sym.has_derivs (true);
+ void *addr = m_context->heap_allot (sym, true);
+ VaryingRef<Dual2<Vec3> > I ((Dual2<Vec3> *)addr, sym.step());
+ for (int i = 0; i < npoints(); ++i)
+ I[i].set (globals->I[i], globals->dIdx[i], globals->dIdy[i]);
+ } else {
+ sym.has_derivs (false);
+ sym.data (globals->I.ptr()); sym.step (globals->I.step());
+ }
} else if (sym.name() == Strings::N) {
sym.has_derivs (false);
sym.data (globals->N.ptr()); sym.step (globals->N.step());


Re: run flags vs active intervals

Larry Gritz <l...@...>
 

I want to point out that Chris F tested out the "all on", "random on", and "alternating on/off" cases. There's one more case that may be important, which is a few isolated "on" points with big "off" gaps in between -- and in that case, I expect Chris F's solution to perform even better (compared to what we have now) than the other cases.

I'm not looking forward to doing this overhaul (only because it's tedious and extensive, there's nothing hard about it), but I think it's potentially a big win. Thanks, Chris!

-- lg


On Jan 21, 2010, at 10:49 AM, Christopher wrote:

I like this idea too. What we were discussing yesterday was something
like:

index: [ 0 1 2 3 4 5 6 7 8 9 ]
flags: [ 0 0 1 1 1 1 0 0 1 0 ]
active_points: [ 2 3 4 5 8 ]

for (int i = 0; i < num_active; i++)
do_something(active_points[i]);

This would be slightly more efficient if you had a single point active
(or isolated single points), but it requires a lot more indirections
in the common case.

So I'm all in favor of trying this out - though it is a pretty big
overhaul ...

I would maintain these ranges as little int arrays on the stack just
like we maintain the runflags now. The upper bound for the
active_range array size is just npoints (run flags: on off on off on
off ...) - flip any bit and you get fewer "runs".

-Chris


On Jan 21, 9:16 am, Larry Gritz <l...@...> wrote:
Awesome, Chris. Would you believe we were just talking internally about this topic yesterday? We were considering the amount of waste if there were big gaps of "off" points in the middle. But I think your solution is quite a bit more elegant than what we were discussing (a list of "on" points). I like how it devolves into (begin,end] for the common case of all points on.

It's a pretty big overhaul, touches every shadeop and a lot of templates, and the call signatures have to be changed (to what, exactly? passing a std::vector<int>& ? or a combo of int *begend and int segments?). Ugh, and we may wish to change/add OIIO texture routines that take runflags, too. But your experiment is quite convincing.

What does everybody else think? Chris/Cliff/Alex? (I'm happy to do the coding, but I want consensus because it touches so much.)

-- lg

On Jan 21, 2010, at 8:21 AM, Chris Foster wrote:



Hi all,
I've been looking through the OSL source a little, and I'm interested to see
that you're using runflags for the SIMD state. I know that's a really
conventional solution, but there's an alternative representation which I reckon
is significantly faster, and I wondered if you considered it.
Imagine a SIMD runstate as follows
index: [ 0 1 2 3 4 5 6 7 8 9 ]
flags: [ 0 0 1 1 1 1 0 0 1 0 ]
An alternative to the flags is to represent this state as a list of active
intervals. As a set of active start/stop pairs, the state looks like
active = [ 2 6 8 9 ]
ie,
state: [ 0 0 1 1 1 1 0 0 1 0 ]
^ v ^ v
2 6 8 9
The advantage of doing this is that the inner SIMD loops become tighter since
there's one less test. Instead of
for (int i = 0; i < len; ++i)
{
if (state[i])
do_something(i);
}
we'd have
for (int j = 0; j < nActiveTimes2; j+=2)
{
for (int i = active[j]; i < active[j+1]; ++i)
do_something(i);
}
and the inner loop is now completely coherent.
I can see you have the beginnings of this idea in the current OSL code, since
you pass beginpoint and endpoint along with the runflags everywhere. However,
why not take it to its logical conclusion?
Given that you already have beginpoint and endpoint, the particularly big win
here is when most of the flags are turned on. If do_something() is a simple
arithmetic operation (eg, float addition) the difference between the two
formulations can be a factor of two in speed.
I'm attaching some basic test code which I whipped up. Timings on my core2 duo
with gcc -O3 look like:
All flags on (completely coherent):
run flags: 1.310s
active intervals: 0.690s
Random flags on (completely incoherent):
run flags: 5.440s
active intervals: 3.310s
Alternate flags on (maximum number of active intervals -> worst case):
run flags: 1.500s
active intervals: 2.150s
Thoughts?
~Chris.
<ATT00001..txt><runstate.cpp>
--
Larry Gritz
l...@...
<ATT00001..txt>
--
Larry Gritz
l...@...


Re: run flags vs active intervals

Christopher <cku...@...>
 

I like this idea too. What we were discussing yesterday was something
like:

index: [ 0 1 2 3 4 5 6 7 8 9 ]
flags: [ 0 0 1 1 1 1 0 0 1 0 ]
active_points: [ 2 3 4 5 8 ]

for (int i = 0; i < num_active; i++)
do_something(active_points[i]);

This would be slightly more efficient if you had a single point active
(or isolated single points), but it requires a lot more indirections
in the common case.

So I'm all in favor of trying this out - though it is a pretty big
overhaul ...

I would maintain these ranges as little int arrays on the stack just
like we maintain the runflags now. The upper bound for the
active_range array size is just npoints (run flags: on off on off on
off ...) - flip any bit and you get fewer "runs".

-Chris

On Jan 21, 9:16 am, Larry Gritz <l...@...> wrote:
Awesome, Chris.  Would you believe we were just talking internally about this topic yesterday?  We were considering the amount of waste if there were big gaps of "off" points in the middle.  But I think your solution is quite a bit more elegant than what we were discussing (a list of "on" points).  I like how it devolves into (begin,end] for the common case of all points on.

It's a pretty big overhaul, touches every shadeop and a lot of templates, and the call signatures have to be changed (to what, exactly? passing a std::vector<int>& ? or a combo of int *begend and int segments?).  Ugh, and we may wish to change/add OIIO texture routines that take runflags, too.  But your experiment is quite convincing.

What does everybody else think?  Chris/Cliff/Alex?  (I'm happy to do the coding, but I want consensus because it touches so much.)

        -- lg

On Jan 21, 2010, at 8:21 AM, Chris Foster wrote:



Hi all,
I've been looking through the OSL source a little, and I'm interested to see
that you're using runflags for the SIMD state.  I know that's a really
conventional solution, but there's an alternative representation which I reckon
is significantly faster, and I wondered if you considered it.
Imagine a SIMD runstate as follows
index: [ 0  1  2  3  4  5  6  7  8  9 ]
flags: [ 0  0  1  1  1  1  0  0  1  0 ]
An alternative to the flags is to represent this state as a list of active
intervals.  As a set of active start/stop pairs, the state looks like
active = [ 2 6  8 9 ]
ie,
state: [ 0  0  1  1  1  1  0  0  1  0 ]
              ^           v     ^  v
              2           6     8  9
The advantage of doing this is that the inner SIMD loops become tighter since
there's one less test.  Instead of
for (int i = 0; i < len; ++i)
{
   if (state[i])
       do_something(i);
}
we'd have
for (int j = 0; j < nActiveTimes2; j+=2)
{
   for (int i = active[j]; i < active[j+1]; ++i)
       do_something(i);
}
and the inner loop is now completely coherent.
I can see you have the beginnings of this idea in the current OSL code, since
you pass beginpoint and endpoint along with the runflags everywhere.  However,
why not take it to its logical conclusion?
Given that you already have beginpoint and endpoint, the particularly big win
here is when most of the flags are turned on.  If do_something() is a simple
arithmetic operation (eg, float addition) the difference between the two
formulations can be a factor of two in speed.
I'm attaching some basic test code which I whipped up.  Timings on my core2 duo
with gcc -O3 look like:
All flags on (completely coherent):
run flags:        1.310s
active intervals: 0.690s
Random flags on (completely incoherent):
run flags:        5.440s
active intervals: 3.310s
Alternate flags on (maximum number of active intervals -> worst case):
run flags:        1.500s
active intervals: 2.150s
Thoughts?
~Chris.
<ATT00001..txt><runstate.cpp>
--
Larry Gritz
l...@...


Re: run flags vs active intervals

Xavier Ho <con...@...>
 

On Fri, Jan 22, 2010 at 4:15 AM, Wormszer <worm...@...> wrote:
Is it ok to ask these questions in this thread like this or should I be starting a new one? I don't want to fill this thread with off topic information from the original questions? I am somewhat new to this process, and unclear on what is looked down upon.

As a uni student who is keen on the OSL open source release, I'm glad to be reading these discussions - particularly questions and answers. They're the source of knowledge, freely shared insights. It excites the community in its common understanding of this project, and benefits those who, like myself, digest the information that go through mailing lists. Not to mention this is probably archived, and later on searchable if you ever need it again. My opinion is, if it's related, it's probably okay. Unless you intend to start a newer/bigger conversation, there isn't really a need to start a new thread. Having multiple threads on similar topics only make digging information harder, anyhow.

My 2 cp,
Xavier


Re: run flags vs active intervals

Wormszer <worm...@...>
 

Thanks, that makes sense. I know compilers are getting smarter all the time but wasn't sure if they were there yet. The cache and memory coherence makes sense, and then sorting things like that would allow for easier parallel implementations on GPU, etc.

Another quick question what is n, is n the total number of pixels in the image, or i possibly rays? And that's why you have disabled ops, pixels that shader is not applied too?
Is there a good place to look or read about that? Or do i need to dig into the source?

Is it ok to ask these questions in this thread like this or should I be starting a new one? I don't want to fill this thread with off topic information from the original questions? I am somewhat new to this process, and unclear on what is looked down upon.

Thanks
Jeremy


On Thu, Jan 21, 2010 at 12:44 PM, Larry Gritz <l...@...> wrote:
It's not really referring to hardware SIMD, but just that we are shading many points at once, "in lock-step."  In other words, if you had many points to shade, you could do it in this order:

    point 0:
        add
        texture
        assign
    point 1:
        add
        texture
        assign
    ...
    point n:
        add
        texture
        assign

or you could do it in this order:

    add all points on [0,n]
    texture all points on [0,n]
    assign all points on [0,n]

We call the latter "SIMD" (single instruction, multiple data), because each instruction operates on big arrays of data (one value for each point being shaded).

SIMD helps by:

   * increasing interpreter speed, by moving much of the interpreter overhead to "per batch" rather than "per point".
   * improving memory coherence, cache behavior, and allowing hardware SIMD, since it's naturally using "structure-of-array" layout, i.e. it's faster to add contiguous arrays than separate floats in more sparse memory locations.
   * improves texture coherence because you're doing lots of lookups from the same texture on all points at the same time (which are in turn likely to be on the same tiles).




On Jan 21, 2010, at 9:30 AM, Wormszer wrote:

Is there a good resource on this topic? I did some googling and didn't see what i was looking for.

Where is the actual SIMD taking place?
Is the compiler figuring it out from the loop and setting up the correct instructions. Or is it relying on the cpu to recognize consecutive ops with different data, modifying instructions in its pipe by combing instructions into a parallel one.

Or maybe i'm way off.

Thanks,

Jeremy

On Thu, Jan 21, 2010 at 12:16 PM, Larry Gritz <l...@...> wrote:
Awesome, Chris.  Would you believe we were just talking internally about this topic yesterday?  We were considering the amount of waste if there were big gaps of "off" points in the middle.  But I think your solution is quite a bit more elegant than what we were discussing (a list of "on" points).  I like how it devolves into (begin,end] for the common case of all points on.

It's a pretty big overhaul, touches every shadeop and a lot of templates, and the call signatures have to be changed (to what, exactly? passing a std::vector<int>& ? or a combo of int *begend and int segments?).  Ugh, and we may wish to change/add OIIO texture routines that take runflags, too.  But your experiment is quite convincing.

What does everybody else think?  Chris/Cliff/Alex?  (I'm happy to do the coding, but I want consensus because it touches so much.)

       -- lg


On Jan 21, 2010, at 8:21 AM, Chris Foster wrote:

> Hi all,
>
> I've been looking through the OSL source a little, and I'm interested to see
> that you're using runflags for the SIMD state.  I know that's a really
> conventional solution, but there's an alternative representation which I reckon
> is significantly faster, and I wondered if you considered it.
>
> Imagine a SIMD runstate as follows
>
> index: [ 0  1  2  3  4  5  6  7  8  9 ]
>
> flags: [ 0  0  1  1  1  1  0  0  1  0 ]
>
>
> An alternative to the flags is to represent this state as a list of active
> intervals.  As a set of active start/stop pairs, the state looks like
>
> active = [ 2 6  8 9 ]
>
> ie,
>
> state: [ 0  0  1  1  1  1  0  0  1  0 ]
>               ^           v     ^  v
>               2           6     8  9
>
> The advantage of doing this is that the inner SIMD loops become tighter since
> there's one less test.  Instead of
>
> for (int i = 0; i < len; ++i)
> {
>    if (state[i])
>        do_something(i);
> }
>
> we'd have
>
> for (int j = 0; j < nActiveTimes2; j+=2)
> {
>    for (int i = active[j]; i < active[j+1]; ++i)
>        do_something(i);
> }
>
> and the inner loop is now completely coherent.
>
> I can see you have the beginnings of this idea in the current OSL code, since
> you pass beginpoint and endpoint along with the runflags everywhere.  However,
> why not take it to its logical conclusion?
>
> Given that you already have beginpoint and endpoint, the particularly big win
> here is when most of the flags are turned on.  If do_something() is a simple
> arithmetic operation (eg, float addition) the difference between the two
> formulations can be a factor of two in speed.
>
>
> I'm attaching some basic test code which I whipped up.  Timings on my core2 duo
> with gcc -O3 look like:
>
>
> All flags on (completely coherent):
>
> run flags:        1.310s
> active intervals: 0.690s
>
>
> Random flags on (completely incoherent):
>
> run flags:        5.440s
> active intervals: 3.310s
>
>
> Alternate flags on (maximum number of active intervals -> worst case):
>
> run flags:        1.500s
> active intervals: 2.150s
>
>
> Thoughts?
>
> ~Chris.
> <ATT00001..txt><runstate.cpp>

--
Larry Gritz
l...@...





--
You received this message because you are subscribed to the Google Groups "OSL Developers" group.
To post to this group, send email to osl...@....
To unsubscribe from this group, send email to osl...@....
For more options, visit this group at http://groups.google.com/group/osl-dev?hl=en.




<ATT00001..htm>

--
Larry Gritz




--
You received this message because you are subscribed to the Google Groups "OSL Developers" group.
To post to this group, send email to osl...@....
To unsubscribe from this group, send email to osl...@....
For more options, visit this group at http://groups.google.com/group/osl-dev?hl=en.



Re: run flags vs active intervals

Larry Gritz <l...@...>
 

It's not really referring to hardware SIMD, but just that we are shading many points at once, "in lock-step."  In other words, if you had many points to shade, you could do it in this order:

    point 0:
        add
        texture
        assign
    point 1:
        add
        texture
        assign
    ...
    point n:
        add
        texture
        assign

or you could do it in this order:

    add all points on [0,n]
    texture all points on [0,n]
    assign all points on [0,n]

We call the latter "SIMD" (single instruction, multiple data), because each instruction operates on big arrays of data (one value for each point being shaded).

SIMD helps by:

   * increasing interpreter speed, by moving much of the interpreter overhead to "per batch" rather than "per point".
   * improving memory coherence, cache behavior, and allowing hardware SIMD, since it's naturally using "structure-of-array" layout, i.e. it's faster to add contiguous arrays than separate floats in more sparse memory locations.
   * improves texture coherence because you're doing lots of lookups from the same texture on all points at the same time (which are in turn likely to be on the same tiles).




On Jan 21, 2010, at 9:30 AM, Wormszer wrote:

Is there a good resource on this topic? I did some googling and didn't see what i was looking for.

Where is the actual SIMD taking place?
Is the compiler figuring it out from the loop and setting up the correct instructions. Or is it relying on the cpu to recognize consecutive ops with different data, modifying instructions in its pipe by combing instructions into a parallel one.

Or maybe i'm way off.

Thanks,

Jeremy

On Thu, Jan 21, 2010 at 12:16 PM, Larry Gritz <l...@...> wrote:
Awesome, Chris.  Would you believe we were just talking internally about this topic yesterday?  We were considering the amount of waste if there were big gaps of "off" points in the middle.  But I think your solution is quite a bit more elegant than what we were discussing (a list of "on" points).  I like how it devolves into (begin,end] for the common case of all points on.

It's a pretty big overhaul, touches every shadeop and a lot of templates, and the call signatures have to be changed (to what, exactly? passing a std::vector<int>& ? or a combo of int *begend and int segments?).  Ugh, and we may wish to change/add OIIO texture routines that take runflags, too.  But your experiment is quite convincing.

What does everybody else think?  Chris/Cliff/Alex?  (I'm happy to do the coding, but I want consensus because it touches so much.)

       -- lg


On Jan 21, 2010, at 8:21 AM, Chris Foster wrote:

> Hi all,
>
> I've been looking through the OSL source a little, and I'm interested to see
> that you're using runflags for the SIMD state.  I know that's a really
> conventional solution, but there's an alternative representation which I reckon
> is significantly faster, and I wondered if you considered it.
>
> Imagine a SIMD runstate as follows
>
> index: [ 0  1  2  3  4  5  6  7  8  9 ]
>
> flags: [ 0  0  1  1  1  1  0  0  1  0 ]
>
>
> An alternative to the flags is to represent this state as a list of active
> intervals.  As a set of active start/stop pairs, the state looks like
>
> active = [ 2 6  8 9 ]
>
> ie,
>
> state: [ 0  0  1  1  1  1  0  0  1  0 ]
>               ^           v     ^  v
>               2           6     8  9
>
> The advantage of doing this is that the inner SIMD loops become tighter since
> there's one less test.  Instead of
>
> for (int i = 0; i < len; ++i)
> {
>    if (state[i])
>        do_something(i);
> }
>
> we'd have
>
> for (int j = 0; j < nActiveTimes2; j+=2)
> {
>    for (int i = active[j]; i < active[j+1]; ++i)
>        do_something(i);
> }
>
> and the inner loop is now completely coherent.
>
> I can see you have the beginnings of this idea in the current OSL code, since
> you pass beginpoint and endpoint along with the runflags everywhere.  However,
> why not take it to its logical conclusion?
>
> Given that you already have beginpoint and endpoint, the particularly big win
> here is when most of the flags are turned on.  If do_something() is a simple
> arithmetic operation (eg, float addition) the difference between the two
> formulations can be a factor of two in speed.
>
>
> I'm attaching some basic test code which I whipped up.  Timings on my core2 duo
> with gcc -O3 look like:
>
>
> All flags on (completely coherent):
>
> run flags:        1.310s
> active intervals: 0.690s
>
>
> Random flags on (completely incoherent):
>
> run flags:        5.440s
> active intervals: 3.310s
>
>
> Alternate flags on (maximum number of active intervals -> worst case):
>
> run flags:        1.500s
> active intervals: 2.150s
>
>
> Thoughts?
>
> ~Chris.
> <ATT00001..txt><runstate.cpp>

--
Larry Gritz
l...@...





--
You received this message because you are subscribed to the Google Groups "OSL Developers" group.
To post to this group, send email to osl...@....
To unsubscribe from this group, send email to osl...@....
For more options, visit this group at http://groups.google.com/group/osl-dev?hl=en.




<ATT00001..htm>

--
Larry Gritz




Re: run flags vs active intervals

Wormszer <worm...@...>
 

Is there a good resource on this topic? I did some googling and didn't see what i was looking for.

Where is the actual SIMD taking place?
Is the compiler figuring it out from the loop and setting up the correct instructions. Or is it relying on the cpu to recognize consecutive ops with different data, modifying instructions in its pipe by combing instructions into a parallel one.

Or maybe i'm way off.

Thanks,

Jeremy


On Thu, Jan 21, 2010 at 12:16 PM, Larry Gritz <l...@...> wrote:
Awesome, Chris.  Would you believe we were just talking internally about this topic yesterday?  We were considering the amount of waste if there were big gaps of "off" points in the middle.  But I think your solution is quite a bit more elegant than what we were discussing (a list of "on" points).  I like how it devolves into (begin,end] for the common case of all points on.

It's a pretty big overhaul, touches every shadeop and a lot of templates, and the call signatures have to be changed (to what, exactly? passing a std::vector<int>& ? or a combo of int *begend and int segments?).  Ugh, and we may wish to change/add OIIO texture routines that take runflags, too.  But your experiment is quite convincing.

What does everybody else think?  Chris/Cliff/Alex?  (I'm happy to do the coding, but I want consensus because it touches so much.)

       -- lg


On Jan 21, 2010, at 8:21 AM, Chris Foster wrote:

> Hi all,
>
> I've been looking through the OSL source a little, and I'm interested to see
> that you're using runflags for the SIMD state.  I know that's a really
> conventional solution, but there's an alternative representation which I reckon
> is significantly faster, and I wondered if you considered it.
>
> Imagine a SIMD runstate as follows
>
> index: [ 0  1  2  3  4  5  6  7  8  9 ]
>
> flags: [ 0  0  1  1  1  1  0  0  1  0 ]
>
>
> An alternative to the flags is to represent this state as a list of active
> intervals.  As a set of active start/stop pairs, the state looks like
>
> active = [ 2 6  8 9 ]
>
> ie,
>
> state: [ 0  0  1  1  1  1  0  0  1  0 ]
>               ^           v     ^  v
>               2           6     8  9
>
> The advantage of doing this is that the inner SIMD loops become tighter since
> there's one less test.  Instead of
>
> for (int i = 0; i < len; ++i)
> {
>    if (state[i])
>        do_something(i);
> }
>
> we'd have
>
> for (int j = 0; j < nActiveTimes2; j+=2)
> {
>    for (int i = active[j]; i < active[j+1]; ++i)
>        do_something(i);
> }
>
> and the inner loop is now completely coherent.
>
> I can see you have the beginnings of this idea in the current OSL code, since
> you pass beginpoint and endpoint along with the runflags everywhere.  However,
> why not take it to its logical conclusion?
>
> Given that you already have beginpoint and endpoint, the particularly big win
> here is when most of the flags are turned on.  If do_something() is a simple
> arithmetic operation (eg, float addition) the difference between the two
> formulations can be a factor of two in speed.
>
>
> I'm attaching some basic test code which I whipped up.  Timings on my core2 duo
> with gcc -O3 look like:
>
>
> All flags on (completely coherent):
>
> run flags:        1.310s
> active intervals: 0.690s
>
>
> Random flags on (completely incoherent):
>
> run flags:        5.440s
> active intervals: 3.310s
>
>
> Alternate flags on (maximum number of active intervals -> worst case):
>
> run flags:        1.500s
> active intervals: 2.150s
>
>
> Thoughts?
>
> ~Chris.
> <ATT00001..txt><runstate.cpp>

--
Larry Gritz
l...@...





--
You received this message because you are subscribed to the Google Groups "OSL Developers" group.
To post to this group, send email to osl...@....
To unsubscribe from this group, send email to osl...@....
For more options, visit this group at http://groups.google.com/group/osl-dev?hl=en.





Re: run flags vs active intervals

Larry Gritz <l...@...>
 

Awesome, Chris. Would you believe we were just talking internally about this topic yesterday? We were considering the amount of waste if there were big gaps of "off" points in the middle. But I think your solution is quite a bit more elegant than what we were discussing (a list of "on" points). I like how it devolves into (begin,end] for the common case of all points on.

It's a pretty big overhaul, touches every shadeop and a lot of templates, and the call signatures have to be changed (to what, exactly? passing a std::vector<int>& ? or a combo of int *begend and int segments?). Ugh, and we may wish to change/add OIIO texture routines that take runflags, too. But your experiment is quite convincing.

What does everybody else think? Chris/Cliff/Alex? (I'm happy to do the coding, but I want consensus because it touches so much.)

-- lg


On Jan 21, 2010, at 8:21 AM, Chris Foster wrote:

Hi all,

I've been looking through the OSL source a little, and I'm interested to see
that you're using runflags for the SIMD state. I know that's a really
conventional solution, but there's an alternative representation which I reckon
is significantly faster, and I wondered if you considered it.

Imagine a SIMD runstate as follows

index: [ 0 1 2 3 4 5 6 7 8 9 ]

flags: [ 0 0 1 1 1 1 0 0 1 0 ]


An alternative to the flags is to represent this state as a list of active
intervals. As a set of active start/stop pairs, the state looks like

active = [ 2 6 8 9 ]

ie,

state: [ 0 0 1 1 1 1 0 0 1 0 ]
^ v ^ v
2 6 8 9

The advantage of doing this is that the inner SIMD loops become tighter since
there's one less test. Instead of

for (int i = 0; i < len; ++i)
{
if (state[i])
do_something(i);
}

we'd have

for (int j = 0; j < nActiveTimes2; j+=2)
{
for (int i = active[j]; i < active[j+1]; ++i)
do_something(i);
}

and the inner loop is now completely coherent.

I can see you have the beginnings of this idea in the current OSL code, since
you pass beginpoint and endpoint along with the runflags everywhere. However,
why not take it to its logical conclusion?

Given that you already have beginpoint and endpoint, the particularly big win
here is when most of the flags are turned on. If do_something() is a simple
arithmetic operation (eg, float addition) the difference between the two
formulations can be a factor of two in speed.


I'm attaching some basic test code which I whipped up. Timings on my core2 duo
with gcc -O3 look like:


All flags on (completely coherent):

run flags: 1.310s
active intervals: 0.690s


Random flags on (completely incoherent):

run flags: 5.440s
active intervals: 3.310s


Alternate flags on (maximum number of active intervals -> worst case):

run flags: 1.500s
active intervals: 2.150s


Thoughts?

~Chris.
<ATT00001..txt><runstate.cpp>
--
Larry Gritz
l...@...


run flags vs active intervals

Chris Foster <chri...@...>
 

Hi all,

I've been looking through the OSL source a little, and I'm interested to see
that you're using runflags for the SIMD state. I know that's a really
conventional solution, but there's an alternative representation which I reckon
is significantly faster, and I wondered if you considered it.

Imagine a SIMD runstate as follows

index: [ 0 1 2 3 4 5 6 7 8 9 ]

flags: [ 0 0 1 1 1 1 0 0 1 0 ]


An alternative to the flags is to represent this state as a list of active
intervals. As a set of active start/stop pairs, the state looks like

active = [ 2 6 8 9 ]

ie,

state: [ 0 0 1 1 1 1 0 0 1 0 ]
^ v ^ v
2 6 8 9

The advantage of doing this is that the inner SIMD loops become tighter since
there's one less test. Instead of

for (int i = 0; i < len; ++i)
{
if (state[i])
do_something(i);
}

we'd have

for (int j = 0; j < nActiveTimes2; j+=2)
{
for (int i = active[j]; i < active[j+1]; ++i)
do_something(i);
}

and the inner loop is now completely coherent.

I can see you have the beginnings of this idea in the current OSL code, since
you pass beginpoint and endpoint along with the runflags everywhere. However,
why not take it to its logical conclusion?

Given that you already have beginpoint and endpoint, the particularly big win
here is when most of the flags are turned on. If do_something() is a simple
arithmetic operation (eg, float addition) the difference between the two
formulations can be a factor of two in speed.


I'm attaching some basic test code which I whipped up. Timings on my core2 duo
with gcc -O3 look like:


All flags on (completely coherent):

run flags: 1.310s
active intervals: 0.690s


Random flags on (completely incoherent):

run flags: 5.440s
active intervals: 3.310s


Alternate flags on (maximum number of active intervals -> worst case):

run flags: 1.500s
active intervals: 2.150s


Thoughts?

~Chris.


Re: Volume Shaders

Daniel <night-...@...>
 

On 21 Jan., 16:13, Larry Gritz <l...@...> wrote:
Stay tuned, this will all be done very soon, and probably discussed in detail here as it's happening.
Thanks so far for the answers, I'm looking forward to seeing OSL
progress. Good stuff!

Regards,
Daniel


Re: Volume Shaders

Larry Gritz <l...@...>
 

Er, I'm not sure. I think this is exactly the kind of thing we will be working out over the next few weeks. When we actually implemented surface integrators, we discovered all sorts of issues we hadn't realized when we spec'ed it. The gist is what we imagined, but the little details are different, I expect the same will happen with volumes.

The "density" is already part of the volume's closure, in much the same way that opacity is already part of a surface closure (by virtue of its weighting of closure elements that know they are "transparency-like"). It's possible that we'll want the shader to also return some kind of hint about the frequency that it needs to be sampled (as part of, or in addition to, the closure). Or maybe that's strictly the job of the integrator. The integrator can itself have parameters set by the renderer, and there can be multiple integrators for very different volume situations that need different sampling strategies. Also, it should be obvious that when we add volumes, we will also add several volume scattering closures, as the set of surface closures will not be adequate.

Stay tuned, this will all be done very soon, and probably discussed in detail here as it's happening.

-- lg


On Jan 21, 2010, at 1:12 AM, Daniel wrote:

On 21 Jan., 08:33, Larry Gritz <l...@...> wrote:
OK, that is how I interpreted the spec.
The volume integrator will, presumably, need some information about
the density of the volume, right? Do you already have ideas for
specifying that in OSL? Or will volume density information simply be
part of the scene description so integrators can choose to do this in
a way that fits their scene format?

Regards,
Daniel
<ATT00001..txt>
--
Larry Gritz
l...@...


Re: Volume Shaders

Daniel <night-...@...>
 

On 21 Jan., 10:12, Daniel <night...@...> wrote:
Or will volume density information simply be part of the scene description so integrators can choose to do this in a way that fits their scene format?
Meh, I just realized my ambiguous use of "integrator". In this case I
meant "party integrating OSL into their system"...

Regards,
Daniel


Re: Volume Shaders

Daniel <night-...@...>
 

On 21 Jan., 08:33, Larry Gritz <l...@...> wrote:
A "volume integrator" in the renderer is responsible for doing the actual ray marching, evaluating the closure for specific light directions, and accumulating the contributions along the viewing ray.
OK, that is how I interpreted the spec.
The volume integrator will, presumably, need some information about
the density of the volume, right? Do you already have ideas for
specifying that in OSL? Or will volume density information simply be
part of the scene description so integrators can choose to do this in
a way that fits their scene format?

Regards,
Daniel


Re: Volume Shaders

Larry Gritz <l...@...>
 

We're tackling the volume shaders fairly soon. The spec isn't very clear about them, but that will be beefed up as we implement it. Basically the idea is that it will be very analogous to surface shaders -- returning a closure that describes in a view-independent way what the scattering of the volume is at a particular point. A "volume integrator" in the renderer is responsible for doing the actual ray marching, evaluating the closure for specific light directions, and accumulating the contributions along the viewing ray.

This will all get fleshed out over the next few weeks, there's a pretty strong near-term deadline on our own shows to get this working.

-- lg


On Jan 20, 2010, at 12:29 AM, Daniel wrote:

Guys,

I just started looking into OSL. It certainly looks very interesting!
Did I miss something or is it true that volume shaders are currently
neither implemented, nor truly specified?
I see that you filed issue #3 about this. Can you already comment on
your plans for volume shaders? I'm curious... Which part of the
illumination will the shaders actually compute?

Regards,
Daniel
<ATT00001..txt>
--
Larry Gritz
l...@...


Re: Compiling OpenShadingLanguage under Windows

Wormszer <worm...@...>
 


I have mcpp integrated into the compiler now. Working like CPP and writing everything to stdout.

A few things, the binary available for download at least on windows doesn't support forcing an include file.
So i had to build it from source, luckily it wasn't to difficult to create a new project for it.

I built it as VS for VS. Now that I think about it I bet I could build it as VS for GCC so that it would support the same command line options.

Another weird issue was I guess the lexer doesn't support  c-style /* comments? These might not be defined in the osl language, and if so probably should be removed from the stdosl.h.
Its the only difference i could see, that could of been causing the error.

CL would remove the */ */ comments at the end of the math defines in stdosl.h even though i told it to preserve comments.

mcpp when i told it to leave comments would actually leave them in there(seems like correct behavior), and then the lexer would crash.

I am not sure why comments are needed really at this point, and i think i must of included them for my own debugging, i don't think CPP was set to output them.

Thanks for the suggestions on a easier solution to the pre-processor issue.

I looked real quick for the boost but mcpp seemed easier since i could just get a binary and set it in my path. (but then i had to rebuild it)

Well now to see if I can get anything to happen with my compiled shaders.

Jeremy


On Wed, Jan 20, 2010 at 9:03 PM, Wormszer <worm...@...> wrote:
I am fine with either one. I think having something embedded or buildable would be usefull.

Otherwise there maybe issues with different compilers and would probably need some kind of config or something that cmake would generate at least on windows with several versions of VS etc.

Will just have to see how Larry or the other devs feel about using one of those two for Linux builds as well. I would assume it would be wise to have all the preprocessing done with the same tool when possible.

I will look at both real quick but I might lean towards mcpp.


On Jan 20, 2010, at 8:14 PM, Chris Foster <chri...@...> wrote:

On Thu, Jan 21, 2010 at 11:02 AM, Blair Zajac <bl...@...> wrote:
The main annoyance with wave is that it causes the compiler to issue truly
horrendous error messages if you get things wrong (the wave internals make
heavy use of template metaprogramming).

(Obviously this is only a problem when integrating wave into the project
source, and that's not really difficult at all.)

There's mcpp which is designed to be an embeddable C-preprocessor.

Ice, which we use at Sony Imageworks for all our middle-tier systems, uses
mcpp for it's own internal IDL type language, so I would recommend that.

mcpp looks nice.  In some sense, using wave would mean one less dependency
since OSL already relies on boost, but it does mean linking to libboost-wave,
so if you have a modular boost install the point may be moot...

~Chris

PS: Sorry to Blair for the duplicate message.  I intended to send it
to the list :-(
--
You received this message because you are subscribed to the Google Groups "OSL Developers" group.
To post to this group, send email to osl...@....
To unsubscribe from this group, send email to osl...@....
For more options, visit this group at http://groups.google.com/group/osl-dev?hl=en.




Re: Compiling OpenShadingLanguage under Windows

Wormszer <worm...@...>
 

I am fine with either one. I think having something embedded or buildable would be usefull.

Otherwise there maybe issues with different compilers and would probably need some kind of config or something that cmake would generate at least on windows with several versions of VS etc.

Will just have to see how Larry or the other devs feel about using one of those two for Linux builds as well. I would assume it would be wise to have all the preprocessing done with the same tool when possible.

I will look at both real quick but I might lean towards mcpp.

On Jan 20, 2010, at 8:14 PM, Chris Foster <chri...@...> wrote:

On Thu, Jan 21, 2010 at 11:02 AM, Blair Zajac <bl...@...> wrote:
The main annoyance with wave is that it causes the compiler to issue truly
horrendous error messages if you get things wrong (the wave internals make
heavy use of template metaprogramming).
(Obviously this is only a problem when integrating wave into the project
source, and that's not really difficult at all.)

There's mcpp which is designed to be an embeddable C-preprocessor.

Ice, which we use at Sony Imageworks for all our middle-tier systems, uses
mcpp for it's own internal IDL type language, so I would recommend that.
mcpp looks nice. In some sense, using wave would mean one less dependency
since OSL already relies on boost, but it does mean linking to libboost-wave,
so if you have a modular boost install the point may be moot...

~Chris

PS: Sorry to Blair for the duplicate message. I intended to send it
to the list :-(
--
You received this message because you are subscribed to the Google Groups "OSL Developers" group.
To post to this group, send email to osl...@....
To unsubscribe from this group, send email to osl...@... .
For more options, visit this group at http://groups.google.com/group/osl-dev?hl=en .

4901 - 4920 of 5010