Pre-review: run state overhaul (issue206045)
We've done extensive analysis showing that for typical shading batches
we see in the wild, our method of having an array of true/false runflags
is very wasteful, and despite the extra indirection involved, having a
flat list of the "on" points is much more efficient (5x or more speed
gain on tight inner loops). We're not sure yet exactly what this will
translate to in overall runtime gains, but it's got to help.
This is not a final review, just showing progress along the way.
The code in this pre-review adds the index list to the Runstate, and
maintains it properly. I validate this by checking at every instruction
that the runflags and indices, and all tests pass! So I know for sure
that the indices are being set up properly.
The whole point is that now I can convert the shadeops one by one, at
all times keeping the renderer working, and then only when everything is
converted rip out the old runflags entirely.
Please review this at http://codereview.appspot.com/206045/show