Crash with shader compiled to memory


Thomas METAIS
 

Hello,

We have a weird crash with a shader, compiled on the fly with a OSLCompiler created on the stack, and loaded with LoadMemoryCompiledShader.
The crash occurs later, when calling optimize_group. The compiled shader itself is very simple ( just does out = in ) and its input is connected to a pre-compiled shader.

/lib64/libc.so.6(+0x36400)[0x7fb729d97400]
/u/opt3production/production.CentOS-7.4/Opt/P-20210519-Opt-21_0002_0004/lib64/liboslexec.so.1.11(_ZN9OSL_v1_113pvt16RuntimeOptimizer19resolve_isconnectedEv+0x1e8)[0x7fb73366f5f8]
/u/opt3production/production.CentOS-7.4/Opt/P-20210519-Opt-21_0002_0004/lib64/liboslexec.so.1.11(_ZN9OSL_v1_113pvt16RuntimeOptimizer3runEv+0x318)[0x7fb7336746e8]
/u/opt3production/production.CentOS-7.4/Opt/P-20210519-Opt-21_0002_0004/lib64/liboslexec.so.1.11(_ZN9OSL_v1_113pvt17ShadingSystemImpl14optimize_groupERNS_11ShaderGroupEPNS_14ShadingContextEb+0x37f)[0x7fb7335a94df]

I ended up using a sanitizer (ASan) and it reports an error during the destruction of the compiler:

==11203==ERROR: AddressSanitizer: new-delete-type-mismatch on 0x60d000064760 in thread T0:
  object passed to delete has wrong type:
  size of the allocated type:   136 bytes;
  size of the deallocated type: 96 bytes.
    #0 0xa82280 in operator delete(void*, unsigned long) (/u/devstuff/releases.CentOS-7.8/mgl/tomtom/dist.21.1.asan/bin/mglr.bin+0xa82280)
    #1 0x7fc55591c2d8 in OSL_v1_11::pvt::SymbolTable::delete_syms() /u/local/imgdev/imgdev_installer/production/workspace/CentOS-7.4/release/build/Opt/P-20210519-Opt-21_0002_0004-r75747/Dist/CentOS-7.4/build/sources/OpenShadingLanguage/sources/OpenShadingLanguage-1.11.13.0/src/liboslcomp/symtab.cpp:278
    #2 0x7fc55590e392 in OSL_v1_11::pvt::SymbolTable::~SymbolTable() /u/local/imgdev/imgdev_installer/production/workspace/CentOS-7.4/release/build/Opt/P-20210519-Opt-21_0002_0004-r75747/Dist/CentOS-7.4/build/sources/OpenShadingLanguage/sources/OpenShadingLanguage-1.11.13.0/src/liboslcomp/symtab.h:208
    #3 0x7fc55590e392 in OSL_v1_11::pvt::OSLCompilerImpl::~OSLCompilerImpl() /u/local/imgdev/imgdev_installer/production/workspace/CentOS-7.4/release/build/Opt/P-20210519-Opt-21_0002_0004-r75747/Dist/CentOS-7.4/build/sources/OpenShadingLanguage/sources/OpenShadingLanguage-1.11.13.0/src/liboslcomp/oslcomp.cpp:120
    #4 0x7fc55590e4b0 in OSL_v1_11::OSLCompiler::~OSLCompiler() /u/local/imgdev/imgdev_installer/production/workspace/CentOS-7.4/release/build/Opt/P-20210519-Opt-21_0002_0004-r75747/Dist/CentOS-7.4/build/sources/OpenShadingLanguage/sources/OpenShadingLanguage-1.11.13.0/src/liboslcomp/oslcomp.cpp:47
    #5 0x1100654 in mg::rt::OslShaderConverter::compileShader(OSL_v1_11::ShadingSystem*, OpenImageIO_v2_2::string_view, std::string&, std::string&) src/rt/shading/src/OslShaderConverter.cpp:1817


It seems that when a FunctionSymbol object in the symbol table is deleted, its destructor is not called.
The FunctionSymbol class (symtab.h) is derived from the Symbol class (osl_pvt.h), but Symbol's destructor is not virtual.

We patched our OSL version (1.11.13), just making Symbol's destructor virtual and it solved our problem, no more crashes.

I hope this will help ( and that I'm not wrong somewhere... )!

Thanks for your work,

Thomas METAIS,
Illumination MacGuff


Larry Gritz
 

Apologies for the delay. How about this?  https://github.com/AcademySoftwareFoundation/OpenShadingLanguage/pull/1397

I tought it would be safe to do what we did -- the derived classes don't in fact override the destructor and they don't add any fields that need nontrivial teardown. But I agree that we've done something odd here, probably illegal or undefined behavior. I think your solution is probably the simplest and most foolproof approach to fixing.



On Aug 5, 2021, at 8:47 AM, Thomas METAIS <thomas.metais@...> wrote:

Hello,

We have a weird crash with a shader, compiled on the fly with a OSLCompiler created on the stack, and loaded with LoadMemoryCompiledShader.
The crash occurs later, when calling optimize_group. The compiled shader itself is very simple ( just does out = in ) and its input is connected to a pre-compiled shader.

/lib64/libc.so.6(+0x36400)[0x7fb729d97400]
/u/opt3production/production.CentOS-7.4/Opt/P-20210519-Opt-21_0002_0004/lib64/liboslexec.so.1.11(_ZN9OSL_v1_113pvt16RuntimeOptimizer19resolve_isconnectedEv+0x1e8)[0x7fb73366f5f8]
/u/opt3production/production.CentOS-7.4/Opt/P-20210519-Opt-21_0002_0004/lib64/liboslexec.so.1.11(_ZN9OSL_v1_113pvt16RuntimeOptimizer3runEv+0x318)[0x7fb7336746e8]
/u/opt3production/production.CentOS-7.4/Opt/P-20210519-Opt-21_0002_0004/lib64/liboslexec.so.1.11(_ZN9OSL_v1_113pvt17ShadingSystemImpl14optimize_groupERNS_11ShaderGroupEPNS_14ShadingContextEb+0x37f)[0x7fb7335a94df]

I ended up using a sanitizer (ASan) and it reports an error during the destruction of the compiler:

==11203==ERROR: AddressSanitizer: new-delete-type-mismatch on 0x60d000064760 in thread T0:
  object passed to delete has wrong type:
  size of the allocated type:   136 bytes;
  size of the deallocated type: 96 bytes.
    #0 0xa82280 in operator delete(void*, unsigned long) (/u/devstuff/releases.CentOS-7.8/mgl/tomtom/dist.21.1.asan/bin/mglr.bin+0xa82280)
    #1 0x7fc55591c2d8 in OSL_v1_11::pvt::SymbolTable::delete_syms() /u/local/imgdev/imgdev_installer/production/workspace/CentOS-7.4/release/build/Opt/P-20210519-Opt-21_0002_0004-r75747/Dist/CentOS-7.4/build/sources/OpenShadingLanguage/sources/OpenShadingLanguage-1.11.13.0/src/liboslcomp/symtab.cpp:278
    #2 0x7fc55590e392 in OSL_v1_11::pvt::SymbolTable::~SymbolTable() /u/local/imgdev/imgdev_installer/production/workspace/CentOS-7.4/release/build/Opt/P-20210519-Opt-21_0002_0004-r75747/Dist/CentOS-7.4/build/sources/OpenShadingLanguage/sources/OpenShadingLanguage-1.11.13.0/src/liboslcomp/symtab.h:208
    #3 0x7fc55590e392 in OSL_v1_11::pvt::OSLCompilerImpl::~OSLCompilerImpl() /u/local/imgdev/imgdev_installer/production/workspace/CentOS-7.4/release/build/Opt/P-20210519-Opt-21_0002_0004-r75747/Dist/CentOS-7.4/build/sources/OpenShadingLanguage/sources/OpenShadingLanguage-1.11.13.0/src/liboslcomp/oslcomp.cpp:120
    #4 0x7fc55590e4b0 in OSL_v1_11::OSLCompiler::~OSLCompiler() /u/local/imgdev/imgdev_installer/production/workspace/CentOS-7.4/release/build/Opt/P-20210519-Opt-21_0002_0004-r75747/Dist/CentOS-7.4/build/sources/OpenShadingLanguage/sources/OpenShadingLanguage-1.11.13.0/src/liboslcomp/oslcomp.cpp:47
    #5 0x1100654 in mg::rt::OslShaderConverter::compileShader(OSL_v1_11::ShadingSystem*, OpenImageIO_v2_2::string_view, std::string&, std::string&) src/rt/shading/src/OslShaderConverter.cpp:1817


It seems that when a FunctionSymbol object in the symbol table is deleted, its destructor is not called.
The FunctionSymbol class (symtab.h) is derived from the Symbol class (osl_pvt.h), but Symbol's destructor is not virtual.

We patched our OSL version (1.11.13), just making Symbol's destructor virtual and it solved our problem, no more crashes.

I hope this will help ( and that I'm not wrong somewhere... )!

Thanks for your work,

Thomas METAIS,
Illumination MacGuff

--
Larry Gritz