Thread: Issues with C++ exception handling in an FDW
Hello, I've run into a very odd issue calling C++ code that uses exceptions from within our PostgreSQL FDW. Specifically, we havebroken our FDW into two components, a C layer that looks quite similar to the FDW for text files and a C++ layer thatis called into by the C layer to interface with our storage file format. We compile these two components into separate shared libraries, thus we have: c-fdw.so c++-fdw.so and the c-fdw.so is compiled using -Wl,-rpath to allow it to find the c++-fdw.so at load time. When there are no errors everything works flawlessly, however, we noticed that even throwing an exception in the C++ layerwas causing an immediate segmentation fault. Even when encapsulated in a try { } catch(...) { } block. If anyone has seen anything like this, any pointers or suggestions would be much appreciated. I have followed all of therecommendations in the PostgreSQL documentation, with no luck. I am not overloading the _init() functions in either sharedlibrary (another potential source of errors I have read about). Thanks! Craig Soules
On Mon, Jan 30, 2012 at 6:04 PM, Soules, Craig <craig.soules@hp.com> wrote: > When there are no errors everything works flawlessly, however, we noticed that even throwing an exception in the C++ layerwas causing an immediate segmentation fault. Even when encapsulated in a try { } catch(...) { } block. Stack backtrace? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 30 January 2012 23:04, Soules, Craig <craig.soules@hp.com> wrote: > When there are no errors everything works flawlessly, however, we noticed that even throwing an exception in the C++ layerwas causing an immediate segmentation fault. Even when encapsulated in a try { } catch(...) { } block. > > If anyone has seen anything like this, any pointers or suggestions would be much appreciated. I have followed all of therecommendations in the PostgreSQL documentation, with no luck. I am not overloading the _init() functions in either sharedlibrary (another potential source of errors I have read about). I suggest that you generalise from the example of PLV8. The basic problem is that the effect of longjmp()ing over an area of the stack with a C++ non-POD type is undefined. I don't think you can even use structs, as they have implicit destructors in C++. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
On Tuesday, January 31, 2012 07:52:52 PM Peter Geoghegan wrote: > On 30 January 2012 23:04, Soules, Craig <craig.soules@hp.com> wrote: > > When there are no errors everything works flawlessly, however, we noticed > > that even throwing an exception in the C++ layer was causing an > > immediate segmentation fault. Even when encapsulated in a try { } > > catch(...) { } block. > > > > If anyone has seen anything like this, any pointers or suggestions would > > be much appreciated. I have followed all of the recommendations in the > > PostgreSQL documentation, with no luck. I am not overloading the > > _init() functions in either shared library (another potential source of > > errors I have read about). > > I suggest that you generalise from the example of PLV8. The basic > problem is that the effect of longjmp()ing over an area of the stack > with a C++ non-POD type is undefined. I don't think you can even use > structs, as they have implicit destructors in C++. The PODness of a struct depends on its contents. Andres
On 31 January 2012 19:01, Andres Freund <andres@anarazel.de> wrote: >> I suggest that you generalise from the example of PLV8. The basic >> problem is that the effect of longjmp()ing over an area of the stack >> with a C++ non-POD type is undefined. I don't think you can even use >> structs, as they have implicit destructors in C++. > The PODness of a struct depends on its contents. Right. If I was going to invest much effort in this sort of thing, I might even write a static assertion that verified a given type's POD-ness over time, by declaring it within a union...which would work....unless you were using C++11. Or, just use std::is_pod to build a static assertion. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
> I suggest that you generalise from the example of PLV8. The basic > problem is that the effect of longjmp()ing over an area of the stack > with a C++ non-POD type is undefined. I don't think you can even use > structs, as they have implicit destructors in C++. I had thought that this was only an issue if you tried to longjmp() over a section of C++ code starting from a postgres backendC function? From the PostgreSQL documentation: " If calling backend functions from C++ code, be sure that the C++ call stack contains only plain old data structures (POD).This is necessary because backend errors generate a distant longjmp() that does not properly unroll a C++ call stackwith non-POD objects." But this is not what our code is doing. Our code is a C++ function that only does the following: try { throw 1; } catch (int e) { } catch (...) { } which causes an immediate segmentation fault. To answer another responders question, the stack trace looks as follows: #0 0x00002b3ce8f40fa5 in __cxa_allocate_exception () from /usr/lib64/libstdc++.so.6 #1 0x00002b3ce77b6256 in initMBSource (state=0x1ab87a80) at /data/soules/metaboxA-bugfix/Metabox/debug_build/src/lib/query/dsFdwShim.cpp:16791 #2 0x00002b3ce6c0b0aa in dsBeginForeignScan (node=0x1ab872d0, eflags=<value optimized out>) at dataseries_fdw.c:819 #3 0x000000000057606c in ExecInitForeignScan () #4 0x000000000055c715 in ExecInitNode () #5 0x000000000056874c in ExecInitAgg () #6 0x000000000055c6a5 in ExecInitNode () #7 0x000000000055b944 in standard_ExecutorStart () #8 0x0000000000621b96 in PortalStart () #9 0x000000000061edad in exec_simple_query () #10 0x000000000061f624 in PostgresMain () #11 0x00000000005e4c5c in ServerLoop () #12 0x00000000005e595c in PostmasterMain () #13 0x000000000058a77e in main () Note: #2 is the entry into our C library and #1 is the entry into our C++ library This appears to be some kind of allocation error, but the machine on which I'm running has plenty of free ram: [~]$ free total used free shared buffers cached Mem: 24682888 10505920 14176968 0 1220496 7412352 -/+ buffers/cache: 1873072 22809816 Swap: 2096472 0 2096472 I also don't understand how it could truly be an allocation issue since we new/delete plenty of memory during a successfulrun (as well as using plenty of C++ containers which do internal allocation). Hopefully this helps jog thoughts on my issue! Thanks again! Craig