Thread: Writing triggers in C++

Writing triggers in C++

From

Jacob Rief

Date:

13 February 2007, 19:22:53

I tried to write a trigger using C++. This requires to include the
following header-files:

extern "C" {
#include <postgres.h>
#include <executor/spi.h>
#include <commands/trigger.h>
#include <fmgr.h>
}

Unfortunately some of the included headers define some structs and
functions where a few identifiers are C++ keywords.
The compiler-directive 'extern "C"' does not help here, it just tells
the compiler not to mangle C-identifiers. 'extern "C"' does not rename
C++ keywords into something else. Therefore AFAIK, if someone wants to
include those headers files into a C++ program, the identifiers causing
problems have to be renamed manually.

For instance, Postgresql version 8.2.3
/usr/include/pgsql/server/nodes/primnodes.h:950:   List  *using;    /* USING clause, if any (list of String) */
'using' is a C++ keyword

/usr/include/pgsql/server/nodes/parsenodes.h:179:   Oid   typeid;    /* type identified by OID */
'typeid' is a C++ keyword

/usr/include/pgsql/server/nodes/parsenodes.h:249,265,401,943,1309:   TypeName   *typename;
'typename' is a C++ keyword

/usr/include/pgsql/server/utils/builtins.h:544:
extern char *quote_qualified_identifier(const char *namespace,
'namespace' is a C++ keyword

Is there any convention how to rename such identifiers? If I would
rename those identifiers (I simply would add an underscore to each of
them), would such a patch be accepted and adopted onto one of the next
releases? 

Regards, Jacob

Re: Writing triggers in C++

From

Peter Eisentraut

Date:

13 February 2007, 19:56:08

Jacob Rief wrote:
> Is there any convention how to rename such identifiers? If I would
> rename those identifiers (I simply would add an underscore to each of
> them), would such a patch be accepted and adopted onto one of the
> next releases?

Couldn't you do the required renamings as preprocessor macros, e.g.,

#define typename _typename
#include <postgres_stuff>
#undef typename

#include <c++_stuff>

your_code;

I would expect very little enthusiasm for making PostgreSQL code C++ 
safe.  There is already too much trouble keeping up with all the 
variants of C.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: Writing triggers in C++

From

Tom Lane

Date:

13 February 2007, 21:07:10

Jacob Rief <jacob.rief@gmx.at> writes:
> I tried to write a trigger using C++.

That is most likely not going to work anyway, because the backend
operating environment is C not C++.  If you dumb it down enough
--- no exceptions, no RTTI, no use of C++ library --- then it might
work, but at that point you're really coding in C anyway.

> Is there any convention how to rename such identifiers? If I would
> rename those identifiers (I simply would add an underscore to each of
> them), would such a patch be accepted and adopted onto one of the next
> releases? 

No.  Because of the above problems, we don't see much reason to avoid
C++'s extra keywords.
        regards, tom lane

Re: Writing triggers in C++

From

Andreas Pflug

Date:

14 February 2007, 08:06:13

Tom Lane wrote:
> Jacob Rief <jacob.rief@gmx.at> writes:
>   
>> I tried to write a trigger using C++.
>>     
>
> That is most likely not going to work anyway, because the backend
> operating environment is C not C++.  If you dumb it down enough
> --- no exceptions, no RTTI, no use of C++ library --- then it might
> work, 
I can confirm that it does work this way.

Regards,
Andreas

Re: Writing triggers in C++

From

"Florian G. Pflug"

Date:

14 February 2007, 09:05:30

Andreas Pflug wrote:
> Tom Lane wrote:
>> Jacob Rief <jacob.rief@gmx.at> writes:
>>   
>>> I tried to write a trigger using C++.
>>>     
>> That is most likely not going to work anyway, because the backend
>> operating environment is C not C++.  If you dumb it down enough
>> --- no exceptions, no RTTI, no use of C++ library --- then it might
>> work, 
> I can confirm that it does work this way.

I've written an aggregate function that uses c++ stl hashes, and it 
seems to work pretty well. I'd think that using exceptions should be
fine, as long as you make sure to _always_ catch any exception that
might be thrown inside your own c++ code, and don't let it propagate
into backend code. STL allows you to specify custom allocator classes
as template parameters to hash, vector and the like. You can use that
to let STL allocate memory from the correct memory context.

greetings, Florian Pflug

Re: Writing triggers in C++

From

Alvaro Herrera

Date:

14 February 2007, 09:21:12

Florian G. Pflug wrote:
> Andreas Pflug wrote:
> >Tom Lane wrote:
> >>Jacob Rief <jacob.rief@gmx.at> writes:
> >>  
> >>>I tried to write a trigger using C++.
> >>>    
> >>That is most likely not going to work anyway, because the backend
> >>operating environment is C not C++.  If you dumb it down enough
> >>--- no exceptions, no RTTI, no use of C++ library --- then it might
> >>work, 
> >I can confirm that it does work this way.
> 
> I've written an aggregate function that uses c++ stl hashes, and it 
> seems to work pretty well. I'd think that using exceptions should be
> fine, as long as you make sure to _always_ catch any exception that
> might be thrown inside your own c++ code, and don't let it propagate
> into backend code. STL allows you to specify custom allocator classes
> as template parameters to hash, vector and the like. You can use that
> to let STL allocate memory from the correct memory context.

What happens if Postgres raises an elog(ERROR) in the code you're
catching exceptions in?  Is it propagated outwards?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Writing triggers in C++

From

"Florian G. Pflug"

Date:

14 February 2007, 12:03:02

Alvaro Herrera wrote:
> Florian G. Pflug wrote:
>> Andreas Pflug wrote:
>>> Tom Lane wrote:
>>>> Jacob Rief <jacob.rief@gmx.at> writes:
>>>>  
>>>>> I tried to write a trigger using C++.
>>>>>    
>>>> That is most likely not going to work anyway, because the backend
>>>> operating environment is C not C++.  If you dumb it down enough
>>>> --- no exceptions, no RTTI, no use of C++ library --- then it might
>>>> work, 
>>> I can confirm that it does work this way.
>> I've written an aggregate function that uses c++ stl hashes, and it 
>> seems to work pretty well. I'd think that using exceptions should be
>> fine, as long as you make sure to _always_ catch any exception that
>> might be thrown inside your own c++ code, and don't let it propagate
>> into backend code. STL allows you to specify custom allocator classes
>> as template parameters to hash, vector and the like. You can use that
>> to let STL allocate memory from the correct memory context.
> 
> What happens if Postgres raises an elog(ERROR) in the code you're
> catching exceptions in?  Is it propagated outwards?

In my case, the only possible source of an elog(ERROR) would palloc(), 
when the machine is out of memory (Does it even throw elog(ERROR), or
does it return NULL just as malloc() ?). Since this is rather unlikely,
and would probably lead to a postgres shutdown anyway, I didn't really
care about that case.

You're right of course that this is different for triggers - they're 
much more likely to call SPI functions or otherwise interact with the
backend than my rather self-contained aggregate function. Still, I'd 
think that an elog(ERROR) would propagate outwards - but any C++
destructors of local (stack-allocated) objects wouldn't be called.

So, to be safe, I guess one would need to surround any call that could
call elog(ERROR) with an appropriate handler that translates the 
elog(ERROR) into a C++ exception. This C++ exception would have to be
translated back into an elog(ERROR) at the outmost level of C++ code.

Maybe we should create some wiki page or pgfoundry project that collects
all glue code, tipps and tricks that people invented to glue C++ into
the postgres backend.

greetings, Florian Pflug

Re: Writing triggers in C++

From

Alvaro Herrera

Date:

14 February 2007, 12:20:21

Florian G. Pflug wrote:
> Alvaro Herrera wrote:
> >Florian G. Pflug wrote:
> >>Andreas Pflug wrote:
> >>>Tom Lane wrote:
> >>>>Jacob Rief <jacob.rief@gmx.at> writes:
> >>>> 
> >>>>>I tried to write a trigger using C++.
> >>>>>   
> >>>>That is most likely not going to work anyway, because the backend
> >>>>operating environment is C not C++.  If you dumb it down enough
> >>>>--- no exceptions, no RTTI, no use of C++ library --- then it might
> >>>>work, 
> >>>I can confirm that it does work this way.
> >>I've written an aggregate function that uses c++ stl hashes, and it 
> >>seems to work pretty well. I'd think that using exceptions should be
> >>fine, as long as you make sure to _always_ catch any exception that
> >>might be thrown inside your own c++ code, and don't let it propagate
> >>into backend code. STL allows you to specify custom allocator classes
> >>as template parameters to hash, vector and the like. You can use that
> >>to let STL allocate memory from the correct memory context.
> >
> >What happens if Postgres raises an elog(ERROR) in the code you're
> >catching exceptions in?  Is it propagated outwards?
> 
> In my case, the only possible source of an elog(ERROR) would palloc(), 
> when the machine is out of memory (Does it even throw elog(ERROR), or
> does it return NULL just as malloc() ?). Since this is rather unlikely,
> and would probably lead to a postgres shutdown anyway, I didn't really
> care about that case.

No, an out-of-memory leads to elog(ERROR), which rolls back the current
transaction.  This releases some memory so the system can continue
working.  In fact we periodically see out-of-memory reports, and they
certainly _don't_ cause a general shutdown.

> You're right of course that this is different for triggers - they're 
> much more likely to call SPI functions or otherwise interact with the
> backend than my rather self-contained aggregate function. Still, I'd 
> think that an elog(ERROR) would propagate outwards - but any C++
> destructors of local (stack-allocated) objects wouldn't be called.

Probably stack allocation doesn't matter much, as I think that would be
unwinded by the longjmp call.  I don't know a lot about C++, but if
there are allocations in the data area then those would probably not be
freed.  But it makes me wonder -- is longjmp very compatible with C++
exceptions at all?  I know that it causes problems with POSIX thread
cancel_push() and cancel_pop() for example (meaning, they can't be
used).

> So, to be safe, I guess one would need to surround any call that could
> call elog(ERROR) with an appropriate handler that translates the 
> elog(ERROR) into a C++ exception. This C++ exception would have to be
> translated back into an elog(ERROR) at the outmost level of C++ code.

Sort of a PG_RE_THROW() in the exception handler, I guess.

> Maybe we should create some wiki page or pgfoundry project that collects
> all glue code, tipps and tricks that people invented to glue C++ into
> the postgres backend.

If it can be made to work, sure; in techdocs.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Writing triggers in C++

From

Neil Conway

Date:

14 February 2007, 12:38:49

On Wed, 2007-02-14 at 13:19 -0300, Alvaro Herrera wrote:
> Probably stack allocation doesn't matter much, as I think that would be
> unwinded by the longjmp call.  I don't know a lot about C++, but if
> there are allocations in the data area then those would probably not be
> freed.  But it makes me wonder -- is longjmp very compatible with C++
> exceptions at all?

"C-style stack unwinding (using setjmp and longjmp from <csetjmp>) is
incompatible with exception-handling and is best avoided." (Stroustrup,
p. 433).

Which presumably means that in practice, the interaction between these
features is implementation-defined.

-Neil

Re: Writing triggers in C++

From

"Florian G. Pflug"

Date:

14 February 2007, 13:20:27

Alvaro Herrera wrote:
> Florian G. Pflug wrote:
>> Alvaro Herrera wrote:
>>> Florian G. Pflug wrote:
>>>> Andreas Pflug wrote:
>>>>> Tom Lane wrote:
>>>>>> Jacob Rief <jacob.rief@gmx.at> writes:
>>>>>>
>>>>>>> I tried to write a trigger using C++.
>>>>>>>   
>>>>>> That is most likely not going to work anyway, because the backend
>>>>>> operating environment is C not C++.  If you dumb it down enough
>>>>>> --- no exceptions, no RTTI, no use of C++ library --- then it might
>>>>>> work, 
>>>>> I can confirm that it does work this way.
>>>> I've written an aggregate function that uses c++ stl hashes, and it 
>>>> seems to work pretty well. I'd think that using exceptions should be
>>>> fine, as long as you make sure to _always_ catch any exception that
>>>> might be thrown inside your own c++ code, and don't let it propagate
>>>> into backend code. STL allows you to specify custom allocator classes
>>>> as template parameters to hash, vector and the like. You can use that
>>>> to let STL allocate memory from the correct memory context.
>>> What happens if Postgres raises an elog(ERROR) in the code you're
>>> catching exceptions in?  Is it propagated outwards?
>> In my case, the only possible source of an elog(ERROR) would palloc(), 
>> when the machine is out of memory (Does it even throw elog(ERROR), or
>> does it return NULL just as malloc() ?). Since this is rather unlikely,
>> and would probably lead to a postgres shutdown anyway, I didn't really
>> care about that case.
> 
> No, an out-of-memory leads to elog(ERROR), which rolls back the current
> transaction.  This releases some memory so the system can continue
> working.  In fact we periodically see out-of-memory reports, and they
> certainly _don't_ cause a general shutdown.

Sorry, I explained my point badly. What I actually meant is that in my
specific use-case (Lots of small transaction, non of which use much 
memory), the only reason for out-of-memory conditions I've even seen
was some application gone wild that ate up all available memory. In that 
case, postgres dies sooner or later, because any memory freed during 
rollback is immediatly used by that other application. In general, of 
course, you're right.

>> You're right of course that this is different for triggers - they're 
>> much more likely to call SPI functions or otherwise interact with the
>> backend than my rather self-contained aggregate function. Still, I'd 
>> think that an elog(ERROR) would propagate outwards - but any C++
>> destructors of local (stack-allocated) objects wouldn't be called.
> 
> Probably stack allocation doesn't matter much, as I think that would be
> unwinded by the longjmp call.  I don't know a lot about C++, but if
> there are allocations in the data area then those would probably not be
> freed.  But it makes me wonder -- is longjmp very compatible with C++
> exceptions at all?  I know that it causes problems with POSIX thread
> cancel_push() and cancel_pop() for example (meaning, they can't be
> used).

Yeah, the memory taken by stack-allocated objects is freed (basically by
just resetting the stack pointer). But normally, C++ would call the 
destructor of a stack-allocated objects _before_ resetting the 
stack-pointer. Since setjmp/longjmp don't know anything about C++, they
will omit this step. Whether this causes problems or not depends on the
objects that you allocated on the stack...

>> So, to be safe, I guess one would need to surround any call that could
>> call elog(ERROR) with an appropriate handler that translates the 
>> elog(ERROR) into a C++ exception. This C++ exception would have to be
>> translated back into an elog(ERROR) at the outmost level of C++ code.
> 
> Sort of a PG_RE_THROW() in the exception handler, I guess.
> 
>> Maybe we should create some wiki page or pgfoundry project that collects
>> all glue code, tipps and tricks that people invented to glue C++ into
>> the postgres backend.
> 
> If it can be made to work, sure; in techdocs.

I was thinking that two pairs of macros,
PG_BEGIN_CPP, PG_END_CPP and
PG_CPP_BEGIN_BACKEND, PG_CPP_END_BACKEND
should be able to take care of the exception handling issues.

You'd need to wrap any code-block that calls postgres functions that 
might do an elog(ERROR) inside PG_CPP_BEGIN_BACKEND, PG_CPP_END_BACKEND.

Vice versa, any block of c++ code that is called from the backend would
need to start with PG_BEGIN_CPP, and end with PG_END_CPP.

Do you see any other possible problems, aside from memory managemt issues?

greetings, Florian Pflug

Re: Writing triggers in C++

From

"Florian G. Pflug"

Date:

14 February 2007, 13:25:41

Neil Conway wrote:
> On Wed, 2007-02-14 at 13:19 -0300, Alvaro Herrera wrote:
>> Probably stack allocation doesn't matter much, as I think that would be
>> unwinded by the longjmp call.  I don't know a lot about C++, but if
>> there are allocations in the data area then those would probably not be
>> freed.  But it makes me wonder -- is longjmp very compatible with C++
>> exceptions at all?
> 
> "C-style stack unwinding (using setjmp and longjmp from <csetjmp>) is
> incompatible with exception-handling and is best avoided." (Stroustrup,
> p. 433).
> 
> Which presumably means that in practice, the interaction between these
> features is implementation-defined.

Well, as long as you don't longjmp "past" an C++ catch block, and don't
throw an C++ exception "past" an setjmp handler, there should be no 
problem I think. Or at least I can't imagine how a problem could arise..

greetings, Florian Pflug

Re: Writing triggers in C++

From

Andreas Seltenreich

Date:

14 February 2007, 14:00:10

Florian G. Pflug writes:

>>> Maybe we should create some wiki page or pgfoundry project that collects
>>> all glue code, tipps and tricks that people invented to glue C++ into
>>> the postgres backend.
>>
>> If it can be made to work, sure; in techdocs.
>
> I was thinking that two pairs of macros,
> PG_BEGIN_CPP, PG_END_CPP and
> PG_CPP_BEGIN_BACKEND, PG_CPP_END_BACKEND
> should be able to take care of the exception handling issues.
>
> You'd need to wrap any code-block that calls postgres functions that
> might do an elog(ERROR) inside PG_CPP_BEGIN_BACKEND,
> PG_CPP_END_BACKEND.
>
> Vice versa, any block of c++ code that is called from the backend would
> need to start with PG_BEGIN_CPP, and end with PG_END_CPP.

I've made positive experiences with such a setup, although I've spared
the PG_BEGIN_CPP/PG_END_CPP by doing the exception conversion in a C++
language handler that instantiates functors using the portable class
loading technique described in this paper:

<http://www.s11n.net/papers/classloading_cpp.html>

I'd be glad to help out on a pgfoundry project to make C++ a better
citizen for extending postgres.

regards,
andreas

Re: Writing triggers in C++

From

"bjarne"

Date:

15 February 2007, 13:46:17

On Feb 14, 11:26 am, f...@phlo.org ("Florian G. Pflug") wrote:
> Neil Conway wrote:
> > On Wed, 2007-02-14 at 13:19 -0300, Alvaro Herrera wrote:
> >> Probably stack allocation doesn't matter much, as I think that would be
> >> unwinded by the longjmp call.  I don't know a lot about C++, but if
> >> there are allocations in the data area then those would probably not be
> >> freed.  But it makes me wonder -- is longjmp very compatible with C++
> >> exceptions at all?
>
> > "C-style stack unwinding (using setjmp and longjmp from <csetjmp>) is
> > incompatible with exception-handling and is best avoided." (Stroustrup,
> > p. 433).
>
> > Which presumably means that in practice, the interaction between these
> > features is implementation-defined.
>
> Well, as long as you don't longjmp "past" an C++ catch block, and don't
> throw an C++ exception "past" an setjmp handler, there should be no
> problem I think. Or at least I can't imagine how a problem could arise..
>

Also, don't jump out of (past) the scope of any local variable with a
destructor.

If you are in a C++ program, use exceptions. If you are in a C
program, fake the equivalent using setjmp/longjmp. Don't mix the two -
it's too tricky.
 -- Bjarne Stroustrup; http://www.research.att.com/~bs

Re: Writing triggers in C++

From

Jacob Rief

Date:

18 February 2007, 18:47:50

Tom Lane wrote:

> That is most likely not going to work anyway, because the backend
> operating environment is C not C++.  If you dumb it down enough
> --- no exceptions, no RTTI, no use of C++ library --- then it might
> work, but at that point you're really coding in C anyway.

Writing "normal" user-defined-functions in C++ is not a problem so far.
I even handle C++ exceptions, by catching each C++ exception inside my
functions. The catch()-blocks in those functions raise
Postgres-exceptions using elog in case of a throw(). Writing "normal"
user-defined-functions in C++ is even encouraged by the documentation,
which says: "User-defined functions can be written in C (or a language
that can be made compatible with C, such as C++)." [chapter 33.9.]
The question is, why not writing user-defined trigger-functions in 
C++ ? The difference between a "normal" function and a trigger function
is not that big although. The "big" difference is, that one must include
some more header-files (executor/spi.h and commands/trigger.h) which
themselves include other headers-files containing identifiers which
unfortunately are C++-keywords.

> > Is there any convention how to rename such identifiers? If I would
> > rename those identifiers (I simply would add an underscore to each of
> > them), would such a patch be accepted and adopted onto one of the next
> > releases? 
> 
> No.  Because of the above problems, we don't see much reason to avoid
> C++'s extra keywords.

In order to check how much code would be have to be changed, I renamed
the affected keywords in the Postgres-8.2.3-headers files, patched the
affected sources and recompiled the code. The resulting patch affects
only 189 lines of code in 23 files.
Applying this patch would encourage authors of external trigger
functions to write their code in C++ instead of using PL/SQL and calling
"normal" user-defined functions, or writing wrappers in C to hide the 
C++-keywords.
I will recreate this patch for the CVS-head of the repository, if there
are chances that it ever will be commitet.

Regards, Jacob