Thread: Maintaining state across function calls
I want to process all the records in a table through a C-language (well, C++) function (i.e. one function call per row of the table) in such a way that the function hangs onto its internal state across calls. Something like SELECT my_function(a, b, c) FROM my_table ORDER BY d; The value returned in the last row of the table would be the result I'm looking for. (This could be neatened up by using a custom aggregate and putting my calculation in the sfunc but that's a minor detail). The question is: what's the "best practice" way of letting a C/C++-language function hang onto internal state across calls? So far I'm thinking something along the lines of: Datum my_function(int a, int b, int c, int reset) { static my_data *p = NULL; if (reset) //(re)initialise internal state { delete p; p = NULL; } else { if (!p) { p = new my_data; } //make use of internal state to do calculations or whatever } } The user would be responsible for calling my_function with "reset" set to true to wipe previous internal state before using the function in a new query; doing this also frees the memory associated with the function. This system is of course prone to leakage if the user forgets to wipe the internal state after use, but it will only leak sizeof(my_data) per connection, and the OS will garbage-collect all that when the connection dies anyway. Alternatively, use this in a custom aggregate and make the ffunc do the garbage collection, which should prevents leakage altogether. Is this a reasonable thing to do? What are the risks? Is there a more "best-practice" way to achieve the same result? Many thanks, Matt
On 11/19/2012 08:41 PM, matt@byrney.com wrote: > I want to process all the records in a table through a C-language (well, > C++) function (i.e. one function call per row of the table) in such a way > that the function hangs onto its internal state across calls. Something > like > > SELECT my_function(a, b, c) FROM my_table ORDER BY d; > > The value returned in the last row of the table would be the result I'm > looking for. (This could be neatened up by using a custom aggregate and > putting my calculation in the sfunc but that's a minor detail). [snip] > Alternatively, use this in a custom aggregate and make the ffunc do the > garbage collection, which should prevents leakage altogether. You don't generally need to do this cleanup yourself. Use appropriate palloc memory contexts and it'll be done for you when the memory context is destroyed. I would want to implement this as an aggregate using the standard aggregate / window function machinery. Have a look at how the existing aggregates like string_agg are implemented in the Pg source code. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
> On 11/19/2012 08:41 PM, matt@byrney.com wrote: >> I want to process all the records in a table through a C-language (well, >> C++) function (i.e. one function call per row of the table) in such a >> way >> that the function hangs onto its internal state across calls. Something >> like >> >> SELECT my_function(a, b, c) FROM my_table ORDER BY d; >> >> The value returned in the last row of the table would be the result I'm >> looking for. (This could be neatened up by using a custom aggregate and >> putting my calculation in the sfunc but that's a minor detail). > [snip] >> Alternatively, use this in a custom aggregate and make the ffunc do the >> garbage collection, which should prevents leakage altogether. > You don't generally need to do this cleanup yourself. Use appropriate > palloc memory contexts and it'll be done for you when the memory context > is destroyed. > > I would want to implement this as an aggregate using the standard > aggregate / window function machinery. Have a look at how the existing > aggregates like string_agg are implemented in the Pg source code. Thanks for your reply. A follow-up question: to use the palloc/pfree functions with a C++ STL container, do I simply give the container an allocator which uses palloc and pfree instead of the default allocator, which uses new and delete? Matt
matt@byrney.com writes: > The question is: what's the "best practice" way of letting a > C/C++-language function hang onto internal state across calls? A static variable for that is a really horrid idea. Instead use fcinfo->flinfo->fn_extra to point to some workspace palloc'd in the appropriate context. If you grep the PG sources for fn_extra you'll find plenty of examples. regards, tom lane
> matt@byrney.com writes: >> The question is: what's the "best practice" way of letting a >> C/C++-language function hang onto internal state across calls? > > A static variable for that is a really horrid idea. Instead use > fcinfo->flinfo->fn_extra to point to some workspace palloc'd in the > appropriate context. If you grep the PG sources for fn_extra you'll > find plenty of examples. > > regards, tom lane > Thanks for this. Out of curiosity, why is a static a bad way to do this?
matt@byrney.com writes: > Thanks for this. Out of curiosity, why is a static a bad way to do this? Well, it wouldn't allow more than one instance of the function per query, and it wouldn't reset correctly after an error, and surely you agree that your proposal of making the user do a separate "reset" step is an unreliable and unpleasant-to-use kluge. regards, tom lane
On 11/19/2012 10:09 PM, matt@byrney.com wrote: > Thanks for your reply. A follow-up question: to use the palloc/pfree > functions with a C++ STL container, do I simply give the container an > allocator which uses palloc and pfree instead of the default allocator, > which uses new and delete? If at all possible, isolate your C++ code from the PostgreSQL aggregate implementation. Pass the C++ code pre-allocated buffers to work with if you can, and manage the allocations in the Pg C code. Turn your C++ code into library that presents only `extern "C"` interfaces and opaque types if yu can. C++ exception handling and the PostgreSQL backend's longjmp() based error handling will interact in exciting and interesting ways. Avoid calling `palloc`, `pfree` etc from within C++ if you can. If you really must, ensure that your C++ code doesn't use any RAII, stack-allocated objects with dtors, etc. Otherwise you'll have to translate error handling mechanisms at every boundary between C++ and Pg code, something I'm not even certain is possible to do reliably. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 20 November 2012 01:30, Craig Ringer <craig@2ndquadrant.com> wrote: > Otherwise you'll have to translate error handling mechanisms at every > boundary between C++ and Pg code, something I'm not even certain is > possible to do reliably. I think it's probably the case that PLV8 is the most mature example of wrapping a C++ library that is liable to throw C++ exceptions within Postgres backend code, in a sane way (that is, avoiding unwinding the stack via longjmp() over a part of the stack where a destructor needs to be called, which is undefined in C++). -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Craig Ringer wrote: > If at all possible, isolate your C++ code from the PostgreSQL > aggregate implementation. Pass the C++ code pre-allocated buffers > to work with if you can, and manage the allocations in the Pg C > code. Turn your C++ code into library that presents only `extern > "C"` interfaces and opaque types if yu can. +1 You definitely want to separately compile the C code which interfaces with PostgreSQL and calls C entry points to the C++ code. A clear and clean boundary here is critical to reliability and maintainability. -Kevin
On Tue, Nov 20, 2012 at 12:30 PM, Craig Ringer <craig@2ndquadrant.com> wrote: > C++ exception handling and the PostgreSQL backend's longjmp() based > error handling will interact in exciting and interesting ways. Define "interesting"? You mean in Wash's sense of "Oh God, oh God, we're going to receive signal 9"? Not a huge fan of C++ exception handling myself, it seems to interact with a few things that way. ChrisA