Thread: Clean-up callbacks for non-SR functions
Greets, Is there a way for me to clean up fn_extra in flinfo when the function is not a Set Returning Function(SRF)? I know I can use RegisterExprContextCallback and the RSI's econtext to register a callback for SRFs, but this--or similar functionality--does not appear to be available for non-SRFs. Am I missing something? (This question comes from trying to keep non-SRF generator support in PL/Py from leaking all over the floor..) Regards, James William Pye
James William Pye <flaw@rhid.com> writes: > I know I can use RegisterExprContextCallback and the RSI's econtext to > register a callback for SRFs, but this--or similar > functionality--does not appear to be available for non-SRFs. Hm? That functionality works for any function, whether it returns a set or not. regards, tom lane
On 05/18/04:20/2, Tom Lane wrote: > Hm? That functionality works for any function, whether it returns a set > or not. Okay, then I think I found a bug: SELECT * FROM aFunction(); Gives fcinfo->resultinfo != NULL, regardless of the type of return. SELECT aFunction(); Gives fcinfo->resultinfo != NULL, ONLY IF it is a SRF.(fn_retset != 0) I think the culprit is in the function ExecMakeFunctionResult in file execQual.c, line ~1230: :e execQual.c /retset ....... /* * If function returns set, prepare a resultinfo node for * communication */ if (fcache->func.fn_retset) { fcinfo.resultinfo = (Node *) &rsinfo; ........ And to be nagging: Utility functions like OidFunctionCall# don't setup resultinfo, and probably rightfully so in some regards, but ISTM that there should be a mechanism that is independent of the executor. Maybe an explicit requirement to call a "FunctionCallCleanup(fcinfo)", or, dare I say, free hooks on pointers? :) I attached a simple patch that seems to make it work, but I'm not sure if there will be any unwanted side effects, as I'm barely familiar with the executor.... Here's a bunch of data that I collected, probably not very enlightening after the preceding summary(Discovered the pattern after gathering all this data).. ----------------------- uname -a FreeBSD void 4.10-PRERELEASE FreeBSD 4.10-PRERELEASE #5: Wed Apr 28 06:12:01 MST 2004 root@void:/files/src/freebsd/RELENG_4/sys/compile/void i386 backend> SELECT version(); 1: version = "PostgreSQL 7.4.1 on i386-portbld-freebsd4.9, compiled by GCC 2.95.4" All this data is retrieved as soon as it hits my handler: -------------------------------------------- Regular function: -------------------------------------------- CREATE FUNCTION count() RETURNS numeric LANGUAGE plpy AS ' i = 0L while True: i += 1 yield i '; backend> SELECT count(), * FROM someTable; Breakpoint 1, pl_handler (fcinfo=0xbfbff154) at src/pl.c:468 (gdb) print *fcinfo $2 = { flinfo = 0x841d234, context = 0x0, resultinfo = 0x0, <---------------------- :( isnull = 0 '\000', nargs = 0, arg = {0 <repeats 32 times>}, argnull = '\000' <repeats 31 times> } (gdb) print *fcinfo->flinfo $4 = { fn_addr = 0x285dc118 <pl_handler>, fn_oid = 554021, fn_nargs = 0, fn_strict = 0 '\000', fn_retset = 0 '\000', fn_extra = 0x0, fn_mcxt = 0x82ab858, fn_expr = 0x8422838 } (gdb) bt #0 pl_handler (fcinfo=0xbfbff154) at src/pl.c:468 #1 0x80f116e in ExecMakeFunctionResult () <--------------- #2 0x80f177a in ExecMakeTableFunctionResult () #3 0x80f2911 in ExecEvalExpr () #4 0x80f34d4 in ExecCleanTargetListLength () #5 0x80f36d2 in ExecProject () #6 0x80f37ac in ExecScan () #7 0x80fa432 in ExecSeqScan () #8 0x80efd11 in ExecProcNode () #9 0x80eea48 in ExecEndPlan () #10 0x80edff4 in ExecutorRun () #11 0x8153d56 in PortalRun () #12 0x8153c56 in PortalRun () #13 0x8150d87 in pg_plan_queries () #14 0x815310f in PostgresMain () #15 0x8107096 in main () #16 0x806d772 in _start () I also tried a simpler SELECT count();, it gave a NULL resultinfo as well. backend> SELECT * FROM count(); -- gives a not null resultinfo, and puts me: #0 pl_handler (fcinfo=0xbfbff1f4) at src/pl.c:468 #1 0x80f13a3 in ExecMakeTableFunctionResult () #2 0x80f9d47 in ExecReScanNestLoop () #3 0x80f3758 in ExecScan () #4 0x80f9e0a in ExecFunctionScan () #5 0x80efd51 in ExecProcNode () #6 0x80eea48 in ExecEndPlan () #7 0x80edff4 in ExecutorRun () #8 0x8153d56 in PortalRun () #9 0x8153c56 in PortalRun () #10 0x8150d87 in pg_plan_queries () #11 0x815310f in PostgresMain () #12 0x8107096 in main () #13 0x806d772 in _start () -------------------------------------------------- Composite type returning, on the other hand: -------------------------------------------------- CREATE FUNCTION Composite() RETURNS someTable LANGUAGE plpy AS 'return [0L]'; backend> SELECT * FROM Composite(); Breakpoint 1, pl_handler (fcinfo=0xbfbff1f4) at src/pl.c:468 (gdb) print *fcinfo $5 = { flinfo = 0x841d31c, context = 0x0, resultinfo = 0xbfbff1d4, isnull = 0 '\000', nargs = 0, arg = {0 <repeats 32 times>}, argnull = '\000' <repeats 31 times> } (gdb) print *fcinfo->flinfo $6 = { fn_addr = 0x285dc118 <pl_handler>, fn_oid = 554025, fn_nargs = 0, fn_strict = 0 '\000', fn_retset = 0 '\000', fn_extra = 0x0, fn_mcxt = 0x82ab858, fn_expr = 0x8313a60 } (gdb) print *((ReturnSetInfo *)fcinfo->resultinfo) $8 = { type = T_ReturnSetInfo, econtext = 0x841d1b0, expectedDesc = 0x841d258, allowedModes = 3, returnMode = SFRM_ValuePerCall, isDone = ExprSingleResult, setResult = 0x0, setDesc = 0x0 } (gdb) bt #0 pl_handler (fcinfo=0xbfbff1f4) at src/pl.c:468 #1 0x80f13a3 in ExecMakeTableFunctionResult () #2 0x80f9d47 in ExecReScanNestLoop () #3 0x80f3758 in ExecScan () #4 0x80f9e0a in ExecFunctionScan () #5 0x80efd51 in ExecProcNode () #6 0x80eea48 in ExecEndPlan () #7 0x80edff4 in ExecutorRun () #8 0x8153d56 in PortalRun () #9 0x8153c56 in PortalRun () #10 0x8150d87 in pg_plan_queries () #11 0x815310f in PostgresMain () #12 0x8107096 in main () #13 0x806d772 in _start () NOTE: If I SELECT Composite(); resultinfo is NULL.. -------------------------------------------------- And finally, an actual SRF: -------------------------------------------------- CREATE FUNCTION SRF() RETURNS SETOF someTable LANGUAGE plpy AS 'return [[0L]]'; backend> SELECT * FROM SRF(); Breakpoint 1, pl_handler (fcinfo=0xbfbff1f4) at src/pl.c:468 (gdb) print *fcinfo $11 = { flinfo = 0x842631c, context = 0x0, resultinfo = 0xbfbff1d4, isnull = 0 '\000', nargs = 0, arg = {0 <repeats 32 times>}, argnull = '\000' <repeats 31 times> } (gdb) print *fcinfo->flinfo $12 = { fn_addr = 0x285dc118 <pl_handler>, fn_oid = 554026, fn_nargs = 0, fn_strict = 0 '\000', fn_retset = 1 '\001', fn_extra = 0x0, fn_mcxt = 0x82ab858, fn_expr = 0x8313a60 } (gdb) print *((ReturnSetInfo *)fcinfo->resultinfo) $13 = { type = T_ReturnSetInfo, econtext = 0x84261b0, expectedDesc = 0x8426258, allowedModes = 3, returnMode = SFRM_ValuePerCall, isDone = ExprSingleResult, setResult = 0x0, setDesc = 0x0 } (gdb) bt #0 pl_handler (fcinfo=0xbfbff1f4) at src/pl.c:468 #1 0x80f13a3 in ExecMakeTableFunctionResult () #2 0x80f9d47 in ExecReScanNestLoop () #3 0x80f3758 in ExecScan () #4 0x80f9e0a in ExecFunctionScan () #5 0x80efd51 in ExecProcNode () #6 0x80eea48 in ExecEndPlan () #7 0x80edff4 in ExecutorRun () #8 0x8153d56 in PortalRun () #9 0x8153c56 in PortalRun () #10 0x8150d87 in pg_plan_queries () #11 0x815310f in PostgresMain () #12 0x8107096 in main () #13 0x806d772 in _start () Probably more info than you need/want, but gdb is fun, so here's lots! 8) Regards, James William Pye
Attachment
James William Pye <flaw@rhid.com> writes: > SELECT aFunction(); > Gives fcinfo->resultinfo != NULL, ONLY IF it is a SRF.(fn_retset != 0) Indeed. Since passing a ReturnSetInfo in resultinfo occurs only when the system is expecting a set result (and is prepared to handle one), I do not see what you would expect different here. > I attached a simple patch that seems to make it work, s/makes it work/breaks it/ ... this patch would effectively inform functions that *aren't* supposed to return set that a set result is expected. Which would certainly break plpgsql, and probably any other callee that is coded to handle both cases. It's true that this setup doesn't allow non-SRFs to get at the econtext, but I'm not sure that they need to. We have not previously seen any example where that's important. If it really were important then we'd need to invent a different result node type (*not* ReturnSetInfo) to carry only econtext. Such a function would however fail in situations where there simply isn't any econtext, which I think includes a lot of the system's internal uses. So let's see the use-case first. regards, tom lane
On 05/20/04:20/4, Tom Lane wrote: > It's true that this setup doesn't allow non-SRFs to get at the econtext, > but I'm not sure that they need to. The only thing my goal has in common with getting at the econtext is the ability to register a callback that can clean up my fn_extra at a relatively appropriate time. Effectively, the looked-up FmgrInfo "owns" a reference to the PyObject* stored in fn_extra. Ideally, the reference count of the object that fn_extra points to would be decremented by a callback/hook before the FmgrInfo is destroyed/pfree'd.. > Such a function would however fail in situations > where there simply isn't any econtext, which I think includes a lot of > the system's internal uses. Aye; this is why I was hoping for an ExprContext[normally executor?] independent mechanism. Although, I am mostly concerned with normal usage(for now), which I think constitutes the context of the executor, and thus ExprContext being available. > We have not previously seen any example where that's important. > .... > So let's see the use-case first. Okay, [a little foreword] I use fn_extra to store state information, specifically[normally] a pointer to an iterator(PyObject*). This usage occurs when the user chooses to return an iterator object(A generator is an iterator). So that the next time the function is called, the next iteration is retrieved rather than calling the function. A generator will execute the function, but it continues the execution after the last yield statement. Initially, I wasn't sure how such functions differed from an explicit VPC-SRF, but they do, especially in regard to isDone. When a VPC-SRF is done, it should of course mark isDone, but one of these "mock VPC-SRFs" should, effectively, restart, and continue on a given state until its FmgrInfo is thrown away.. COUNTER:i=0Lwhile True: # It doesn't stop, and shouldn't i+=1 yield i ... CREATE FUNCTION srfcount() RETURNS SETOF numeric LANGUAGE plpy AS '$COUNTER'; SELECT srfcount(); ... Well, you're gonna be waiting a while, as it will never be done, because isDone in the RSI will never be ExprEndResult. ----------- CREATE FUNCTION count() RETURNS numeric LANGUAGE plpy AS '$COUNTER'; SELECT count(), * FROM someTable; Renders the desired result: [snip previous 7] 1: count = "8" (typeid = 1700, len = -1, typmod = -1,byval = f) 2: somec = "foo" (typeid = 25, len = -1, typmod = -1, byval = f) Indeed this is a trivial case, but I believe this functionality would solve some of the issues that Elein discussed in a talk about plpython at OSCON2003. And in an elegant, Pythonic way. The code at the end: http://www.varlena.com/varlena/Talks/PyAggs/pyaggs.html > Indeed. Since passing a ReturnSetInfo in resultinfo occurs only when > the system is expecting a set result (and is prepared to handle one), > I do not see what you would expect different here. Well, I guess I wasn't expecting the system to request a set from a function that is not said to return a set(proretset): SELECT * FROM Composite(); -- Example from my preceding letter ... >> resultinfo = 0xbfbff1d4, Shows that the system gives it resultinfo, thus implying that the system wants a set, but Composite() merely returns a single complex type(tuple). I understand that this is to be seen as a feature rather than a flaw. > s/makes it work/breaks it/ ... this patch would effectively inform > functions that *aren't* supposed to return set that a set result is > expected. Indeed, it does. The new patch attached isn't much better, and is included for mostly discussion purposes. The patch stands to imply that a function should only attempt to return a set if the _contents_ of ResultSetInfo dictates such an action. ie, allowedModes should be a valid mode.(It is set to 0 in my patch) This seems to have some hackish qualities, so your solution would probably be better: > If it really were important then we'd > need to invent a different result node type (*not* ReturnSetInfo) to > carry only econtext. Although, I'm not sure if it's really that important.. > Which would certainly break plpgsql, [From a _brief_ analysis of pl/plpgsql] AFAICS, plpgsql only thinks about returning a set when retisset(fn_retset) is true, which is only true when proretset is true, which is only true if the function was explicitly defined to return a SETOF. So I'm not sure why this would break plpgsql. Needless to say, I don't see why plpgsql would act any differently if it were given a not null RSI when !fn_retset, considering that it only seems to use it when fn_retset is true. I didn't test it, so I might need to eat my hat, but... Line 353 of pl/plpgsql/pl_exec.c in plpgsql_exec_function() Also, return next is only allowed if fn_retset: In gram.y, line ~1179. > and probably any other callee that is coded to handle both cases. Aye. Can't argue with that. Regards, James William Pye
James William Pye <flaw@rhid.com> writes: > On 05/20/04:20/4, Tom Lane wrote: >> It's true that this setup doesn't allow non-SRFs to get at the econtext, >> but I'm not sure that they need to. > The only thing my goal has in common with getting at the econtext is > the ability to register a callback that can clean up my fn_extra at a > relatively appropriate time. > Effectively, the looked-up FmgrInfo "owns" a reference to the PyObject* > stored in fn_extra. Ideally, the reference count of the object that fn_extra > points to would be decremented by a callback/hook before the FmgrInfo is > destroyed/pfree'd.. Hm. I do not think you can use an expression context callback for this anyway, because the callback won't be called in the case that query execution is abandoned due to an error. What you'd need for something like that is a global data structure that is traversed at transaction commit/abort and tells you which PyObjects you are holding refcounts on. You might want to look at plpgsql's plpgsql_eoxact() function for something vaguely similar. regards, tom lane
On 05/21/04:20/5, Tom Lane wrote: > Hm. I do not think you can use an expression context callback for this > anyway, because the callback won't be called in the case that query > execution is abandoned due to an error. > What you'd need for something like that is a global data structure that > is traversed at transaction commit/abort and tells you which PyObjects > you are holding refcounts on. Indeed. I was planning to implement something along those lines for that case at a later point in time, just wasn't sure when the cleanup should occur, and how it was to be triggered at that time. I still have reservations about the temporary leakage, but I can't help but think that a "solution" to that is likely cost more than its worth. I can't imagine anything other than an explicit "end of usage" callback function stored in FmgrInfo that takes fn_extra as an argument, and this would still require ERROR handling... > You might want to look at plpgsql's > plpgsql_eoxact() function for something vaguely similar. Aye! Thanks for pointing me at EOXact! It's quite the perfect point to cleanup my fn_extra usage. I think that it is quite reasonable to specify that "inter-transaction" usage of the feature to be forbidden; thus further qualifying it as an appropriate cleanup point. Disqualifying my own idea: AFA free-hooks are concerned, it is clear to me now that they are BAD for my application, as it assumes FmgrInfo is allocated by Postgres's memory manager, and after a quick grep I see that FmgrInfo is statically declared/allocated is many places.. -- Regards, James William Pye