Thread: Security definer "generated column" function used in index
PostgreSQL version 9.0.4, 64 bit. Linux version 2.6.16.60-0.39.3-smp (geeko@buildhost) (gcc version 4.1.2 20070115 (SUSE Linux)) #1 SMP Mon May 11 11:46:34 UTC 2009 SUSE Linux Enterprise Server 10 (x86_64) VERSION = 10 PATCHLEVEL = 2 We flagged some functions as SECURITY DEFINER and had queries which had been in use for months suddenly fail to complete. We set them back to SECURITY INVOKER and things returned to normal. I took stack traces of the four connections with queries which seemed to be "stuck". They all had this sequence of calls in the middle: #13 0x000000000054a5b6 in fmgr_sql (fcinfo=0x7fff9eec0b30) at functions.c:441 #14 0x00000000006a2f05 in fmgr_security_definer (fcinfo=0x30006) at fmgr.c:957 #15 0x0000000000544047 in ExecMakeFunctionResult (fcache=0x512ac70, econtext=0x512aa80, isNull=0x512b8a8 "", isDone=0x512b9c0) at execQual.c:1827 #16 0x0000000000540fb7 in ExecProject (projInfo=<value optimized out>, isDone=0x7fff9eec101c) at execQual.c:5089 #17 0x0000000000555403 in ExecResult (node=0x512a970) at nodeResult.c:155 #18 0x00000000005409a6 in ExecProcNode (node=0x512a970) at execProcnode.c:355 #19 0x000000000053f891 in standard_ExecutorRun (queryDesc=0x2b4121b11280, direction=-1628699568, count=1) at execMain.c:1188 #20 0x000000000054a656 in fmgr_sql (fcinfo=0x7fff9eec1310) at functions.c:475 #21 0x00000000006a2f05 in fmgr_security_definer (fcinfo=0x30006) at fmgr.c:957 #22 0x0000000000545ef0 in ExecMakeFunctionResultNoSets (fcache=0x2b4121aa2b98, econtext=0x2b4121aa1798, isNull=0x7fff9eec1a90 "", isDone=<value optimized out>) at execQual.c:1894 #23 0x0000000000545e6c in ExecMakeFunctionResultNoSets (fcache=0x2b4121aa2358, econtext=0x2b4121aa1798, isNull=0x7fff9eec1b8f "", isDone=<value optimized out>) at execQual.c:1866 #24 0x0000000000545f8f in ExecQual (qual=<value optimized out>, econtext=0x2b4121aa1798, resultForNull=0 '\0') at execQual.c:4991 #25 0x00000000005476ef in ExecScan (node=0x2b4121aa1688, accessMtd=0x5511a0 <IndexNext>, recheckMtd=0x551150 <IndexRecheck>) at execScan.c:192 #26 0x00000000005409ea in ExecProcNode (node=0x2b4121aa1688) at execProcnode.c:382 #27 0x0000000000554935 in ExecNestLoop (node=0x2b4121a9dea0) at nodeNestloop.c:154 #28 0x0000000000540a6a in ExecProcNode (node=0x2b4121a9dea0) at execProcnode.c:419 Full (unedited) stack traces for all four attached. Notice the recursive calls to fmgr_security_definer(). I wonder whether that might be a problem, since the comment for that function says: | This is not re-entrant, but then the fcinfo itself can't be used | re-entrantly anyway. All of these queries are similar, and involved searches using a LIKE clause against a "searchName" "generated column" -- a function taking the record type of the table as its parameter. That function then calls a function which takes several parameters, Both functions were changed to SECURITY DEFINER when the problems started. The functions are: CREATE OR REPLACE FUNCTION "searchName"(rec "Party") RETURNS "SearchNameT" LANGUAGE sql IMMUTABLE AS $$ select "searchName"($1."nameL", $1."nameF", $1."nameM", $1."suffix"); $$; CREATE OR REPLACE FUNCTION "searchName"("nameL" "LastNameT", "nameF" "FirstNameT", "nameM" "MiddleNameT", "suffix" "NameSuffixT") RETURNS "SearchNameT" LANGUAGE sql IMMUTABLE AS $$ select regexp_replace(upper( $1 || case when $2 is not null or $3 is not null or $4 is not null then ',' || coalesce($2, '') || coalesce($3, '') || coalesce($4, '') else '' end), '[^A-Z0-9\,]', '', 'g')::"SearchNameT" $$ And there is an index on "Party": "Party_SearchName" btree ("searchName"("Party".*)) First off, is there much chance that this is fixed between 9.0.4 and 9.0.6? If not, what do people feel would be the most useful information for diagnosing the problem? -Kevin
Attachment
No comments on this? It seems to me that at a minimum this needs better documentation of a limitation, and the conditions under which you hit the problem. I'm not sure there isn't an outright bug here. We would like to flag all of our trigger functions as SECURITY DEFINER, but there are triggers which do DML which can fire other triggers, and at this point I'm not sure whether that's safe. Anyone? -Kevin On 2011-12-09 12:49 PM I wrote: PostgreSQL version 9.0.4, 64 bit. Linux version 2.6.16.60-0.39.3-smp (geeko@buildhost) (gcc version 4.1.2 20070115 (SUSE Linux)) #1 SMP Mon May 11 11:46:34 UTC 2009 SUSE Linux Enterprise Server 10 (x86_64) VERSION = 10 PATCHLEVEL = 2 We flagged some functions as SECURITY DEFINER and had queries which had been in use for months suddenly fail to complete. We set them back to SECURITY INVOKER and things returned to normal. I took stack traces of the four connections with queries which seemed to be "stuck". They all had this sequence of calls in the middle: #13 0x000000000054a5b6 in fmgr_sql (fcinfo=0x7fff9eec0b30) at functions.c:441 #14 0x00000000006a2f05 in fmgr_security_definer (fcinfo=0x30006) at fmgr.c:957 #15 0x0000000000544047 in ExecMakeFunctionResult (fcache=0x512ac70, econtext=0x512aa80, isNull=0x512b8a8 "", isDone=0x512b9c0) at execQual.c:1827 #16 0x0000000000540fb7 in ExecProject (projInfo=<value optimized out>, isDone=0x7fff9eec101c) at execQual.c:5089 #17 0x0000000000555403 in ExecResult (node=0x512a970) at nodeResult.c:155 #18 0x00000000005409a6 in ExecProcNode (node=0x512a970) at execProcnode.c:355 #19 0x000000000053f891 in standard_ExecutorRun (queryDesc=0x2b4121b11280, direction=-1628699568, count=1) at execMain.c:1188 #20 0x000000000054a656 in fmgr_sql (fcinfo=0x7fff9eec1310) at functions.c:475 #21 0x00000000006a2f05 in fmgr_security_definer (fcinfo=0x30006) at fmgr.c:957 #22 0x0000000000545ef0 in ExecMakeFunctionResultNoSets (fcache=0x2b4121aa2b98, econtext=0x2b4121aa1798, isNull=0x7fff9eec1a90 "", isDone=<value optimized out>) at execQual.c:1894 #23 0x0000000000545e6c in ExecMakeFunctionResultNoSets (fcache=0x2b4121aa2358, econtext=0x2b4121aa1798, isNull=0x7fff9eec1b8f "", isDone=<value optimized out>) at execQual.c:1866 #24 0x0000000000545f8f in ExecQual (qual=<value optimized out>, econtext=0x2b4121aa1798, resultForNull=0 '\0') at execQual.c:4991 #25 0x00000000005476ef in ExecScan (node=0x2b4121aa1688, accessMtd=0x5511a0 <IndexNext>, recheckMtd=0x551150 <IndexRecheck>) at execScan.c:192 #26 0x00000000005409ea in ExecProcNode (node=0x2b4121aa1688) at execProcnode.c:382 #27 0x0000000000554935 in ExecNestLoop (node=0x2b4121a9dea0) at nodeNestloop.c:154 #28 0x0000000000540a6a in ExecProcNode (node=0x2b4121a9dea0) at execProcnode.c:419 Full (unedited) stack traces for all four attached. Notice the recursive calls to fmgr_security_definer(). I wonder whether that might be a problem, since the comment for that function says: | This is not re-entrant, but then the fcinfo itself can't be used | re-entrantly anyway. All of these queries are similar, and involved searches using a LIKE clause against a "searchName" "generated column" -- a function taking the record type of the table as its parameter. That function then calls a function which takes several parameters, Both functions were changed to SECURITY DEFINER when the problems started. The functions are: CREATE OR REPLACE FUNCTION "searchName"(rec "Party") RETURNS "SearchNameT" LANGUAGE sql IMMUTABLE AS $$ select "searchName"($1."nameL", $1."nameF", $1."nameM", $1."suffix"); $$; CREATE OR REPLACE FUNCTION "searchName"("nameL" "LastNameT", "nameF" "FirstNameT", "nameM" "MiddleNameT", "suffix" "NameSuffixT") RETURNS "SearchNameT" LANGUAGE sql IMMUTABLE AS $$ select regexp_replace(upper( $1 || case when $2 is not null or $3 is not null or $4 is not null then ',' || coalesce($2, '') || coalesce($3, '') || coalesce($4, '') else '' end), '[^A-Z0-9\,]', '', 'g')::"SearchNameT" $$ And there is an index on "Party": "Party_SearchName" btree ("searchName"("Party".*)) First off, is there much chance that this is fixed between 9.0.4 and 9.0.6? If not, what do people feel would be the most useful information for diagnosing the problem? -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > No comments on this? If there was a reproducible test case in your original message, I didn't see it, so I assumed you intended to investigate further on your own. It wasn't even clear to me that this was a Postgres bug rather than some error in your trigger logic. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> No comments on this? > > If there was a reproducible test case in your original message, > I didn't see it, so I assumed you intended to investigate further > on your own. It wasn't even clear to me that this was a Postgres > bug rather than some error in your trigger logic. Sorry if my first post wasn't clear. It was happening on SELECT statements; no triggers involved. (I had *intended* just to get trigger functions, but had accidentally included some others.) I wasn't able to create a small, self-contained test case with a few hours of attempts, so I was hoping someone could suggest (from the stack traces and other clues) how best to attempt that or what other information might be useful. It wasn't even clear to me that it was OK to have one security definer function call another, based on the code comment I quoted, so I didn't want to spend more hours on attempting to create a test case if it simply wasn't supported. Sad to say, the script which flagged the functions as security definer didn't cause problems in normal testing, and were deployed to production (in advance of a software release which will need the expanded permissions), where the problem surfaced under user load. The fact that the larger number of concurrent users hit the problem where my test scripts haven't suggests some race condition, so even if I create it here, it will probably be something where I need to know what information to capture while it is happening. We only need to add the security definer flag on trigger functions at this point for the upcoming application release, but I'm not yet confident that this is safe. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > ... It wasn't even clear to me that it was > OK to have one security definer function call another, based on the > code comment I quoted, so I didn't want to spend more hours on > attempting to create a test case if it simply wasn't supported. Yes, that's definitely *supposed* to work, though I'll grant that there could be bugs there. It's hard to see how it'd be a race condition though. On reflection what seems most likely is simply that turning these otherwise-inlineable SQL functions into SECURITY DEFINER disabled inline-ing them, resulting in catastrophic degradation of the generated plans, such that they took a lot longer than you were accustomed to (they shouldn't have been "hung" though). regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > On reflection what seems most likely is simply that turning these > otherwise-inlineable SQL functions into SECURITY DEFINER disabled > inline-ing them, resulting in catastrophic degradation of the > generated plans, such that they took a lot longer than you were > accustomed to (they shouldn't have been "hung" though). Ah, I had not considered that. That also explains why my attempts to recreate the situation with "toy" tables didn't show the issue. Also, it didn't occur to me until later to check whether a continue and another backtrace showed things moving; all the evidence suggested (in retrospect) that it was "doing something" rather than being blocked, per se; but these are normally sub-second queries which were killed after running over an hour, so I (probably wrongly) assumed they were in an endless loop. I will try again in just one site with a bit more care about which functions I flag. If that goes OK, I'll have the confidence to go forward with the application release. Thanks! -Kevin