Thread: BUG #5798: Some weird error with pl/pgsql procedure
The following bug has been logged online: Bug reference: 5798 Logged by: Maxim Boguk Email address: Maxim.Boguk@gmail.com PostgreSQL version: 8.4.4 Operating system: FreeBSD 7.2 Description: Some weird error with pl/pgsql procedure Details: While I testing my table compactor utility (http://code.google.com/p/compacttable/), I sometime getting exeptionally weird errors from pl/pgsql procedure. 2010-12-21 05:17:19 MSK 17263 postgres@hh from 192.168.1.129 [vxid:22/1898656 txid:20671567] [SELECT] ERROR: cannot extract system attribute from virtual tuple 2010-12-21 05:17:19 MSK 17263 postgres@hh from 192.168.1.129 [vxid:22/1898656 txid:20671567] [SELECT] CONTEXT: PL/pgSQL function "__clear_table_tail" line 50 at FOR over EXECUTE statement 2010-12-21 05:17:19 MSK 17263 postgres@hh from 192.168.1.129 [vxid:22/1898656 txid:20671567] [SELECT] STATEMENT: SELECT * from __clear_table_tail('vacancy_view', 'public', 'count_', 25053, 25) Now 50th line of the procedure is: --trying update all tuples in current page range FOR _new_tid IN EXECUTE 'UPDATE ONLY ' || quote_ident(arg_relnamespace) || '.' || quote_ident(arg_relname) || ' SET ' || arg_field || ' = ' || arg_field || ' WHERE ctid=ANY($1) RETURNING ctid' USING _current_tid_list LOOP This error usually happen with another error from Slony side (this error easy to understand and not an problem at all): 2010-12-21 05:15:08 MSK 24679 slony@hh from [local] [vxid:37/4336609 txid:20671079] [UPDATE] ERROR: could not serialize access due to concurrent update 2010-12-21 05:15:08 MSK 24679 slony@hh from [local] [vxid:37/4336609 txid:20671079] [UPDATE] STATEMENT: update only "public"."vacancy_view" set "count_"='1428' where "vacancy_id"='3568635'; ... I think there some strange concurrency issues with pl/pgsql and slony, and I don't understand reason of that error. This error not easy to reproduce but happens for me 3 times during tests on 130GB slave database run with my utility. Anyone can enlighten me what happens here?
On Mon, Dec 20, 2010 at 9:48 PM, Maxim Boguk <Maxim.Boguk@gmail.com> wrote: > Anyone can enlighten me what happens here? That does look weird, but without a simple test case I think it's going to be hard to investigate this. Installing Slony and your code and building a 130GB is more effort than I'm willing to put in... Any way you can get a stack backtrace at the point the error occurs? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
It happened again in much simpler case (no slony or deadlock message involved... no slony at all on the database). The same error DBD::Pg::db selectall_arrayref failed: ERROR: cannot extract system attribute from virtual tuple CONTEXT: PL/pgSQL function "__clear_table_tail" line 50 at FOR over EXECUTE statement It happened when function worked with small table with only 33 pages size and 600 live tuples. What made situations even more easy to track it is I know this table don't have any inserts or deletes for ages (only select and update) and I have full log of database queries during problem. Log of call of the procedure: 2010-12-23 00:21:04 MSK pgsql@billing 55717 ERROR: cannot extract system attribute from virtual tuple 2010-12-23 00:21:04 MSK pgsql@billing 55717 CONTEXT: PL/pgSQL function "__clear_table_tail" line 50 at FOR over EXECUTE statement 2010-12-23 00:21:04 MSK pgsql@billing 55717 STATEMENT: SELECT * from __clear_table_tail('ctrl_servers', 'bill', 'opt', 21,= 1) In the same time log all actions involving changes on table ctrl_servers during +/- 3 second from ERROR: 2010-12-23 00:21:03 MSK bill@billing 42078 LOG: duration: 0.736 ms statement: UPDATE ctrl_servers SET last_connect =3D 'now' WHERE ( id =3D '1514' ) Just one query. Line where ERROR happens: FOR _new_tid IN EXECUTE 'UPDATE ONLY ' || quote_ident(arg_relnamespace) || '.' || quote_ident(arg_relname) || ' SET ' || arg_field || ' =3D ' || arg_field || ' WHERE ctid=3DANY($1) RETURNING ctid' USING _current_tid_list LOOP Need be said _current_tid_list at start of procedure call contained all possible tids for 21 page of relation (not only for live tuples but all possible (in range like {21,1} to {21,300+})). I have an idea the error happens because some rare race conditions with tid scan and concurrent updates in the same page (probably HOT somehow involved... because both updateable fields in function and concurrent update query updated non-indexed fields). Currently I work on creating self contained test case (unfortunately without much success until now). About stack backtrace I not sure how to get it (and even worse the database is heavily loaded production DB so I not sure creating core dump would be good idea). Still I would like to receive any help with creating a stack backtrace. On Wed, Dec 22, 2010 at 5:22 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Dec 20, 2010 at 9:48 PM, Maxim Boguk <Maxim.Boguk@gmail.com> wrot= e: >> Anyone can enlighten me what happens here? > > That does look weird, but without a simple test case I think it's > going to be hard to investigate this. =C2=A0Installing Slony and your code > and building a 130GB is more effort than I'm willing to put in... > > Any way you can get a stack backtrace at the point the error occurs? > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > --=20 Maxim Boguk Senior Postgresql DBA. Skype: maxim.boguk Jabber: maxim.boguk@gmail.com LinkedIn profile: http://nz.linkedin.com/in/maximboguk =D0=9C=D0=BE=D0=B9=D0=9A=D1=80=D1=83=D0=B3: http://mboguk.moikrug.ru/ =D0=A1=D0=B8=D0=BB=D0=B0 =D1=81=D0=BE=D0=BB=D0=BE=D0=BC=D1=83 =D0=BB=D0=BE= =D0=BC=D0=B8=D1=82, =D0=BD=D0=BE =D0=BD=D0=B5 =D0=B2=D1=81=D0=B5 =D0=B2 =D0= =BD=D0=B0=D1=88=D0=B5=D0=B9 =D0=B6=D0=B8=D0=B7=D0=BD=D0=B8 - =D1=81=D0=BE= =D0=BB=D0=BE=D0=BC=D0=B0, =D0=B4=D0=B0 =D0=B8 =D1=81=D0=B8=D0=BB=D0=B0 =D0= =B4=D0=B0=D0=BB=D0=B5=D0=BA=D0=BE =D0=BD=D0=B5 =D0=B2=D1=81=D0=B5.
On Wed, Dec 22, 2010 at 4:53 PM, Maxim Boguk <maxim.boguk@gmail.com> wrote: > About stack backtrace I not sure how to get it (and even worse the > database is heavily loaded production DB so I not sure creating core > dump would be good idea). > Still I would like to receive any help with creating a stack backtrace. Well, if you can reproduce this problem in a given session, you can use pg_backend_pid() at the start of the session to get the backend ID, and then if your build has debugging symbols (or separate debuginfos you can install), you could attach gdb to it, set a breakpoint at the lines in heaptuple.c that can throw this error, reproduce the problem, and then get a backtrace. But if it's happening randomly I don't have a lot of ideas. I do wonder if this could be related to using TIDs that don't actually exist in the table. That might be worth some experimentation. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi all and Hi Robert I finally managed create working slave server with non-stripped Postgresql 8.4.7 and working gdb. (sorry for such long delay). And i can reproduce situation quite easy now, so i managed get stack backtrace with that error. What i need to do now? I have some expirience with gbd now but i have no idea where to look next. Full backtrace: Breakpoint 1, slot_getattr (slot=3D0xdd09ea468, attnum=3D-1, isnull=3D0xdd0= 9b4a68 "") at heaptuple.c:1185 1185 elog(ERROR, "cannot extract system attribute from virtual tuple"); (gdb) bt #0 slot_getattr (slot=3D0xdd09ea468, attnum=3D-1, isnull=3D0xdd09b4a68 "")= at heaptuple.c:1185 #1 0x000000000058a5f8 in ExecEvalScalarVar (exprstate=3D0xdd09b4b60, econtext=3D0xdd09b4a80, isNull=3D0xdd09b4a68 "", isDone=3D0xdd09b4ce0) at execQual.c:787 #2 0x00000000005933d7 in ExecTargetList (targetlist=3D0xdd09b4cb0, econtext=3D0xdd09b4a80, values=3D0xdd09b4a50, isnull=3D0xdd09b4a68 "", itemIsDone=3D0xdd09b4ce0, isDone=3D0x0) at execQual.c:5118 #3 0x00000000005939a3 in ExecProject (projInfo=3D0xdd09b4be0, isDone=3D0x0= ) at execQual.c:5333 #4 0x00000000005879c7 in ExecProcessReturning (projectReturning=3D0xdd09b4be0, tupleSlot=3D0xdd09ea468, planSlot=3D0xdd09= ea3a8, dest=3D0xdd09de5c8) at execMain.c:2273 #5 0x00000000005875fe in ExecUpdate (slot=3D0xdd09ea468, tupleid=3D0x7fffffffd2e0, planSlot=3D0xdd09ea3a8, dest=3D0xdd09de5c8, estate=3D0xdd09e6030) at execMain.c:2142 #6 0x0000000000586c68 in ExecutePlan (estate=3D0xdd09e6030, planstate=3D0xdd09ea550, operation=3DCMD_UPDATE, numberTuples=3D0, direction=3DForwardScanDirection, dest=3D0xdd09de5c8) at execMain.c:1681 #7 0x00000000005849a5 in standard_ExecutorRun (queryDesc=3D0xdd09de658, direction=3DForwardScanDirection, count=3D0) at execMain.c:309 #8 0x00000000005848a5 in ExecutorRun (queryDesc=3D0xdd09de658, direction=3DForwardScanDirection, count=3D0) at execMain.c:258 #9 0x0000000000673d43 in ProcessQuery (plan=3D0xdd0998890, sourceText=3D0xdd09c5830 "UPDATE ONLY bill.ctrl_servers SET opt =3D opt WHERE ctid=3DANY($1) RETURNING ctid", params=3D0xdd09c58c0, dest=3D0xdd09de5c8, completionTag=3D0x7fffffffd4e0 "") at pquery.c:196 #10 0x00000000006754e5 in PortalRunMulti (portal=3D0xdd08c8140, isTopLevel= =3D0 '\000', dest=3D0xdd09de5c8, altdest=3D0x9dfdc0, completionTag=3D0x7fffffffd4e0 "") at pquery.c:1269 #11 0x0000000000675127 in FillPortalStore (portal=3D0xdd08c8140, isTopLevel= =3D0 '\000') at pquery.c:1061 #12 0x00000000006757f8 in PortalRunFetch (portal=3D0xdd08c8140, fdirection=3DFETCH_FORWARD, count=3D10, dest=3D0x9dfe40) at pquery.c:1397 #13 0x00000000005b24e1 in _SPI_cursor_operation (portal=3D0xdd08c8140, direction=3DFETCH_FORWARD, count=3D10, dest=3D0x9dfe40) at spi.c:2088 #14 0x00000000005b11a8 in SPI_cursor_fetch (portal=3D0xdd08c8140, forward= =3D1 '\001', count=3D10) at spi.c:1258 #15 0x0000000dd0b1e199 in exec_for_query (estate=3D0x7fffffffda50, stmt=3D0xdd098bfc8, portal=3D0xdd08c8140, prefetch_ok=3D1 '\001') at pl_exec.c:4262 #16 0x0000000dd0b1bc43 in exec_stmt_dynfors (estate=3D0x7fffffffda50, stmt=3D0xdd098bfc8) at pl_exec.c:3107 #17 0x0000000dd0b1881a in exec_stmt (estate=3D0x7fffffffda50, stmt=3D0xdd098bfc8) at pl_exec.c:1308 #18 0x0000000dd0b1859a in exec_stmts (estate=3D0x7fffffffda50, stmts=3D0xdd098bc28) at pl_exec.c:1203 #19 0x0000000dd0b18f38 in exec_stmt_while (estate=3D0x7fffffffda50, stmt=3D0xdd098d6f8) at pl_exec.c:1608 #20 0x0000000dd0b18733 in exec_stmt (estate=3D0x7fffffffda50, stmt=3D0xdd098d6f8) at pl_exec.c:1264 #21 0x0000000dd0b1859a in exec_stmts (estate=3D0x7fffffffda50, stmts=3D0xdd098a650) at pl_exec.c:1203 #22 0x0000000dd0b183df in exec_stmt_block (estate=3D0x7fffffffda50, block=3D0xdd098d778) at pl_exec.c:1140 #23 0x0000000dd0b16a29 in plpgsql_exec_function (func=3D0xdd087abd8, fcinfo=3D0x7fffffffdcf0) at pl_exec.c:315 #24 0x0000000dd0b1260f in plpgsql_call_handler (fcinfo=3D0x7fffffffdcf0) at pl_handler.c:95 ---Type <return> to continue, or q <return> to quit--- #25 0x000000000058c902 in ExecMakeTableFunctionResult (funcexpr=3D0xdd08ca7= 30, econtext=3D0xdd08ca460, expectedDesc=3D0xdd08ca5f0, randomAccess=3D0 '\000') at execQual.c:2016 #26 0x00000000005a3e7d in FunctionNext (node=3D0xdd08ca350) at nodeFunctionscan.c:64 #27 0x0000000000593a10 in ExecScan (node=3D0xdd08ca350, accessMtd=3D0x5a3e10 <FunctionNext>) at execScan.c:68 #28 0x00000000005a3eea in ExecFunctionScan (node=3D0xdd08ca350) at nodeFunctionscan.c:98 #29 0x0000000000588fdf in ExecProcNode (node=3D0xdd08ca350) at execProcnode.c:385 #30 0x0000000000586794 in ExecutePlan (estate=3D0xdd08ca030, planstate=3D0xdd08ca350, operation=3DCMD_SELECT, numberTuples=3D0, direction=3DForwardScanDirection, dest=3D0xdd08996f0) at execMain.c:1504 #31 0x00000000005849a5 in standard_ExecutorRun (queryDesc=3D0xdd082f430, direction=3DForwardScanDirection, count=3D0) at execMain.c:309 #32 0x00000000005848a5 in ExecutorRun (queryDesc=3D0xdd082f430, direction=3DForwardScanDirection, count=3D0) at execMain.c:258 #33 0x0000000000674e18 in PortalRunSelect (portal=3D0xdd08c8030, forward=3D1 '\001', count=3D0, dest=3D0xdd08996f0) at pquery.c:953 #34 0x0000000000674ab9 in PortalRun (portal=3D0xdd08c8030, count=3D9223372036854775807, isTopLevel=3D1 '\001', dest=3D0xdd08996f0, altdest=3D0xdd08996f0, completionTag=3D0x7fffffffe570 "") at pquery.c:7= 79 #35 0x000000000066f0e6 in exec_simple_query ( query_string=3D0x8037e5030 "\nSELECT * from __clear_table_tail('ctrl_servers', 'bill', 'opt', 13, 1)\n") at postgres.c:997 #36 0x0000000000672f72 in PostgresMain (argc=3D4, argv=3D0x803732930, username=3D0x803732900 "pgsql") at postgres.c:3676 #37 0x000000000063b5cd in BackendRun (port=3D0x803703c00) at postmaster.c:3= 467 #38 0x000000000063ab2a in BackendStartup (port=3D0x803703c00) at postmaster.c:3081 #39 0x00000000006382dc in ServerLoop () at postmaster.c:1387 #40 0x0000000000637abb in PostmasterMain (argc=3D3, argv=3D0x7fffffffeb88) = at postmaster.c:1040 #41 0x00000000005c0b5a in main (argc=3D3, argv=3D0x7fffffffeb88) at main.c:= 188 On Wed, Jan 5, 2011 at 4:28 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Dec 22, 2010 at 4:53 PM, Maxim Boguk <maxim.boguk@gmail.com> > wrote: > > About stack backtrace I not sure how to get it (and even worse the > > database is heavily loaded production DB so I not sure creating core > > dump would be good idea). > > Still I would like to receive any help with creating a stack backtrace. > > Well, if you can reproduce this problem in a given session, you can > use pg_backend_pid() at the start of the session to get the backend > ID, and then if your build has debugging symbols (or separate > debuginfos you can install), you could attach gdb to it, set a > breakpoint at the lines in heaptuple.c that can throw this error, > reproduce the problem, and then get a backtrace. But if it's > happening randomly I don't have a lot of ideas. > > I do wonder if this could be related to using TIDs that don't actually > exist in the table. That might be worth some experimentation. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > --=20 Maxim Boguk Senior Postgresql DBA. Skype: maxim.boguk Jabber: maxim.boguk@gmail.com LinkedIn profile: http://nz.linkedin.com/in/maximboguk If they can send one man to the moon... why can't they send them all? =D0=9C=D0=BE=D0=B9=D0=9A=D1=80=D1=83=D0=B3: http://mboguk.moikrug.ru/ =D0=A1=D0=B8=D0=BB=D0=B0 =D1=81=D0=BE=D0=BB=D0=BE=D0=BC=D1=83 =D0=BB=D0=BE= =D0=BC=D0=B8=D1=82, =D0=BD=D0=BE =D0=BD=D0=B5 =D0=B2=D1=81=D0=B5 =D0=B2 =D0= =BD=D0=B0=D1=88=D0=B5=D0=B9 =D0=B6=D0=B8=D0=B7=D0=BD=D0=B8 - =D1=81=D0=BE= =D0=BB=D0=BE=D0=BC=D0=B0, =D0=B4=D0=B0 =D0=B8 =D1=81=D0=B8=D0=BB=D0=B0 =D0= =B4=D0=B0=D0=BB=D0=B5=D0=BA=D0=BE =D0=BD=D0=B5 =D0=B2=D1=81=D0=B5.
Just an idea from code examination: I think problem in " RETURNING ctid" part of query and next comment in slot_getattr: /* * system attributes are handled by heap_getsysattr */ So, may be RETURNING implementation somehow incorrectely handle system attributes in returing values list. On Sun, Feb 20, 2011 at 1:13 AM, Maxim Boguk <maxim.boguk@gmail.com> wrote: > Hi all and Hi Robert > > I finally managed create working slave server with non-stripped Postgresql > 8.4.7 and working gdb. > (sorry for such long delay). > And i can reproduce situation quite easy now, so i managed get stack > backtrace with that error. > What i need to do now? I have some expirience with gbd now but i have no > idea where to look next. > > Full backtrace: > > Breakpoint 1, slot_getattr (slot=3D0xdd09ea468, attnum=3D-1, isnull=3D0xd= d09b4a68 > "") at heaptuple.c:1185 > 1185 elog(ERROR, "cannot extract system > attribute from virtual tuple"); > (gdb) bt > #0 slot_getattr (slot=3D0xdd09ea468, attnum=3D-1, isnull=3D0xdd09b4a68 "= ") at > heaptuple.c:1185 > #1 0x000000000058a5f8 in ExecEvalScalarVar (exprstate=3D0xdd09b4b60, > econtext=3D0xdd09b4a80, isNull=3D0xdd09b4a68 "", isDone=3D0xdd09b4ce0) > at execQual.c:787 > #2 0x00000000005933d7 in ExecTargetList (targetlist=3D0xdd09b4cb0, > econtext=3D0xdd09b4a80, values=3D0xdd09b4a50, isnull=3D0xdd09b4a68 "", > itemIsDone=3D0xdd09b4ce0, isDone=3D0x0) at execQual.c:5118 > #3 0x00000000005939a3 in ExecProject (projInfo=3D0xdd09b4be0, isDone=3D0= x0) at > execQual.c:5333 > #4 0x00000000005879c7 in ExecProcessReturning > (projectReturning=3D0xdd09b4be0, tupleSlot=3D0xdd09ea468, planSlot=3D0xdd= 09ea3a8, > dest=3D0xdd09de5c8) at execMain.c:2273 > #5 0x00000000005875fe in ExecUpdate (slot=3D0xdd09ea468, > tupleid=3D0x7fffffffd2e0, planSlot=3D0xdd09ea3a8, dest=3D0xdd09de5c8, > estate=3D0xdd09e6030) at execMain.c:2142 > #6 0x0000000000586c68 in ExecutePlan (estate=3D0xdd09e6030, > planstate=3D0xdd09ea550, operation=3DCMD_UPDATE, numberTuples=3D0, > direction=3DForwardScanDirection, dest=3D0xdd09de5c8) at execMain.c:1= 681 > #7 0x00000000005849a5 in standard_ExecutorRun (queryDesc=3D0xdd09de658, > direction=3DForwardScanDirection, count=3D0) at execMain.c:309 > #8 0x00000000005848a5 in ExecutorRun (queryDesc=3D0xdd09de658, > direction=3DForwardScanDirection, count=3D0) at execMain.c:258 > #9 0x0000000000673d43 in ProcessQuery (plan=3D0xdd0998890, > sourceText=3D0xdd09c5830 "UPDATE ONLY bill.ctrl_servers SET opt =3D o= pt > WHERE ctid=3DANY($1) RETURNING ctid", params=3D0xdd09c58c0, > dest=3D0xdd09de5c8, completionTag=3D0x7fffffffd4e0 "") at pquery.c:196 > #10 0x00000000006754e5 in PortalRunMulti (portal=3D0xdd08c8140, isTopLeve= l=3D0 > '\000', dest=3D0xdd09de5c8, altdest=3D0x9dfdc0, > completionTag=3D0x7fffffffd4e0 "") at pquery.c:1269 > #11 0x0000000000675127 in FillPortalStore (portal=3D0xdd08c8140, isTopLev= el=3D0 > '\000') at pquery.c:1061 > #12 0x00000000006757f8 in PortalRunFetch (portal=3D0xdd08c8140, > fdirection=3DFETCH_FORWARD, count=3D10, dest=3D0x9dfe40) at pquery.c:1397 > #13 0x00000000005b24e1 in _SPI_cursor_operation (portal=3D0xdd08c8140, > direction=3DFETCH_FORWARD, count=3D10, dest=3D0x9dfe40) at spi.c:2088 > #14 0x00000000005b11a8 in SPI_cursor_fetch (portal=3D0xdd08c8140, forward= =3D1 > '\001', count=3D10) at spi.c:1258 > #15 0x0000000dd0b1e199 in exec_for_query (estate=3D0x7fffffffda50, > stmt=3D0xdd098bfc8, portal=3D0xdd08c8140, prefetch_ok=3D1 '\001') > at pl_exec.c:4262 > #16 0x0000000dd0b1bc43 in exec_stmt_dynfors (estate=3D0x7fffffffda50, > stmt=3D0xdd098bfc8) at pl_exec.c:3107 > #17 0x0000000dd0b1881a in exec_stmt (estate=3D0x7fffffffda50, > stmt=3D0xdd098bfc8) at pl_exec.c:1308 > #18 0x0000000dd0b1859a in exec_stmts (estate=3D0x7fffffffda50, > stmts=3D0xdd098bc28) at pl_exec.c:1203 > #19 0x0000000dd0b18f38 in exec_stmt_while (estate=3D0x7fffffffda50, > stmt=3D0xdd098d6f8) at pl_exec.c:1608 > #20 0x0000000dd0b18733 in exec_stmt (estate=3D0x7fffffffda50, > stmt=3D0xdd098d6f8) at pl_exec.c:1264 > #21 0x0000000dd0b1859a in exec_stmts (estate=3D0x7fffffffda50, > stmts=3D0xdd098a650) at pl_exec.c:1203 > #22 0x0000000dd0b183df in exec_stmt_block (estate=3D0x7fffffffda50, > block=3D0xdd098d778) at pl_exec.c:1140 > #23 0x0000000dd0b16a29 in plpgsql_exec_function (func=3D0xdd087abd8, > fcinfo=3D0x7fffffffdcf0) at pl_exec.c:315 > #24 0x0000000dd0b1260f in plpgsql_call_handler (fcinfo=3D0x7fffffffdcf0) = at > pl_handler.c:95 > ---Type <return> to continue, or q <return> to quit--- > #25 0x000000000058c902 in ExecMakeTableFunctionResult > (funcexpr=3D0xdd08ca730, econtext=3D0xdd08ca460, expectedDesc=3D0xdd08ca5= f0, > randomAccess=3D0 '\000') at execQual.c:2016 > #26 0x00000000005a3e7d in FunctionNext (node=3D0xdd08ca350) at > nodeFunctionscan.c:64 > #27 0x0000000000593a10 in ExecScan (node=3D0xdd08ca350, accessMtd=3D0x5a3= e10 > <FunctionNext>) at execScan.c:68 > #28 0x00000000005a3eea in ExecFunctionScan (node=3D0xdd08ca350) at > nodeFunctionscan.c:98 > #29 0x0000000000588fdf in ExecProcNode (node=3D0xdd08ca350) at > execProcnode.c:385 > #30 0x0000000000586794 in ExecutePlan (estate=3D0xdd08ca030, > planstate=3D0xdd08ca350, operation=3DCMD_SELECT, numberTuples=3D0, > direction=3DForwardScanDirection, dest=3D0xdd08996f0) at execMain.c:1= 504 > #31 0x00000000005849a5 in standard_ExecutorRun (queryDesc=3D0xdd082f430, > direction=3DForwardScanDirection, count=3D0) at execMain.c:309 > #32 0x00000000005848a5 in ExecutorRun (queryDesc=3D0xdd082f430, > direction=3DForwardScanDirection, count=3D0) at execMain.c:258 > #33 0x0000000000674e18 in PortalRunSelect (portal=3D0xdd08c8030, forward= =3D1 > '\001', count=3D0, dest=3D0xdd08996f0) at pquery.c:953 > #34 0x0000000000674ab9 in PortalRun (portal=3D0xdd08c8030, > count=3D9223372036854775807, isTopLevel=3D1 '\001', dest=3D0xdd08996f0, > altdest=3D0xdd08996f0, completionTag=3D0x7fffffffe570 "") at pquery.c= :779 > #35 0x000000000066f0e6 in exec_simple_query ( > query_string=3D0x8037e5030 "\nSELECT * from > __clear_table_tail('ctrl_servers', 'bill', 'opt', 13, 1)\n") at > postgres.c:997 > #36 0x0000000000672f72 in PostgresMain (argc=3D4, argv=3D0x803732930, > username=3D0x803732900 "pgsql") at postgres.c:3676 > #37 0x000000000063b5cd in BackendRun (port=3D0x803703c00) at > postmaster.c:3467 > #38 0x000000000063ab2a in BackendStartup (port=3D0x803703c00) at > postmaster.c:3081 > #39 0x00000000006382dc in ServerLoop () at postmaster.c:1387 > #40 0x0000000000637abb in PostmasterMain (argc=3D3, argv=3D0x7fffffffeb88= ) at > postmaster.c:1040 > #41 0x00000000005c0b5a in main (argc=3D3, argv=3D0x7fffffffeb88) at main.= c:188 > > > > On Wed, Jan 5, 2011 at 4:28 PM, Robert Haas <robertmhaas@gmail.com> wrote: > >> On Wed, Dec 22, 2010 at 4:53 PM, Maxim Boguk <maxim.boguk@gmail.com> >> wrote: >> > About stack backtrace I not sure how to get it (and even worse the >> > database is heavily loaded production DB so I not sure creating core >> > dump would be good idea). >> > Still I would like to receive any help with creating a stack backtrace. >> >> Well, if you can reproduce this problem in a given session, you can >> use pg_backend_pid() at the start of the session to get the backend >> ID, and then if your build has debugging symbols (or separate >> debuginfos you can install), you could attach gdb to it, set a >> breakpoint at the lines in heaptuple.c that can throw this error, >> reproduce the problem, and then get a backtrace. But if it's >> happening randomly I don't have a lot of ideas. >> >> I do wonder if this could be related to using TIDs that don't actually >> exist in the table. That might be worth some experimentation. >> >> -- >> Robert Haas >> EnterpriseDB: http://www.enterprisedb.com >> The Enterprise PostgreSQL Company >> > > > > -- > Maxim Boguk > Senior Postgresql DBA. > > Skype: maxim.boguk > Jabber: maxim.boguk@gmail.com > > LinkedIn profile: http://nz.linkedin.com/in/maximboguk > If they can send one man to the moon... why can't they send them all? > > > =D0=9C=D0=BE=D0=B9=D0=9A=D1=80=D1=83=D0=B3: http://mboguk.moikrug.ru/ > =D0=A1=D0=B8=D0=BB=D0=B0 =D1=81=D0=BE=D0=BB=D0=BE=D0=BC=D1=83 =D0=BB=D0= =BE=D0=BC=D0=B8=D1=82, =D0=BD=D0=BE =D0=BD=D0=B5 =D0=B2=D1=81=D0=B5 =D0=B2 = =D0=BD=D0=B0=D1=88=D0=B5=D0=B9 =D0=B6=D0=B8=D0=B7=D0=BD=D0=B8 - =D1=81=D0= =BE=D0=BB=D0=BE=D0=BC=D0=B0, =D0=B4=D0=B0 =D0=B8 =D1=81=D0=B8=D0=BB=D0=B0 = =D0=B4=D0=B0=D0=BB=D0=B5=D0=BA=D0=BE =D0=BD=D0=B5 > =D0=B2=D1=81=D0=B5. > --=20 Maxim Boguk Senior Postgresql DBA. Skype: maxim.boguk Jabber: maxim.boguk@gmail.com LinkedIn profile: http://nz.linkedin.com/in/maximboguk If they can send one man to the moon... why can't they send them all? =D0=9C=D0=BE=D0=B9=D0=9A=D1=80=D1=83=D0=B3: http://mboguk.moikrug.ru/ =D0=A1=D0=B8=D0=BB=D0=B0 =D1=81=D0=BE=D0=BB=D0=BE=D0=BC=D1=83 =D0=BB=D0=BE= =D0=BC=D0=B8=D1=82, =D0=BD=D0=BE =D0=BD=D0=B5 =D0=B2=D1=81=D0=B5 =D0=B2 =D0= =BD=D0=B0=D1=88=D0=B5=D0=B9 =D0=B6=D0=B8=D0=B7=D0=BD=D0=B8 - =D1=81=D0=BE= =D0=BB=D0=BE=D0=BC=D0=B0, =D0=B4=D0=B0 =D0=B8 =D1=81=D0=B8=D0=BB=D0=B0 =D0= =B4=D0=B0=D0=BB=D0=B5=D0=BA=D0=BE =D0=BD=D0=B5 =D0=B2=D1=81=D0=B5.
Maxim Boguk <maxim.boguk@gmail.com> writes: > I finally managed create working slave server with non-stripped Postgresql > 8.4.7 and working gdb. > (sorry for such long delay). > And i can reproduce situation quite easy now, so i managed get stack > backtrace with that error. > What i need to do now? I have some expirience with gbd now but i have no > idea where to look next. Now that you know what query is causing the problem: > #9 0x0000000000673d43 in ProcessQuery (plan=0xdd0998890, > sourceText=0xdd09c5830 "UPDATE ONLY bill.ctrl_servers SET opt = opt > WHERE ctid=ANY($1) RETURNING ctid", params=0xdd09c58c0, > dest=0xdd09de5c8, completionTag=0x7fffffffd4e0 "") at pquery.c:196 it shouldn't be too hard to prepare a self-contained test case so that others can look at this. regards, tom lane
On Sun, Feb 20, 2011 at 5:08 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Maxim Boguk <maxim.boguk@gmail.com> writes: > > I finally managed create working slave server with non-stripped > Postgresql > > 8.4.7 and working gdb. > > (sorry for such long delay). > > And i can reproduce situation quite easy now, so i managed get stack > > backtrace with that error. > > What i need to do now? I have some expirience with gbd now but i have no > > idea where to look next. > > Now that you know what query is causing the problem: > > > #9 0x0000000000673d43 in ProcessQuery (plan=3D0xdd0998890, > > sourceText=3D0xdd09c5830 "UPDATE ONLY bill.ctrl_servers SET opt =3D= opt > > WHERE ctid=3DANY($1) RETURNING ctid", params=3D0xdd09c58c0, > > dest=3D0xdd09de5c8, completionTag=3D0x7fffffffd4e0 "") at pquery.c:= 196 > > it shouldn't be too hard to prepare a self-contained test case so that > others can look at this. > > regards, tom lane > Sorry, but it's very unlikely I can create self-contained test case. That error doesn't happen on stand alone idle server. It's only happen at random time on table with very high update rate in production environment, and happen over 1-30 minutes calling my storable procedure in endless loop (e.g. once per 1000+ calls). I was long ago sure about exact place where that error happens (becuase my storable procedure contains only one real sql query), but because I can't create self contained test case I was asked use gdb and try to get backtrace at point of error. So now I can examine and provide variable values at that point to give a clue about what happens here. --=20 Maxim Boguk Senior Postgresql DBA. Skype: maxim.boguk Jabber: maxim.boguk@gmail.com LinkedIn profile: http://nz.linkedin.com/in/maximboguk If they can send one man to the moon... why can't they send them all? =D0=9C=D0=BE=D0=B9=D0=9A=D1=80=D1=83=D0=B3: http://mboguk.moikrug.ru/ =D0=A1=D0=B8=D0=BB=D0=B0 =D1=81=D0=BE=D0=BB=D0=BE=D0=BC=D1=83 =D0=BB=D0=BE= =D0=BC=D0=B8=D1=82, =D0=BD=D0=BE =D0=BD=D0=B5 =D0=B2=D1=81=D0=B5 =D0=B2 =D0= =BD=D0=B0=D1=88=D0=B5=D0=B9 =D0=B6=D0=B8=D0=B7=D0=BD=D0=B8 - =D1=81=D0=BE= =D0=BB=D0=BE=D0=BC=D0=B0, =D0=B4=D0=B0 =D0=B8 =D1=81=D0=B8=D0=BB=D0=B0 =D0= =B4=D0=B0=D0=BB=D0=B5=D0=BA=D0=BE =D0=BD=D0=B5 =D0=B2=D1=81=D0=B5.
Today I perfomed some more digging with code and gdb. I not sure, but may be I have an idea what happens here: RETURNING ctid fails when code implementing returning happened to get a "Virtual" tuple as input. And that ofcourse fail because "virtual tuple does not have any "system columns" (such as tid). (seems no one got such error before because no one tried to use system columns in RETURNING list). I tracked virtual tuple from heaptuple.c::slot_getattr down to execMain.c::ExecUpdate and I think somewhere in this way virtual tuple should be need to be "materialized" into physical tuple. I have zero expirience with reporting such results so I will try my best: path of that virtual tuple in problem case looks like: *(slot_getattr::slot) *(ExecEvalVar::econtext->ecxt_scantuple) *(ExecTargetList::econtext->ecxt_scantuple) *(ExecProject::econtext->ecxt_scantuple) *(ExecProject::projInfo->pi_exprContext->ecxt_scantuple) *(ExecProcessReturning::projectReturning->pi_exprContext->ecxt_scantuple) *(ExecUpdate::resultRelInfo->ri_projectReturning->pi_exprContext->ecxt_scan= tuple) All these variable contained the same TupleTableSlot structure: (gdb) print *(ExecUpdate::resultRelInfo->ri_projectReturning->pi_exprContext->ecxt_scan= tuple) $29 =3D { type =3D T_TupleTableSlot, tts_isempty =3D 0 '\000', tts_shouldFree =3D 0 '\000', tts_shouldFreeMin =3D 0 '\000', tts_slow =3D 0 '\000', tts_tuple =3D 0x0, tts_tupleDescriptor =3D 0xdd09d1030, tts_mcxt =3D 0xdd08da2e0, tts_buffer =3D 0, tts_nvalid =3D 11, tts_values =3D 0xdd09d1840, tts_isnull =3D 0xdd09d18d0 "", tts_mintuple =3D 0x0, tts_minhdr =3D { t_len =3D 0, t_self =3D { ip_blkid =3D { bi_hi =3D 0, bi_lo =3D 0 }, ip_posid =3D 44192 }, t_tableOid =3D 13, t_data =3D 0x10 }, tts_off =3D 59333634048 } Again I not sure but ExecProcessReturning function seems good candidate to perform tuple materialization. PS: May be it seems just a lot handwaving but I just started study gdb and postgresql source code. PPS: I must say thanks to authors of http://doxygen.postgresql.org . It is great tool and saved me lot time today. On Sun, Feb 20, 2011 at 12:01 PM, Maxim Boguk <maxim.boguk@gmail.com> wrote: > > > On Sun, Feb 20, 2011 at 5:08 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Maxim Boguk <maxim.boguk@gmail.com> writes: >> > I finally managed create working slave server with non-stripped >> Postgresql >> > 8.4.7 and working gdb. >> > (sorry for such long delay). >> > And i can reproduce situation quite easy now, so i managed get stack >> > backtrace with that error. >> > What i need to do now? I have some expirience with gbd now but i have = no >> > idea where to look next. >> >> Now that you know what query is causing the problem: >> >> > #9 0x0000000000673d43 in ProcessQuery (plan=3D0xdd0998890, >> > sourceText=3D0xdd09c5830 "UPDATE ONLY bill.ctrl_servers SET opt = =3D opt >> > WHERE ctid=3DANY($1) RETURNING ctid", params=3D0xdd09c58c0, >> > dest=3D0xdd09de5c8, completionTag=3D0x7fffffffd4e0 "") at pquery.c= :196 >> >> it shouldn't be too hard to prepare a self-contained test case so that >> others can look at this. >> >> regards, tom lane >> > > > Sorry, but it's very unlikely I can create self-contained test case. > That error doesn't happen on stand alone idle server. > It's only happen at random time on table with very high update rate in > production environment, and happen over 1-30 minutes calling my storable > procedure in endless loop (e.g. once per 1000+ calls). > I was long ago sure about exact place where that error happens (becuase = my > storable procedure contains only one real sql query), but because I can't > create self contained test case I was asked use gdb and try to get backtr= ace > at point of error. > > So now I can examine and provide variable values at that point to give a > clue about what happens here. > > > -- > Maxim Boguk > Senior Postgresql DBA. > > Skype: maxim.boguk > Jabber: maxim.boguk@gmail.com > > LinkedIn profile: http://nz.linkedin.com/in/maximboguk > If they can send one man to the moon... why can't they send them all? > > =D0=9C=D0=BE=D0=B9=D0=9A=D1=80=D1=83=D0=B3: http://mboguk.moikrug.ru/ > =D0=A1=D0=B8=D0=BB=D0=B0 =D1=81=D0=BE=D0=BB=D0=BE=D0=BC=D1=83 =D0=BB=D0= =BE=D0=BC=D0=B8=D1=82, =D0=BD=D0=BE =D0=BD=D0=B5 =D0=B2=D1=81=D0=B5 =D0=B2 = =D0=BD=D0=B0=D1=88=D0=B5=D0=B9 =D0=B6=D0=B8=D0=B7=D0=BD=D0=B8 - =D1=81=D0= =BE=D0=BB=D0=BE=D0=BC=D0=B0, =D0=B4=D0=B0 =D0=B8 =D1=81=D0=B8=D0=BB=D0=B0 = =D0=B4=D0=B0=D0=BB=D0=B5=D0=BA=D0=BE =D0=BD=D0=B5 > =D0=B2=D1=81=D0=B5. > --=20 Maxim Boguk Senior Postgresql DBA. Skype: maxim.boguk Jabber: maxim.boguk@gmail.com LinkedIn profile: http://nz.linkedin.com/in/maximboguk If they can send one man to the moon... why can't they send them all? =D0=9C=D0=BE=D0=B9=D0=9A=D1=80=D1=83=D0=B3: http://mboguk.moikrug.ru/ =D0=A1=D0=B8=D0=BB=D0=B0 =D1=81=D0=BE=D0=BB=D0=BE=D0=BC=D1=83 =D0=BB=D0=BE= =D0=BC=D0=B8=D1=82, =D0=BD=D0=BE =D0=BD=D0=B5 =D0=B2=D1=81=D0=B5 =D0=B2 =D0= =BD=D0=B0=D1=88=D0=B5=D0=B9 =D0=B6=D0=B8=D0=B7=D0=BD=D0=B8 - =D1=81=D0=BE= =D0=BB=D0=BE=D0=BC=D0=B0, =D0=B4=D0=B0 =D0=B8 =D1=81=D0=B8=D0=BB=D0=B0 =D0= =B4=D0=B0=D0=BB=D0=B5=D0=BA=D0=BE =D0=BD=D0=B5 =D0=B2=D1=81=D0=B5.
Maxim Boguk <maxim.boguk@gmail.com> writes: > I tracked virtual tuple from heaptuple.c::slot_getattr down to > execMain.c::ExecUpdate and I think somewhere in this way virtual tuple > should be need to be "materialized" into physical tuple. That happens at the top of ExecUpdate: /* * get the heap tuple out of the tuple table slot, making sure we have a * writable copy */ tuple = ExecMaterializeSlot(slot); I don't see any code path there that could re-virtualize the slot. Do you have any triggers or constraints on the target table? Because those are about the only things that influence this code path at all... > Again I not sure but ExecProcessReturning function seems good candidate to > perform tuple materialization. If ExecProcessReturning were given a virtual slot, materializing it would do no good at all for this problem, because the system columns in the result would be garbage. It has to receive an already materialized slot that contains the right information beforehand. The only reason that retrieving ctid can work at all is that when heap_update is called, "tuple" is pointing at the slot's contained tuple and so setting tuple->t_self changes the contents of the slot. That whole function is completely dependent on the slot contents being physical not virtual, and I don't see anything there that would break it. regards, tom lane
Hi Yes here is one BEFORE UPDATE trigger on relation ( Triggers: _billing_main_denyaccess_71 BEFORE INSERT OR DELETE OR UPDATE ON ctrl_servers FOR EACH ROW EXECUTE PROCEDURE _billing_main.denyaccess('_billing_main') ) However that trigger should not fire at all because storable procedure work with set local session_replication_role to 'replica'; But tuple become "virtual" in these lines: 2027 newtuple =3D ExecBRUpdateTriggers(estate, resultRelInfo, 2028 tupleid, tuple); 2029 2030 if (newtuple =3D=3D NULL) /* "do nothing" */ 2031 return; 2032 2033 if (newtuple !=3D tuple) /* modified by Trigger(s)= */ 2034 { 2035 /* 2036 * Put the modified tuple into a slot for convenience of routines 2037 * below. We assume the tuple was allocated in per-tuple memory 2038 * context, and therefore will go away by itself. The tuple table 2039 * slot should not try to clear it. 2040 */ 2041 TupleTableSlot *newslot =3D estate->es_trig_tuple_slot; 2042 2043 if (newslot->tts_tupleDescriptor !=3D slot->tts_tupleDescriptor) 2044 ExecSetSlotDescriptor(newslot, slot->tts_tupleDescriptor); 2045 ExecStoreTuple(newtuple, newslot, InvalidBuffer, false); 2046 slot =3D newslot; 2047 tuple =3D newtuple; 2048 } If be more exact at line: 2047 tuple =3D newtuple; What I don't understand fist it is why that code path fail only once per 1000-100000 calls only. I have an idea that 2033 if (newtuple !=3D tuple) /* modified by Trigger(s)= */ sometime can return true even if no triggers really had been done. After more code digging I think I found reason Look like that happens in this part of ExecBRUpdateTriggers code: 02273 /* 02274 * In READ COMMITTED isolation level it's possible that newtuple = was 02275 * changed due to concurrent update. In that case we have a raw subplan 02276 * output tuple and need to run it through the junk filter. 02277 */ 02278 if (newSlot !=3D NULL <http://doxygen.postgresql.org/c_8h.html#070d2ce7b6bb7e5c05602aa8c308d0c4>) 02279 intuple =3D newtuple =3D ExecRemoveJunk <http://doxygen.postgresql.org/execJunk_8c.html#9779200174530a361ea70e21639= c2c72>(relinfo->ri_junkFilter <http://doxygen.postgresql.org/structResultRelInfo.html#0d887ebe5510dd3e663= e95888d72cd67>, newSlot); That rarity of that situation become understandable (that is very very rare case when tuple got changed in such way) and really create virtual tuple. Look like need put tuple materialization code here as well. PS: again may be just handwaving... I just started learn postgresql code. Any help from more skilled persons would be great. On Mon, Feb 21, 2011 at 10:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Maxim Boguk <maxim.boguk@gmail.com> writes: > > I tracked virtual tuple from heaptuple.c::slot_getattr down to > > execMain.c::ExecUpdate and I think somewhere in this way virtual tuple > > should be need to be "materialized" into physical tuple. > > That happens at the top of ExecUpdate: > > /* > * get the heap tuple out of the tuple table slot, making sure we have= a > * writable copy > */ > tuple =3D ExecMaterializeSlot(slot); > > I don't see any code path there that could re-virtualize the slot. > > Do you have any triggers or constraints on the target table? Because > those are about the only things that influence this code path at all... > > > Again I not sure but ExecProcessReturning function seems good candidate > to > > perform tuple materialization. > > If ExecProcessReturning were given a virtual slot, materializing it > would do no good at all for this problem, because the system columns in > the result would be garbage. It has to receive an already materialized > slot that contains the right information beforehand. The only reason > that retrieving ctid can work at all is that when heap_update is called, > "tuple" is pointing at the slot's contained tuple and so setting > tuple->t_self changes the contents of the slot. That whole function is > completely dependent on the slot contents being physical not virtual, > and I don't see anything there that would break it. > > regards, tom lane > --=20 Maxim Boguk Senior Postgresql DBA. Skype: maxim.boguk Jabber: maxim.boguk@gmail.com LinkedIn profile: http://nz.linkedin.com/in/maximboguk If they can send one man to the moon... why can't they send them all? =D0=9C=D0=BE=D0=B9=D0=9A=D1=80=D1=83=D0=B3: http://mboguk.moikrug.ru/ =D0=A1=D0=B8=D0=BB=D0=B0 =D1=81=D0=BE=D0=BB=D0=BE=D0=BC=D1=83 =D0=BB=D0=BE= =D0=BC=D0=B8=D1=82, =D0=BD=D0=BE =D0=BD=D0=B5 =D0=B2=D1=81=D0=B5 =D0=B2 =D0= =BD=D0=B0=D1=88=D0=B5=D0=B9 =D0=B6=D0=B8=D0=B7=D0=BD=D0=B8 - =D1=81=D0=BE= =D0=BB=D0=BE=D0=BC=D0=B0, =D0=B4=D0=B0 =D0=B8 =D1=81=D0=B8=D0=BB=D0=B0 =D0= =B4=D0=B0=D0=BB=D0=B5=D0=BA=D0=BE =D0=BD=D0=B5 =D0=B2=D1=81=D0=B5.
Maxim Boguk <maxim.boguk@gmail.com> writes: > Yes here is one BEFORE UPDATE trigger on relation > ... > However that trigger should not fire at all because storable procedure work > with set local session_replication_role to 'replica'; Would be nice if you'd provided those sorts of details earlier, instead of us having to resort to interrogation to obtain them. > I have an idea that > 2033 if (newtuple != tuple) /* modified by Trigger(s) */ > sometime can return true even if no triggers really had been done. No, you've got that backwards: if that test did succeed, the if-body would guarantee that we have a materialized slot. I think though that I see what is going on: 1. The slot passed to ExecUpdate is always going to be estate->es_junkFilter->jf_resultSlot, as a consequence of the fact that ExecutePlan will always have invoked ExecFilterJunk just above. 2. ExecUpdate forces the slot into the materialized state and sets its local "tuple" variable to the address of the physical tuple contained in the slot. 3. ExecBRUpdateTriggers runs and tries to lock the target tuple. Occasionally, there is a concurrent update that's just committed, in which case GetTupleForTrigger runs EvalPlanQual and gives back a new tuple-to-be-stored in a different slot. When that happens, ExecBRUpdateTriggers cleans the tuple by applying ExecRemoveJunk ... using estate->es_junkFilter. 4. execJunk.c does ExecClearTuple on its result slot. At this point, ExecUpdate's local tuple variable is a dangling pointer, because we just freed the tuple it is pointing at. 5. ExecFilterJunk's result is a virtual tuple, but ExecRemoveJunk does ExecCopySlotTuple -> heap_form_tuple to produce a copied tuple that isn't owned by the slot. However, if you're unlucky and the phase of the moon is just right, that copied tuple may get palloc'd in the same place where we just pfree'd the tuple that had been owned by the jf_resultSlot. 6. ExecBRUpdateTriggers falls through (not finding any triggers that will actually fire) and returns the copied tuple to ExecUpdate. Normally, ExecUpdate would stuff the copied tuple into es_trig_tuple_slot and all would be well, because that slot would surely be in materialized state. However, if you were sufficiently unlucky above, the "newtuple != tuple" test fails because of pointer equality between the dangling local variable and the live copied tuple. 7. At this point ExecUpdate has a physical tuple that is what it is supposed to store, so that's fine, and furthermore the slot contains the right data values for all the user columns. So the only visible problem is that if you try to extract ctid or other system columns from the slot, that will fail, because the physical tuple isn't owned by the slot. I think though that worse problems could ensue if there were active triggers that modified the row --- if you were really *seriously* unlucky, there could be a chance pointer match between the dangling tuple variable and a modified tuple returned by a trigger. In that case the modified tuple would be what was stored, but the index entries generated for it would be from the pre-trigger slot values. Ugh. That quick little "ExecRemoveJunk" is a lot more dangerous than it looks. I had actually looked at this before, but concluded it was OK because I couldn't reproduce the problem with a trigger in place. I guess I wasn't unlucky enough to have the chance pointer equality occur. The narrowest way to fix this would be to add a condition to the "newtuple != tuple" test in ExecUpdate to verify that "tuple" still matches what is stored in the slot. That's pretty ugly though. What would likely be better is to rearrange things so that ExecBRUpdateTriggers doesn't change the es_junkFilter's state. Or we could have ExecBRUpdateTriggers store the tuple into es_trig_tuple_slot for itself and return that slot, thus eliminating the need for a pointer-equality-dependent test to see if something had been done. Either way seems like a significantly more invasive patch, but it might be less likely to break again in future. regards, tom lane
I wrote: > [ theory about cause of bug #5798 ] Attached is a proposed patch against HEAD to fix this problem. It's a bit hard to *prove* that it fixes the problem, since there's no easy way to force the collision of palloc addresses. But I think it will. The patch changes the signature of ExecBRUpdateTriggers so that it takes and returns a TupleTableSlot instead of a HeapTuple. This essentially just moves the troublesome code (the "if (newtuple != tuple)" block) from ExecUpdate to the bottom of ExecBRUpdateTriggers. It solves the problem because in the case where ExecBRUpdateTriggers clobbers the contents of the junkfilter's output slot, it can immediately update the tuple pointer that will later be compared to see if the trigger(s) replaced the tuple. I chose to change ExecBRInsertTriggers, ExecIRInsertTriggers, and ExecIRUpdateTriggers to also take and return TupleTableSlots. These functions don't have a bug AFAICS, but it seems better that they all retain similar APIs. I'm not sure whether to back-patch all four changes or limit the patch to changing ExecBRUpdateTriggers in the back branches. Any thoughts about that? regards, tom lane diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index cac11a6c64107a2f977d2d7e5dc0d29fce5e35c7..44f568f396414f69485ab0538b0086741e5068c9 100644 *** a/src/backend/commands/copy.c --- b/src/backend/commands/copy.c *************** CopyFrom(CopyState cstate) *** 1836,1842 **** ResultRelInfo *resultRelInfo; EState *estate = CreateExecutorState(); /* for ExecConstraints() */ ExprContext *econtext; ! TupleTableSlot *slot; MemoryContext oldcontext = CurrentMemoryContext; ErrorContextCallback errcontext; CommandId mycid = GetCurrentCommandId(true); --- 1836,1842 ---- ResultRelInfo *resultRelInfo; EState *estate = CreateExecutorState(); /* for ExecConstraints() */ ExprContext *econtext; ! TupleTableSlot *myslot; MemoryContext oldcontext = CurrentMemoryContext; ErrorContextCallback errcontext; CommandId mycid = GetCurrentCommandId(true); *************** CopyFrom(CopyState cstate) *** 1932,1939 **** estate->es_result_relation_info = resultRelInfo; /* Set up a tuple slot too */ ! slot = ExecInitExtraTupleSlot(estate); ! ExecSetSlotDescriptor(slot, tupDesc); /* Prepare to catch AFTER triggers. */ AfterTriggerBeginQuery(); --- 1932,1941 ---- estate->es_result_relation_info = resultRelInfo; /* Set up a tuple slot too */ ! myslot = ExecInitExtraTupleSlot(estate); ! ExecSetSlotDescriptor(myslot, tupDesc); ! /* Triggers might need a slot as well */ ! estate->es_trig_tuple_slot = ExecInitExtraTupleSlot(estate); /* Prepare to catch AFTER triggers. */ AfterTriggerBeginQuery(); *************** CopyFrom(CopyState cstate) *** 1960,1965 **** --- 1962,1968 ---- for (;;) { + TupleTableSlot *slot; bool skip_tuple; Oid loaded_oid = InvalidOid; *************** CopyFrom(CopyState cstate) *** 1983,2014 **** /* Triggers and stuff need to be invoked in query context. */ MemoryContextSwitchTo(oldcontext); skip_tuple = false; /* BEFORE ROW INSERT Triggers */ if (resultRelInfo->ri_TrigDesc && resultRelInfo->ri_TrigDesc->trig_insert_before_row) { ! HeapTuple newtuple; ! ! newtuple = ExecBRInsertTriggers(estate, resultRelInfo, tuple); ! if (newtuple == NULL) /* "do nothing" */ skip_tuple = true; ! else if (newtuple != tuple) /* modified by Trigger(s) */ ! { ! heap_freetuple(tuple); ! tuple = newtuple; ! } } if (!skip_tuple) { List *recheckIndexes = NIL; - /* Place tuple in tuple slot */ - ExecStoreTuple(tuple, slot, InvalidBuffer, false); - /* Check the constraints of the tuple */ if (cstate->rel->rd_att->constr) ExecConstraints(resultRelInfo, slot, estate); --- 1986,2013 ---- /* Triggers and stuff need to be invoked in query context. */ MemoryContextSwitchTo(oldcontext); + /* Place tuple in tuple slot --- but slot shouldn't free it */ + slot = myslot; + ExecStoreTuple(tuple, slot, InvalidBuffer, false); + skip_tuple = false; /* BEFORE ROW INSERT Triggers */ if (resultRelInfo->ri_TrigDesc && resultRelInfo->ri_TrigDesc->trig_insert_before_row) { ! slot = ExecBRInsertTriggers(estate, resultRelInfo, slot); ! if (slot == NULL) /* "do nothing" */ skip_tuple = true; ! else /* trigger might have changed tuple */ ! tuple = ExecMaterializeSlot(slot); } if (!skip_tuple) { List *recheckIndexes = NIL; /* Check the constraints of the tuple */ if (cstate->rel->rd_att->constr) ExecConstraints(resultRelInfo, slot, estate); diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c index 8d996a87c771d4210a97e449afdbe512420afa34..dc6ee9c266f2f9fc731a35d4ec18c23f283c477c 100644 *** a/src/backend/commands/trigger.c --- b/src/backend/commands/trigger.c *************** ExecASInsertTriggers(EState *estate, Res *** 1909,1920 **** false, NULL, NULL, NIL, NULL); } ! HeapTuple ExecBRInsertTriggers(EState *estate, ResultRelInfo *relinfo, ! HeapTuple trigtuple) { TriggerDesc *trigdesc = relinfo->ri_TrigDesc; ! HeapTuple newtuple = trigtuple; HeapTuple oldtuple; TriggerData LocTriggerData; int i; --- 1909,1921 ---- false, NULL, NULL, NIL, NULL); } ! TupleTableSlot * ExecBRInsertTriggers(EState *estate, ResultRelInfo *relinfo, ! TupleTableSlot *slot) { TriggerDesc *trigdesc = relinfo->ri_TrigDesc; ! HeapTuple slottuple = ExecMaterializeSlot(slot); ! HeapTuple newtuple = slottuple; HeapTuple oldtuple; TriggerData LocTriggerData; int i; *************** ExecBRInsertTriggers(EState *estate, Res *** 1947,1958 **** relinfo->ri_TrigFunctions, relinfo->ri_TrigInstrument, GetPerTupleMemoryContext(estate)); ! if (oldtuple != newtuple && oldtuple != trigtuple) heap_freetuple(oldtuple); if (newtuple == NULL) ! break; } ! return newtuple; } void --- 1948,1976 ---- relinfo->ri_TrigFunctions, relinfo->ri_TrigInstrument, GetPerTupleMemoryContext(estate)); ! if (oldtuple != newtuple && oldtuple != slottuple) heap_freetuple(oldtuple); if (newtuple == NULL) ! return NULL; /* "do nothing" */ } ! ! if (newtuple != slottuple) ! { ! /* ! * Return the modified tuple using the es_trig_tuple_slot. We assume ! * the tuple was allocated in per-tuple memory context, and therefore ! * will go away by itself. The tuple table slot should not try to ! * clear it. ! */ ! TupleTableSlot *newslot = estate->es_trig_tuple_slot; ! TupleDesc tupdesc = RelationGetDescr(relinfo->ri_RelationDesc); ! ! if (newslot->tts_tupleDescriptor != tupdesc) ! ExecSetSlotDescriptor(newslot, tupdesc); ! ExecStoreTuple(newtuple, newslot, InvalidBuffer, false); ! slot = newslot; ! } ! return slot; } void *************** ExecARInsertTriggers(EState *estate, Res *** 1966,1977 **** true, NULL, trigtuple, recheckIndexes, NULL); } ! HeapTuple ExecIRInsertTriggers(EState *estate, ResultRelInfo *relinfo, ! HeapTuple trigtuple) { TriggerDesc *trigdesc = relinfo->ri_TrigDesc; ! HeapTuple newtuple = trigtuple; HeapTuple oldtuple; TriggerData LocTriggerData; int i; --- 1984,1996 ---- true, NULL, trigtuple, recheckIndexes, NULL); } ! TupleTableSlot * ExecIRInsertTriggers(EState *estate, ResultRelInfo *relinfo, ! TupleTableSlot *slot) { TriggerDesc *trigdesc = relinfo->ri_TrigDesc; ! HeapTuple slottuple = ExecMaterializeSlot(slot); ! HeapTuple newtuple = slottuple; HeapTuple oldtuple; TriggerData LocTriggerData; int i; *************** ExecIRInsertTriggers(EState *estate, Res *** 2004,2015 **** relinfo->ri_TrigFunctions, relinfo->ri_TrigInstrument, GetPerTupleMemoryContext(estate)); ! if (oldtuple != newtuple && oldtuple != trigtuple) heap_freetuple(oldtuple); if (newtuple == NULL) ! break; } ! return newtuple; } void --- 2023,2051 ---- relinfo->ri_TrigFunctions, relinfo->ri_TrigInstrument, GetPerTupleMemoryContext(estate)); ! if (oldtuple != newtuple && oldtuple != slottuple) heap_freetuple(oldtuple); if (newtuple == NULL) ! return NULL; /* "do nothing" */ } ! ! if (newtuple != slottuple) ! { ! /* ! * Return the modified tuple using the es_trig_tuple_slot. We assume ! * the tuple was allocated in per-tuple memory context, and therefore ! * will go away by itself. The tuple table slot should not try to ! * clear it. ! */ ! TupleTableSlot *newslot = estate->es_trig_tuple_slot; ! TupleDesc tupdesc = RelationGetDescr(relinfo->ri_RelationDesc); ! ! if (newslot->tts_tupleDescriptor != tupdesc) ! ExecSetSlotDescriptor(newslot, tupdesc); ! ExecStoreTuple(newtuple, newslot, InvalidBuffer, false); ! slot = newslot; ! } ! return slot; } void *************** ExecASUpdateTriggers(EState *estate, Res *** 2257,2288 **** GetModifiedColumns(relinfo, estate)); } ! HeapTuple ExecBRUpdateTriggers(EState *estate, EPQState *epqstate, ResultRelInfo *relinfo, ! ItemPointer tupleid, HeapTuple newtuple) { TriggerDesc *trigdesc = relinfo->ri_TrigDesc; TriggerData LocTriggerData; HeapTuple trigtuple; HeapTuple oldtuple; - HeapTuple intuple = newtuple; TupleTableSlot *newSlot; int i; Bitmapset *modifiedCols; trigtuple = GetTupleForTrigger(estate, epqstate, relinfo, tupleid, &newSlot); if (trigtuple == NULL) ! return NULL; /* ! * In READ COMMITTED isolation level it's possible that newtuple was * changed due to concurrent update. In that case we have a raw subplan ! * output tuple and need to run it through the junk filter. */ if (newSlot != NULL) ! intuple = newtuple = ExecRemoveJunk(relinfo->ri_junkFilter, newSlot); modifiedCols = GetModifiedColumns(relinfo, estate); --- 2293,2336 ---- GetModifiedColumns(relinfo, estate)); } ! TupleTableSlot * ExecBRUpdateTriggers(EState *estate, EPQState *epqstate, ResultRelInfo *relinfo, ! ItemPointer tupleid, TupleTableSlot *slot) { TriggerDesc *trigdesc = relinfo->ri_TrigDesc; + HeapTuple slottuple = ExecMaterializeSlot(slot); + HeapTuple newtuple = slottuple; TriggerData LocTriggerData; HeapTuple trigtuple; HeapTuple oldtuple; TupleTableSlot *newSlot; int i; Bitmapset *modifiedCols; + /* get a copy of the on-disk tuple we are planning to update */ trigtuple = GetTupleForTrigger(estate, epqstate, relinfo, tupleid, &newSlot); if (trigtuple == NULL) ! return NULL; /* cancel the update action */ /* ! * In READ COMMITTED isolation level it's possible that target tuple was * changed due to concurrent update. In that case we have a raw subplan ! * output tuple in newSlot, and need to run it through the junk filter to ! * produce an insertable tuple. ! * ! * Caution: more than likely, the passed-in slot is the same as the ! * junkfilter's output slot, so we are clobbering the original value of ! * slottuple by doing the filtering. This is OK since neither we nor our ! * caller have any more interest in the prior contents of that slot. */ if (newSlot != NULL) ! { ! slot = ExecFilterJunk(relinfo->ri_junkFilter, newSlot); ! slottuple = ExecMaterializeSlot(slot); ! newtuple = slottuple; ! } modifiedCols = GetModifiedColumns(relinfo, estate); *************** ExecBRUpdateTriggers(EState *estate, EPQ *** 2314,2326 **** relinfo->ri_TrigFunctions, relinfo->ri_TrigInstrument, GetPerTupleMemoryContext(estate)); ! if (oldtuple != newtuple && oldtuple != intuple) heap_freetuple(oldtuple); if (newtuple == NULL) ! break; } heap_freetuple(trigtuple); ! return newtuple; } void --- 2362,2394 ---- relinfo->ri_TrigFunctions, relinfo->ri_TrigInstrument, GetPerTupleMemoryContext(estate)); ! if (oldtuple != newtuple && oldtuple != slottuple) heap_freetuple(oldtuple); if (newtuple == NULL) ! { ! heap_freetuple(trigtuple); ! return NULL; /* "do nothing" */ ! } } heap_freetuple(trigtuple); ! ! if (newtuple != slottuple) ! { ! /* ! * Return the modified tuple using the es_trig_tuple_slot. We assume ! * the tuple was allocated in per-tuple memory context, and therefore ! * will go away by itself. The tuple table slot should not try to ! * clear it. ! */ ! TupleTableSlot *newslot = estate->es_trig_tuple_slot; ! TupleDesc tupdesc = RelationGetDescr(relinfo->ri_RelationDesc); ! ! if (newslot->tts_tupleDescriptor != tupdesc) ! ExecSetSlotDescriptor(newslot, tupdesc); ! ExecStoreTuple(newtuple, newslot, InvalidBuffer, false); ! slot = newslot; ! } ! return slot; } void *************** ExecARUpdateTriggers(EState *estate, Res *** 2342,2355 **** } } ! HeapTuple ExecIRUpdateTriggers(EState *estate, ResultRelInfo *relinfo, ! HeapTuple oldtuple, HeapTuple newtuple) { TriggerDesc *trigdesc = relinfo->ri_TrigDesc; TriggerData LocTriggerData; ! HeapTuple intuple = newtuple; ! HeapTuple rettuple; int i; LocTriggerData.type = T_TriggerData; --- 2410,2424 ---- } } ! TupleTableSlot * ExecIRUpdateTriggers(EState *estate, ResultRelInfo *relinfo, ! HeapTuple trigtuple, TupleTableSlot *slot) { TriggerDesc *trigdesc = relinfo->ri_TrigDesc; + HeapTuple slottuple = ExecMaterializeSlot(slot); + HeapTuple newtuple = slottuple; TriggerData LocTriggerData; ! HeapTuple oldtuple; int i; LocTriggerData.type = T_TriggerData; *************** ExecIRUpdateTriggers(EState *estate, Res *** 2367,2392 **** TRIGGER_TYPE_UPDATE)) continue; if (!TriggerEnabled(estate, relinfo, trigger, LocTriggerData.tg_event, ! NULL, oldtuple, newtuple)) continue; ! LocTriggerData.tg_trigtuple = oldtuple; ! LocTriggerData.tg_newtuple = newtuple; LocTriggerData.tg_trigtuplebuf = InvalidBuffer; LocTriggerData.tg_newtuplebuf = InvalidBuffer; LocTriggerData.tg_trigger = trigger; ! rettuple = ExecCallTriggerFunc(&LocTriggerData, i, relinfo->ri_TrigFunctions, relinfo->ri_TrigInstrument, GetPerTupleMemoryContext(estate)); ! if (newtuple != rettuple && newtuple != intuple) ! heap_freetuple(newtuple); ! newtuple = rettuple; if (newtuple == NULL) ! break; } ! return newtuple; } void --- 2436,2477 ---- TRIGGER_TYPE_UPDATE)) continue; if (!TriggerEnabled(estate, relinfo, trigger, LocTriggerData.tg_event, ! NULL, trigtuple, newtuple)) continue; ! LocTriggerData.tg_trigtuple = trigtuple; ! LocTriggerData.tg_newtuple = oldtuple = newtuple; LocTriggerData.tg_trigtuplebuf = InvalidBuffer; LocTriggerData.tg_newtuplebuf = InvalidBuffer; LocTriggerData.tg_trigger = trigger; ! newtuple = ExecCallTriggerFunc(&LocTriggerData, i, relinfo->ri_TrigFunctions, relinfo->ri_TrigInstrument, GetPerTupleMemoryContext(estate)); ! if (oldtuple != newtuple && oldtuple != slottuple) ! heap_freetuple(oldtuple); if (newtuple == NULL) ! return NULL; /* "do nothing" */ } ! ! if (newtuple != slottuple) ! { ! /* ! * Return the modified tuple using the es_trig_tuple_slot. We assume ! * the tuple was allocated in per-tuple memory context, and therefore ! * will go away by itself. The tuple table slot should not try to ! * clear it. ! */ ! TupleTableSlot *newslot = estate->es_trig_tuple_slot; ! TupleDesc tupdesc = RelationGetDescr(relinfo->ri_RelationDesc); ! ! if (newslot->tts_tupleDescriptor != tupdesc) ! ExecSetSlotDescriptor(newslot, tupdesc); ! ExecStoreTuple(newtuple, newslot, InvalidBuffer, false); ! slot = newslot; ! } ! return slot; } void diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c index 42662bdc461edfd2f9313795865577270fd33736..12a5b2a8953e19ade74f9a1c4f668532096b76c9 100644 *** a/src/backend/executor/nodeModifyTable.c --- b/src/backend/executor/nodeModifyTable.c *************** ExecInsert(TupleTableSlot *slot, *** 199,258 **** if (resultRelInfo->ri_TrigDesc && resultRelInfo->ri_TrigDesc->trig_insert_before_row) { ! HeapTuple newtuple; ! ! newtuple = ExecBRInsertTriggers(estate, resultRelInfo, tuple); ! if (newtuple == NULL) /* "do nothing" */ return NULL; ! if (newtuple != tuple) /* modified by Trigger(s) */ ! { ! /* ! * Put the modified tuple into a slot for convenience of routines ! * below. We assume the tuple was allocated in per-tuple memory ! * context, and therefore will go away by itself. The tuple table ! * slot should not try to clear it. ! */ ! TupleTableSlot *newslot = estate->es_trig_tuple_slot; ! TupleDesc tupdesc = RelationGetDescr(resultRelationDesc); ! ! if (newslot->tts_tupleDescriptor != tupdesc) ! ExecSetSlotDescriptor(newslot, tupdesc); ! ExecStoreTuple(newtuple, newslot, InvalidBuffer, false); ! slot = newslot; ! tuple = newtuple; ! } } /* INSTEAD OF ROW INSERT Triggers */ if (resultRelInfo->ri_TrigDesc && resultRelInfo->ri_TrigDesc->trig_insert_instead_row) { ! HeapTuple newtuple; ! ! newtuple = ExecIRInsertTriggers(estate, resultRelInfo, tuple); ! if (newtuple == NULL) /* "do nothing" */ return NULL; ! if (newtuple != tuple) /* modified by Trigger(s) */ ! { ! /* ! * Put the modified tuple into a slot for convenience of routines ! * below. We assume the tuple was allocated in per-tuple memory ! * context, and therefore will go away by itself. The tuple table ! * slot should not try to clear it. ! */ ! TupleTableSlot *newslot = estate->es_trig_tuple_slot; ! TupleDesc tupdesc = RelationGetDescr(resultRelationDesc); ! ! if (newslot->tts_tupleDescriptor != tupdesc) ! ExecSetSlotDescriptor(newslot, tupdesc); ! ExecStoreTuple(newtuple, newslot, InvalidBuffer, false); ! slot = newslot; ! tuple = newtuple; ! } newId = InvalidOid; } --- 199,224 ---- if (resultRelInfo->ri_TrigDesc && resultRelInfo->ri_TrigDesc->trig_insert_before_row) { ! slot = ExecBRInsertTriggers(estate, resultRelInfo, slot); ! if (slot == NULL) /* "do nothing" */ return NULL; ! /* trigger might have changed tuple */ ! tuple = ExecMaterializeSlot(slot); } /* INSTEAD OF ROW INSERT Triggers */ if (resultRelInfo->ri_TrigDesc && resultRelInfo->ri_TrigDesc->trig_insert_instead_row) { ! slot = ExecIRInsertTriggers(estate, resultRelInfo, slot); ! if (slot == NULL) /* "do nothing" */ return NULL; ! /* trigger might have changed tuple */ ! tuple = ExecMaterializeSlot(slot); newId = InvalidOid; } *************** ExecUpdate(ItemPointer tupleid, *** 533,563 **** if (resultRelInfo->ri_TrigDesc && resultRelInfo->ri_TrigDesc->trig_update_before_row) { ! HeapTuple newtuple; ! ! newtuple = ExecBRUpdateTriggers(estate, epqstate, resultRelInfo, ! tupleid, tuple); ! if (newtuple == NULL) /* "do nothing" */ return NULL; ! if (newtuple != tuple) /* modified by Trigger(s) */ ! { ! /* ! * Put the modified tuple into a slot for convenience of routines ! * below. We assume the tuple was allocated in per-tuple memory ! * context, and therefore will go away by itself. The tuple table ! * slot should not try to clear it. ! */ ! TupleTableSlot *newslot = estate->es_trig_tuple_slot; ! TupleDesc tupdesc = RelationGetDescr(resultRelationDesc); ! ! if (newslot->tts_tupleDescriptor != tupdesc) ! ExecSetSlotDescriptor(newslot, tupdesc); ! ExecStoreTuple(newtuple, newslot, InvalidBuffer, false); ! slot = newslot; ! tuple = newtuple; ! } } /* INSTEAD OF ROW UPDATE Triggers */ --- 499,512 ---- if (resultRelInfo->ri_TrigDesc && resultRelInfo->ri_TrigDesc->trig_update_before_row) { ! slot = ExecBRUpdateTriggers(estate, epqstate, resultRelInfo, ! tupleid, slot); ! if (slot == NULL) /* "do nothing" */ return NULL; ! /* trigger might have changed tuple */ ! tuple = ExecMaterializeSlot(slot); } /* INSTEAD OF ROW UPDATE Triggers */ *************** ExecUpdate(ItemPointer tupleid, *** 565,571 **** resultRelInfo->ri_TrigDesc->trig_update_instead_row) { HeapTupleData oldtup; - HeapTuple newtuple; Assert(oldtuple != NULL); oldtup.t_data = oldtuple; --- 514,519 ---- *************** ExecUpdate(ItemPointer tupleid, *** 573,601 **** ItemPointerSetInvalid(&(oldtup.t_self)); oldtup.t_tableOid = InvalidOid; ! newtuple = ExecIRUpdateTriggers(estate, resultRelInfo, ! &oldtup, tuple); ! if (newtuple == NULL) /* "do nothing" */ return NULL; ! if (newtuple != tuple) /* modified by Trigger(s) */ ! { ! /* ! * Put the modified tuple into a slot for convenience of routines ! * below. We assume the tuple was allocated in per-tuple memory ! * context, and therefore will go away by itself. The tuple table ! * slot should not try to clear it. ! */ ! TupleTableSlot *newslot = estate->es_trig_tuple_slot; ! TupleDesc tupdesc = RelationGetDescr(resultRelationDesc); ! ! if (newslot->tts_tupleDescriptor != tupdesc) ! ExecSetSlotDescriptor(newslot, tupdesc); ! ExecStoreTuple(newtuple, newslot, InvalidBuffer, false); ! slot = newslot; ! tuple = newtuple; ! } } else { --- 521,534 ---- ItemPointerSetInvalid(&(oldtup.t_self)); oldtup.t_tableOid = InvalidOid; ! slot = ExecIRUpdateTriggers(estate, resultRelInfo, ! &oldtup, slot); ! if (slot == NULL) /* "do nothing" */ return NULL; ! /* trigger might have changed tuple */ ! tuple = ExecMaterializeSlot(slot); } else { diff --git a/src/include/commands/trigger.h b/src/include/commands/trigger.h index c213ac7a4efa32ab79e80560a4a040fc4e666636..80a779ed0bdf143f058249843bebdf665925abe6 100644 *** a/src/include/commands/trigger.h --- b/src/include/commands/trigger.h *************** extern void ExecBSInsertTriggers(EState *** 132,147 **** ResultRelInfo *relinfo); extern void ExecASInsertTriggers(EState *estate, ResultRelInfo *relinfo); ! extern HeapTuple ExecBRInsertTriggers(EState *estate, ResultRelInfo *relinfo, ! HeapTuple trigtuple); extern void ExecARInsertTriggers(EState *estate, ResultRelInfo *relinfo, HeapTuple trigtuple, List *recheckIndexes); ! extern HeapTuple ExecIRInsertTriggers(EState *estate, ResultRelInfo *relinfo, ! HeapTuple trigtuple); extern void ExecBSDeleteTriggers(EState *estate, ResultRelInfo *relinfo); extern void ExecASDeleteTriggers(EState *estate, --- 132,147 ---- ResultRelInfo *relinfo); extern void ExecASInsertTriggers(EState *estate, ResultRelInfo *relinfo); ! extern TupleTableSlot *ExecBRInsertTriggers(EState *estate, ResultRelInfo *relinfo, ! TupleTableSlot *slot); extern void ExecARInsertTriggers(EState *estate, ResultRelInfo *relinfo, HeapTuple trigtuple, List *recheckIndexes); ! extern TupleTableSlot *ExecIRInsertTriggers(EState *estate, ResultRelInfo *relinfo, ! TupleTableSlot *slot); extern void ExecBSDeleteTriggers(EState *estate, ResultRelInfo *relinfo); extern void ExecASDeleteTriggers(EState *estate, *************** extern void ExecBSUpdateTriggers(EState *** 160,179 **** ResultRelInfo *relinfo); extern void ExecASUpdateTriggers(EState *estate, ResultRelInfo *relinfo); ! extern HeapTuple ExecBRUpdateTriggers(EState *estate, EPQState *epqstate, ResultRelInfo *relinfo, ItemPointer tupleid, ! HeapTuple newtuple); extern void ExecARUpdateTriggers(EState *estate, ResultRelInfo *relinfo, ItemPointer tupleid, HeapTuple newtuple, List *recheckIndexes); ! extern HeapTuple ExecIRUpdateTriggers(EState *estate, ResultRelInfo *relinfo, ! HeapTuple oldtuple, ! HeapTuple newtuple); extern void ExecBSTruncateTriggers(EState *estate, ResultRelInfo *relinfo); extern void ExecASTruncateTriggers(EState *estate, --- 160,179 ---- ResultRelInfo *relinfo); extern void ExecASUpdateTriggers(EState *estate, ResultRelInfo *relinfo); ! extern TupleTableSlot *ExecBRUpdateTriggers(EState *estate, EPQState *epqstate, ResultRelInfo *relinfo, ItemPointer tupleid, ! TupleTableSlot *slot); extern void ExecARUpdateTriggers(EState *estate, ResultRelInfo *relinfo, ItemPointer tupleid, HeapTuple newtuple, List *recheckIndexes); ! extern TupleTableSlot *ExecIRUpdateTriggers(EState *estate, ResultRelInfo *relinfo, ! HeapTuple trigtuple, ! TupleTableSlot *slot); extern void ExecBSTruncateTriggers(EState *estate, ResultRelInfo *relinfo); extern void ExecASTruncateTriggers(EState *estate,
I wrote: > Ugh. That quick little "ExecRemoveJunk" is a lot more dangerous than it > looks. I had actually looked at this before, but concluded it was OK > because I couldn't reproduce the problem with a trigger in place. > I guess I wasn't unlucky enough to have the chance pointer equality > occur. On closer inspection, the palloc collision is actually extremely likely, because what will happen is we'll pfree the old tuple and immediately palloc the new one, and if the new one is of sufficiently similar size that it needs the same size of alloc chunk, it's *guaranteed* to get that same chunk back, because of the LIFO free-chunk chains in aset.c. The reason that the problem is hard to reproduce is that most triggers (certainly those written in plpgsql) will give back newly allocated tuples even when you return the unmodified NEW tuple. The only way to expose the problem is for ExecBRUpdateTrigger's trigger-calling loop to not replace the "newtuple", and the easiest way for that to happen is if all the triggers are disabled. So that's why you're seeing it when fooling with the replication-role setting. I was able to reproduce the failure by creating a trigger with a false WHEN condition, and of course there are other ways to prevent a trigger from being called too. regards, tom lane
On Tue, Feb 22, 2011 at 1:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: > > Ugh. That quick little "ExecRemoveJunk" is a lot more dangerous than it > > looks. I had actually looked at this before, but concluded it was OK > > because I couldn't reproduce the problem with a trigger in place. > > I guess I wasn't unlucky enough to have the chance pointer equality > > occur. > > On closer inspection, the palloc collision is actually extremely likely, > because what will happen is we'll pfree the old tuple and immediately > palloc the new one, and if the new one is of sufficiently similar size > that it needs the same size of alloc chunk, it's *guaranteed* to get > that same chunk back, because of the LIFO free-chunk chains in aset.c. > > The reason that the problem is hard to reproduce is that most triggers > (certainly those written in plpgsql) will give back newly allocated > tuples even when you return the unmodified NEW tuple. The only way to > expose the problem is for ExecBRUpdateTrigger's trigger-calling loop to > not replace the "newtuple", and the easiest way for that to happen is if > all the triggers are disabled. So that's why you're seeing it when > fooling with the replication-role setting. I was able to reproduce the > failure by creating a trigger with a false WHEN condition, and of course > there are other ways to prevent a trigger from being called too. > > regards, tom lane > Thank you very much for fixing that issue. I apologize for not providing critical details at start of the discussion (until yesterday they seems unimportant for me), I will try provide all possible details in future if I find anything else like that. Now for me start digging next strange problem described here: http://archives.postgresql.org/pgsql-admin/2011-01/msg00055.php -- Maxim Boguk Senior Postgresql DBA.
On Mon, Feb 21, 2011 at 6:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: >> Ugh. =A0That quick little "ExecRemoveJunk" is a lot more dangerous than = it >> looks. =A0I had actually looked at this before, but concluded it was OK >> because I couldn't reproduce the problem with a trigger in place. >> I guess I wasn't unlucky enough to have the chance pointer equality >> occur. > > On closer inspection, the palloc collision is actually extremely likely, > because what will happen is we'll pfree the old tuple and immediately > palloc the new one, and if the new one is of sufficiently similar size > that it needs the same size of alloc chunk, it's *guaranteed* to get > that same chunk back, because of the LIFO free-chunk chains in aset.c. > > The reason that the problem is hard to reproduce is that most triggers > (certainly those written in plpgsql) will give back newly allocated > tuples even when you return the unmodified NEW tuple. =A0The only way to > expose the problem is for ExecBRUpdateTrigger's trigger-calling loop to > not replace the "newtuple", and the easiest way for that to happen is if > all the triggers are disabled. =A0So that's why you're seeing it when > fooling with the replication-role setting. =A0I was able to reproduce the > failure by creating a trigger with a false WHEN condition, and of course > there are other ways to prevent a trigger from being called too. I bumped into the 'virtual tuple' error on 9.0.3 on a completely unrelated query: create table foo(a int, b int); insert into foo select 1, 1 where not exists (select 1 from foo where a =3D 1 for update); 9.0.4 got rid of the error, but as there are no triggers or anything like that on the foo so I thought I'd post my findings here in case anyone else bumped into it. merlin
Merlin Moncure <mmoncure@gmail.com> writes: > I bumped into the 'virtual tuple' error on 9.0.3 on a completely > unrelated query: > create table foo(a int, b int); > insert into foo select 1, 1 where not exists (select 1 from foo where > a = 1 for update); > 9.0.4 got rid of the error, but as there are no triggers or anything > like that on the foo so I thought I'd post my findings here in case > anyone else bumped into it. I think most likely this has to do with commit d5478c3391f8f1a243abbc3d9253aac3d6d3538e, and is unrelated (except in the error message) to that trigger issue. regards, tom lane