Thread: Delete cascade with three levels bug ?
Hi, hackers<br /><br /> I have a problem at PostgreSQL 8.3.5 (Slackware Server and Win 2003 Server) SO independent.<br /><br/>When run the scripts below I receive the error:<br /><br />---------------------------------------------------------------------------<br/>testes=# DELETE FROM pai WHERE co_pai =1;<br /> server closed the connection unexpectedly<br /> This probably means the server terminated abnormally<br/> before or while processing the request.<br />The connection to the server was lost. Attempting reset:Failed.<br /> ---------------------------------------------------------------------------<br /><br />The script createsthe father table, son and grandson. All conected with FK Cascade.<br /><br />I changed the comportment of the pg_triggerto postpone the validation of constraint<br /> (make compatible with 7.4 version because we migrate it).<br /><br/>When I delete the father register the next error ocourr.<br /><br />PS1.: On Windows need reinicialize the service.<br/> PS2.: On Linux all connection down but not is needed reinicialize the service.<br /> PS3.: I make a test on8.4 version and the same error ocourr.<br /><br /><br /> -------Scripts-------<br /> CREATE TABLE father<br /> (<br /> co_father double precision NOT NULL,<br /> no_description character varying(50) NOT NULL,<br /> CONSTRAINT pk_fatherPRIMARY KEY (co_father)<br /> )<br /> WITH (OIDS=TRUE);<br /> ALTER TABLE father OWNER TO postgres;<br /><br /><br/> CREATE TABLE son<br /> (<br /> co_son double precision NOT NULL,<br /> co_father double precision NOT NULL,<br/> no_description character varying(50) NOT NULL,<br /> CONSTRAINT pk_son PRIMARY KEY (co_son)<br /> )<br /> WITH(OIDS=TRUE);<br /> ALTER TABLE son OWNER TO postgres;<br /><br /> CREATE TABLE grandson<br /> (<br /> co_grandson doubleprecision NOT NULL,<br /> co_son double precision NOT NULL,<br /> no_description character varying(50) NOT NULL,<br/> CONSTRAINT pk_grandson PRIMARY KEY (co_grandson)<br /> )<br /> WITH (OIDS=TRUE);<br /> ALTER TABLE son OWNERTO postgres;<br /><br /> ALTER TABLE son<br /> ADD CONSTRAINT fk_son_father FOREIGN KEY (co_father)<br /> REFERENCESfather (co_father) MATCH SIMPLE<br /> ON UPDATE CASCADE ON DELETE CASCADE;<br /><br /> ALTER TABLE grandson<br/> ADD CONSTRAINT fk_grandson_son FOREIGN KEY (co_son)<br /> REFERENCES son (co_son) MATCH SIMPLE<br /> ON UPDATE CASCADE ON DELETE CASCADE;<br /><br /> UPDATE pg_constraint SET condeferred = TRUE, condeferrable = TRUE;<br/> UPDATE pg_trigger SET tgdeferrable = TRUE, tginitdeferred = TRUE;<br /><br /> INSERT INTO father VALUES(1, 'FATHER-1');<br/><br /> INSERT INTO son VALUES(1,1, 'Son FATHER-1');<br /> INSERT INTO son VALUES(2,1, 'Son FATHER-1');<br/> INSERT INTO son VALUES(3,1, 'Son FATHER-1');<br /><br /> INSERT INTO grandson VALUES(1,1, 'Grandson FATHER-1');<br/> INSERT INTO grandson VALUES(2,2, 'Grandson FATHER-1');<br /> INSERT INTO grandson VALUES(3,3, 'GrandsonFATHER-1');<br /><br /> DELETE FROM father WHERE co_father = 1<br /><br /><br />My LOGS -----------------------------------------------------------------------------------------------------------<br/><br />0LOG: 00000: server process (PID 23470) was terminated by signal 11: Segmentation fault<br /> 0LOCATION: LogChildExit,postmaster.c:2529<br />0LOG: 00000: terminating any other active server processes<br />0LOCATION: HandleChildCrash,postmaster.c:2374<br />0FATAL: 57P03: the database system is in recovery mode<br /> 0LOCATION: ProcessStartupPacket,postmaster.c:1648<br /> 0LOG: 00000: all server processes terminated; reinitializing<br />0LOCATION: PostmasterStateMachine, postmaster.c:2690<br />0LOG: 00000: database system was interrupted; last known upat 2009-10-27 11:43:37 BRST<br />0LOCATION: StartupXLOG, xlog.c:4836<br /> 0DEBUG: 00000: checkpoint record is at 1/1C865BD4<br/>0LOCATION: StartupXLOG, xlog.c:4906<br />0DEBUG: 00000: redo record is at 1/1C865BD4; shutdown TRUE<br />0LOCATION: StartupXLOG, xlog.c:4932<br />0DEBUG: 00000: next transaction ID: 0/27113; next OID: 67190<br /> 0LOCATION: StartupXLOG, xlog.c:4936<br />0DEBUG: 00000: next MultiXactId: 1; next MultiXactOffset: 0<br />0LOCATION: StartupXLOG,xlog.c:4939<br />0LOG: 00000: database system was not properly shut down; automatic recovery in progress<br/> 0LOCATION: StartupXLOG, xlog.c:5003<br />0LOG: 00000: record with zero length at 1/1C865C14<br />0LOCATION: ReadRecord, xlog.c:3126<br />0LOG: 00000: redo is not required<br />0LOCATION: StartupXLOG, xlog.c:5146<br/>0DEBUG: 00000: transaction ID wrap limit is 2147484026, limited by database "template1"<br /> 0LOCATION: SetTransactionIdLimit, varsup.c:283<br />0LOG: 00000: database system is ready to accept connections<br />0LOCATION: reaper, postmaster.c:2156<br /><br />-----------------------------------------------------------------------------------------------------------------------------<br clear="all"/><br />Has I know, the deletion cascade with three levels don't give this problem. The same think occourr ifI change the pg_trigger comportment to postpone validation. This is right?<br /><br />Sorry, but what wrap limit do ?<br/><br />Thanks in advanced.<br /><br />Sincerely,<br />-- <br />Marcelo Costa<br /><a href="http://www.marcelocosta.net"target="_blank"></a>-------------------------------------------------<br />“You can't alwaysget what want”, <br /><br />Doctor House in apology to Mike Jagger<br />
Marcelo Costa escreveu: > Hi, hackers > > I have a problem at PostgreSQL 8.3.5 (Slackware Server and Win 2003 > Server) SO independent. > > When run the scripts below I receive the error: > This is not a bug. There are many ways to shoot yourself in the foot; and it is one of them... > UPDATE pg_trigger SET tgdeferrable = TRUE, tginitdeferred = TRUE; > This query will defer all of the triggers, including the cascade ones. If you compiles with --enable-cassert you will see: TRAP: FailedAssertion("!(afterTriggers->query_depth >= 0)", Arquivo: "/a/pgsql/src/backend/commands/trigger.c", Linha: 3893) Why don't you use the syntax (DEFERRABLE and INITIALLY DEFERRED)? That will do the Right Thing (TM). BTW, is it worth preventing such a crash putting an elog message in trigger.c? -- Euler Taveira de Oliveira http://www.timbira.com/
On Tue, Oct 27, 2009 at 10:41 AM, Euler Taveira de Oliveira <euler@timbira.com> wrote: > Marcelo Costa escreveu: >> Hi, hackers >> >> I have a problem at PostgreSQL 8.3.5 (Slackware Server and Win 2003 >> Server) SO independent. >> >> When run the scripts below I receive the error: >> > This is not a bug. There are many ways to shoot yourself in the foot; and it > is one of them... > >> UPDATE pg_trigger SET tgdeferrable = TRUE, tginitdeferred = TRUE; >> > This query will defer all of the triggers, including the cascade ones. If you > compiles with --enable-cassert you will see: > > TRAP: FailedAssertion("!(afterTriggers->query_depth >= 0)", Arquivo: > "/a/pgsql/src/backend/commands/trigger.c", Linha: 3893) > > Why don't you use the syntax (DEFERRABLE and INITIALLY DEFERRED)? That will do > the Right Thing (TM). > > BTW, is it worth preventing such a crash putting an elog message in trigger.c? It doesn't seem right to allow a catalog change that results in an assertion failure. Seems like we should either prevent the catalog change, or have an elog() there rather than Assert(). ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Oct 27, 2009 at 10:41 AM, Euler Taveira de Oliveira >> BTW, is it worth preventing such a crash putting an elog message in trigger.c? > It doesn't seem right to allow a catalog change that results in an > assertion failure. Seems like we should either prevent the catalog > change, or have an elog() there rather than Assert(). We can't prevent a superuser from making stupid catalog changes. I agree though that it's not very nice to Assert or dump core afterwards. Changing the assert to an elog seems like an appropriate answer, assuming that this isn't masking some more significant issue. regards, tom lane
Marcelo Costa <marcelojscosta@gmail.com> writes: > [ trying to defer RI_FKey_cascade_del trigger crashes the backend ] I looked at this a bit more and think that there's more to it than pilot error. The crash occurs because we queue a deferred trigger here: #0 AfterTriggerSaveEvent (relinfo=0x40181778, event=1, row_trigger=1 '\001', oldtup=0x4018c1b8, newtup=0x0, recheckIndexes=0x0,modifiedCols=0x0) at trigger.c:3890 #1 0x2db9c0 in ExecARDeleteTriggers (estate=0x40181778, relinfo=0x40181778, tupleid=0x4018c1b8) at trigger.c:1896 #2 0x317124 in ExecDelete (tupleid=0x7b03c388, planSlot=0x4018b2c0, epqstate=0x4018b664, estate=0x401816a8) at nodeModifyTable.c:368 #3 0x317894 in ExecModifyTable (node=0x4018b618) at nodeModifyTable.c:738 #4 0x2fe01c in ExecProcNode (node=0x4018b618) at execProcnode.c:359 #5 0x2fc034 in ExecutePlan (estate=0x401816a8, planstate=0x4018b618, operation=CMD_DELETE, sendTuples=0 '\000', numberTuples=0,direction=ForwardScanDirection, dest=0x40003a80) at execMain.c:1188 #6 0x2fadf0 in standard_ExecutorRun (queryDesc=0x4017e2a0, direction=ForwardScanDirection, count=0) at execMain.c:278 #7 0x2fac94 in ExecutorRun (queryDesc=0x40181778, direction=ForwardScanDirection, count=1) at execMain.c:227 #8 0x3268c8 in _SPI_pquery (queryDesc=0x4017e2a0, fire_triggers=0 '\000', tcount=0) at spi.c:2009 #9 0x326520 in _SPI_execute_plan (plan=0x40180288, paramLI=0x4017e280, snapshot=0x0, crosscheck_snapshot=0x0, read_only=0,fire_triggers=0 '\000', tcount=0) at spi.c:1831 #10 0x32383c in SPI_execute_snapshot (plan=0x40180288, Values=0x7b03be88, Nulls=0x7b03bf88 " ", snapshot=0x0, crosscheck_snapshot=0x0,read_only=0, fire_triggers=0, tcount=0) at spi.c:441 #11 0x493d08 in ri_PerformCheck (qkey=0x7b03bb20, qplan=0x40180288, fk_rel=0x40133600, pk_rel=0x401262f0, old_tuple=0x7b03b5a8,new_tuple=0x0, detectNewRows=1, expect_OK=8, constrname=0x7b03b8cc "fk_son_father") at ri_triggers.c:3350 #12 0x4901d4 in RI_FKey_cascade_del (fcinfo=0x40181778) at ri_triggers.c:1166 #13 0x2db140 in ExecCallTriggerFunc (trigdata=0x7b03b588, tgindx=0, finfo=0x7b03b850, instr=0x0, per_tuple_context=0x40153b60)at trigger.c:1631 #14 0x2dccc8 in AfterTriggerExecute (event=0x40170678, rel=0x401262f0, trigdesc=0x1, finfo=0x4016a0d0, instr=0x0, per_tuple_context=0x40153b60)at trigger.c:2830 #15 0x2dd038 in afterTriggerInvokeEvents (events=0x40170a58, firing_id=1, estate=0x40169e40, delete_ok=1 '\001') at trigger.c:3009 #16 0x2dd5ac in AfterTriggerFireDeferred () at trigger.c:3247 #17 0x1fd230 in CommitTransaction () at xact.c:1581 #18 0x1fdfe8 in CommitTransactionCommand () at xact.c:2323 #19 0x3ed0e8 in finish_xact_command () at postgres.c:2372 #20 0x3eae20 in exec_simple_query (query_string=0x400625b8 "DELETE FROM father WHERE co_father = 1;") at postgres.c:1039 #21 0x3ef344 in PostgresMain (argc=2, argv=0x51, username=0x4005f9d8 "postgres") at postgres.c:3573 when we are not inside any AfterTriggerBeginQuery/AfterTriggerEndQuery pair. Normally _SPI_pquery() would take care of that detail, but it's been specifically told not to by the RI trigger code (notice the fire_triggers=0 arguments). It is not immediately obvious that an RI trigger query could never cause AFTER triggers to be queued, so I think this is at least a latent bug, even if this particular symptom involves intentional misconfiguration of standard triggers. The two reasonable fixes seem to be to change our minds about not letting SPI fire deferred triggers in this usage, or to add code to AfterTriggerFireDeferred to invoke AfterTriggerBegin/EndQuery. I do not recall what the reasoning was for deferring triggers during ri_triggers calls, and am afraid to monkey with that behavior unless someone can lay out an argument why it's okay to change. So modifying AfterTriggerFireDeferred seems like the right solution. regards, tom lane
I wrote: > The crash occurs because we queue a deferred trigger here ... > when we are not inside any AfterTriggerBeginQuery/AfterTriggerEndQuery > pair. Normally _SPI_pquery() would take care of that detail, but it's > been specifically told not to by the RI trigger code (notice the > fire_triggers=0 arguments). It is not immediately obvious that an > RI trigger query could never cause AFTER triggers to be queued, so > I think this is at least a latent bug, even if this particular symptom > involves intentional misconfiguration of standard triggers. After further study I've gone back to the let's-just-make-the-assert-an-elog camp. The reason we have an issue is exactly that the cascade-delete trigger assumes that it will never be deferred, which is how come it can get away with telling SPI to not deal with any invoked AFTER triggers immediately. There isn't any good reason to support deferring that trigger, nor is there any way to get into the state short of clueless superuser manipulation of the catalogs. I did look at modifying AfterTriggerFireDeferred, but concluded that's not a great solution on two grounds: * there doesn't seem to be any way to do it without taking a noticeable performance hit (in particular, an extra useless scan over the deferred trigger queue to verify everything's been fired); * the crash occurs all the way back to 8.0, and such a fix would be a great deal more trouble to back-port. So elog it is ... regards, tom lane