Re: Breakage in trigger.c - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: Breakage in trigger.c
Date
Msg-id 20040907001554.GD25809@dcc.uchile.cl
Whole thread Raw
In response to Breakage in trigger.c  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Breakage in trigger.c  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Mon, Sep 06, 2004 at 05:55:29PM -0400, Tom Lane wrote:
> I can't believe that the coding at trigger.c line 2010 ff is correct:
> 
>         /*
>          * Skip executing cancelled events, and events done by
>          * transactions that are not aborted.
>          */
>         if (!(event->dte_event & TRIGGER_DEFERRED_CANCELED) ||
>             (event->dte_event & TRIGGER_DEFERRED_DONE &&
>              TransactionIdIsValid(event->dte_done_xid) &&
>              !TransactionIdDidAbort(event->dte_done_xid)))
>         {
> 
> Surely the sense of this is backwards, and it should be
> 
>         if (!(event->dte_event & TRIGGER_DEFERRED_CANCELED) &&
>             !(event->dte_event & TRIGGER_DEFERRED_DONE &&
>               TransactionIdIsValid(event->dte_done_xid) &&
>               !TransactionIdDidAbort(event->dte_done_xid)))
>         {

Hmmm ... yes, I think you are right.  I remember staring at that test
for a long time, decomposing it on paper, I wrote the version you
propose and then revert to the other one, etc.


> AFAICT we don't actually use TRIGGER_DEFERRED_CANCELED, so the existing
> coding never "skips" at all here, which makes it just a performance loss
> rather than visible misbehavior.

I think the fact that the items are also checked means that they are not
done more than once.

> I'm also concerned about the fact that the per-item states have
> dti_done_xid values distinct from the whole-event value.  It's
> not obvious that a rollback of the subxact that did one item implies
> a rollback of the subxact that last marked the event as scanned.
> Can anyone offer a proof that that's OK?  If it is OK, is it really
> necessary to have per-item dti_done_xid at all?

Not sure about this one.  What happened is that I saw a "done" flag in
the events, another one in the item, so I added a Xid flag to each to be
able to check them indepently.  All in all I don't think I grok this
code enough to make a judgement either way.

> Finally, surely the "Mark the event done" case should advance
> prev_event?  As-is the code is capable of messing up the list links.

Not sure.  Why doesn't the other arm of that conditional need it?


All in all I think this highlights a big shortcoming of this project.
If there was much more extensive regression testing, this code could be
refactored with much more confidence, and this kind of bugs wouldn't be
there on the first place.  Sadly, it's too hard to do unit testing in
this kind of code, which makes refactoring dangerous and costly.

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
Management by consensus: I have decided; you concede.
(Leonard Liu)



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Breakage in trigger.c
Next
From: Tom Lane
Date:
Subject: Re: Breakage in trigger.c