"Steve Wolfe" <steve@iboats.com> writes:
> Some time ago, I presented a problem I was having and receiving "ERROR:
> heap_mark4update: (am)invalid tid" on the PG mailing list, and you had
> indicated that you'd need access to a machine with a copy of the data so
> that you could debg it. I've finally gotten a machine ready,
> [snip]
> As a review, the query in question is this:
> update ma_cart set active = 'f' where ma_cart.expires < 'now' and
> ma_cart.active = 't' and ma_cart.cart_id = ma_inv.cart_id;
> ERROR: heap_mark4update: (am)invalid tid
> The error is *sometimes* fixed by a vacuum, but not always.
I have looked into this, and found that the problem occurs when there
are some rows in ma_cart that join to more than one row in ma_inv.
This results in the UPDATE trying to update the same row more than once.
Ordinarily, the first update of a particular row would succeed and the
rest would fail silently (since they'd see the row as already updated
in the current command). However, because you have a BEFORE UPDATE
trigger defined on this table, the trigger code tries to obtain a row
lock (a SELECT FOR UPDATE type lock) on the target row before it runs
the trigger, so as to ensure that the row will hold still while the
trigger is executed. And heap_mark4update barfs with the above error
message. Why would it do that? Because it's been called with the
wrong current command ID :-(.
It turns out that the first time you call the BEFORE UPDATE trigger
in a particular session, plpgsql calls SPI_prepare which does a
CommandCounterIncrement. So later trigger calls in the same command
see GetCurrentCommandId() returning 1. When we reach a case where
we are reprocessing an already-processed row (which will have been
marked with deletion command ID 0), heap_mark4update complains ---
quite properly I think.
It looks to me like ExecBRUpdateTriggers and ExecBRDeleteTriggers need
to be passed the snapshot command ID being used by the main executor,
rather than assuming they can use GetCurrentCommandId. (Even if one
thinks that SPI_prepare shouldn't do CommandCounterIncrement, it's
quite plausible for a BEFORE trigger to do other queries, so we can't
assume that GetCurrentCommandId returns the same value being used by
the outer query.)
I doubt that the apparent dependencies on vacuum and other queries
are real --- the real issue was whether the trigger had been used
previously in the same session. But without a pre-existing reason
to think of internal backend state as a contributing factor, I can
see that you'd not have thought of it (it took me a heck of a long
time to figure out myself, even while watching it with a debugger).
I'll work up a patch against 7.3 that you can apply to verify that
it solves the problem for you.
regards, tom lane