Thread: Error: heap_mark4update: (am)invalid tid
I'm receiving an error message that I'm not sure how to resolve: heap_mark4update: (am)invalid tid . The offending query is: "update a set active = 'f' where a.expires < 'now' and a.active = 't' and a.cart_id = b.cart_id" The error is a bit inconsistent. Doing a select with the saem criteria will work fine - but the select gives a few more rows in the result than the update will when it works. Often (but not always), vacuuming both tables a and b will get the update to work - but again, not always. Table a has ~200,000 rows, and b has ~1,000 rows. Table a changes very frequently. The PG version is 7.3 on a dual-Athlon Linux machine. I would have first upgraded, but I've seen references in the archives to the same (or similar) problems happening with newer versions as well. If this is a known issue, I'd love to hear the workaround. If not, I'll offer any information I can to help track down the problem. steve
"Steve Wolfe" <nw@codon.com> writes: > I'm receiving an error message that I'm not sure how to resolve: > heap_mark4update: (am)invalid tid . The offending query is: > "update a set active = 'f' where a.expires < 'now' and a.active = 't' and > a.cart_id = b.cart_id" Hmm ... a plain UPDATE doesn't invoke heap_mark4update. Do you have any foreign keys involved here? > The error is a bit inconsistent. Doing a select with the saem criteria > will work fine - but the select gives a few more rows in the result than > the update will when it works. Often (but not always), vacuuming both > tables a and b will get the update to work - but again, not always. So once the error happens, it's persistent? That's pretty interesting. I'd have expected it to depend on concurrent activity ... > If this is a > known issue, I'd love to hear the workaround. AFAIK there is no known cause for this in 7.3.*. regards, tom lane
> > I'm receiving an error message that I'm not sure how to resolve: > > heap_mark4update: (am)invalid tid . The offending query is: > > > "update a set active = 'f' where a.expires < 'now' and a.active = 't' and > > a.cart_id = b.cart_id" > > Hmm ... a plain UPDATE doesn't invoke heap_mark4update. Do you have any > foreign keys involved here? Yes, a.cart_id is a primary key, and b.cart_id references a.cart_id. > > The error is a bit inconsistent. Doing a select with the saem criteria > > will work fine - but the select gives a few more rows in the result than > > the update will when it works. Often (but not always), vacuuming both > > tables a and b will get the update to work - but again, not always. > > So once the error happens, it's persistent? That's pretty interesting. > I'd have expected it to depend on concurrent activity ... Once it's failed, it will kep failing until the vacuuming. Once, the vacuum didn't fix the problem. On that occasion, doing the select (with the same criteria) first, then the update *did* fix the problem. The database is fully vacuumed nightly. It's only after rows have been added.modified in a for a while that the problem crops up. Again, if there's any more information I can offer, let me know. steve
"Steve Wolfe" <nw@codon.com> writes: > Again, if there's any more information I can offer, let me know. I think the next step is to dig into it with a debugger. Can you do that, or offer access to your machine to someone who can? The fact that it's reproducible once triggered should make it easier to attack with a debugger... regards, tom lane
Hi, i have the same problem, and i would like to know if there is now a fix ? Here is some info : Message error: heap_mark4update: (am)invalid tid Postgresql version 7.3.2 Short : it happens on an update. Long : it happens when i do : UPDATE prospect SET abandon = 'O', modifie_par='amoal' WHERE id_prospect = '47111'; WARNING: Error occurred while executing PL/pgSQL function abandon_prospect WARNING: line 3 at SQL statement ERROR: heap_mark4update: (am)invalid tid There is a trigger on the table "prospect" which launches the function "abandon_prospect". This function is the bogus one : create function abandon_prospect() returns opaque as 'begin if (old.abandon != ''O'' or old.abandon is null) and new.abandon = ''O'' then update contact set date_reponse = now(), type_reponse = ''abandon'' where id_prospect = new.id_prospect anddate_reponse is null; end if; return new; end;' language 'plpgsql'; If i do directly an update on the table 'contact' in psql, whatever i do, i have the same error. Vacuuming the base doesn't do anything better. Doing a select on the table (as tried in a previous message) doesn't do anything better. Table 'contact' has 51000 rows. I have the same problem on the production machine and on my test machine. Note that the datas on my test machine have been dumped from the production machine 2 weeks ago. As the bogus request is rare, i don't know when this problem has begun. I have a primary key, five indexes and 13 triggers on this table. And about the debugger option, i don't know how to use one, but i might offer an access to my test machine if needed. If there is something i can do to help to debug this, just let me know. Best regards, David Pradier On Mon, Feb 03, 2003 at 05:39:10PM -0500, Tom Lane wrote: > "Steve Wolfe" <nw@codon.com> writes: > > Again, if there's any more information I can offer, let me know. > > I think the next step is to dig into it with a debugger. Can you do > that, or offer access to your machine to someone who can? > > The fact that it's reproducible once triggered should make it easier > to attack with a debugger... > > regards, tom lane > -- dpradier@apartia.fr 01.42.24.72.74
David Pradier <dpradier@apartia.fr> writes: > Hi, i have the same problem, and i would like to know if there is now a > fix ? Hmm. I am not sure if this is really the same problem as Steve Wolfe had, but the patch for his problem is in CVS: 2003-03-27 09:33 tgl * src/: backend/commands/trigger.c, backend/executor/execMain.c, include/commands/trigger.h (REL7_3_STABLE): GetTupleForTrigger must use outer transaction's command counter for time qual checking, not GetCurrentCommandId. Per test case from Steve Wolfe. You could try getting those diffs from our CVS server and applying them locally to see if it fixes the problem. Or you could wait for 7.3.3 to come out --- but if it turns out that you are not seeing the same thing as Steve did, it would be better to find out before we release 7.3.3. regards, tom lane