Thread: attempted to lock invisible tuple - PG 8.4.1
I'm running our products test suite against PostgreSQL 8.4.1. The test suite runs fine against 8.3.7. With 8.4.1, some of our tests are failing with the exception 'attempted to lock invisible tuple'. The failures are repeatable - they crash every time at the same point. They crash no matter if they are being run in isolation or as part of the larger test suite. Anyone know what we could be doing that triggers that? Looking at our statement logs we don't seem to be doing anything unusual. The failing tests I've checked are running under SERIALIZABLE isolation level, and the database will have been recreated a few instants ago using 'createdb --template test_template_db'. One of the statement logs is at http://paste.ubuntu.com/285983/ - I can't see anything unusual going on but it might help diagnose the problem. -- Stuart Bishop <stuart@stuartbishop.net> http://www.stuartbishop.net/
On 5 Oct 2009, at 8:58, Stuart Bishop wrote: > I'm running our products test suite against PostgreSQL 8.4.1. The test > suite runs fine against 8.3.7. > > With 8.4.1, some of our tests are failing with the exception > 'attempted to lock invisible tuple'. The failures are repeatable - > they crash every time at the same point. They crash no matter if they > are being run in isolation or as part of the larger test suite. > > Anyone know what we could be doing that triggers that? Looking at our > statement logs we don't seem to be doing anything unusual. The failing > tests I've checked are running under SERIALIZABLE isolation level, and > the database will have been recreated a few instants ago using > 'createdb --template test_template_db'. A similar issue was discussed just recently here: http://archives.postgresql.org/pgsql-general/2009-09/msg01219.php That issue involved cursors though (and a serializable isolation level, but you have that). Do you have any triggers that use cursors on the table that the update fails for? > One of the statement logs is at http://paste.ubuntu.com/285983/ - I > can't see anything unusual > going on but it might help diagnose the problem. Alban Hertroys -- Screwing up is the best way to attach something to the ceiling. !DSPAM:737,4ac9bad911688389419940!
On Mon, Oct 5, 2009 at 4:22 PM, Alban Hertroys <dalroi@solfertje.student.utwente.nl> wrote: > On 5 Oct 2009, at 8:58, Stuart Bishop wrote: > >> I'm running our products test suite against PostgreSQL 8.4.1. The test >> suite runs fine against 8.3.7. >> >> With 8.4.1, some of our tests are failing with the exception >> 'attempted to lock invisible tuple'. The failures are repeatable - >> they crash every time at the same point. They crash no matter if they >> are being run in isolation or as part of the larger test suite. >> >> Anyone know what we could be doing that triggers that? Looking at our >> statement logs we don't seem to be doing anything unusual. The failing >> tests I've checked are running under SERIALIZABLE isolation level, and >> the database will have been recreated a few instants ago using >> 'createdb --template test_template_db'. > > A similar issue was discussed just recently here: > http://archives.postgresql.org/pgsql-general/2009-09/msg01219.php > > That issue involved cursors though (and a serializable isolation level, but > you have that). Do you have any triggers that use cursors on the table that > the update fails for? There is a trigger on that table, and it is certainly the culprit as can be seen here (different table, same trigger): launchpad_dev=# begin; BEGIN launchpad_dev=# set transaction_isolation to serializable; SET launchpad_dev=# update bug set name='foooz' where id=1; ERROR: attempted to lock invisible tuple launchpad_dev=# abort; ROLLBACK launchpad_dev=# alter table bug disable trigger tsvectorupdate; ALTER TABLE launchpad_dev=# begin; BEGIN launchpad_dev=# set transaction_isolation to serializable; SET launchpad_dev=# update bug set name='foooz' where id=1; UPDATE 1 launchpad_dev=# abort; ROLLBACK I haven't had luck reducing this to a test case though. I'll give it another shot tomorrow. Here are some more details for the audience: launchpad_dev=# \d bug Table "public.bug" Column | Type | Modifiers ------------------------+-----------------------------+------------------------------------------------------------------------------------ id | integer | not null default nextval('bug_id_seq'::regclass) datecreated | timestamp without time zone | not null default timezone('UTC'::text, ('now'::text)::timestamp(6) with time zone) name | text | title | text | not null description | text | not null owner | integer | not null duplicateof | integer | fti | ts2.tsvector | private | boolean | not null default false security_related | boolean | not null default false date_last_updated | timestamp without time zone | not null default timezone('UTC'::text, now()) date_made_private | timestamp without time zone | who_made_private | integer | date_last_message | timestamp without time zone | number_of_duplicates | integer | not null default 0 message_count | integer | not null default 0 users_affected_count | integer | default 0 users_unaffected_count | integer | default 0 hotness | integer | not null default 0 Indexes: "bug_pkey" PRIMARY KEY, btree (id) "bug_name_key" UNIQUE, btree (name) "bug__date_last_message__idx" btree (date_last_message) "bug__date_last_updated__idx" btree (date_last_updated) CLUSTER "bug__datecreated__idx" btree (datecreated) "bug__hotness__idx" btree (hotness) "bug__users_affected_count__idx" btree (users_affected_count) "bug__users_unaffected_count__idx" btree (users_unaffected_count) "bug__who_made_private__idx" btree (who_made_private) WHERE who_made_private IS NOT NULL "bug_duplicateof_idx" btree (duplicateof) "bug_fti" gist (fti) "bug_owner_idx" btree (owner) Check constraints: "notduplicateofself" CHECK (NOT id = duplicateof) "sane_description" CHECK (ltrim(description) <> ''::text AND char_length(description) <= 50000) "valid_bug_name" CHECK (valid_bug_name(name)) Foreign-key constraints: "bug__who_made_private__fk" FOREIGN KEY (who_made_private) REFERENCES person(id) "bug_duplicateof_fk" FOREIGN KEY (duplicateof) REFERENCES bug(id) "bug_owner_fk" FOREIGN KEY (owner) REFERENCES person(id) Referenced by: TABLE "bugactivity" CONSTRAINT "$1" FOREIGN KEY (bug) REFERENCES bug(id) TABLE "bugsubscription" CONSTRAINT "$2" FOREIGN KEY (bug) REFERENCES bug(id) TABLE "bug" CONSTRAINT "bug_duplicateof_fk" FOREIGN KEY (duplicateof) REFERENCES bug(id) TABLE "bugaffectsperson" CONSTRAINT "bugaffectsperson_bug_fkey" FOREIGN KEY (bug) REFERENCES bug(id) TABLE "bugattachment" CONSTRAINT "bugattachment_bug_fk" FOREIGN KEY (bug) REFERENCES bug(id) TABLE "bugbranch" CONSTRAINT "bugbranch_bug_fkey" FOREIGN KEY (bug) REFERENCES bug(id) TABLE "bugcve" CONSTRAINT "bugcve_bug_fk" FOREIGN KEY (bug) REFERENCES bug(id) TABLE "bugmessage" CONSTRAINT "bugmessage__bug__fk" FOREIGN KEY (bug) REFERENCES bug(id) TABLE "bugnomination" CONSTRAINT "bugnomination__bug__fk" FOREIGN KEY (bug) REFERENCES bug(id) TABLE "bugnotification" CONSTRAINT "bugnotification_bug_fkey" FOREIGN KEY (bug) REFERENCES bug(id) TABLE "bugnotificationarchive" CONSTRAINT "bugnotificationarchive__bug__fk" FOREIGN KEY (bug) REFERENCES bug(id) TABLE "bugpackageinfestation" CONSTRAINT "bugpackageinfestation_bug_fk" FOREIGN KEY (bug) REFERENCES bug(id) TABLE "bugproductinfestation" CONSTRAINT "bugproductinfestation_bug_fk" FOREIGN KEY (bug) REFERENCES bug(id) TABLE "bugtask" CONSTRAINT "bugtask__bug__fk" FOREIGN KEY (bug) REFERENCES bug(id) TABLE "bugwatch" CONSTRAINT "bugwatch_bug_fk" FOREIGN KEY (bug) REFERENCES bug(id) TABLE "hwsubmissionbug" CONSTRAINT "hwsubmissionbug_bug_fkey" FOREIGN KEY (bug) REFERENCES bug(id) TABLE "mentoringoffer" CONSTRAINT "mentoringoffer_bug_fkey" FOREIGN KEY (bug) REFERENCES bug(id) TABLE "questionbug" CONSTRAINT "questionbug__bug__fk" FOREIGN KEY (bug) REFERENCES bug(id) TABLE "specificationbug" CONSTRAINT "specificationbug_bug_fk" FOREIGN KEY (bug) REFERENCES bug(id) Triggers: set_bug_number_of_duplicates_t AFTER INSERT OR DELETE OR UPDATE ON bug FOR EACH ROW EXECUTE PROCEDURE set_bug_number_of_duplicates() Disabled triggers: tsvectorupdate BEFORE INSERT OR UPDATE ON bug FOR EACH ROW EXECUTE PROCEDURE ts2.ftiupdate('name', 'a', 'title', 'b', 'description', 'd') launchpad_dev=# \df+ ts2.ftiupdate List of functions -[ RECORD 1 ]-------+----------------------------------------------------------------------------- Schema | ts2 Name | ftiupdate Result data type | trigger Argument data types | Type | trigger Volatility | volatile Owner | stub Language | plpythonu Source code | : new = TD["new"] : args = TD["args"][:] : : # Short circuit if none of the relevant columns have been : # modified and fti is not being set to NULL (setting the fti : # column to NULL is thus how we can force a rebuild of the fti : # column). : if TD["event"] == "UPDATE" and new["fti"] != None: : old = TD["old"] : relevant_modification = False : for column_name in args[::2]: : if new[column_name] != old[column_name]: : relevant_modification = True : break : if not relevant_modification: : return "OK" : : # Generate an SQL statement that turns the requested : # column values into a weighted tsvector : sql = [] : for i in range(0, len(args), 2): : sql.append( : "ts2.setweight(ts2.to_tsvector('default', coalesce(" : "substring(ltrim($%d) from 1 for 2500),''))," : "CAST($%d AS \"char\"))" % (i + 1, i + 2)) : args[i] = new[args[i]] : : sql = "SELECT %s AS fti" % "||".join(sql) : : # Execute and store in the fti column : plan = plpy.prepare(sql, ["text", "char"] * (len(args)/2)) : new["fti"] = plpy.execute(plan, args, 1)[0]["fti"] : : # Tell PostgreSQL we have modified the data : return "MODIFY" : Description | Trigger function that keeps the fti tsvector column up to date. -- Stuart Bishop <stuart@stuartbishop.net> http://www.stuartbishop.net/
Stuart Bishop wrote: > On Mon, Oct 5, 2009 at 4:22 PM, Alban Hertroys > <dalroi@solfertje.student.utwente.nl> wrote: > > A similar issue was discussed just recently here: > > http://archives.postgresql.org/pgsql-general/2009-09/msg01219.php > > > > That issue involved cursors though (and a serializable isolation level, but > > you have that). Do you have any triggers that use cursors on the table that > > the update fails for? > > There is a trigger on that table, and it is certainly the culprit as > can be seen here (different table, same trigger): I don't think the committed patch touches anything involved in what you're testing, but if you could grab CVS tip from the 8.4 branch (or the snapshot from ftp.postgresql.org:/pub/snapshot/stable/8.4 ) and give it a try, that'd be great. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Mon, Oct 5, 2009 at 11:00 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Stuart Bishop wrote: >> On Mon, Oct 5, 2009 at 4:22 PM, Alban Hertroys >> <dalroi@solfertje.student.utwente.nl> wrote: > >> > A similar issue was discussed just recently here: >> > http://archives.postgresql.org/pgsql-general/2009-09/msg01219.php >> > >> > That issue involved cursors though (and a serializable isolation level, but >> > you have that). Do you have any triggers that use cursors on the table that >> > the update fails for? >> >> There is a trigger on that table, and it is certainly the culprit as >> can be seen here (different table, same trigger): > > I don't think the committed patch touches anything involved in what > you're testing, but if you could grab CVS tip from the 8.4 branch (or > the snapshot from ftp.postgresql.org:/pub/snapshot/stable/8.4 ) and give > it a try, that'd be great. I trigger the same error with a freshly built snapshot. -- Stuart Bishop <stuart@stuartbishop.net> http://www.stuartbishop.net/
Attachment
Stuart Bishop wrote: > > > On Mon, Oct 5, 2009 at 11:00 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > >Stuart Bishop wrote: > >>On Mon, Oct 5, 2009 at 4:22 PM, Alban Hertroys > >><dalroi@solfertje.student.utwente.nl> wrote: > > > >>> A similar issue was discussed just recently here: > >>> http://archives.postgresql.org/pgsql-general/2009-09/msg01219.php > >>> > >>> That issue involved cursors though (and a serializable isolation level, but > >>> you have that). Do you have any triggers that use cursors on the table that > >>> the update fails for? > >> > >>There is a trigger on that table, and it is certainly the culprit as > >>can be seen here (different table, same trigger): > > > >I don't think the committed patch touches anything involved in what > >you're testing, but if you could grab CVS tip from the 8.4 branch (or > >the snapshot from ftp.postgresql.org:/pub/snapshot/stable/8.4 ) and give > >it a try, that'd be great. > > I trigger the same error with a freshly built snapshot. mmkay. So, any luck in constructing a test case? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Tue, Oct 6, 2009 at 8:28 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Stuart Bishop wrote: >> >> >> On Mon, Oct 5, 2009 at 11:00 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote: >> >Stuart Bishop wrote: >> >>On Mon, Oct 5, 2009 at 4:22 PM, Alban Hertroys >> >><dalroi@solfertje.student.utwente.nl> wrote: >> > >> >>> A similar issue was discussed just recently here: >> >>> http://archives.postgresql.org/pgsql-general/2009-09/msg01219.php >> >>> >> >>> That issue involved cursors though (and a serializable isolation level, but >> >>> you have that). Do you have any triggers that use cursors on the table that >> >>> the update fails for? >> >> >> >>There is a trigger on that table, and it is certainly the culprit as >> >>can be seen here (different table, same trigger): >> > >> >I don't think the committed patch touches anything involved in what >> >you're testing, but if you could grab CVS tip from the 8.4 branch (or >> >the snapshot from ftp.postgresql.org:/pub/snapshot/stable/8.4 ) and give >> >it a try, that'd be great. >> >> I trigger the same error with a freshly built snapshot. > > mmkay. So, any luck in constructing a test case? Yes. Just no luck getting it sent to the mailing list - seems to silently drop emails with attachments on me :-P I've managed to get a self contained test case assembled. I'm not sure what to make of this. The test case builds a small database from a dump (one table), and triggers the 'invisible tuple' error. If I touch the table though, such as ALTER TABLE or just updating some data in in, the problem disappears. $ sh invisible.sh [...] BEGIN SET ERROR: attempted to lock invisible tuple ROLLBACK BEGIN UPDATE 1 COMMIT BEGIN SET UPDATE 1 ROLLBACK The test case (invisible.sh) and required dump (foodump.sql - 60k) are at http://www.stuartbishop.net/invisible/ -- Stuart Bishop <stuart@stuartbishop.net> http://www.stuartbishop.net/
Stuart Bishop wrote: > On Tue, Oct 6, 2009 at 8:28 PM, Alvaro Herrera > <alvherre@commandprompt.com> wrote: > > mmkay. So, any luck in constructing a test case? > > Yes. Just no luck getting it sent to the mailing list - seems to > silently drop emails with attachments on me :-P :-( > The test case (invisible.sh) and required dump (foodump.sql - 60k) are > at http://www.stuartbishop.net/invisible/ Got it, thanks, looking. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera wrote: > Stuart Bishop wrote: > > The test case (invisible.sh) and required dump (foodump.sql - 60k) are > > at http://www.stuartbishop.net/invisible/ > > Got it, thanks, looking. Here's a slightly smaller test case; basically I removed the tsearch2 cruft and extra columns in the table. One thing of note is that if the COPY commands is reduced to occupy less than one page in the target table, the problem does not occur. To reproduce, restore the attached file and run BEGIN; SET transaction_isolation TO SERIALIZABLE; UPDATE Bug2 SET name='foobar' WHERE id=1; ABORT; -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Attachment
Alvaro Herrera wrote: > Here's a slightly smaller test case; basically I removed the tsearch2 > cruft and extra columns in the table. One thing of note is that if the > COPY commands is reduced to occupy less than one page in the target > table, the problem does not occur. And here's an even smaller one that doesn't involve plpython. create or replace function upd() returns trigger language plpgsql as $$ declare rec record; begin new.description = 'updated in trigger'; return new; end; $$; create table onetest ( id int, filler text, description text ); create trigger onetest_t before update on onetest for each row execute procedure upd(); insert into onetest select a, repeat('xyzxz', 100), 'new' from generate_series(1, 50) a; BEGIN; SET transaction isolation level SERIALIZABLE; UPDATE onetest SET description = 'no no', id = 1 where id = 1; commit; -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
I think the previous patch to snapmgr.c was mistaken. Instead of fixing a single trouble spot, we're better off fixing PushActiveSnapshot so that any use of it that involves a snapshot that's subject to a future command counter update should create a new copy. This is correct because the 8.3 code used to do CopySnapshot anytime it was setting ActiveSnapshot. So we're not disrupting any behavior here -- we're merely restoring what was the previous customary behavior. The attached patch implements this idea. It reverts the code changes done in the previous patch, because they're obviously no longer necessary. The new regression test that it added still passes with this new patch. I will add a new one for this new problem. (This new patch restores CopySnapshot as a static function too). -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Attachment
Alvaro Herrera wrote: > create trigger onetest_t before update on onetest for each row > execute procedure upd(); > > insert into onetest select a, repeat('xyzxz', 100), 'new' from generate_series(1, 50) a; > > BEGIN; > SET transaction isolation level SERIALIZABLE; > UPDATE onetest SET description = 'no no', id = 1 where id = 1; > commit; What I don't understand is why this works when the update uses the same target page. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > I think the previous patch to snapmgr.c was mistaken. Instead of fixing > a single trouble spot, we're better off fixing PushActiveSnapshot so > that any use of it that involves a snapshot that's subject to a future > command counter update should create a new copy. For a while I was thinking this was useless, because surely CurrentSnapshot would always be pointing to static storage, no? However I realized that this is not the case in serializable transactions, because such transaction need to register the current snapshot and thus it creates a copy of it. So the problem is that in a serializable snapshot, ActiveSnapshot may be pointing to the exact same copy that a CommandCounterIncrement would modify. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Stuart Bishop wrote: > >I don't think the committed patch touches anything involved in what > >you're testing, but if you could grab CVS tip from the 8.4 branch (or > >the snapshot from ftp.postgresql.org:/pub/snapshot/stable/8.4 ) and give > >it a try, that'd be great. > > I trigger the same error with a freshly built snapshot. If you're up for a bit of patching, please test with the attached patch applied. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Attachment
On Wed, Oct 7, 2009 at 3:09 AM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Stuart Bishop wrote: > >> >I don't think the committed patch touches anything involved in what >> >you're testing, but if you could grab CVS tip from the 8.4 branch (or >> >the snapshot from ftp.postgresql.org:/pub/snapshot/stable/8.4 ) and give >> >it a try, that'd be great. >> >> I trigger the same error with a freshly built snapshot. > > If you're up for a bit of patching, please test with the attached patch > applied. The patch is working. I am no longer able to trigger the 'attempted to lock invisible tuple' error. Thanks :-) -- Stuart Bishop <stuart@stuartbishop.net> http://www.stuartbishop.net/