Thread: Partitioning/inherited tables vs FKs
Hi, we came across an interesting problem. =# create table parent (id serial primary key, t text); NOTICE: CREATE TABLE will create implicit sequence "parent_id_seq" for serial column "parent.id" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "parent_pkey" for table "parent" CREATE TABLE =# create table child () inherits (parent); CREATE TABLE =# create table refer (id serial primary key, parent_id integer references parent (id)); NOTICE: CREATE TABLE will create implicit sequence "refer_id_seq" for serial column "refer.id" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "refer_pkey" for table "refer" CREATE TABLE =# begin; BEGIN =# insert into child (t) values ('a') returning id;id ---- 1 (1 sor) INSERT 0 1 =# select * from parent;id | t ----+--- 1 | a (1 sor) =# insert into refer (parent_id) values (1); ERROR: insert or update on table "refer" violates foreign key constraint "refer_parent_id_fkey" DETAIL: Key (parent_id)=(1) is not present in table "parent". The use case for this was there were different news items, and there were another table for summaries, that could point to any of the news items table. Another use case could be a large partitioned table with an FK to the main table where the referring table might only contain very few "interesting" data. No matter what are the semantics, the parent table in the inheritance chain cannot be used as and endpoint for FKs. Is it a bug, or intentional? The only solution currently is that the referring table has to be partitioned the same way as the referred table in the FK, and its parent table has to be queried. Best regards, Zoltán Böszörményi -- Bible has answers for everything. Proof: "But let your communication be, Yea, yea; Nay, nay: for whatsoever is more than these cometh of evil." (Matthew 5:37) - basics of digital technology. "May your kingdom come" - superficial description of plate tectonics ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH http://www.postgresql.at/
On May 6, 2010, at 10:52 , Boszormenyi Zoltan wrote: > =# create table parent (id serial primary key, t text); > ... > =# create table child () inherits (parent); > ... > =# create table refer (id serial primary key, parent_id integer > ... > =# insert into child (t) values ('a') returning id; > ... > =# select * from parent; > id | t > ----+--- > 1 | a > (1 sor) > > =# insert into refer (parent_id) values (1); > ERROR: insert or update on table "refer" violates foreign key > constraint "refer_parent_id_fkey" > DETAIL: Key (parent_id)=(1) is not present in table "parent". > > The use case for this was there were different news items, > and there were another table for summaries, that could point > to any of the news items table. Another use case could be > a large partitioned table with an FK to the main table where > the referring table might only contain very few "interesting" data. Yeah, this is a long-standing issue with inheritance. Table inheritance in postgres isn't much more than an implicit UNIONdone on selects plus some logic in ALTER TABLE to keep propagate structural changes. Indices and constraints basicallyalways behave as if ONLY had been specified. I'm not even sure if the ids are globally unique in your example -it might be that each child's "id serial" column gets its very own sequence. One possible workaround is no create a table, say referred_ids, that contains all the ids from parent and all of its children,kept up-to-date via triggers, and point the FK constraint to that table. That also allows for a global unique constrainton the ids by definition a suitable unique or primary key constraint on referred_ids. What lies at the heart of this problem is the lack of multi-table indices and hence multi-table unique constraints in postgres.AFAIK with those in place the rest amounts to the removal of ONLY from the constraint check queries plus some codeto propagate constraint triggers to child tables. best regards, Florian Pflug
2010/5/6 Boszormenyi Zoltan <zb@cybertec.at>: > > =# insert into refer (parent_id) values (1); > ERROR: insert or update on table "refer" violates foreign key > constraint "refer_parent_id_fkey" > DETAIL: Key (parent_id)=(1) is not present in table "parent". > > The use case for this was there were different news items, > and there were another table for summaries, that could point > to any of the news items table. Another use case could be > a large partitioned table with an FK to the main table where > the referring table might only contain very few "interesting" data. > > No matter what are the semantics, the parent table in the > inheritance chain cannot be used as and endpoint for FKs. > > Is it a bug, or intentional? i would call it a bug, but this is a known issue > > The only solution currently is that the referring table has to be > partitioned the same way as the referred table in the FK, and > its parent table has to be queried. > no, you can install a trigger on the child table that verifies the existence of the id on your partitioned parent table, the SELECT you'll use inside that trigger will look at the entire set of tables (as long as you don't use FROM ONLY) also could be useful to put an index (even a PK) on every child to ensure uniqueness and make the SELECT more efficient, and of course a check constraint in every child emulating a partition key -- Jaime Casanova www.2ndQuadrant.com Soporte y capacitación de PostgreSQL
On Thu, May 6, 2010 at 6:37 AM, Jaime Casanova <jaime@2ndquadrant.com> wrote: > i would call it a bug, but this is a known issue > >> >> The only solution currently is that the referring table has to be >> partitioned the same way as the referred table in the FK, and >> its parent table has to be queried. >> > > no, you can install a trigger on the child table that verifies the > existence of the id on your partitioned parent table, the SELECT > you'll use inside that trigger will look at the entire set of tables > (as long as you don't use FROM ONLY) > > also could be useful to put an index (even a PK) on every child to > ensure uniqueness and make the SELECT more efficient, and of course a > check constraint in every child emulating a partition key The referential integrity triggers contain some extra magic that isn't easily simulatable in userland, and that is necessary to make the foreign key constraints airtight. We've discussed this previously but I don't remember which thread it was or the details of when things blow up. I think it's something like this: the parent has a tuple that is not referenced by any child. Transaction 1 begins, deletes the parent tuple (checking that it has no children), and pauses. Transaction 2 begins, adds a child tuple that references the parent tuple (checking that the parent exists, which it does), and commits. Transaction 1 commits. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Florian Pflug <fgp@phlo.org> writes: > What lies at the heart of this problem is the lack of multi-table > indices and hence multi-table unique constraints in postgres. AFAIK > with those in place the rest amounts to the removal of ONLY from the > constraint check queries plus some code to propagate constraint > triggers to child tables. Well, the lack of multi-table indexes certainly is the heart of the problem, but I'm not sure that inventing such a thing is the solution. Quite aside from the implementation difficulties involved in it, doing things that way would destroy some of the major reasons to partition tables at all: * the index grows as the size of the total data set, it's not limited by partition size * can't cheaply drop one partition any more, you have to vacuum the (big) index first * probably some other things I'm not thinking of at the moment. I think the real solution is to upgrade the partitioning infrastructure so that we can understand that columns are unique across the whole partitioned table, when the partitioning is done on that column and each partition has a unique index. regards, tom lane
On May 6, 2010, at 16:38 , Tom Lane wrote: > Florian Pflug <fgp@phlo.org> writes: >> What lies at the heart of this problem is the lack of multi-table >> indices and hence multi-table unique constraints in postgres. AFAIK >> with those in place the rest amounts to the removal of ONLY from the >> constraint check queries plus some code to propagate constraint >> triggers to child tables. > > Well, the lack of multi-table indexes certainly is the heart of the > problem, but I'm not sure that inventing such a thing is the solution. > Quite aside from the implementation difficulties involved in it, > doing things that way would destroy some of the major reasons to > partition tables at all: > > * the index grows as the size of the total data set, it's not limited > by partition size > > * can't cheaply drop one partition any more, you have to vacuum the > (big) index first > > * probably some other things I'm not thinking of at the moment. > > I think the real solution is to upgrade the partitioning infrastructure > so that we can understand that columns are unique across the whole > partitioned table, when the partitioning is done on that column and each > partition has a unique index. True, for partitioned tables multi-table indices reintroduce some of the performance problems that partitioning is supposedto avoid. But OTOH if you use table inheritance as a means to map data models (e.g. EER) more naturally to SQL, then multi-table indiceshave advantages over the partitioning-friendly solution you sketched above. With a multi-table index, SELECT * FROM PARENT WHERE ID=?? has complexity LOG(N*M) where M is the number of tables inheritingfrom PARENT (including PARENT itself), and N the average number of rows in these tables. With one index per child,the complexity is M*LOG(N) which is significantly higher if M is large. Constraint exclusion could reduce that to LOG(N),but only if each child is has it's own private ID range which precludes ID assignment from a global sequence and hencemakes ID assignment much more complex and error-prone. Anyway, I was wondering why we need guaranteed uniqueness for FK relationships anyway. Because if we don't (which I didn'tcheck prior to posting this I must admit), then why can't we simply remove the "ONLY" from the RI queries and let ALTERTABLE attach the RI triggers not only to the parent but also to all children. What am I missing? best regards, Florian Pflug
Florian Pflug <fgp@phlo.org> writes: > Anyway, I was wondering why we need guaranteed uniqueness for FK > relationships anyway. It's required by spec, and the semantics aren't terribly sensible without it. regards, tom lane
> The referential integrity triggers contain some extra magic that isn't > easily simulatable in userland, and that is necessary to make the > foreign key constraints airtight. We've discussed this previously but > I don't remember which thread it was or the details of when things > blow up. I think it's something like this: the parent has a tuple > that is not referenced by any child. Transaction 1 begins, deletes > the parent tuple (checking that it has no children), and pauses. > Transaction 2 begins, adds a child tuple that references the parent > tuple (checking that the parent exists, which it does), and commits. > Transaction 1 commits. Will SELECT ... FOR SHARE not help? Regargs, Dmitry
On Tue, May 11, 2010 at 2:16 AM, Dmitry Fefelov <fozzy@ac-sw.com> wrote: >> The referential integrity triggers contain some extra magic that isn't >> easily simulatable in userland, and that is necessary to make the >> foreign key constraints airtight. We've discussed this previously but >> I don't remember which thread it was or the details of when things >> blow up. I think it's something like this: the parent has a tuple >> that is not referenced by any child. Transaction 1 begins, deletes >> the parent tuple (checking that it has no children), and pauses. >> Transaction 2 begins, adds a child tuple that references the parent >> tuple (checking that the parent exists, which it does), and commits. >> Transaction 1 commits. > > Will SELECT ... FOR SHARE not help? Try it, with the example above. I think you'll find that it doesn't. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On 2010-05-11 14:29 +0200, Robert Haas wrote: > On Tue, May 11, 2010 at 2:16 AM, Dmitry Fefelov <fozzy@ac-sw.com> wrote: >>> The referential integrity triggers contain some extra magic that isn't >>> easily simulatable in userland, and that is necessary to make the >>> foreign key constraints airtight. We've discussed this previously but >>> I don't remember which thread it was or the details of when things >>> blow up. I think it's something like this: the parent has a tuple >>> that is not referenced by any child. Transaction 1 begins, deletes >>> the parent tuple (checking that it has no children), and pauses. >>> Transaction 2 begins, adds a child tuple that references the parent >>> tuple (checking that the parent exists, which it does), and commits. >>> Transaction 1 commits. >> >> Will SELECT ... FOR SHARE not help? > > Try it, with the example above. I think you'll find that it doesn't. TXA => delete from foo; DELETE 1 TXB => select a from foo for share; -- waits What am I missing? Regards, Marko Tiikkaja
2010/5/11 Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>: > On 2010-05-11 14:29 +0200, Robert Haas wrote: > >> On Tue, May 11, 2010 at 2:16 AM, Dmitry Fefelov <fozzy@ac-sw.com> wrote: >> >>>> The referential integrity triggers contain some extra magic that isn't >>>> easily simulatable in userland, and that is necessary to make the >>>> foreign key constraints airtight. We've discussed this previously but >>>> I don't remember which thread it was or the details of when things >>>> blow up. I think it's something like this: the parent has a tuple >>>> that is not referenced by any child. Transaction 1 begins, deletes >>>> the parent tuple (checking that it has no children), and pauses. >>>> Transaction 2 begins, adds a child tuple that references the parent >>>> tuple (checking that the parent exists, which it does), and commits. >>>> Transaction 1 commits. >>> >>> Will SELECT ... FOR SHARE not help? >> >> Try it, with the example above. I think you'll find that it doesn't. > > TXA => delete from foo; > DELETE 1 > > TXB => select a from foo for share; -- waits > > What am I missing? Slightly verbose example of what can go wrong: CREATE TABLE a (i int PRIMARY KEY); INSERT INTO a VALUES (1); CREATE TABLE b (a_id int); >>>>>> Start with T1: T1> BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE; BEGIN T1> SELECT i FROM a WHERE i = 1 FOR SHARE; -- Does a with i = 1 exist?i ---1 (1 Zeile) T1> INSERT INTO b VALUES (1); -- Great, it existed, insert row pointing to it in b. INSERT 0 1 >>>>>> Switch to T2: T2> BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE; -- Evil transaction T2 is intervening! BEGIN T2> SELECT i FROM a WHERE i = 1 FOR SHARE; -- Lock a with i = 1 FOR SHARE.i ---1 (1 Zeile) T2> SELECT a_id FROM b WHERE a_id = 1; -- Check whether it's got anything pointing to it.a_id ------ (0 Zeilen) T2> DELETE FROM a WHERE i = 1; -- Nope, so delete a with i = 1 (this blocks, because T1 is still holding the lock). >>>>>> Switch to T1: 1> COMMIT; -- Commit the insertion of a row pointing to a with i = 1 (this releases all locks that T1 is holding). COMMIT >>>>>> T2 continues: DELETE 1 T2> COMMIT; -- Commit the deletion of a with i = 1. COMMIT T2> SELECT * FROM b EXCEPT SELECT * FROM a; -- Check for inconsistencies.a_id ------ 1 (1 Zeile) Woops. Nicolas
This is getting way off topic, but: On 5/11/10 3:55 PM +0300, Nicolas Barbier wrote: > T2> SELECT i FROM a WHERE i = 1 FOR SHARE; -- Lock a with i = 1 FOR SHARE. > i > --- > 1 > (1 Zeile) > > T2> SELECT a_id FROM b WHERE a_id = 1; -- Check whether it's got > anything pointing to it. > a_id > ------ > (0 Zeilen) > > T2> DELETE FROM a WHERE i = 1; -- Nope, so delete a with i = 1 (this > blocks, because T1 is still holding the lock). Obviously you wouldn't delete anything with a SHARE lock. Regards, Marko Tiikkaja
2010/5/11 Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi>: > This is getting way off topic, but: > > On 5/11/10 3:55 PM +0300, Nicolas Barbier wrote: >> >> T2> SELECT i FROM a WHERE i = 1 FOR SHARE; -- Lock a with i = 1 FOR >> SHARE. >> i >> --- >> 1 >> (1 Zeile) >> >> T2> SELECT a_id FROM b WHERE a_id = 1; -- Check whether it's got >> anything pointing to it. >> a_id >> ------ >> (0 Zeilen) >> >> T2> DELETE FROM a WHERE i = 1; -- Nope, so delete a with i = 1 (this >> blocks, because T1 is still holding the lock). > > Obviously you wouldn't delete anything with a SHARE lock. So where would you put a SELECT ... FOR SHARE to fix the problem? (Per "Will SELECT ... FOR SHARE not help?".) I agree that my second FOR SHARE doesn't really make a lot of sense, but that doesn't disprove the fact that the first FOR SHARE fails to ensure consistency. Nicolas
On 5/11/10 4:07 PM +0300, Nicolas Barbier wrote: > 2010/5/11 Marko Tiikkaja<marko.tiikkaja@cs.helsinki.fi>: > >> This is getting way off topic, but: >> >> On 5/11/10 3:55 PM +0300, Nicolas Barbier wrote: >>> >>> T2> SELECT i FROM a WHERE i = 1 FOR SHARE; -- Lock a with i = 1 FOR >>> SHARE. >>> i >>> --- >>> 1 >>> (1 Zeile) >>> >>> T2> SELECT a_id FROM b WHERE a_id = 1; -- Check whether it's got >>> anything pointing to it. >>> a_id >>> ------ >>> (0 Zeilen) >>> >>> T2> DELETE FROM a WHERE i = 1; -- Nope, so delete a with i = 1 (this >>> blocks, because T1 is still holding the lock). >> >> Obviously you wouldn't delete anything with a SHARE lock. > > So where would you put a SELECT ... FOR SHARE to fix the problem? (Per > "Will SELECT ... FOR SHARE not help?".) I agree that my second FOR > SHARE doesn't really make a lot of sense, but that doesn't disprove > the fact that the first FOR SHARE fails to ensure consistency. I took the "SELECT ... FOR SHARE" suggestion in a more general way, suggesting the use of row-level locks. T2 should be holding an exclusive row-level lock (SELECT ... FOR UPDATE) when checking for references. Regards, Marko Tiikkaja
On 5/11/10 4:11 PM +0300, I wrote: > I took the "SELECT ... FOR SHARE" suggestion in a more general way, > suggesting the use of row-level locks. T2 should be holding an > exclusive row-level lock (SELECT ... FOR UPDATE) when checking for > references. Hmm. Right, that transaction wouldn't see the rows in a serializable transaction so this doesn't solve the problem. Regards, Marko Tiikkaja
Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes: > On 5/11/10 4:11 PM +0300, I wrote: >> I took the "SELECT ... FOR SHARE" suggestion in a more general way, >> suggesting the use of row-level locks. T2 should be holding an >> exclusive row-level lock (SELECT ... FOR UPDATE) when checking for >> references. > Hmm. Right, that transaction wouldn't see the rows in a serializable > transaction so this doesn't solve the problem. Yeah. The hidden "magic" in the built-in FK code is not locking (it does actually use SELECT FOR SHARE to lock rows). Rather, it's about doing tuple liveness checks using snapshots that aren't available at the SQL level, particularly in serializable transactions. regards, tom lane
Nicolas Barbier <nicolas.barbier@gmail.com> wrote: >>>>>>> Switch to T1: > > 1> COMMIT; -- Commit the insertion... > COMMIT > >>>>>>> T2 continues: > > DELETE 1 > T2> COMMIT; -- Commit the deletion of a with i = 1. > COMMIT > T2> SELECT * FROM b EXCEPT SELECT * FROM a; > a_id > ------ > 1 > (1 Zeile) > > Woops. This is exactly the sort of issue for which true serializable behavior will provide a solution. I will be offering a patch to implement that for 9.1 once 9.0 settles down. FWIW when you commit T1, the patched code rolls back T2 with this message: T2> DELETE FROM a WHERE i = 1; ERROR: could not serialize access due to read/write dependencies among transactions HINT: The transaction might succeed if retried. Thanks for the example; I will it to the others. -Kevin
SHARE locks vs. DELETE in SERIALIZABLE mode (Was: Partitioning/inherited tables vs FKs)
From
Florian Pflug
Date:
On May 11, 2010, at 13:29 , Robert Haas wrote: > On Tue, May 11, 2010 at 2:16 AM, Dmitry Fefelov <fozzy@ac-sw.com> wrote: >>> The referential integrity triggers contain some extra magic that isn't >>> easily simulatable in userland, and that is necessary to make the >>> foreign key constraints airtight. We've discussed this previously but >>> I don't remember which thread it was or the details of when things >>> blow up. I think it's something like this: the parent has a tuple >>> that is not referenced by any child. Transaction 1 begins, deletes >>> the parent tuple (checking that it has no children), and pauses. >>> Transaction 2 begins, adds a child tuple that references the parent >>> tuple (checking that the parent exists, which it does), and commits. >>> Transaction 1 commits. >> >> Will SELECT ... FOR SHARE not help? > > Try it, with the example above. I think you'll find that it doesn't. That example does in fact work. Here is the precise sequence of commands I tested with constraint checking triggers implementedin PL/PGSQL. C1: BEGIN C1: DELETE FROM parent WHERE parent_id = 0 C2: BEGIN C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE -- Optional C2: INSERT INTO child (parent_id) VALUES (0) -- Waits for C1 to commit C1: COMMIT -- Now C2 fails either with a constraint_violation or serialization_error The reason this works is that C2's attempt to SHARE-lock the parent row blocks until C1 commits. In READ COMMITTED mode C2will then realize that the parent row is now gone. In SERIALIZABLE mode it won't get that far, because the SHARE-lockingattempt throws a serialization error since the parent row was concurrently modified. The serialization error, however, disappears if the two transactions are swapped. The following sequence of commands succeeds,even though the FK constraint is not satisfied. C1: BEGIN C1: INSERT INTO child (parent_id) VALUES (0) C2: BEGIN C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE C2: SELECT TRUE -- Take snapshot *before* C1 commits C1: COMMIT C2: DELETE FROM parent WHERE parent_id = 0 -- Works! C2: COMMIT It seems that while SHARE-locking a concurrently deleted row causes a serialization error, deleting a concurrently SHARE-lockedis allowed. I do wonder if this shouldn't be considered a bug - whether locks conflict or not does not usuallydepend on the other in which they are taken. The build-in constraint triggers avoid the second case by checking not only for rows visible under the transaction's snapshotbut also for rows visible under a freshly taken snapshot in the ri_parent PERFORM statement. I do wonder if the recheckwas still needed if the DELETE in the second case threw a serialization_error also. Does anyone have an example thatproves it necessary? best regards, Florian Pflug Here are the table definitions and trigger functions I used: CREATE TABLE parent (parent_id SERIAL NOT NULL PRIMARY KEY); CREATE TABLE child (child_id SERIAL NOT NULL PRIMARY KEY, parent_id INTEGER NOT NULL); CREATE FUNCTION ri_parent() RETURNS TRIGGER AS $body$ BEGIN PERFORM TRUE FROM child WHERE parent_id = OLD.parent_id; IF FOUND THEN RAISE SQLSTATE '23503' USING MESSAGE = 'Parent' || OLD.parent_id || ' still referenced during ' || TG_OP; END IF; RETURN NULL; END; $body$ LANGUAGE PLPGSQL VOLATILE; CREATE TRIGGER ri_parent AFTER UPDATE OR DELETE ON parent FOR EACH ROW EXECUTE PROCEDURE ri_parent(); CREATE FUNCTION ri_child() RETURNS TRIGGER AS $body$ BEGIN PERFORM TRUE FROM parent WHERE parent_id = NEW.parent_id FOR SHARE OF parent; IF NOT FOUND THEN RAISE SQLSTATE '23503'USING MESSAGE = 'Parent ' || NEW.parent_id || ' does not exist during ' || TG_OP; END IF; RETURN NULL; END; $body$ LANGUAGE PLPGSQL VOLATILE; CREATE TRIGGER ri_child AFTER INSERT OR UPDATE ON child FOR EACH ROW EXECUTE PROCEDURE ri_child();
Re: SHARE locks vs. DELETE in SERIALIZABLE mode (Was: Partitioning/inherited tables vs FKs)
From
Robert Haas
Date:
2010/5/11 Florian Pflug <fgp@phlo.org>: > On May 11, 2010, at 13:29 , Robert Haas wrote: >> On Tue, May 11, 2010 at 2:16 AM, Dmitry Fefelov <fozzy@ac-sw.com> wrote: >>>> The referential integrity triggers contain some extra magic that isn't >>>> easily simulatable in userland, and that is necessary to make the >>>> foreign key constraints airtight. We've discussed this previously but >>>> I don't remember which thread it was or the details of when things >>>> blow up. I think it's something like this: the parent has a tuple >>>> that is not referenced by any child. Transaction 1 begins, deletes >>>> the parent tuple (checking that it has no children), and pauses. >>>> Transaction 2 begins, adds a child tuple that references the parent >>>> tuple (checking that the parent exists, which it does), and commits. >>>> Transaction 1 commits. >>> >>> Will SELECT ... FOR SHARE not help? >> >> Try it, with the example above. I think you'll find that it doesn't. > > That example does in fact work. Here is the precise sequence of commands I tested with constraint checking triggers implementedin PL/PGSQL. [...] > The serialization error, however, disappears if the two transactions are swapped. The following sequence of commands succeeds,even though the FK constraint is not satisfied. Thanks for figuring this out. I thought there was a case like this but I couldn't remember exactly how to reproduce it. > C1: BEGIN > C1: INSERT INTO child (parent_id) VALUES (0) > C2: BEGIN > C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE > C2: SELECT TRUE -- Take snapshot *before* C1 commits > C1: COMMIT > C2: DELETE FROM parent WHERE parent_id = 0 -- Works! > C2: COMMIT > > It seems that while SHARE-locking a concurrently deleted row causes a serialization error, deleting a concurrently SHARE-lockedis allowed. I do wonder if this shouldn't be considered a bug - whether locks conflict or not does not usuallydepend on the other in which they are taken. Wait - I'm confused. The DELETE in your example happens after C1 commits, so C1 can't still be holding any locks (nor does C2 take any locks prior to the commit of C1). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Re: SHARE locks vs. DELETE in SERIALIZABLE mode (Was: Partitioning/inherited tables vs FKs)
From
"Kevin Grittner"
Date:
Florian Pflug <fgp@phlo.org> wrote: > The serialization error, however, disappears if the two > transactions are swapped. The following sequence of commands > succeeds, even though the FK constraint is not satisfied. > > C1: BEGIN > C1: INSERT INTO child (parent_id) VALUES (0) > C2: BEGIN > C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE > C2: SELECT TRUE -- Take snapshot *before* C1 commits > C1: COMMIT > C2: DELETE FROM parent WHERE parent_id = 0 -- Works! > C2: COMMIT Thanks for another good example. Added to serializable test suite. C2> DELETE FROM parent WHERE parent_id = 0; ERROR: could not serialize access due to read/write dependencies among transactions HINT: The transaction might succeed if retried. CONTEXT: SQL statement "SELECT TRUE FROM child WHERE parent_id = OLD.parent_id" PL/pgSQL function "ri_parent" line 2 at PERFORM By the way, when adding these, I'm taking off the "FOR SHARE" or "FOR UPDATE" clauses; they're not needed with true serializable transactions. Otherwise, examples used as presented. -Kevin
Re: SHARE locks vs. DELETE in SERIALIZABLE mode (Was: Partitioning/inherited tables vs FKs)
From
Florian Pflug
Date:
On May 11, 2010, at 17:04 , Robert Haas wrote: > 2010/5/11 Florian Pflug <fgp@phlo.org>: >> C1: BEGIN >> C1: INSERT INTO child (parent_id) VALUES (0) >> C2: BEGIN >> C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE >> C2: SELECT TRUE -- Take snapshot *before* C1 commits >> C1: COMMIT >> C2: DELETE FROM parent WHERE parent_id = 0 -- Works! >> C2: COMMIT >> >> It seems that while SHARE-locking a concurrently deleted row causes a serialization error, deleting a concurrently SHARE-lockedis allowed. I do wonder if this shouldn't be considered a bug - whether locks conflict or not does not usuallydepend on the other in which they are taken. > > Wait - I'm confused. The DELETE in your example happens after C1 > commits, so C1 can't still be holding any locks (nor does C2 take any > locks prior to the commit of C1). I used the word "lock" a bit sloppy there. What I did want to point out is that any UPDATE by a SERIALIZABLE transaction to a row that has been concurrently updatedcauses a serialization error. The same happens when it instead SHARE- or UPDATE-locks the concurrently updated row.This is also independent from the commit-time of the concurrent transaction, as long as it is deemed invisible by theUPDATE/LOCK-ing transaction. In other words, any attempt to UPDATE, SHARE-lock or UPDATE-lock a row from within a SERIALIZABLEtransaction fails if the visible row version isn't the latest row version. If, however, the order of the eventsis the other way around, such that the SHARE-locking or UPDATE-locking happens first, and the UPDATE afterwards, thenno serialization error occurs! That might seem sensible if you view SHARE-locks and UPDATE-locks as locks, and the "taint" that marks a row (the existenceof a newer row version) after it has been updated by a transaction as "something else". After all, as you pointedout, the lock is gone as soon as the transaction commits. If, however, you view that "taint" as a slightly strangekind of lock that a transaction holds on the rows it updated even *after* the transaction committed, then it stopsmaking sense. You now have a "locking" behavior with order-dependent conflicts. Viewing those "taints" as locks is consistent with how that true serializability algorithm Kevin Grittner is working on dealswith those things, I believe - or at least it's probably that paper in the back of my mind that made me call it "lock"in the first place. It would be interesting to formalize this in the language of that paper - unfortunately, I probably lack the time to do thisin the near future :-( To avoid more confusion, here are the sequences of commands I have in mind: No serialization error (and neither with "FOR UPDATE" instead of "FOR SHARE") C1: BEGIN C1: SELECT t.* FROM t WHERE id = 1 FOR SHARE C2: BEGIN C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE C2: SELECT TRUE --Take snapshot before c1 commits C1: COMMIT C2: UPDATE t SET id = 2 WHERE id = 1 C2: COMMIT Serialization error (and also with "FOR UPDATE" instead of "FOR SHARE") C1: BEGIN C1: UPDATE t SET id = 2 WHERE id = 1 C2: BEGIN C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE C2: SELECT TRUE --Take snapshot before c1 commits C1: COMMIT C2: SELECT t.* FROM t WHERE id = 1 FOR SHARE --serialization error C2: COMMIT best regards, Florian Pflug
Re: SHARE locks vs. DELETE in SERIALIZABLE mode (Was: Partitioning/inherited tables vs FKs)
From
Jan Wieck
Date:
On 5/11/2010 12:39 PM, Florian Pflug wrote: > On May 11, 2010, at 17:04 , Robert Haas wrote: >> 2010/5/11 Florian Pflug <fgp@phlo.org>: >>> C1: BEGIN >>> C1: INSERT INTO child (parent_id) VALUES (0) >>> C2: BEGIN >>> C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE >>> C2: SELECT TRUE -- Take snapshot *before* C1 commits >>> C1: COMMIT >>> C2: DELETE FROM parent WHERE parent_id = 0 -- Works! >>> C2: COMMIT >>> >>> It seems that while SHARE-locking a concurrently deleted row causes a serialization error, deleting a concurrently SHARE-lockedis allowed. I do wonder if this shouldn't be considered a bug - whether locks conflict or not does not usuallydepend on the other in which they are taken. >> >> Wait - I'm confused. The DELETE in your example happens after C1 >> commits, so C1 can't still be holding any locks (nor does C2 take any >> locks prior to the commit of C1). > > I used the word "lock" a bit sloppy there. > > What I did want to point out is that any UPDATE by a SERIALIZABLE transaction to a row that has been concurrently updatedcauses a serialization error. The same happens when it instead SHARE- or UPDATE-locks the concurrently updated row.This is also independent from the commit-time of the concurrent transaction, as long as it is deemed invisible by theUPDATE/LOCK-ing transaction. In other words, any attempt to UPDATE, SHARE-lock or UPDATE-lock a row from within a SERIALIZABLEtransaction fails if the visible row version isn't the latest row version. If, however, the order of the eventsis the other way around, such that the SHARE-locking or UPDATE-locking happens first, and the UPDATE afterwards, thenno serialization error occurs! > > That might seem sensible if you view SHARE-locks and UPDATE-locks as locks, and the "taint" that marks a row (the existenceof a newer row version) after it has been updated by a transaction as "something else". After all, as you pointedout, the lock is gone as soon as the transaction commits. If, however, you view that "taint" as a slightly strangekind of lock that a transaction holds on the rows it updated even *after* the transaction committed, then it stopsmaking sense. You now have a "locking" behavior with order-dependent conflicts. > > Viewing those "taints" as locks is consistent with how that true serializability algorithm Kevin Grittner is working ondeals with those things, I believe - or at least it's probably that paper in the back of my mind that made me call it "lock"in the first place. > > It would be interesting to formalize this in the language of that paper - unfortunately, I probably lack the time to dothis in the near future :-( > > To avoid more confusion, here are the sequences of commands I have in mind: > > No serialization error (and neither with "FOR UPDATE" instead of "FOR SHARE") > C1: BEGIN > C1: SELECT t.* FROM t WHERE id = 1 FOR SHARE > C2: BEGIN > C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE > C2: SELECT TRUE --Take snapshot before c1 commits > C1: COMMIT > C2: UPDATE t SET id = 2 WHERE id = 1 > C2: COMMIT > > Serialization error (and also with "FOR UPDATE" instead of "FOR SHARE") > C1: BEGIN > C1: UPDATE t SET id = 2 WHERE id = 1 > C2: BEGIN > C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE > C2: SELECT TRUE --Take snapshot before c1 commits > C1: COMMIT > C2: SELECT t.* FROM t WHERE id = 1 FOR SHARE --serialization error > C2: COMMIT The problem really is that in the case of deleting a PK row while a concurrent transaction creates such a reference cannot be solved with user level visibility rules in case of a serializable transacton, unless you go really expensive routes. One corner case is that the transaction doing the FK INSERT commits after the serializable transaction doing the PK DELETE got its snapshot and also does the PK check before the PK DELETE got the lock on it. No user level visibility allows it to see that newly created reference. And unless the FK INSERTer actually UPDATE's the PK row (expensive), the PK DELETE will not throw anything. It will wait to get the lock and go ahead with the delete. The PK DELETE needs to be able to do some sort of dirty scan in order to see those new references. That is what I think Tom was referring to. Jan -- Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin
Re: SHARE locks vs. DELETE in SERIALIZABLE mode (Was: Partitioning/inherited tables vs FKs)
From
Florian Pflug
Date:
On May 11, 2010, at 20:05 , Jan Wieck wrote: > The problem really is that in the case of deleting a PK row while a concurrent transaction creates such a reference cannotbe solved with user level visibility rules in case of a serializable transacton, unless you go really expensive routes. Yeah. The information to detect this is there, though - the xmax of the PK row will be a multixact in this case, and onemember of that set won't be deemed visible by the deleting transaction. > One corner case is that the transaction doing the FK INSERT commits after the serializable transaction doing the PK DELETEgot its snapshot and also does the PK check before the PK DELETE got the lock on it. No user level visibility allowsit to see that newly created reference. And unless the FK INSERTer actually UPDATE's the PK row (expensive), the PKDELETE will not throw anything. It will wait to get the lock and go ahead with the delete. Exactly. It consciously waits for the lock (knowing that it was held by a concurrent transaction *not* visible to the deletingtransaction), and after obtaining the lock goes on to delete the row. If the concurrent transaction hadn't held amere lock, but had instead UPDATEd the row, this would cause a serialization error. > The PK DELETE needs to be able to do some sort of dirty scan in order to see those new references. That is what I thinkTom was referring to. Yeah. Though the need for that "dirty scan" (it's not actually a scan with DIRTY READ semantics, but rather one with READCOMMITTED semantics) might vanish if a SHARE lock had the same effect (causing a serialization error) on concurrent transactionsthat an UPDATE has. I'm not yet convinced that this is true, nor do I necessarily think that making all SHARE locks behave that way would bea good idea. But if my assertion is in fact true it would allow for robust user-level referential constraints by eithermodifying SHARE-lock behavior or adding a new row-lock type. best regards, Florian Pflug
On May 6, 2010, at 4:31 AM, Florian Pflug wrote: >> The use case for this was there were different news items, >> and there were another table for summaries, that could point >> to any of the news items table. Another use case could be >> a large partitioned table with an FK to the main table where >> the referring table might only contain very few "interesting" data. > > Yeah, this is a long-standing issue with inheritance. Table inheritance in postgres isn't much more than an implicit UNIONdone on selects plus some logic in ALTER TABLE to keep propagate structural changes. Indices and constraints basicallyalways behave as if ONLY had been specified. I'm not even sure if the ids are globally unique in your example -it might be that each child's "id serial" column gets its very own sequence. > > One possible workaround is no create a table, say referred_ids, that contains all the ids from parent and all of its children,kept up-to-date via triggers, and point the FK constraint to that table. That also allows for a global unique constrainton the ids by definition a suitable unique or primary key constraint on referred_ids. > > What lies at the heart of this problem is the lack of multi-table indices and hence multi-table unique constraints in postgres.AFAIK with those in place the rest amounts to the removal of ONLY from the constraint check queries plus some codeto propagate constraint triggers to child tables. FWIW, we use inheritance for something other than partitioning, and I created a trigger that provides a crude form of a foreignkey constraint, as well as one that provides a crude global unique constraint on the PK. Both probably have holesand race conditions, but I figure they're better than just hoping no one screws something up. BTW, my intention is to release all the generic tools we've developed to pgFoundry, it just hasn't happened yet. If enoughpeople find this stuff interesting I can try and up the priority on getting that done. (And if you're *really* wantingthis stuff you could pay 2nd Quadrant or CMD to get it for you.) test_us@workbook.local=# \df+ payment_instruments.tg_payment_instruments_unique List of functions -[ RECORD 1 ]-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Schema | payment_instruments Name | tg_payment_instruments_unique Result data type | trigger Argument data types | Volatility | volatile Owner | cnuadmin Language | plpgsql Source code | : : DECLARE : name CONSTANT text := 'payment_instruments.tg_payment_instruments_unique'; : c_full_table_name CONSTANT text := TG_TABLE_SCHEMA|| '.' || TG_TABLE_NAME; : BEGIN : PERFORM tools.assert( TG_WHEN ='BEFORE', TG_NAME || ' ON ' || c_full_table_name ||' must be an BEFORE trigger' ); : PERFORM tools.assert(TG_LEVEL = 'ROW', TG_NAME || ' ON ' || c_full_table_name ||' must be a row-level trigger' ); : : -- Deleting would break RI, so don't allow it. Granted, this should probably be a separatetrigger, but... : PERFORM tools.assert( 'payment_instruments__payment_instruments__inherit__no_delete' : , TG_OP != 'DELETE' : , 'DELETEs are not allowed on ' || c_full_table_name || ' (they would break inheritance RI)' : ); : : RAISE DEBUG '%: : TG_OP = % : TG_TABLE_NAME = % : NEW.payment_instrument_id = %' : , name : , TG_OP : , TG_TABLE_NAME : , NEW.payment_instrument_id : ; : : -- Changing the PK wouldbreak RI, so we shouldn't allow it. Granted, this should probably be a separate trigger, but... : IF TG_OP = 'UPDATE' THEN : PERFORM tools.assert( 'payment_instruments__payment_instruments__inherit__pk_no_change' : , NEW.payment_instrument_idIS NOT DISTINCT FROM OLD.payment_instrument_id : , 'Changing payment_instrument_idon ' || c_full_table_name || ' is not allowed (it would break inheritance RI)' : ); : ELSE : -- Only check for dupes on insert, otherwise we'll seeour own ID : PERFORM tools.assert( 'payment_instruments__payment_instruments__inherit__unique' : , NOT EXISTS( SELECT * FROM payment_instruments.payment_instrumentsWHERE payment_instrument_id = NEW.payment_instrument_id ) : , 'duplicate row violation, payment_instrument_id ' || coalesce( NEW.payment_instrument_id::text, '<NULL>' ) || 'already exists' : ); : END IF; : : RETURN NEW; : END; : : Description | Trigger to try and prevent duplicated payment_instrument_ids. This trigger is in no way perfect andhas a huge race condition, but generally these IDs should be getting assigned by a sequence, so we should not normallyhave an issue with duped IDs anyway. test_us@workbook.local=# \df+ payment_instruments.tg_payment_instruments_ri List of functions -[ RECORD 1 ]-------+------------------------------------------------------------------------------------------------------------------------------------------ Schema | payment_instruments Name | tg_payment_instruments_ri Result data type | trigger Argument data types | Volatility | volatile Owner | cnuadmin Language | plpgsql Source code | : : DECLARE : name CONSTANT text := 'payment_instruments.tg_payment_instruments_ri'; : c_full_table_name CONSTANT text := TG_TABLE_SCHEMA|| '.' || TG_TABLE_NAME; : v_payment_instrument_type payment_instruments.payment_instrument_types.payment_instrument_type%TYPE; : v_table_name text; : v_only text := ''; : v_result int; : sql text; : BEGIN : PERFORM tools.assert( TG_WHEN = 'AFTER', TG_NAME || ' ON ' || c_full_table_name ||'must be an AFTER trigger' ); : PERFORM tools.assert( TG_LEVEL = 'ROW', TG_NAME || ' ON ' || c_full_table_name||' must be a row-level trigger' ); : PERFORM tools.assert( TG_OP IN ( 'INSERT', 'UPDATE'), TG_NAME || ' ON ' || c_full_table_name ||' must be on INSERT or UPDATE' ); : : -- Th generally won't be allowed, but the trigger should still support it : IF NEW.payment_instrument_idIS NULL THEN : RAISE DEBUG '%: payment_instrument_id is NULL, skippingcheck', name; : RETURN NULL; : END IF; : : -- If we're updating and we haven't modified payment_instrument_id, just bail : IF TG_OP = 'UPDATE' THEN : IF NEW.payment_instrument_id IS NOT DISTINCT FROM OLD.payment_instrument_idTHEN : RAISE DEBUG '%: payment_instrument_id ( = % ) is unchanged,skipping check', name, NEW.payment_instrument_id; : RETURN NULL; : END IF; : END IF; : : /* : * We want to not only check for existence of the desired row, we also want : * to share-lock it.Unfortunately, sharelocks aren't implemented for : * inherited tables, so we need to find the recordin the correct table. We : * can do this fairly easily if we can find out what "type" of recordit is. : */ : : -- Figure out what type of record thisis (of course, record might not exist here) : SELECT INTO v_payment_instrument_type 'payment_instruments.'|| payment_instrument_type || 's' : FROM payment_instruments.payment_instrument_types__get(( : SELECT payment_instrument_type_id : FROM payment_instruments.payment_instruments pi : WHERE pi.payment_instrument_id = NEW.payment_instrument_id : ) ) : ; : IF NOT FOUND OR v_payment_instrument_type IS NULL THEN : -- There wasn't a record at all : v_only := 'ONLY '; -- FOR SHAREwon't work if we select from both the parent and the children : v_table_name := 'payment_instruments.payment_instruments'; : ELSE : -- We figured out whattype of record this is, now try and lock the row in the right table : v_table_name := v_payment_instrument_type; : END IF; : : sql := 'SELECT 1 FROM' || v_only || v_table_name || ' : WHERE payment_instrument_id = ' || NEW.payment_instrument_id|| ' : FOR SHARE' : ; : RAISEDEBUG '%: Executing SQL %', name, sql; : -- Note that simply using a PERFORM here might get optimizedout. : EXECUTE sql INTO v_result; : : /* : * Normally we should always find an record, but if it was somehow : * put in thewrong table, or if it was deleted after our initial : * select then we wouldn't have one. : */ : IF v_result = 1 THEN : RAISE DEBUG '%: record forpayment_instrument_id = % found in table %', name, NEW.payment_instrument_id, v_table_name; : RETURN NULL; : END IF; : : RAISE EXCEPTION 'insert on updateon table "%.%" violates foreign key; payment_instrument_id=(%) is not present in table "%"' : , TG_TABLE_SCHEMA : , TG_TABLE_NAME : , NEW.payment_instrument_id : , v_table_name : ; : END; : : Description | Enables Refferential Integrity at the payment_instruments level (normal RI on a table that is an inheritanceparent does not really work) -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
On Thu, May 6, 2010 at 10:38 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > * the index grows as the size of the total data set, it's not limited > by partition size > > * can't cheaply drop one partition any more, you have to vacuum the > (big) index first So I wholeheartedly agree with the general sentiment that if you need global indexes then partitioning just isn't really the right tool for you. But it occurs to me that we could defer the vacuum safely. I'm assuming a index heap-tid pointer for a global index would include a relid or some other identifier to specify which partition the tuple is in. If you drop that partition those can all just be left as dangling pointers as long as we don't reuse that id. So all we would need is some way to leave a catalog entry reserving that id. The data files can be truncated and deleted normally and whenever vacuum does run against the index it can clean up the catalog entries for the deleted partitions. But I would rather work on having unique and foreign key constraints that work on keys which include the partition key than work on global indexes. -- greg