Thread: [PATCH] Support for foreign keys with arrays
This patch adds basic support of arrays in foreign keys, by allowing to define a referencing column as an array of elements having the same type as the referenced column in the referenced table. Every NOT NULL element in the referencing array is matched against the referenced table. Example: CREATE TABLE pt ( id INTEGER PRIMARY KEY, ... ); CREATE TABLE ft ( id SERIAL PRIMARY KEY, pids INTEGER[] REFERENCES pt, ... ); This patch is for discussion and has been built against HEAD. It compiles and passes all regressions tests (including specific ones - see the src/test/regress/sql/foreign_key.sql file). Empty arrays, multi-dimensional arrays, duplicate elements and NULL values are allowed. We had to enforce some limitations, due to the lack (yet) of a clear and universally accepted behaviour and strategy. For example, consider the ON DELETE action on the above tables: in case of delete of a record in the 'pt' table, should we remove the whole row or just the values from the array? We hope we can start a discussion from here. Current limitations: * Only arrays of the same type as the primary key in the referenced table are supported * multi-column foreign keys are not supported (only single column) * Only RESTRICT and NO ACTION methods for referential integrity enforcement are currently supported TODO: * Improve check for empty arrays, which might interfere with SSI (see below) * Verify interaction with serializable transactions AUTHORS: * Gabriele Bartolini <gabriele.bartolini@2ndQuadrant.it> * Marco Nenciarini <marco.nenciarini@2ndQuadrant.it> Cheers, Gabriele (and Marco) -- Gabriele Bartolini - 2ndQuadrant Italia PostgreSQL Training, Services and Support gabriele.bartolini@2ndQuadrant.it | www.2ndQuadrant.it
Attachment
Hi Gabriele, On Fri, Nov 04, 2011 at 01:48:02PM +0100, Gabriele Bartolini wrote: > CREATE TABLE pt ( > id INTEGER PRIMARY KEY, > ... > ); > > CREATE TABLE ft ( > id SERIAL PRIMARY KEY, > pids INTEGER[] REFERENCES pt, > ... > ); This seems useful. I'm assuming the SQL spec says nothing about a feature like this? > This patch is for discussion and has been built against HEAD. > It compiles and passes all regressions tests (including specific ones - > see the src/test/regress/sql/foreign_key.sql file). > Empty arrays, multi-dimensional arrays, duplicate elements and NULL > values are allowed. With this patch, RI_Initial_Check does not detect a violation in an array that contains at least one conforming element: BEGIN; CREATE TABLE parent (c int PRIMARY KEY); CREATE TABLE child (c int[]); INSERT INTO parent VALUES (1); INSERT INTO child VALUES ('{3,1,2}'); ALTER TABLE child ADD FOREIGN KEY (c) REFERENCES parent; -- should error INSERT INTO child VALUES ('{3,1,2}'); -- does error, as expected ROLLBACK; The error message DETAIL on constraint violation would benefit from array-FK-specific language. Example of current message: ERROR: insert or update on table "child" violates foreign key constraint "child_c_fkey" DETAIL: Key (c)=({3,1,2}) is not present in table "parent". The patch is missing a change to the code that does FK=FK checks when a user updates the FK side: \set VERBOSITY verbose BEGIN; CREATE TABLE parent (c int PRIMARY KEY); CREATE TABLE child (c int[] REFERENCES parent); INSERT INTO parent VALUES (1); INSERT INTO child VALUES ('{1,1}'); COMMIT; -- ERROR: XX000: no conversion function from integer[] to integer -- LOCATION: ri_HashCompareOp, ri_triggers.c:4097 UPDATE child SET c = '{1,1}'; DROP TABLE parent, child; COMMIT; Please audit each ri_triggers.c entry point for further problems like this. > We had to enforce some limitations, due to the lack (yet) of a clear and > universally accepted behaviour and strategy. > For example, consider the ON DELETE action on the above tables: in case > of delete of a record in the 'pt' table, should we remove the whole row > or just the values from the array? > We hope we can start a discussion from here. Removing values from the array seems best to me. There's no doubt about what ON UPDATE CASCADE should do, and having ON DELETE CASCADE excise individual array elements is consistent with that. It's less clear for SET NULL, but I'd continue with a per-element treatment. I'd continue to forbid SET DEFAULT. However, Jeff Davis did expect ON DELETE CASCADE to remove entire rows: http://archives.postgresql.org/message-id/1288119207.15279.24.camel@jdavis-ux.asterdata.local So, perhaps the behavior needs to be user-selectable. > Current limitations: > > * Only arrays of the same type as the primary key in the referenced > table are supported This is understandable for a WIP, but the final patch will need to use our existing, looser foreign key type match requirement. > * multi-column foreign keys are not supported (only single column) Any particular reason for this? > *** a/doc/src/sgml/ddl.sgml > --- b/doc/src/sgml/ddl.sgml > *************** > *** 764,769 **** CREATE TABLE order_items ( > --- 764,796 ---- > the last table. > </para> > > + <para> > + Another option you have with foreign keys is to use a referencing column > + which is an array of elements with the same type as the referenced column > + in the related table. This feature, also known as <firstterm>foreign key arrays</firstterm>, > + is described in the following example: Please wrap your documentation paragraphs. > *** a/src/backend/commands/tablecmds.c > --- b/src/backend/commands/tablecmds.c > *************** > *** 5705,5710 **** ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation rel, > --- 5705,5735 ---- > Oid ffeqop; > int16 eqstrategy; > > + /* Check if foreign key is an array of primary key types */ > + const bool is_foreign_key_array = (fktype == get_array_type (pktype)); We don't declare non-pointer, local variables "const". Also, [not positive on this one] when an initial assignment requires a comment, declare the variable with no assignment and no comment. Then, assign it later with the comment. This keeps the per-block declarations packed together. This test wrongly rejects FK types that are domains over the array type: BEGIN; CREATE TABLE parent (c int PRIMARY KEY); CREATE DOMAIN intarrdom AS int[]; CREATE TABLE child (c intarrdom REFERENCES parent); ROLLBACK; > + > + /* Enforce foreign key array restrictions */ > + if (is_foreign_key_array) > + { > + /* > + * Foreign key array must not be part of a multi-column foreign key > + */ > + if (is_foreign_key_array && numpks > 1) > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_FOREIGN_KEY), > + errmsg("foreign key arrays must not be part of a multi-column foreign key"))); > + > + /* > + * We have to restrict foreign key array to NO ACTION and RESTRICT mode > + * until the behaviour triggered by the other actions is clearer and well defined > + */ > + if ((fkconstraint->fk_upd_action != FKCONSTR_ACTION_NOACTION && fkconstraint->fk_upd_action != FKCONSTR_ACTION_RESTRICT) > + || (fkconstraint->fk_del_action != FKCONSTR_ACTION_NOACTION && fkconstraint->fk_del_action != FKCONSTR_ACTION_RESTRICT)) Break these lines to keep things within 78 columns. Audit the remainder of your changes for long lines, and break when in doubt. > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_FOREIGN_KEY), > + errmsg("NO ACTION and RESTRICT are the only supported actions for foreign key arrays"))); Error message constants can remain unbroken, though. > + } > + > /* We need several fields out of the pg_opclass entry */ > cla_ht = SearchSysCache1(CLAOID, ObjectIdGetDatum(opclasses[i])); > if (!HeapTupleIsValid(cla_ht)) > *************** > *** 5766,5772 **** ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation rel, > Oid target_typeids[2]; > > input_typeids[0] = pktype; > ! input_typeids[1] = fktype; > target_typeids[0] = opcintype; > target_typeids[1] = opcintype; > if (can_coerce_type(2, input_typeids, target_typeids, > --- 5791,5801 ---- > Oid target_typeids[2]; > > input_typeids[0] = pktype; > ! /* When is FKA we must use for FK the same type of PK */ > ! if (is_foreign_key_array) > ! input_typeids[1] = pktype; > ! else > ! input_typeids[1] = fktype; > target_typeids[0] = opcintype; > target_typeids[1] = opcintype; > if (can_coerce_type(2, input_typeids, target_typeids, This is bogus; the can_coerce_type test will always pass (excluding bad cases of catalog inconsistency). ATAddForeignKeyConstraint should choose to make an array foreign key whenever the PK side is a scalar and the FK side is an array. Then, grab the element type of the FK side and feed that through the operator-identification logic. > *** a/src/backend/utils/adt/ri_triggers.c > --- b/src/backend/utils/adt/ri_triggers.c > *************** > *** 460,465 **** RI_FKey_check(PG_FUNCTION_ARGS) > --- 460,466 ---- > char paramname[16]; > const char *querysep; > Oid queryoids[RI_MAX_NUMKEYS]; > + bool is_foreign_key_array = false; > > /* ---------- > * The query string built is > *************** > *** 476,493 **** RI_FKey_check(PG_FUNCTION_ARGS) > { > Oid pk_type = RIAttType(pk_rel, riinfo.pk_attnums[i]); > Oid fk_type = RIAttType(fk_rel, riinfo.fk_attnums[i]); > > quoteOneName(attname, > RIAttName(pk_rel, riinfo.pk_attnums[i])); > sprintf(paramname, "$%d", i + 1); > ! ri_GenerateQual(&querybuf, querysep, > ! attname, pk_type, > ! riinfo.pf_eq_oprs[i], > ! paramname, fk_type); > querysep = "AND"; > queryoids[i] = fk_type; > } > ! appendStringInfo(&querybuf, " FOR SHARE OF x"); > > /* Prepare and save the plan */ > qplan = ri_PlanCheck(querybuf.data, riinfo.nkeys, queryoids, > --- 477,524 ---- > { > Oid pk_type = RIAttType(pk_rel, riinfo.pk_attnums[i]); > Oid fk_type = RIAttType(fk_rel, riinfo.fk_attnums[i]); > + is_foreign_key_array = (fk_type == get_array_type (pk_type)); Drop the extra whitespace before the function argument list. > > quoteOneName(attname, > RIAttName(pk_rel, riinfo.pk_attnums[i])); > sprintf(paramname, "$%d", i + 1); > ! /* > ! * In case of an array foreign key, we check that every > ! * DISTINCT NOT NULL value in the array is present in the PK table. > ! * XXX: This works because the query is executed with LIMIT 1, I found this comment confusing, since the SQL syntax "LIMIT 1" is never used here. I suppose you're referring to the fact that we call into SPI with tcount = 1? > ! * but may not work properly with SSI (a better approach would be > ! * to inspect the array and skip the check in case of empty arrays). Why might serializable transactions be specially affected? > ! */ > ! if (is_foreign_key_array) > ! { > ! appendStringInfo(&querybuf, " %s (SELECT count(*) FROM (SELECT DISTINCT UNNEST(%s)) y WHERE y IS NOTNULL)", querysep, paramname); > ! appendStringInfo(&querybuf, " = (SELECT count(*) FROM (SELECT 1 FROM ONLY %s y", pkrelname); > ! ri_GenerateQual(&querybuf, "WHERE", > ! attname, pk_type, > ! riinfo.pf_eq_oprs[i], > ! paramname, fk_type); > ! /* > ! * We lock for share every row in the pkreltable that is > ! * referenced by the array elements > ! */ > ! appendStringInfo(&querybuf, " FOR SHARE OF y) z)"); The resulting query performs an irrelevant sequential scan on the PK table: SELECT 1 FROM ONLY "public"."parent" x WHERE (SELECT count(*) FROM (SELECT DISTINCT UNNEST($1)) y WHERE y IS NOT NULL) =(SELECT count(*) FROM (SELECT 1 FROM ONLY "public"."parent" y WHERE "c" OPERATOR(pg_catalog.=) ANY ($1) FOR SHARE OF y)z) As you suggested with that comment above, this scan always ends after one row. That places a bound on the actual performance hit. However, we still read the one row, which may mean loading a page for nothing. At a minimum, simplify this query to: SELECT 1 WHERE (SELECT count(*) FROM (SELECT DISTINCT UNNEST($1)) y WHERE y IS NOT NULL) =(SELECT count(*) FROM (SELECT 1 FROM ONLY "public"."parent" y WHERE "c" OPERATOR(pg_catalog.=) ANY ($1) FOR SHARE OF y)z) That also naturally handles empty arrays against empty PK tables, which currently fail for me even at READ COMMITTED: BEGIN; CREATE TABLE parent (c int PRIMARY KEY); CREATE TABLE child (c int[] REFERENCES parent); INSERT INTO child VALUES ('{}'); -- fails wrongly ROLLBACK; > ! } > ! else > ! { > ! ri_GenerateQual(&querybuf, querysep, > ! attname, pk_type, > ! riinfo.pf_eq_oprs[i], > ! paramname, fk_type); > ! } > querysep = "AND"; > queryoids[i] = fk_type; > } > ! /* > ! * We skip locking for share in case of foreign key arrays > ! * as it has been done in the inner subquery > ! */ > ! if (! is_foreign_key_array) Drop the whitespace after the "!". > ! appendStringInfo(&querybuf, " FOR SHARE OF x"); > > /* Prepare and save the plan */ > qplan = ri_PlanCheck(querybuf.data, riinfo.nkeys, queryoids, > *** a/src/test/regress/expected/foreign_key.out > --- b/src/test/regress/expected/foreign_key.out > *************** > *** 968,978 **** drop table pktable; > drop table pktable_base; > -- 2 columns (1 table), mismatched types > create table pktable_base(base1 int not null, base2 int); > - create table pktable(ptest1 inet, ptest2 inet[], primary key(base1, ptest1), foreign key(base2, ptest2) references > - pktable(base1, ptest1)) inherits (pktable_base); > - NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "pktable_pkey" for table "pktable" > - ERROR: foreign key constraint "pktable_base2_fkey" cannot be implemented > - DETAIL: Key columns "ptest2" and "ptest1" are of incompatible types: inet[] and inet. Instead of deleting this test, change the type from inet[] to something unrelated, like float8. Thanks, nm
Hello 2011/11/17 Noah Misch <noah@leadboat.com>: > Hi Gabriele, > > On Fri, Nov 04, 2011 at 01:48:02PM +0100, Gabriele Bartolini wrote: >> CREATE TABLE pt ( >> id INTEGER PRIMARY KEY, >> ... >> ); >> >> CREATE TABLE ft ( >> id SERIAL PRIMARY KEY, >> pids INTEGER[] REFERENCES pt, >> ... >> ); > > This seems useful. > will be supported situation CREATE TABLE main( id int[] PRIMARY KEY, ... ); CREATE TABLE child( main_id int[] REFERENCES main(id), ?? Regards Pavel Stehule > I'm assuming the SQL spec says nothing about a feature like this? > >> This patch is for discussion and has been built against HEAD. >> It compiles and passes all regressions tests (including specific ones - >> see the src/test/regress/sql/foreign_key.sql file). >> Empty arrays, multi-dimensional arrays, duplicate elements and NULL >> values are allowed. > > With this patch, RI_Initial_Check does not detect a violation in an array that > contains at least one conforming element: > > BEGIN; > CREATE TABLE parent (c int PRIMARY KEY); > CREATE TABLE child (c int[]); > INSERT INTO parent VALUES (1); > INSERT INTO child VALUES ('{3,1,2}'); > ALTER TABLE child ADD FOREIGN KEY (c) REFERENCES parent; -- should error > INSERT INTO child VALUES ('{3,1,2}'); -- does error, as expected > ROLLBACK; > > The error message DETAIL on constraint violation would benefit from > array-FK-specific language. Example of current message: > > ERROR: insert or update on table "child" violates foreign key constraint "child_c_fkey" > DETAIL: Key (c)=({3,1,2}) is not present in table "parent". > > > The patch is missing a change to the code that does FK=FK checks when a user > updates the FK side: > > \set VERBOSITY verbose > BEGIN; > CREATE TABLE parent (c int PRIMARY KEY); > CREATE TABLE child (c int[] REFERENCES parent); > INSERT INTO parent VALUES (1); > INSERT INTO child VALUES ('{1,1}'); > COMMIT; > -- ERROR: XX000: no conversion function from integer[] to integer > -- LOCATION: ri_HashCompareOp, ri_triggers.c:4097 > UPDATE child SET c = '{1,1}'; > DROP TABLE parent, child; > COMMIT; > > Please audit each ri_triggers.c entry point for further problems like this. > >> We had to enforce some limitations, due to the lack (yet) of a clear and >> universally accepted behaviour and strategy. >> For example, consider the ON DELETE action on the above tables: in case >> of delete of a record in the 'pt' table, should we remove the whole row >> or just the values from the array? >> We hope we can start a discussion from here. > > Removing values from the array seems best to me. There's no doubt about what > ON UPDATE CASCADE should do, and having ON DELETE CASCADE excise individual > array elements is consistent with that. It's less clear for SET NULL, but I'd > continue with a per-element treatment. I'd continue to forbid SET DEFAULT. > > However, Jeff Davis did expect ON DELETE CASCADE to remove entire rows: > http://archives.postgresql.org/message-id/1288119207.15279.24.camel@jdavis-ux.asterdata.local > So, perhaps the behavior needs to be user-selectable. > >> Current limitations: >> >> * Only arrays of the same type as the primary key in the referenced >> table are supported > > This is understandable for a WIP, but the final patch will need to use our > existing, looser foreign key type match requirement. > >> * multi-column foreign keys are not supported (only single column) > > Any particular reason for this? > >> *** a/doc/src/sgml/ddl.sgml >> --- b/doc/src/sgml/ddl.sgml >> *************** >> *** 764,769 **** CREATE TABLE order_items ( >> --- 764,796 ---- >> the last table. >> </para> >> >> + <para> >> + Another option you have with foreign keys is to use a referencing column >> + which is an array of elements with the same type as the referenced column >> + in the related table. This feature, also known as <firstterm>foreign key arrays</firstterm>, >> + is described in the following example: > > Please wrap your documentation paragraphs. > >> *** a/src/backend/commands/tablecmds.c >> --- b/src/backend/commands/tablecmds.c >> *************** >> *** 5705,5710 **** ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation rel, >> --- 5705,5735 ---- >> Oid ffeqop; >> int16 eqstrategy; >> >> + /* Check if foreign key is an array of primary key types */ >> + const bool is_foreign_key_array = (fktype == get_array_type (pktype)); > > We don't declare non-pointer, local variables "const". Also, [not positive on > this one] when an initial assignment requires a comment, declare the variable > with no assignment and no comment. Then, assign it later with the comment. > This keeps the per-block declarations packed together. > > > This test wrongly rejects FK types that are domains over the array type: > > BEGIN; > CREATE TABLE parent (c int PRIMARY KEY); > CREATE DOMAIN intarrdom AS int[]; > CREATE TABLE child (c intarrdom REFERENCES parent); > ROLLBACK; > >> + >> + /* Enforce foreign key array restrictions */ >> + if (is_foreign_key_array) >> + { >> + /* >> + * Foreign key array must not be part of a multi-column foreign key >> + */ >> + if (is_foreign_key_array && numpks > 1) >> + ereport(ERROR, >> + (errcode(ERRCODE_INVALID_FOREIGN_KEY), >> + errmsg("foreign key arrays must not be part of a multi-column foreign key"))); >> + >> + /* >> + * We have to restrict foreign key array to NO ACTION and RESTRICT mode >> + * until the behaviour triggered by the other actions is clearer and well defined >> + */ >> + if ((fkconstraint->fk_upd_action != FKCONSTR_ACTION_NOACTION && fkconstraint->fk_upd_action != FKCONSTR_ACTION_RESTRICT) >> + || (fkconstraint->fk_del_action != FKCONSTR_ACTION_NOACTION && fkconstraint->fk_del_action!= FKCONSTR_ACTION_RESTRICT)) > > Break these lines to keep things within 78 columns. Audit the remainder of > your changes for long lines, and break when in doubt. > >> + ereport(ERROR, >> + (errcode(ERRCODE_INVALID_FOREIGN_KEY), >> + errmsg("NO ACTION and RESTRICT are the only supported actions for foreign key arrays"))); > > Error message constants can remain unbroken, though. > >> + } >> + >> /* We need several fields out of the pg_opclass entry */ >> cla_ht = SearchSysCache1(CLAOID, ObjectIdGetDatum(opclasses[i])); >> if (!HeapTupleIsValid(cla_ht)) >> *************** >> *** 5766,5772 **** ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation rel, >> Oid target_typeids[2]; >> >> input_typeids[0] = pktype; >> ! input_typeids[1] = fktype; >> target_typeids[0] = opcintype; >> target_typeids[1] = opcintype; >> if (can_coerce_type(2, input_typeids, target_typeids, >> --- 5791,5801 ---- >> Oid target_typeids[2]; >> >> input_typeids[0] = pktype; >> ! /* When is FKA we must use for FK the same type of PK */ >> ! if (is_foreign_key_array) >> ! input_typeids[1] = pktype; >> ! else >> ! input_typeids[1] = fktype; >> target_typeids[0] = opcintype; >> target_typeids[1] = opcintype; >> if (can_coerce_type(2, input_typeids, target_typeids, > > This is bogus; the can_coerce_type test will always pass (excluding bad cases > of catalog inconsistency). > > ATAddForeignKeyConstraint should choose to make an array foreign key whenever > the PK side is a scalar and the FK side is an array. Then, grab the element > type of the FK side and feed that through the operator-identification logic. > >> *** a/src/backend/utils/adt/ri_triggers.c >> --- b/src/backend/utils/adt/ri_triggers.c >> *************** >> *** 460,465 **** RI_FKey_check(PG_FUNCTION_ARGS) >> --- 460,466 ---- >> char paramname[16]; >> const char *querysep; >> Oid queryoids[RI_MAX_NUMKEYS]; >> + bool is_foreign_key_array = false; >> >> /* ---------- >> * The query string built is >> *************** >> *** 476,493 **** RI_FKey_check(PG_FUNCTION_ARGS) >> { >> Oid pk_type = RIAttType(pk_rel, riinfo.pk_attnums[i]); >> Oid fk_type = RIAttType(fk_rel, riinfo.fk_attnums[i]); >> >> quoteOneName(attname, >> RIAttName(pk_rel, riinfo.pk_attnums[i])); >> sprintf(paramname, "$%d", i + 1); >> ! ri_GenerateQual(&querybuf, querysep, >> ! attname, pk_type, >> ! riinfo.pf_eq_oprs[i], >> ! paramname, fk_type); >> querysep = "AND"; >> queryoids[i] = fk_type; >> } >> ! appendStringInfo(&querybuf, " FOR SHARE OF x"); >> >> /* Prepare and save the plan */ >> qplan = ri_PlanCheck(querybuf.data, riinfo.nkeys, queryoids, >> --- 477,524 ---- >> { >> Oid pk_type = RIAttType(pk_rel, riinfo.pk_attnums[i]); >> Oid fk_type = RIAttType(fk_rel, riinfo.fk_attnums[i]); >> + is_foreign_key_array = (fk_type == get_array_type (pk_type)); > > Drop the extra whitespace before the function argument list. > >> >> quoteOneName(attname, >> RIAttName(pk_rel, riinfo.pk_attnums[i])); >> sprintf(paramname, "$%d", i + 1); >> ! /* >> ! * In case of an array foreign key, we check that every >> ! * DISTINCT NOT NULL value in the array is present in the PK table. >> ! * XXX: This works because the query is executed with LIMIT 1, > > I found this comment confusing, since the SQL syntax "LIMIT 1" is never used > here. I suppose you're referring to the fact that we call into SPI with > tcount = 1? > >> ! * but may not work properly with SSI (a better approach would be >> ! * to inspect the array and skip the check in case of empty arrays). > > Why might serializable transactions be specially affected? > >> ! */ >> ! if (is_foreign_key_array) >> ! { >> ! appendStringInfo(&querybuf, " %s (SELECT count(*) FROM (SELECT DISTINCT UNNEST(%s)) y WHEREy IS NOT NULL)", querysep, paramname); >> ! appendStringInfo(&querybuf, " = (SELECT count(*) FROM (SELECT 1 FROM ONLY %s y", pkrelname); >> ! ri_GenerateQual(&querybuf, "WHERE", >> ! attname, pk_type, >> ! riinfo.pf_eq_oprs[i], >> ! paramname, fk_type); >> ! /* >> ! * We lock for share every row in the pkreltable that is >> ! * referenced by the array elements >> ! */ >> ! appendStringInfo(&querybuf, " FOR SHARE OF y) z)"); > > The resulting query performs an irrelevant sequential scan on the PK table: > > SELECT 1 FROM ONLY "public"."parent" x WHERE (SELECT count(*) FROM (SELECT DISTINCT UNNEST($1)) y WHERE y IS NOT NULL)= (SELECT count(*) FROM (SELECT 1 FROM ONLY "public"."parent" y WHERE "c" OPERATOR(pg_catalog.=) ANY ($1) FOR SHAREOF y) z) > > As you suggested with that comment above, this scan always ends after one row. > That places a bound on the actual performance hit. However, we still read the > one row, which may mean loading a page for nothing. At a minimum, simplify > this query to: > > SELECT 1 WHERE (SELECT count(*) FROM (SELECT DISTINCT UNNEST($1)) y WHERE y IS NOT NULL)= (SELECT count(*) FROM (SELECT 1 FROM ONLY "public"."parent" y WHERE "c" OPERATOR(pg_catalog.=) ANY ($1) FOR SHAREOF y) z) > > That also naturally handles empty arrays against empty PK tables, which > currently fail for me even at READ COMMITTED: > > BEGIN; > CREATE TABLE parent (c int PRIMARY KEY); > CREATE TABLE child (c int[] REFERENCES parent); > INSERT INTO child VALUES ('{}'); -- fails wrongly > ROLLBACK; > >> ! } >> ! else >> ! { >> ! ri_GenerateQual(&querybuf, querysep, >> ! attname, pk_type, >> ! riinfo.pf_eq_oprs[i], >> ! paramname, fk_type); >> ! } >> querysep = "AND"; >> queryoids[i] = fk_type; >> } >> ! /* >> ! * We skip locking for share in case of foreign key arrays >> ! * as it has been done in the inner subquery >> ! */ >> ! if (! is_foreign_key_array) > > Drop the whitespace after the "!". > >> ! appendStringInfo(&querybuf, " FOR SHARE OF x"); >> >> /* Prepare and save the plan */ >> qplan = ri_PlanCheck(querybuf.data, riinfo.nkeys, queryoids, > >> *** a/src/test/regress/expected/foreign_key.out >> --- b/src/test/regress/expected/foreign_key.out >> *************** >> *** 968,978 **** drop table pktable; >> drop table pktable_base; >> -- 2 columns (1 table), mismatched types >> create table pktable_base(base1 int not null, base2 int); >> - create table pktable(ptest1 inet, ptest2 inet[], primary key(base1, ptest1), foreign key(base2, ptest2) references >> - pktable(base1, ptest1)) inherits (pktable_base); >> - NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "pktable_pkey" for table "pktable" >> - ERROR: foreign key constraint "pktable_base2_fkey" cannot be implemented >> - DETAIL: Key columns "ptest2" and "ptest1" are of incompatible types: inet[] and inet. > > Instead of deleting this test, change the type from inet[] to something > unrelated, like float8. > > Thanks, > nm > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
Noah Misch <noah@leadboat.com> writes: > On Fri, Nov 04, 2011 at 01:48:02PM +0100, Gabriele Bartolini wrote: >> CREATE TABLE pt ( >> id INTEGER PRIMARY KEY, >> >> CREATE TABLE ft ( >> id SERIAL PRIMARY KEY, >> pids INTEGER[] REFERENCES pt, > I'm assuming the SQL spec says nothing about a feature like this? I'm pretty certain that the SQL spec flat out forbids this. The least we could do is invent some non-spec syntax that makes the intention clear, rather than having the system assume that an error case was intended to mean something else. Maybe pids INTEGER[] ARRAY REFERENCES pt, or something like that. (ARRAY is a fully reserved word already, so I think this syntax should work, but I've not tried it.) BTW, has anyone thought through whether this is a sane idea at all? It seems to me to be full of cases that will require rather arbitrary decisions, like whether ON DELETE CASCADE should involve deleting the whole row or just one array element. regards, tom lane
> BTW, has anyone thought through whether this is a sane idea at all? > It seems to me to be full of cases that will require rather arbitrary > decisions, like whether ON DELETE CASCADE should involve deleting the > whole row or just one array element. One array element, presumably. Does the patch implement CASCADE? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
2011/11/17 Tom Lane <tgl@sss.pgh.pa.us>: > Noah Misch <noah@leadboat.com> writes: >> On Fri, Nov 04, 2011 at 01:48:02PM +0100, Gabriele Bartolini wrote: >>> CREATE TABLE pt ( >>> id INTEGER PRIMARY KEY, >>> >>> CREATE TABLE ft ( >>> id SERIAL PRIMARY KEY, >>> pids INTEGER[] REFERENCES pt, > >> I'm assuming the SQL spec says nothing about a feature like this? > > I'm pretty certain that the SQL spec flat out forbids this. > > The least we could do is invent some non-spec syntax that makes the > intention clear, rather than having the system assume that an error case > was intended to mean something else. Maybe > > pids INTEGER[] ARRAY REFERENCES pt, > > or something like that. (ARRAY is a fully reserved word already, > so I think this syntax should work, but I've not tried it.) +1 Regards Pavel Stehule > > BTW, has anyone thought through whether this is a sane idea at all? > It seems to me to be full of cases that will require rather arbitrary > decisions, like whether ON DELETE CASCADE should involve deleting the > whole row or just one array element. > > regards, tom lane > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
Folks, BTW, I don't want to block this patch. However, it occurs to me that a more generalized FK based on non-equality conditions (i.e. expressions) would be nice if it were possible. Then we could have FKs from all kinds of complex structures. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Wed, Nov 16, 2011 at 11:28 PM, Noah Misch <noah@leadboat.com> wrote: > > Removing values from the array seems best to me. There's no doubt about what > ON UPDATE CASCADE should do, and having ON DELETE CASCADE excise individual > array elements is consistent with that. It's less clear for SET NULL, but I'd > continue with a per-element treatment. I'd continue to forbid SET DEFAULT. > > However, Jeff Davis did expect ON DELETE CASCADE to remove entire rows: > http://archives.postgresql.org/message-id/1288119207.15279.24.camel@jdavis-ux.asterdata.local > So, perhaps the behavior needs to be user-selectable. > i will agree with Jeff on this... i mean, on the normal case it will delete the row. no? the docs says about the CASCADE action """ CASCADE Delete any rows referencing the deleted row, or update the value of the referencing column to the new value of the referenced column, respectively. """ so, that is what i will expect -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación
Josh Berkus <josh@agliodbs.com> writes: >> BTW, has anyone thought through whether this is a sane idea at all? >> It seems to me to be full of cases that will require rather arbitrary >> decisions, like whether ON DELETE CASCADE should involve deleting the >> whole row or just one array element. > One array element, presumably. Um, why? One reasonable interpretation of an array reference is that the row depends on *all* of the referenced pkeys. Also, if you do delete one array element at a time, what do you do when the array becomes empty --- delete the row, or not, and in each case what's your semantic justification for that choice? In short, "presumably" doesn't cut it here. regards, tom lane
Hi Noah, thanks for your unvaluable review, rich of useful and thorough comments and notes. Marco and myself will add your proposed tests as soon as possible (most likely after the Italian PGDay which is this week). However, given the feedback received from other developers too (including Tom), I would first concentrate on defining the syntax and how referential integrity actions should work. Il 17/11/11 05:28, Noah Misch ha scritto: > Removing values from the array seems best to me. There's no doubt > about what ON UPDATE CASCADE should do, and having ON DELETE CASCADE > excise individual array elements is consistent with that. It's less > clear for SET NULL, but I'd continue with a per-element treatment. I'd > continue to forbid SET DEFAULT. However, Jeff Davis did expect ON > DELETE CASCADE to remove entire rows: > http://archives.postgresql.org/message-id/1288119207.15279.24.camel@jdavis-ux.asterdata.local > So, perhaps the behavior needs to be user-selectable. I would agree with what Tom is saying here, given that SQL specs do not say anything about this feature. We could leave standard REFERENCES keyword handling the array value as it is now. If a user wants to take advantage of in-array referential integrity, we could implement the special keyword "ARRAY REFERENCES" as Tom proposes (or a similar keyword). Consequently, we need to agree on what the actions on delete and update operations are. In case of ARRAY REFERENCES, I would be inclined to leave the same meaning of ROW scope actions to CASCADE and SET NULL actions, while disallowing the SET DEFAULT action (as Noah suggests too). At the same time, I would add two actions for ARRAY REFERENCES which will be processing array elements: * ARRAY CASCADE * ARRAY SET NULL (Of course if you are welcome to propose a better naming convention). This table summarises the scope of the actions. --------------- --------- --------- | ON | ON | Action | DELETE | UPDATE | --------------- --------- --------- CASCADE | Row | Element | SET NULL | Row | Row | ARRAY CASCADE | Element | Element | ARRAY SET NULL | Element | Element | SET DEFAULT | Error | Error | NO ACTION | - | - | RESTRICT | - | - | --------------- --------- --------- For instance, with an "ARRAY REFERENCES ... ON DELETE CASCADE", I would expect that the whole row is deleted (as Jeff et al. say). However, if I specify "ARRAY REFERENCES ... ON DELETE ARRAY CASCADE", I would expect that elements in the referencing array are removed. Similary the "ARRAY REFERENCES ... ON DELETE SET NULL" will set the row to NULL, whereas "ARRAY REFERENCES ... ON DELETE ARRAY SET NULL" will set individual elements in the referencing array to NULL. In case of updates, SET NULL and ARRAY SET NULL works the same (updating the whole row or the single elements). CASCADE and ARRAY CASCADE are synonyms, as they would work in individual elements (which is the action that makes more sense anyway). I believe that, before we proceed with one implementation or another, it is important we discuss this sort of things and agree on a possible long-term path (so that we can organise intermediate deliverables). Thanks, Gabriele -- Gabriele Bartolini - 2ndQuadrant Italia PostgreSQL Training, Services and Support gabriele.bartolini@2ndQuadrant.it |www.2ndQuadrant.it
Hi Gabriele and Marco, On Sun, Nov 20, 2011 at 10:36:15AM +0100, Gabriele Bartolini wrote: > --------------- --------- --------- > | ON | ON | > Action | DELETE | UPDATE | > --------------- --------- --------- > CASCADE | Row | Element | > SET NULL | Row | Row | > ARRAY CASCADE | Element | Element | > ARRAY SET NULL | Element | Element | > SET DEFAULT | Error | Error | > NO ACTION | - | - | > RESTRICT | - | - | > --------------- --------- --------- thank you for this very clear and concise summary! I agree with your appeal for a broad discussion on the proposed syntax, and I will use the same language to express my proposal (for clarity and to simplify the discussion): ------------------ --------- --------------- | ON | ON | Action | DELETE | UPDATE | ------------------ --------- --------------- CASCADE | Element | Element | SET NULL | Element | Element | SET DEFAULT | Error | Error | ARRAY CASCADE | Row | Element = Row | ARRAY SET NULL | Row | Row | ARRAY SET DEFAULT | Row | Row | NO ACTION | - | - | RESTRICT | - | - | ------------------ --------- --------------- I have swapped your syntax in the following way which looks cleaner to me: the ARRAY (CASCADE | SET NULL | SET DEFAULT) syntax denote operations that happen on the whole array, and CASCADE | SET NULL | SET DEFAULT denote instead operations that happen on the elements of the array. Associating the "Element" behaviour with the ON DELETE CASCADE syntax is also consistent with the case where the referencing table A is constructed as GROUP BY of another table B, and the array reference on A is built by aggregating a non-array reference on B with ON DELETE CASCADE syntax. In other words, the same syntax (ON DELETE CASCADE) would denote the same behaviour in both the aggregated case ( = one row per object, using array references) and the non-aggregated case (multiple rows for one object, using equality references), which represent two distinct implementations of the same abstraction. The "Row" behaviour would instead be associated to a new syntax (ON DELETE ARRAY CASCADE), which cannot be obtained via the existing syntax in the non-aggregated implementation, on the grounds that it might be useful for some semantics (for instance: if you remove a vertex from a polygon, you can either destroy the polygon or transform it into a polygon with less vertices). The same principle of considering the two implementations as the same abstraction would also confirm your choice to raise an exception on ON (DELETE | UPDATE) SET DEFAULT. It would also suggest to enable ON (DELETE | UPDATE) ARRAY SET DEFAULT. The reasoning is that we can actually specify a default array in the referencing column, but we can't specify a default element. Before I briefly thought to use the referenced column default as a default for the single element, but it seems a bad idea: a default is an expression (possibly non-constant) which is evaluated only when a new row is created in the referenced table, and using it outside of that context looks inappropriate. Regarding ON UPDATE ARRAY CASCADE, I agree to make it a synonym, since updating the whole array to take into account the update on the referenced field is equivalent to updating the single element to take into account the same fact. Finally, ON UPDATE ARRAY SET NULL would still have an use case as a different behaviour than ON UPDATE SET NULL, which we make available to the database designer: instead of replacing the updated element in the array with a NULL, we replace the whole array with a NULL. This is essentially the same difference that we have between ON DELETE ARRAY CASCADE and ON DELETE CASCADE. Thanks, Dr. Gianni Ciolli - 2ndQuadrant Italia PostgreSQL Training, Services and Support gianni.ciolli@2ndquadrant.it | www.2ndquadrant.it
On Sun, Nov 20, 2011 at 10:36:15AM +0100, Gabriele Bartolini wrote: > I would agree with what Tom is saying here, given that SQL specs do not > say anything about this feature. We could leave standard REFERENCES > keyword handling the array value as it is now. If a user wants to take > advantage of in-array referential integrity, we could implement the > special keyword "ARRAY REFERENCES" as Tom proposes (or a similar > keyword). No objection to that. > --------------- --------- --------- > | ON | ON | > Action | DELETE | UPDATE | > --------------- --------- --------- > CASCADE | Row | Element | > SET NULL | Row | Row | > ARRAY CASCADE | Element | Element | > ARRAY SET NULL | Element | Element | > SET DEFAULT | Error | Error | > NO ACTION | - | - | > RESTRICT | - | - | > --------------- --------- --------- I like this. > CASCADE and ARRAY CASCADE are > synonyms, as they would work in individual elements (which is the action > that makes more sense anyway). What about making ON UPDATE CASCADE an error? That way, we can say that ARRAY <action> always applies to array elements, and plain <action> always applies to entire rows. SET DEFAULT should now be fine to allow. It's ARRAY SET DEFAULT, in your new terminology, that wouldn't make sense. Thanks, nm
On Fri, Nov 4, 2011 at 7:48 AM, Gabriele Bartolini <gabriele.bartolini@2ndquadrant.it> wrote: > This patch adds basic support of arrays in foreign keys, by allowing to > define a referencing column as an array of elements having the same type as > the referenced column in the referenced table. > Every NOT NULL element in the referencing array is matched against the > referenced table. I like the idea of being able to define more flexible foreign keys, but are we gilding the lily here? The proposed solution is really quite specific to the nuances of arrays. Perhaps there is a more general expression based syntax that leaves the door open for other types conditions such as dealing fields dependent on other fields? merlin
On mån, 2011-11-21 at 10:30 -0600, Merlin Moncure wrote: > I like the idea of being able to define more flexible foreign keys, > but are we gilding the lily here? The proposed solution is really > quite specific to the nuances of arrays. Perhaps there is a more > general expression based syntax that leaves the door open for other > types conditions such as dealing fields dependent on other fields? Yeah, basically you'd just need a contains and/or is-contained-by operator between the two types.
Hi Noah, thanks for your feedback. Il 20/11/11 14:05, Noah Misch ha scritto: > What about making ON UPDATE CASCADE an error? That way, we can say that ARRAY > <action> always applies to array elements, and plain<action> always applies to > entire rows. > > SET DEFAULT should now be fine to allow. It's ARRAY SET DEFAULT, in your new > terminology, that wouldn't make sense. I have tried to gather your ideas with Gianni's and come to a compromise, which I hope you can both agree on. The reason why I would be inclined to leave CASCADE act on rows (rather than array elements as Gianni suggests) is for backward compatibility (people that are already using referential integrity based on array values). For the same reason, I am not sure whether we should raise an error on update, but will leave this for later. So, here is a summary: --------------- --------- --------- | ON | ON | Action | DELETE | UPDATE | --------------- --------- --------- CASCADE | Row | Error | SET NULL | Row | Row | SET DEFAULT | Row | Row | ARRAY CASCADE | Element | Element | ARRAY SET NULL | Element | Element | NO ACTION | - | - | RESTRICT | - | - | --------------- --------- --------- If that's fine with you guys, Marco and I will refactor the development based on these assumptions. Thanks, Gabriele -- Gabriele Bartolini - 2ndQuadrant Italia PostgreSQL Training, Services and Support gabriele.bartolini@2ndQuadrant.it |www.2ndQuadrant.it
On Sat, Dec 10, 2011 at 09:47:53AM +0100, Gabriele Bartolini wrote: > Il 20/11/11 14:05, Noah Misch ha scritto: >> What about making ON UPDATE CASCADE an error? That way, we can say that ARRAY >> <action> always applies to array elements, and plain<action> always applies to >> entire rows. >> >> SET DEFAULT should now be fine to allow. It's ARRAY SET DEFAULT, in your new >> terminology, that wouldn't make sense. > > I have tried to gather your ideas with Gianni's and come to a > compromise, which I hope you can both agree on. > > The reason why I would be inclined to leave CASCADE act on rows (rather > than array elements as Gianni suggests) is for backward compatibility > (people that are already using referential integrity based on array > values). For the same reason, I am not sure whether we should raise an > error on update, but will leave this for later. Your conclusion is reasonable, but I don't understand this argument for it. The patch does not change the meaning of any SQL that works today. > So, here is a summary: > > --------------- --------- --------- > | ON | ON | > Action | DELETE | UPDATE | > --------------- --------- --------- > CASCADE | Row | Error | > SET NULL | Row | Row | > SET DEFAULT | Row | Row | > ARRAY CASCADE | Element | Element | > ARRAY SET NULL | Element | Element | > NO ACTION | - | - | > RESTRICT | - | - | > --------------- --------- --------- > > If that's fine with you guys, Marco and I will refactor the development > based on these assumptions. Looks fine.
On Thu, Nov 17, 2011 at 12:08:32AM -0500, Tom Lane wrote: > The least we could do is invent some non-spec syntax that makes the > intention clear, rather than having the system assume that an error case > was intended to mean something else. Maybe > > pids INTEGER[] ARRAY REFERENCES pt, +1. Perhaps this for the table_constraint syntax: FOREIGN KEY (ARRAY foo, bar, ARRAY pids) REFERENCES pt
Hello, Il giorno dom, 11/12/2011 alle 19.45 -0500, Noah Misch ha scritto: > On Sat, Dec 10, 2011 at 09:47:53AM +0100, Gabriele Bartolini wrote: > > So, here is a summary: > > > > --------------- --------- --------- > > | ON | ON | > > Action | DELETE | UPDATE | > > --------------- --------- --------- > > CASCADE | Row | Error | > > SET NULL | Row | Row | > > SET DEFAULT | Row | Row | > > ARRAY CASCADE | Element | Element | > > ARRAY SET NULL | Element | Element | > > NO ACTION | - | - | > > RESTRICT | - | - | > > --------------- --------- --------- > > > > If that's fine with you guys, Marco and I will refactor the development > > based on these assumptions. > > Looks fine. > This is our latest version of the patch. Gabriele, Gianni and I have discussed a lot and decided to send an initial patch which uses EACH REFERENCES instead of ARRAY REFERENCES. The reason behind this is that ARRAY REFERENCES generates an ambiguous grammar, and we all agreed that EACH REFERENCES makes sense (and the same time does not introduce any new keyword). This is however open for discussion. The patch now includes the following clauses on the delete/update actions - as per previous emails: --------------- --------- --------- | ON | ON | Action | DELETE | UPDATE | --------------- --------- --------- CASCADE | Row | Error | SET NULL | Row | Row | SET DEFAULT | Row | Row | ARRAY CASCADE | Element | Element | ARRAY SET NULL | Element | Element | NO ACTION | - | - | RESTRICT | - | - | --------------- --------- --------- We will resubmit the patch for the 2012-01 commit fest. Thank you, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
Attachment
Hello, Il giorno dom, 11/12/2011 alle 19.45 -0500, Noah Misch ha scritto: > On Sat, Dec 10, 2011 at 09:47:53AM +0100, Gabriele Bartolini wrote: > > So, here is a summary: > > > > --------------- --------- --------- > > | ON | ON | > > Action | DELETE | UPDATE | > > --------------- --------- --------- > > CASCADE | Row | Error | > > SET NULL | Row | Row | > > SET DEFAULT | Row | Row | > > ARRAY CASCADE | Element | Element | > > ARRAY SET NULL | Element | Element | > > NO ACTION | - | - | > > RESTRICT | - | - | > > --------------- --------- --------- > > > > If that's fine with you guys, Marco and I will refactor the development > > based on these assumptions. > > Looks fine. > This is our latest version of the patch. Gabriele, Gianni and I have discussed a lot and decided to send an initial patch which uses EACH REFERENCES instead of ARRAY REFERENCES. The reason behind this is that ARRAY REFERENCES generates an ambiguous grammar, and we all agreed that EACH REFERENCES makes sense (and the same time does not introduce any new keyword). This is however open for discussion. The patch now includes the following clauses on the delete/update actions - as per previous emails: --------------- --------- --------- | ON | ON | Action | DELETE | UPDATE | --------------- --------- --------- CASCADE | Row | Error | SET NULL | Row | Row | SET DEFAULT | Row | Row | ARRAY CASCADE | Element | Element | ARRAY SET NULL | Element | Element | NO ACTION | - | - | RESTRICT | - | - | --------------- --------- --------- We will resubmit the patch for the 2012-01 commit fest. Thank you, Marco -- Marco Nenciarini - System manager @ Devise.IT marco.nenciarini@devise.it | http://www.devise.it
Attachment
On Sat, Jan 14, 2012 at 08:18:48PM +0100, Marco Nenciarini wrote: > This is our latest version of the patch. Gabriele, Gianni and I have > discussed a lot and decided to send an initial patch which uses EACH > REFERENCES instead of ARRAY REFERENCES. The reason behind this is that > ARRAY REFERENCES generates an ambiguous grammar, and we all agreed that > EACH REFERENCES makes sense (and the same time does not introduce any > new keyword). This is however open for discussion. I greatly like that name; it would still make sense for other aggregate types, should we ever expand its use. Please complete the name change: the documentation, catalog entries, etc should all call them something like "each foreign key constraints" (I don't particularly like that exact wording). You currently forbid multi-column EACH FKs. I agree that we should allow only one array column per FK; with more, the set of required PK rows would be something like the Cartesian product of the elements of array columns. However, there are no definitional problems, at least for NO ACTION, around a FK constraint having one array column and N scalar columns. Whether or not you implement that now, let's choose a table_constraint syntax leaving that opportunity open. How about:FOREIGN KEY(col_a, EACH col_b, col_c) REFERENCES pktable (a, b, c) You've identified that we cannot generally implement the ON DELETE ARRAY CASCADE action on multidimensional arrays. This patch chooses to downgrade to ON DELETE ARRAY SET NULL in that case. My initial reaction is that it would be better to forbid multidimensional arrays in the column when the delete action is ON DELETE ARRAY SET NULL. That's not satisfying, either, because it makes the definition of conforming data depend on the ON DELETE action. Do we have other options? > --------------- --------- --------- > | ON | ON | > Action | DELETE | UPDATE | > --------------- --------- --------- > CASCADE | Row | Error | > SET NULL | Row | Row | > SET DEFAULT | Row | Row | > ARRAY CASCADE | Element | Element | > ARRAY SET NULL | Element | Element | > NO ACTION | - | - | > RESTRICT | - | - | > --------------- --------- --------- To complete the ARRAY -> EACH transition, I would suggest names like CASCADE EACH/SET EACH NULL. I like the extensive test cases you have included. There's one more thing they should do: leave tables having EACH REFERENCES relationships in the regression database. This way, pg_dump tests of the regression database will exercise pg_dump with respect to this feature. The patch emits several warnings: heap.c: In function `StoreRelCheck': heap.c:1947: warning: passing argument 17 of `CreateConstraintEntry' makes integer from pointer without a cast index.c: In function `index_constraint_create': index.c:1160: warning: passing argument 17 of `CreateConstraintEntry' makes integer from pointer without a cast In file included from gram.y:13051: scan.c: In function `yy_try_NUL_trans': scan.c:16243: warning: unused variable `yyg' trigger.c: In function `CreateTrigger': trigger.c:454: warning: passing argument 17 of `CreateConstraintEntry' makes integer from pointer without a cast typecmds.c: In function `domainAddConstraint': typecmds.c:2960: warning: passing argument 17 of `CreateConstraintEntry' makes integer from pointer without a cast arrayfuncs.c: In function `array_remove': arrayfuncs.c:5197: warning: unused variable `dimresult' ri_triggers.c: In function `RI_FKey_check': ri_triggers.c:484: warning: too many arguments for format This test case, copied from my previous review except for updating the syntax, still fails: BEGIN; CREATE TABLE parent (c int PRIMARY KEY); CREATE TABLE child (c int[]); INSERT INTO parent VALUES (1); INSERT INTO child VALUES ('{3,1,2}'); ALTER TABLE child ADD FOREIGN KEY (c) EACH REFERENCES parent; -- should error INSERT INTO child VALUES ('{3,1,2}'); -- does error, as expected ROLLBACK; Most of my code comments concern minor matters: > *** a/doc/src/sgml/ddl.sgml > --- b/doc/src/sgml/ddl.sgml > *************** CREATE TABLE order_items ( > *** 852,857 **** > --- 882,931 ---- > </para> > > <para> > + When working with foreign key arrays, you have two more > + options that can be used with your > + <literal>EACH REFERENCES</literal> definition: > + <literal>ARRAY CASCADE</literal> and > + <literal>ARRAY SET NULL</literal>. Depending on > + the triggering action (<command>DELETE</command> or > + <command>UPDATE</command>) on the referenced table, > + every element in the referencing array will be either > + deleted/updated or set to NULL. > + For more detailed information on foreign key arrays > + options and special cases, please refer to the > + documentation for <xref linkend="sql-createtable">. > + </para> > + > + <para> > + For instance, in the example below, a <command>DELETE</command> > + from the <literal>countries</literal> table will remove > + the referencing elements in the <literal>citizenship_ids</literal> > + array. > + > + <programlisting> > + CREATE TABLE countries ( > + country_id integer PRIMARY KEY, > + name text, > + ... > + ); > + > + CREATE TABLE people ( > + person_id integer PRIMARY KEY, > + first_name text, > + last_name text, > + ... > + citizenship_ids integer[] <emphasis>EACH REFERENCES countries > + ON DELETE ARRAY CASCADE ON UPDATE ARRAY CASCADE</emphasis> > + ); > + </programlisting> > + > + Consequently, an <command>UPDATE</command> of > + the <literal>country_id</literal> column will be propagated > + to any element of the <literal>citizenship_ids</literal> > + field in the <literal>people</literal> table. > + </para> > + > + <para> I would leave off this second example. > *************** SELECT NULLIF(value, '(none)') ... > *** 10452,10457 **** > --- 10480,10490 ---- > </note> > > <para> > + When using <function>array_remove</function> with multi-dimensional > + arrays, elements will be set to NULL as fallback measure. > + </para> If we do keep this behavior, I would give this function a less-natural name and not document it. > *** a/doc/src/sgml/ref/create_table.sgml > --- b/doc/src/sgml/ref/create_table.sgml > *************** CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY > *** 51,57 **** > DEFAULT <replaceable>default_expr</replaceable> | > UNIQUE <replaceable class="PARAMETER">index_parameters</replaceable> | > PRIMARY KEY <replaceable class="PARAMETER">index_parameters</replaceable> | > ! REFERENCES <replaceable class="PARAMETER">reftable</replaceable> [ ( <replaceable class="PARAMETER">refcolumn</replaceable>) ] [ MATCH FULL | MATCH PARTIAL | MATCH SIMPLE ] > [ ON DELETE <replaceable class="parameter">action</replaceable> ] [ ON UPDATE <replaceable class="parameter">action</replaceable>] } > [ DEFERRABLE | NOT DEFERRABLE ] [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ] > > --- 51,57 ---- > DEFAULT <replaceable>default_expr</replaceable> | > UNIQUE <replaceable class="PARAMETER">index_parameters</replaceable> | > PRIMARY KEY <replaceable class="PARAMETER">index_parameters</replaceable> | > ! {EACH} REFERENCES <replaceable class="PARAMETER">reftable</replaceable> [ ( <replaceable class="PARAMETER">refcolumn</replaceable>) ] [ MATCH FULL | MATCH PARTIAL | MATCH SIMPLE ] > [ ON DELETE <replaceable class="parameter">action</replaceable> ] [ ON UPDATE <replaceable class="parameter">action</replaceable>] } > [ DEFERRABLE | NOT DEFERRABLE ] [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ] Use square brackets, not curly brackets, around optional terms. > > *************** CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY > *** 62,68 **** > UNIQUE ( <replaceable class="PARAMETER">column_name</replaceable> [, ... ] ) <replaceable class="PARAMETER">index_parameters</replaceable>| > PRIMARY KEY ( <replaceable class="PARAMETER">column_name</replaceable> [, ... ] ) <replaceable class="PARAMETER">index_parameters</replaceable>| > EXCLUDE [ USING <replaceable class="parameter">index_method</replaceable> ] ( <replaceable class="parameter">exclude_element</replaceable>WITH <replaceable class="parameter">operator</replaceable> [, ... ] ) <replaceableclass="parameter">index_parameters</replaceable> [ WHERE ( <replaceable class="parameter">predicate</replaceable>) ] | > ! FOREIGN KEY ( <replaceable class="PARAMETER">column_name</replaceable> [, ... ] ) REFERENCES <replaceable class="PARAMETER">reftable</replaceable>[ ( <replaceable class="PARAMETER">refcolumn</replaceable> [, ... ] ) ] > [ MATCH FULL | MATCH PARTIAL | MATCH SIMPLE ] [ ON DELETE <replaceable class="parameter">action</replaceable> ] [ON UPDATE <replaceable class="parameter">action</replaceable> ] } > [ DEFERRABLE | NOT DEFERRABLE ] [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ] > > --- 62,68 ---- > UNIQUE ( <replaceable class="PARAMETER">column_name</replaceable> [, ... ] ) <replaceable class="PARAMETER">index_parameters</replaceable>| > PRIMARY KEY ( <replaceable class="PARAMETER">column_name</replaceable> [, ... ] ) <replaceable class="PARAMETER">index_parameters</replaceable>| > EXCLUDE [ USING <replaceable class="parameter">index_method</replaceable> ] ( <replaceable class="parameter">exclude_element</replaceable>WITH <replaceable class="parameter">operator</replaceable> [, ... ] ) <replaceableclass="parameter">index_parameters</replaceable> [ WHERE ( <replaceable class="parameter">predicate</replaceable>) ] | > ! FOREIGN KEY ( <replaceable class="PARAMETER">column_name</replaceable> [, ... ] ) {EACH} REFERENCES <replaceable class="PARAMETER">reftable</replaceable>[ ( <replaceable class="PARAMETER">refcolumn</replaceable> [, ... ] ) ] Likewise. > *** a/src/backend/catalog/information_schema.sql > --- b/src/backend/catalog/information_schema.sql > *************** CREATE VIEW referential_constraints AS > *** 1173,1183 **** > > CAST( > CASE con.confdeltype WHEN 'c' THEN 'CASCADE' > WHEN 'n' THEN 'SET NULL' > WHEN 'd' THEN 'SET DEFAULT' > WHEN 'r' THEN 'RESTRICT' > WHEN 'a' THEN 'NO ACTION' END > ! AS character_data) AS delete_rule > > FROM (pg_namespace ncon > INNER JOIN pg_constraint con ON ncon.oid = con.connamespace > --- 1175,1189 ---- > > CAST( > CASE con.confdeltype WHEN 'c' THEN 'CASCADE' > + WHEN 'C' THEN 'ARRAY CASCADE' > WHEN 'n' THEN 'SET NULL' > + WHEN 'N' THEN 'ARRAY SET NULL' > WHEN 'd' THEN 'SET DEFAULT' > WHEN 'r' THEN 'RESTRICT' > WHEN 'a' THEN 'NO ACTION' END > ! AS character_data) AS delete_rule, > ! > ! CAST(con.confisarray AS boolean) AS is_array No need for that cast. > *** a/src/backend/commands/tablecmds.c > --- b/src/backend/commands/tablecmds.c > *************** ATAddForeignKeyConstraint(AlteredTableIn > *** 5688,5693 **** > --- 5689,5728 ---- > (errcode(ERRCODE_INVALID_FOREIGN_KEY), > errmsg("number of referencing and referenced columns for foreign key disagree"))); > > + /* Enforce foreign key array restrictions */ > + if (fkconstraint->fk_array) > + { > + /* > + * Foreign key array must not be part of a multi-column foreign key > + */ > + if (numpks > 1) > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_FOREIGN_KEY), > + errmsg("foreign key arrays must not be part of a multi-column foreign key"))); > + > + /* > + * ON UPDATE CASCADE action is not supported on FKA > + */ > + if (fkconstraint->fk_upd_action == FKCONSTR_ACTION_CASCADE) > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_FOREIGN_KEY), > + errmsg("ON UPDATE CASCADE action is not supported on foreign key arrays"))); Add a HINT about using ARRAY CASCADE. > *************** ATAddForeignKeyConstraint(AlteredTableIn > *** 5736,5775 **** > eqstrategy, opcintype, opcintype, opfamily); > > /* > ! * Are there equality operators that take exactly the FK type? Assume > ! * we should look through any domain here. > */ > ! fktyped = getBaseType(fktype); > > ! pfeqop = get_opfamily_member(opfamily, opcintype, fktyped, > ! eqstrategy); > ! if (OidIsValid(pfeqop)) > ! ffeqop = get_opfamily_member(opfamily, fktyped, fktyped, > eqstrategy); > - else > - ffeqop = InvalidOid; /* keep compiler quiet */ > > ! if (!(OidIsValid(pfeqop) && OidIsValid(ffeqop))) > { > /* > ! * Otherwise, look for an implicit cast from the FK type to the > ! * opcintype, and if found, use the primary equality operator. > ! * This is a bit tricky because opcintype might be a polymorphic > ! * type such as ANYARRAY or ANYENUM; so what we have to test is > ! * whether the two actual column types can be concurrently cast to > ! * that type. (Otherwise, we'd fail to reject combinations such > ! * as int[] and point[].) > */ > ! Oid input_typeids[2]; > ! Oid target_typeids[2]; > > ! input_typeids[0] = pktype; > ! input_typeids[1] = fktype; > ! target_typeids[0] = opcintype; > ! target_typeids[1] = opcintype; > ! if (can_coerce_type(2, input_typeids, target_typeids, > ! COERCION_IMPLICIT)) > ! pfeqop = ffeqop = ppeqop; > } > > if (!(OidIsValid(pfeqop) && OidIsValid(ffeqop))) > --- 5771,5861 ---- > eqstrategy, opcintype, opcintype, opfamily); > > /* > ! * Discover the equality operators > */ > ! if (fkconstraint->fk_array) > ! { > ! /* > ! * Are there equality operators that take exactly the FK element type? > ! * Assume we should look through any domain here. > ! */ > ! Oid fk_element_type=get_base_element_type(fktype); > ! if (!OidIsValid(fk_element_type)) > ! ereport(ERROR, > ! (errcode(ERRCODE_DATATYPE_MISMATCH), > ! errmsg("foreign key constraint \"%s\" " > ! "cannot be implemented", > ! fkconstraint->conname), > ! errdetail("Key columns \"%s\" is not an array type: %s", > ! strVal(list_nth(fkconstraint->fk_attrs, i)), > ! format_type_be(fktype)))); Use a detail message like "Type of key column "%s" is not an array type: %s". > > ! ffeqop = ARRAY_EQ_OP; > ! > ! pfeqop = get_opfamily_member(opfamily, opcintype, fk_element_type, > eqstrategy); > > ! if (!(OidIsValid(pfeqop))) > ! { > ! /* > ! * Otherwise, look for an implicit cast from the FK type to the > ! * opcintype, and if found, use the primary equality operator. > ! * This is a bit tricky because opcintype might be a polymorphic > ! * type such as ANYARRAY or ANYENUM; so what we have to test is > ! * whether the two actual column types can be concurrently cast to > ! * that type. (Otherwise, we'd fail to reject combinations such > ! * as int[] and point[].) > ! */ > ! Oid input_typeids[2]; > ! Oid target_typeids[2]; > ! > ! input_typeids[0] = pktype; > ! input_typeids[1] = fk_element_type; > ! target_typeids[0] = opcintype; > ! target_typeids[1] = opcintype; > ! if (can_coerce_type(2, input_typeids, target_typeids, > ! COERCION_IMPLICIT)) > ! pfeqop = ppeqop; > ! } > ! } > ! else > { > /* > ! * Are there equality operators that take exactly the FK type? Assume > ! * we should look through any domain here. > */ > ! fktyped = getBaseType(fktype); > > ! pfeqop = get_opfamily_member(opfamily, opcintype, fktyped, > ! eqstrategy); > ! if (OidIsValid(pfeqop)) > ! ffeqop = get_opfamily_member(opfamily, fktyped, fktyped, > ! eqstrategy); > ! else > ! ffeqop = InvalidOid; /* keep compiler quiet */ > ! > ! if (!(OidIsValid(pfeqop) && OidIsValid(ffeqop))) > ! { > ! /* > ! * Otherwise, look for an implicit cast from the FK type to the > ! * opcintype, and if found, use the primary equality operator. > ! * This is a bit tricky because opcintype might be a polymorphic > ! * type such as ANYARRAY or ANYENUM; so what we have to test is > ! * whether the two actual column types can be concurrently cast to > ! * that type. (Otherwise, we'd fail to reject combinations such > ! * as int[] and point[].) > ! */ > ! Oid input_typeids[2]; > ! Oid target_typeids[2]; > ! > ! input_typeids[0] = pktype; > ! input_typeids[1] = fktype; > ! target_typeids[0] = opcintype; > ! target_typeids[1] = opcintype; > ! if (can_coerce_type(2, input_typeids, target_typeids, > ! COERCION_IMPLICIT)) > ! pfeqop = ffeqop = ppeqop; > ! } > } Please reduce the level of textual code duplication here. > *** a/src/backend/commands/trigger.c > --- b/src/backend/commands/trigger.c > *************** ConvertTriggerToFK(CreateTrigStmt *stmt, > *** 861,876 **** > --- 862,881 ---- > switch (funcoid) > { > case F_RI_FKEY_CASCADE_UPD: > + case F_RI_FKEY_ARRCASCADE_UPD: > case F_RI_FKEY_RESTRICT_UPD: > case F_RI_FKEY_SETNULL_UPD: > + case F_RI_FKEY_ARRSETNULL_UPD: > case F_RI_FKEY_SETDEFAULT_UPD: > case F_RI_FKEY_NOACTION_UPD: > funcnum = 0; > break; > > case F_RI_FKEY_CASCADE_DEL: > + case F_RI_FKEY_ARRCASCADE_DEL: > case F_RI_FKEY_RESTRICT_DEL: > case F_RI_FKEY_SETNULL_DEL: > + case F_RI_FKEY_ARRSETNULL_DEL: > case F_RI_FKEY_SETDEFAULT_DEL: > case F_RI_FKEY_NOACTION_DEL: > funcnum = 1; We don't need to support these clauses in ConvertTriggerToFK(); no ancient dumps will bear them. A comment would be enough. On the other hand, your changes here are simple enough. Maybe it's better to add the support as you have than to explain why it's absent. > *** a/src/backend/utils/adt/arrayfuncs.c > --- b/src/backend/utils/adt/arrayfuncs.c > *************** array_unnest(PG_FUNCTION_ARGS) > *** 5174,5176 **** > --- 5174,5599 ---- > SRF_RETURN_DONE(funcctx); > } > } > + > + /* > + * Remove any occurrence of an element from an array > + * > + * If used on a multi-dimensional array the matching elements will be replaced with NULLs as fallback. > + * > + */ > + Datum > + array_remove(PG_FUNCTION_ARGS) > + { > + ArrayType *v; > + Datum old_value = PG_GETARG_DATUM(1); > + Oid element_type; > + ArrayType *result; > + Datum *values; > + bool *nulls; > + Datum elt; > + int ndim; > + int *dim; > + int nitems; > + int *dimresult; > + int nresult; > + int i; > + int32 nbytes = 0; > + int32 dataoffset; > + bool hasnulls; > + int typlen; > + bool typbyval; > + char typalign; > + char *s; > + bits8 *bitmap; > + int bitmask; > + Oid collation = PG_GET_COLLATION(); > + TypeCacheEntry *typentry; > + FunctionCallInfoData locfcinfo; > + > + /* > + * If the first argument is null > + * return NULL > + */ > + if (PG_ARGISNULL(0)) > + PG_RETURN_NULL(); > + > + v = PG_GETARG_ARRAYTYPE_P(0); > + > + /* > + * If second argument is NULL, no match is possible > + * so return the first argument unchanged > + */ > + if (PG_ARGISNULL(1)) > + PG_RETURN_ARRAYTYPE_P(v); > + > + ndim = ARR_NDIM(v); > + > + /* > + * If used on a multi-dimensional array the matching elements > + * will be replaced with NULLs as fallback. > + */ > + if (ndim > 1) { > + fcinfo->nargs = 3; > + fcinfo->argnull[2]=true; > + return array_replace(fcinfo); > + } > + > + dim = ARR_DIMS(v); > + element_type = ARR_ELEMTYPE(v); > + nitems = ArrayGetNItems(ndim, dim); > + > + /* Check for empty array */ > + if (nitems <= 0) > + { > + /* Return empty array */ > + PG_RETURN_ARRAYTYPE_P(construct_empty_array(element_type)); > + } > + > + /* > + * We arrange to look up the equality function only once per series of > + * calls, assuming the element type doesn't change underneath us. The > + * typcache is used so that we have no memory leakage when being used > + * as an index support function. > + */ The second sentence of the comment does not apply in this function. > + typentry = (TypeCacheEntry *) fcinfo->flinfo->fn_extra; > + if (typentry == NULL || > + typentry->type_id != element_type) > + { > + typentry = lookup_type_cache(element_type, > + TYPECACHE_EQ_OPR_FINFO); > + if (!OidIsValid(typentry->eq_opr_finfo.fn_oid)) > + ereport(ERROR, > + (errcode(ERRCODE_UNDEFINED_FUNCTION), > + errmsg("could not identify an equality operator for type %s", > + format_type_be(element_type)))); > + fcinfo->flinfo->fn_extra = (void *) typentry; > + } > + typlen = typentry->typlen; > + typbyval = typentry->typbyval; > + typalign = typentry->typalign; > + > + /* > + * apply the operator to each pair of array elements. > + */ > + InitFunctionCallInfoData(locfcinfo, &typentry->eq_opr_finfo, 2, > + collation, NULL, NULL); > + > + /* Allocate temporary arrays for new values */ > + values = (Datum *) palloc(nitems * sizeof(Datum)); > + nulls = (bool *) palloc(nitems * sizeof(bool)); > + > + /* Loop over source data */ > + s = ARR_DATA_PTR(v); > + bitmap = ARR_NULLBITMAP(v); > + bitmask = 1; > + hasnulls = false; > + nresult=0; > + > + for (i = 0; i < nitems; i++) > + { > + bool isNull; > + bool oprresult; > + bool skip; > + > + /* Get source element, checking for NULL */ > + if (bitmap && (*bitmap & bitmask) == 0) > + { > + isNull = true; > + skip = false; > + } > + else > + { > + elt = fetch_att(s, typbyval, typlen); > + s = att_addlength_datum(s, typlen, elt); > + s = (char *) att_align_nominal(s, typalign); > + isNull = false; > + > + /* > + * Apply the operator to the element pair > + */ > + locfcinfo.arg[0] = elt; > + locfcinfo.arg[1] = old_value; > + locfcinfo.argnull[0] = false; > + locfcinfo.argnull[1] = false; > + locfcinfo.isnull = false; > + oprresult = DatumGetBool(FunctionCallInvoke(&locfcinfo)); > + if (!oprresult) > + { > + values[nresult] = elt; > + skip = false; > + } > + else { > + skip = true; > + } Remove braces. > + } > + > + if (!skip) > + { > + nulls[nresult] = isNull; > + if (isNull) > + hasnulls = true; > + else > + { > + /* Ensure data is not toasted */ > + if (typlen == -1) > + values[nresult] = PointerGetDatum(PG_DETOAST_DATUM(values[nresult])); Shouldn't be needed; we just pulled the values from an array. > + /* Update total result size */ > + nbytes = att_addlength_datum(nbytes, typlen, values[nresult]); > + nbytes = att_align_nominal(nbytes, typalign); > + /* check for overflow of total request */ > + if (!AllocSizeIsValid(nbytes)) > + ereport(ERROR, > + (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), > + errmsg("array size exceeds the maximum allowed (%d)", > + (int) MaxAllocSize))); This test should never fail. Either convert it to an Assert or just add a comment to that effect. > + } > + nresult++; > + } > + > + /* advance bitmap pointer if any */ > + if (bitmap) > + { > + bitmask <<= 1; > + if (bitmask == 0x100) > + { > + bitmap++; > + bitmask = 1; > + } > + } > + } > + > + /* Allocate and initialize the result array */ Is it worth tracking whether we didn't find anything to remove and just returning the old array in that case? > + if (hasnulls) > + { > + dataoffset = ARR_OVERHEAD_WITHNULLS(ndim, nresult); > + nbytes += dataoffset; > + } > + else > + { > + dataoffset = 0; /* marker for no null bitmap */ > + nbytes += ARR_OVERHEAD_NONULLS(ndim); > + } > + result = (ArrayType *) palloc0(nbytes); > + SET_VARSIZE(result, nbytes); > + result->ndim = ndim; > + result->dataoffset = dataoffset; > + result->elemtype = element_type; > + memcpy(ARR_DIMS(result), ARR_DIMS(v), 2 * ndim * sizeof(int)); > + > + /* Adjust the final length */ > + ARR_DIMS(result)[0] = nresult; > + > + /* > + * Note: do not risk trying to pfree the results of the called function > + */ The comment does not apply here; a comparison function has no result to free. > + CopyArrayEls(result, > + values, nulls, nresult, > + typlen, typbyval, typalign, > + false); > + > + pfree(values); > + pfree(nulls); > + > + PG_RETURN_ARRAYTYPE_P(result); > + } > + > + /* > + * Replace any occurrence of an element in an array > + */ > + Datum > + array_replace(PG_FUNCTION_ARGS) > + { > + ArrayType *v; > + Datum old_value = PG_GETARG_DATUM(1); > + Datum new_value = PG_GETARG_DATUM(2); > + bool new_value_isnull = PG_ARGISNULL(2); > + Oid element_type; > + ArrayType *result; > + Datum *values; > + bool *nulls; > + Datum elt; > + int *dim; > + int ndim; > + int nitems; > + int i; > + int32 nbytes = 0; > + int32 dataoffset; > + bool hasnulls; > + int typlen; > + bool typbyval; > + char typalign; > + char *s; > + bits8 *bitmap; > + int bitmask; > + Oid collation = PG_GET_COLLATION(); > + TypeCacheEntry *typentry; > + FunctionCallInfoData locfcinfo; > + > + /* > + * If the first argument is null > + * return NULL > + */ > + if (PG_ARGISNULL(0)) > + PG_RETURN_NULL(); > + > + v = PG_GETARG_ARRAYTYPE_P(0); > + > + /* > + * If second argument is NULL, no match is possible > + * so return the first argument unchanged > + */ > + if (PG_ARGISNULL(1)) > + PG_RETURN_ARRAYTYPE_P(v); > + > + ndim = ARR_NDIM(v); > + dim = ARR_DIMS(v); > + element_type = ARR_ELEMTYPE(v); > + nitems = ArrayGetNItems(ndim, dim); > + > + /* Check for empty array */ > + if (nitems <= 0) > + { > + /* Return empty array */ > + PG_RETURN_ARRAYTYPE_P(construct_empty_array(element_type)); > + } > + > + /* > + * We arrange to look up the equality function only once per series of > + * calls, assuming the element type doesn't change underneath us. The > + * typcache is used so that we have no memory leakage when being used > + * as an index support function. > + */ The second sentence does not apply here, either. > + typentry = (TypeCacheEntry *) fcinfo->flinfo->fn_extra; > + if (typentry == NULL || > + typentry->type_id != element_type) > + { > + typentry = lookup_type_cache(element_type, > + TYPECACHE_EQ_OPR_FINFO); > + if (!OidIsValid(typentry->eq_opr_finfo.fn_oid)) > + ereport(ERROR, > + (errcode(ERRCODE_UNDEFINED_FUNCTION), > + errmsg("could not identify an equality operator for type %s", > + format_type_be(element_type)))); > + fcinfo->flinfo->fn_extra = (void *) typentry; > + } > + typlen = typentry->typlen; > + typbyval = typentry->typbyval; > + typalign = typentry->typalign; > + > + /* > + * apply the operator to each pair of array elements. > + */ > + InitFunctionCallInfoData(locfcinfo, &typentry->eq_opr_finfo, 2, > + collation, NULL, NULL); > + > + /* Allocate temporary arrays for new values */ > + values = (Datum *) palloc(nitems * sizeof(Datum)); > + nulls = (bool *) palloc(nitems * sizeof(bool)); > + > + /* Loop over source data */ > + s = ARR_DATA_PTR(v); > + bitmap = ARR_NULLBITMAP(v); > + bitmask = 1; > + hasnulls = false; > + > + for (i = 0; i < nitems; i++) > + { > + bool isNull; > + bool oprresult; > + > + /* Get source element, checking for NULL */ > + if (bitmap && (*bitmap & bitmask) == 0) > + { > + isNull = true; > + } > + else > + { > + elt = fetch_att(s, typbyval, typlen); > + s = att_addlength_datum(s, typlen, elt); > + s = (char *) att_align_nominal(s, typalign); > + isNull = false; > + > + /* > + * Apply the operator to the element pair > + */ > + locfcinfo.arg[0] = elt; > + locfcinfo.arg[1] = old_value; > + locfcinfo.argnull[0] = false; > + locfcinfo.argnull[1] = false; > + locfcinfo.isnull = false; > + oprresult = DatumGetBool(FunctionCallInvoke(&locfcinfo)); > + if (!oprresult) > + { > + values[i] = elt; > + } > + else if (!new_value_isnull) > + { > + values[i] = new_value; > + } > + else { > + isNull = true; > + } Remove the braces around single-statement blocks. > + } > + > + nulls[i] = isNull; > + if (isNull) > + hasnulls = true; > + else > + { > + /* Ensure data is not toasted */ > + if (typlen == -1) > + values[i] = PointerGetDatum(PG_DETOAST_DATUM(values[i])); The only value that might need a detoast is new_value. Do so once at the top of the function. > + /* Update total result size */ > + nbytes = att_addlength_datum(nbytes, typlen, values[i]); > + nbytes = att_align_nominal(nbytes, typalign); > + /* check for overflow of total request */ > + if (!AllocSizeIsValid(nbytes)) > + ereport(ERROR, > + (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), > + errmsg("array size exceeds the maximum allowed (%d)", > + (int) MaxAllocSize))); > + } > + > + /* advance bitmap pointer if any */ > + if (bitmap) > + { > + bitmask <<= 1; > + if (bitmask == 0x100) > + { > + bitmap++; > + bitmask = 1; > + } > + } > + } > + > + /* Allocate and initialize the result array */ > + if (hasnulls) > + { > + dataoffset = ARR_OVERHEAD_WITHNULLS(ndim, nitems); > + nbytes += dataoffset; > + } > + else > + { > + dataoffset = 0; /* marker for no null bitmap */ > + nbytes += ARR_OVERHEAD_NONULLS(ndim); > + } > + result = (ArrayType *) palloc0(nbytes); > + SET_VARSIZE(result, nbytes); > + result->ndim = ndim; > + result->dataoffset = dataoffset; > + result->elemtype = element_type; > + memcpy(ARR_DIMS(result), ARR_DIMS(v), 2 * ndim * sizeof(int)); > + > + /* > + * Note: do not risk trying to pfree the results of the called function > + */ The comment does not apply here; a comparison function has no result to free. > + CopyArrayEls(result, > + values, nulls, nitems, > + typlen, typbyval, typalign, > + false); > + > + pfree(values); > + pfree(nulls); > + > + PG_RETURN_ARRAYTYPE_P(result); > + } > diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c > index 03a974a..58a92e1 100644 > *** a/src/backend/utils/adt/ri_triggers.c > --- b/src/backend/utils/adt/ri_triggers.c > *************** typedef struct RI_ConstraintInfo > *** 106,111 **** > --- 110,116 ---- > NameData conname; /* name of the FK constraint */ > Oid pk_relid; /* referenced relation */ > Oid fk_relid; /* referencing relation */ > + bool confisarray; Comment that member. > + Datum > + RI_FKey_arrcascade_upd(PG_FUNCTION_ARGS) > + { These belong earlier in the file, adjacent to the other PK trigger functions. > + /* ---------- > + * The query string built is > + * UPDATE ONLY <fktable> SET fkatt1 = array_replace(fkatt1, $n, $1) [, ...] > + * WHERE $n = fkatt1 [AND ...] > + * The type id's for the $ parameters are those of the > + * corresponding PK attributes. > + * ---------- > + */ RI_FKey_cascade_upd() has this to say at the corresponding juncture: /* ---------- * The query string built is * UPDATE ONLY <fktable> SET fkatt1 = $1[, ...] * WHERE $n = fkatt1 [AND ...] * The type id's for the $ parameters are thoseof the * corresponding PK attributes. Note that we are assuming * there is an assignment castfrom the PK to the FK type; * else the parser will fail. * ---------- */ Since we're matching a polymorphic function here, the types of the second and third arguments must precisely match the element type of the first argument. Even when an implicit cast exists, we must insert cast syntax. Test case: BEGIN; CREATE TABLE parent (c smallint PRIMARY KEY); CREATE TABLE child (c int[] EACH REFERENCES parentON UPDATE ARRAY CASCADEON DELETE ARRAY CASCADE); INSERT INTO parent VALUES (1), (2); INSERT INTO child VALUES ('{1,2}'); UPDATE parent SET c = 3 WHERE c = 2; DELETE FROM parent WHERE c = 1; ROLLBACK; > + initStringInfo(&querybuf); > + initStringInfo(&qualbuf); > + quoteRelationName(fkrelname, fk_rel); > + appendStringInfo(&querybuf, "UPDATE ONLY %s SET", fkrelname); > + querysep = ""; > + qualsep = "WHERE"; > + for (i = 0, j = riinfo.nkeys; i < riinfo.nkeys; i++, j++) > + { > + Oid pk_type = RIAttType(pk_rel, riinfo.pk_attnums[i]); > + Oid fk_type = RIAttType(fk_rel, riinfo.fk_attnums[i]); > + > + quoteOneName(attname, > + RIAttName(fk_rel, riinfo.fk_attnums[i])); > + appendStringInfo(&querybuf, > + "%s %s = array_replace(%s, $%d, $%d)", > + querysep, attname, attname, j + 1, i + 1); Break this line to keep it within 78 columns. Otherwise, pgindent will reverse-indent it. There are some other examples in your patch. Please run your patch through this command, inspect the output, and fix any that don't belong: <patchfile expand -t4 | awk 'length > 78' > *** a/src/include/catalog/pg_constraint.h > --- b/src/include/catalog/pg_constraint.h > *************** CATALOG(pg_constraint,2606) > *** 78,83 **** > --- 78,84 ---- > * constraint. Otherwise confrelid is 0 and the char fields are spaces. > */ > Oid confrelid; /* relation referenced by foreign key */ > + bool confisarray; /* true if an EACH REFERENCE foreign key */ > char confupdtype; /* foreign key's ON UPDATE action */ > char confdeltype; /* foreign key's ON DELETE action */ > char confmatchtype; /* foreign key's match type */ Putting the field at this location adds three bytes of padding. Please locate it elsewhere to avoid that. > *** a/src/include/catalog/pg_proc.h > --- b/src/include/catalog/pg_proc.h > *************** DESCR("referential integrity ON DELETE N > *** 1976,1981 **** > --- 1981,1995 ---- > DATA(insert OID = 1655 ( RI_FKey_noaction_upd PGNSP PGUID 12 1 0 0 0 f f f t f v 0 0 2279 "" _null_ _null_ _null_ _null_RI_FKey_noaction_upd _null_ _null_ _null_ )); > DESCR("referential integrity ON UPDATE NO ACTION"); > > + DATA(insert OID = 3144 ( RI_FKey_arrcascade_del PGNSP PGUID 12 1 0 0 0 f f f t f v 0 0 2279 "" _null_ _null_ _null__null_ RI_FKey_arrcascade_del _null_ _null_ _null_ )); > + DESCR("referential integrity ON DELETE CASCADE"); > + DATA(insert OID = 3145 ( RI_FKey_arrcascade_upd PGNSP PGUID 12 1 0 0 0 f f f t f v 0 0 2279 "" _null_ _null_ _null__null_ RI_FKey_arrcascade_upd _null_ _null_ _null_ )); > + DESCR("referential integrity ON UPDATE CASCADE"); > + DATA(insert OID = 3146 ( RI_FKey_arrsetnull_del PGNSP PGUID 12 1 0 0 0 f f f t f v 0 0 2279 "" _null_ _null_ _null__null_ RI_FKey_arrsetnull_del _null_ _null_ _null_ )); > + DESCR("referential integrity ON DELETE SET NULL"); > + DATA(insert OID = 3147 ( RI_FKey_arrsetnull_upd PGNSP PGUID 12 1 0 0 0 f f f t f v 0 0 2279 "" _null_ _null_ _null__null_ RI_FKey_arrsetnull_upd _null_ _null_ _null_ )); > + DESCR("referential integrity ON UPDATE SET NULL"); Those descriptions need to reflect the actual clauses involved. Thanks, nm
On Sat, Jan 21, 2012 at 8:42 PM, Noah Misch <noah@leadboat.com> wrote: > You currently forbid multi-column EACH FKs. I agree that we should allow only > one array column per FK; with more, the set of required PK rows would be > something like the Cartesian product of the elements of array columns. > However, there are no definitional problems, at least for NO ACTION, around a > FK constraint having one array column and N scalar columns. Whether or not > you implement that now, let's choose a table_constraint syntax leaving that > opportunity open. How about: > FOREIGN KEY(col_a, EACH col_b, col_c) REFERENCES pktable (a, b, c) I don't think we should be trying to cover every possible combination of arrays, non-arrays and all the various options. The number of combinations is making this patch larger than it needs to be and as a result endangers its being committed in this release just on committer time to cope with the complexity. We have a matter of weeks to get this rock solid. Yes, lets keep syntax open for future additions, but lets please focus/edit this down to a solid, useful patch for 9.2. For me, one array column, no other non-array columns and delete restrict would cover 90+% of use cases. Bearing in mind you can cover other cases by writing your own triggers, I don't think solving every problem makes sense in a single release. Once we have a solid base we can fill in the rare cases later. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Sun, Jan 22, 2012 at 09:06:49PM +0000, Simon Riggs wrote: > On Sat, Jan 21, 2012 at 8:42 PM, Noah Misch <noah@leadboat.com> wrote: > > > You currently forbid multi-column EACH FKs. ?I agree that we should allow only > > one array column per FK; with more, the set of required PK rows would be > > something like the Cartesian product of the elements of array columns. > > However, there are no definitional problems, at least for NO ACTION, around a > > FK constraint having one array column and N scalar columns. ?Whether or not > > you implement that now, let's choose a table_constraint syntax leaving that > > opportunity open. ?How about: > > ? ? ? ?FOREIGN KEY(col_a, EACH col_b, col_c) REFERENCES pktable (a, b, c) > > > I don't think we should be trying to cover every possible combination > of arrays, non-arrays and all the various options. The number of > combinations is making this patch larger than it needs to be and as a > result endangers its being committed in this release just on committer > time to cope with the complexity. We have a matter of weeks to get > this rock solid. > > Yes, lets keep syntax open for future additions, but lets please > focus/edit this down to a solid, useful patch for 9.2. +1. Let's change the syntax to leave that door open but not use the flexibility at this time.
Hi Noah, Il 21/01/12 21:42, Noah Misch ha scritto: > On Sat, Jan 14, 2012 at 08:18:48PM +0100, Marco Nenciarini wrote: > I greatly like that name; it would still make sense for other > aggregate types, should we ever expand its use. Please complete the > name change: the documentation, catalog entries, etc should all call > them something like "each foreign key constraints" (I don't > particularly like that exact wording). Ok, we'll go with "EACH Foreign Key Constraints" but I would allow the synonym "Foreign Key Array", especially in the documentation. > How about: FOREIGN KEY(col_a, EACH col_b, col_c) REFERENCES pktable > (a, b, c) We really like this syntax. However, as also Simon suggested, we'd go for switching to this syntax, but stick to a simpler implementation for 9.2. We will then be able to expand the functionality, by keeping the same syntax, from 9.3. > To complete the ARRAY -> EACH transition, I would suggest names like > CASCADE EACH/SET EACH NULL. Sounds perfect. Marco will go through all your comments and will send version 3 shortly. Thank you, Gabriele -- Gabriele Bartolini - 2ndQuadrant Italia PostgreSQL Training, Services and Support gabriele.bartolini@2ndQuadrant.it |www.2ndQuadrant.it
Hi guys, Please find attached version 3 of our patch. We thoroughly followed your suggestions and were able to implement "EACH foreign key constraints" with multi-column syntax as well. "EACH foreign key constraints" represent PostgreSQL implementation of what are also known as Foreign Key Arrays. Some limitations occur in this release, but as previously agreed these can be refined and defined in future release implementations. This patch adds: * support for EACH REFERENCES column constraint on array types - e.g. c1 INT[] EACH REFERENCES t1 * support for EACH foreign key table constraints - e.g. FOREIGN KEY (EACH c1) REFERENCES t1 * support for EACH foreign keys in multi-column foreign key table constraints - e.g. FOREIGN KEY (c1, EACH c2) REFERENCES t1 (u1, u2) * support for two new referential actions on update/delete operations for single-column only EACH foreign keys: ** EACH CASCADE (deletes or updates elements inside the referencing array) ** EACH SET NULL (sets to NULL referencing element inside the foreign array) * support for array_replace and array_remove functions as required by the above actions As previously said, we preferred to keep this patch simple for 9.2 and forbid EACH CASCADE and EACH SET NULL on multi-column foreign keys. After all, majority of use cases is represented by EACH foreign key column constraints (single-column foreign key arrays), and more complicated use cases can be discussed for 9.3 - should this patch make it. :) We can use multi-dimensional arrays as well as referencing columns. In that case though, ON DELETE EACH CASCADE will behave like ON DELETE EACH SET NULL. This is a safe way of implementing the action. We have some ideas on how to implement this, but we feel it is better to limit the behaviour for now. As far as documentation is concerned, we: * added actions and constraint info in the catalog * added an entire section on "EACH foreign key constraints" in the data definition language chapter (we've simplified the second example, greatly following Noah's advice - let us know if this is ok with you) * added array_remove (currently limited to single-dimensional arrays) and array_replace in the array functions chapter * modified REFERENCES/FOREIGN KEY section in the CREATE TABLE command's documentation and added a special section on the EACH REFERENCES clause (using square braces as suggested) Here follows a short list of notes for Noah: * You proposed these changes: ARRAY CASCADE -> EACH CASCADE and ARRAY SET NULL -> EACH SET NULL. We stack with EACH CASCADE and decided to prepend the "EACH" keyword to standard's CASCADE and SET NULL. Grammar is simpler and the emphasis is on the EACH keyword. * Multi-dimensional arrays: ON DELETE EACH CASCADE -> ON DELETE EACH SET NULL. We cannot determine the array's number of dimensions at definition time as it depends on the actual values. As anticipated above, we have some ideas on multi-dimensional element removal, but not for this patch for the aforementioned reasons. * Support of EACH CASCADE/SET NULL in ConvertTriggerToFK(): we decided to leave it. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
Attachment
(I reply to an older message but I did use the newest patch, version 3) I wanted to have a look at v3 of this patch today, but it seems it won't apply and compile anymore. Here are the protestations of patch: patching file src/include/catalog/pg_proc.h Hunk #1 FAILED at 868. Hunk #2 FAILED at 1985. 2 out of 2 hunks FAILED -- saving rejects to file src/include/catalog/pg_proc.h.rej and in case it's any use, a cat of src/include/catalog/pg_proc.h.rej: *************** *** 868,873 **** DATA(insert OID = 2335 ( array_agg PGNSP PGUID 12 1 0 0 0 t f f f f i 1 0 2277 "2283" _null_ _null_ _null_ _null_ aggregate_dummy _null_ _null_ _null_ )); DESCR("concatenate aggregate input into an array"); DATA(insert OID = 760 ( smgrin PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 210 "2275" _null_ _null_ _null_ _null_ smgrin _null_ _null_ _null_ )); DESCR("I/O"); DATA(insert OID = 761 ( smgrout PGNSP PGUID12 1 0 0 0 f f f t f s 1 0 2275 "210" _null_ _null_ _null_ _null_ smgrout _null_ _null_ _null_ )); --- 868,878 ---- DATA(insert OID = 2335 ( array_agg PGNSP PGUID 12 1 0 0 0 t f f f f i 1 0 2277 "2283" _null_ _null_ _null_ _null_ aggregate_dummy _null_ _null_ _null_ )); DESCR("concatenate aggregate input into an array"); + DATA(insert OID = 3157 ( array_remove PGNSP PGUID 12 1 0 0 0 f f f f f i 2 0 2277 "2277 2283" _null_ _null_ _null_ _null_ array_remove _null_ _null_ _null_ )); + DESCR("remove any occurrence of an element from an array"); + DATA(insert OID = 3158 ( array_replace PGNSP PGUID 12 1 0 0 0 f f f f f i 3 0 2277 "2277 2283 2283" _null_ _null_ _null_ _null_ array_replace _null_ _null_ _null_ )); + DESCR("replace any occurrence of an element in an array"); + DATA(insert OID = 760 ( smgrin PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 210 "2275" _null_ _null_ _null_ _null_ smgrin _null_ _null_ _null_ )); DESCR("I/O"); DATA(insert OID = 761 ( smgrout PGNSP PGUID12 1 0 0 0 f f f t f s 1 0 2275 "210" _null_ _null_ _null_ _null_ smgrout _null_ _null_ _null_ )); *************** *** 1980,1985 **** DATA(insert OID = 1655 ( RI_FKey_noaction_upd PGNSP PGUID 12 1 0 0 0 f f f t f v 0 0 2279 "" _null_ _null_ _null_ _null_ RI_FKey_noaction_upd _null_ _null_ _null_ )); DESCR("referential integrity ON UPDATE NO ACTION"); DATA(insert OID = 1666 ( varbiteq PGNSP PGUID 12 1 0 0 0 f f f t f i 2 0 16 "1562 1562" _null_ _null_ _null_ _null_ biteq _null_ _null_ _null_ )); DATA(insert OID = 1667 ( varbitne PGNSP PGUID 12 10 0 0 f f f t f i 2 0 16 "1562 1562" _null_ _null_ _null_ _null_ bitne _null_ _null_ _null_ )); DATA(insert OID = 1668 ( varbitge PGNSP PGUID 12 10 0 0 f f f t f i 2 0 16 "1562 1562" _null_ _null_ _null_ _null_ bitge _null_ _null_ _null_ )); --- 1985,1999 ---- DATA(insert OID = 1655 ( RI_FKey_noaction_upd PGNSP PGUID 12 1 0 0 0 f f f t f v 0 0 2279 "" _null_ _null_ _null_ _null_ RI_FKey_noaction_upd _null_ _null_ _null_ )); DESCR("referential integrity ON UPDATE NO ACTION"); + DATA(insert OID = 3159 ( RI_FKey_eachcascade_del PGNSP PGUID 12 1 0 0 0 f f f t f v 0 0 2279 "" _null_ _null_ _null_ _null_ RI_FKey_eachcascade_del _null_ _null_ _null_ )); + DESCR("referential integrity ON DELETE EACH CASCADE"); + DATA(insert OID = 3160 ( RI_FKey_eachcascade_upd PGNSP PGUID 12 1 0 0 0 f f f t f v 0 0 2279 "" _null_ _null_ _null_ _null_ RI_FKey_eachcascade_upd _null_ _null_ _null_ )); + DESCR("referential integrity ON UPDATE EACH CASCADE"); + DATA(insert OID = 3161 ( RI_FKey_eachsetnull_del PGNSP PGUID 12 1 0 0 0 f f f t f v 0 0 2279 "" _null_ _null_ _null_ _null_ RI_FKey_eachsetnull_del _null_ _null_ _null_ )); + DESCR("referential integrity ON DELETE EACH SET NULL"); + DATA(insert OID = 3162 ( RI_FKey_eachsetnull_upd PGNSP PGUID 12 1 0 0 0 f f f t f v 0 0 2279 "" _null_ _null_ _null_ _null_ RI_FKey_eachsetnull_upd _null_ _null_ _null_ )); + DESCR("referential integrity ON UPDATE EACH SET NULL"); + DATA(insert OID = 1666 ( varbiteq PGNSP PGUID 12 1 0 0 0 f f f t f i 2 0 16 "1562 1562" _null_ _null_ _null_ _null_ biteq _null_ _null_ _null_ )); DATA(insert OID = 1667 ( varbitne PGNSP PGUID 12 10 0 0 f f f t f i 2 0 16 "1562 1562" _null_ _null_ _null_ _null_ bitne _null_ _null_ _null_ )); DATA(insert OID = 1668 ( varbitge PGNSP PGUID 12 10 0 0 f f f t f i 2 0 16 "1562 1562" _null_ _null_ _null_ _null_ bitge _null_ _null_ _null_ )); I'd like to try this out a bit; could you see if you can fix it? thanks, Erik Rijkers
On 21 February 2012 15:22, Erik Rijkers <er@xs4all.nl> wrote: > (I reply to an older message but I did use the newest patch, version 3) > > I wanted to have a look at v3 of this patch today, but it seems it won't apply and compile anymore. That's just because a new column has been added to pg_proc - proleakproof. Generalise from the example of commit cd30728fb2ed7c367d545fc14ab850b5fa2a4850 - it modified ever pg_proc entry, and this patch needs the same treatment: http://git.postgresql.org/gitweb/?p=postgresql.git;a=blobdiff;f=src/include/catalog/pg_proc.h;h=fb2923f94db06cd528d588fa52ca3294e025f9f6;hp=d926a88ff8468a43e7d2982273709fbc34058ade;hb=cd30728fb2ed7c367d545fc14ab850b5fa2a4850;hpb=2bbd88f8f841b01efb073972b60d4dc1ff1f6fd0 -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Hello Erik, Il 21/02/12 16:22, Erik Rijkers ha scritto: > (I reply to an older message but I did use the newest patch, version 3) > > I wanted to have a look at v3 of this patch today, but it seems it won't apply and compile anymore. As Peter pointed out, it is due to a new Boolean field added in the pg_proc catalog. I have updated our patch to set this value to false by default. > I'd like to try this out a bit; could you see if you can fix it? Thank you so much for dedicating your time on reviewing this patch. I have attached version 3b (noting that is just a very small modification/change to major 3 version of the patch), which passes all tests. Cheers, Gabriele -- Gabriele Bartolini - 2ndQuadrant Italia PostgreSQL Training, Services and Support gabriele.bartolini@2ndQuadrant.it | www.2ndQuadrant.it
Attachment
Hi Marco, This version fixes everything I noted in my last review. Apart from corner cases I note, the patch is technically sound. I pose a few policy-type questions; comments from a broader audience would help those. On Mon, Feb 06, 2012 at 07:04:42PM +0100, Marco Nenciarini wrote: > Please find attached version 3 of our patch. We thoroughly followed your > suggestions and were able to implement "EACH foreign key constraints" > with multi-column syntax as well. [actual review based on v3b] > * support for EACH foreign keys in multi-column foreign key table > constraints > - e.g. FOREIGN KEY (c1, EACH c2) REFERENCES t1 (u1, u2) You support, and have test cases for, constraints with multiple EACH columns. The documentation and comments do not explain their semantics. On reviewing the behavior and code, you have implemented it in terms of a Cartesian product. That's logical, but when is such a constraint actually useful? If someone can think of an application, great; let's document his example. Otherwise, let's reject such constraints at DDL time. > As previously said, we preferred to keep this patch simple for 9.2 and > forbid EACH CASCADE and EACH SET NULL on multi-column foreign keys. > After all, majority of use cases is represented by EACH foreign key > column constraints (single-column foreign key arrays), and more > complicated use cases can be discussed for 9.3 - should this patch make > it. :) Good call. > We can use multi-dimensional arrays as well as referencing columns. In > that case though, ON DELETE EACH CASCADE will behave like ON DELETE EACH > SET NULL. This is a safe way of implementing the action. > We have some ideas on how to implement this, but we feel it is better to > limit the behaviour for now. This still feels awfully unprincipled to me. How about just throwing an error when we need to remove an element from a multidimensional array? Essentially, have ON DELETE EACH CASCADE downgrade to ON DELETE RESTRICT when it encounters a multidimensional array. That's strictly less code than what you have now, and it keeps our options open. We can always change from error to set-null later, but we can't so readily change from set-null to anything else. As I pondered this patch again, I came upon formal hazards around non-default operator classes. Today, ATAddForeignKeyConstraint() always chooses pfeqop, ffeqop and ppeqop in the same operator family. If it neglected this, we would get wrong behavior when one of the operators is sensitive to a change to which another is insensitive. For EACH FK constraints, this patch sets ffeqop = ARRAY_EQ_OP unconditionally. array_eq() uses the TYPECACHE_EQ_OPR (usually from the default B-tree operator class). That operator may not exist at all, let alone share an operator family with pfeqop. Shall we deal with this by retrieving TYPECACHE_EQ_OPR in ATAddForeignKeyConstraint() and rejecting the constraint creation if it does not share an operator family with pfeqop? The same hazard affects ri_triggers.c use of array_remove() and array_replace(), and the same change mitigates it. Let me say again how much I like the test coverage from this patch. It's great to think of something worth verifying and find existing coverage for it in the submitted test cases. > *** a/doc/src/sgml/catalogs.sgml > --- b/doc/src/sgml/catalogs.sgml > *************** > *** 2102,2107 **** > --- 2106,2118 ---- > <entry></entry> > <entry>If a check constraint, a human-readable representation of the expression</entry> > </row> > + > + <row> > + <entry><structfield>confisarray</structfield></entry> Name is now confiseach. pg_constraint.confeach needs documentation. > + <entry><type>bool</type></entry> > + <entry></entry> > + <entry>If a foreign key, true if a EACH REFERENCE foreign key</entry> > + </row> > </tbody> > </tgroup> > </table> > *** a/doc/src/sgml/ddl.sgml > --- b/doc/src/sgml/ddl.sgml > + <para> > + Another option you have with foreign keys is to use a > + referencing column which is an array of elements with > + the same type (or a compatible one) as the referenced > + column in the related table. This feature, commonly known > + as <firstterm>foreign key arrays</firstterm>, is implemented Is it indeed "commonly known" by that name? My web searching did not turn up any independent use of the term. In any event, this should be the only place where we mention multiple names for the feature. Pick one preferred term and use it for all other mentions. The documentation, comments and messages currently have a mix of "each foreign key", "EACH foreign key" and "foreign key array". > + <para> > + Even though the most common use case for foreign key arrays > + is on a single column key, you can define an <quote>each foreign > + key constraint</quote> on a group of columns. As the following > + contrived example shows, it needs to be written in table constraint form: > + <programlisting> > + CREATE TABLE t1 ( > + a integer PRIMARY KEY, > + b integer, > + c integer[], > + <emphasis>FOREIGN KEY (b, EACH c) REFERENCES other_table (c1, c2)</emphasis> > + ); > + </programlisting> Rather than merely showing an abstract syntax example, some words about the semantics are in order. You might have a parent table with primary key (category, value). Then your child table(s) have a category column and an array of values. The constraint then ensures that all the entries in the array exist in the parent table *and* belong to the proper category. > *** a/src/backend/commands/tablecmds.c > --- b/src/backend/commands/tablecmds.c > *************** > *** 5736,5741 **** ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation rel, > --- 5759,5807 ---- > (errcode(ERRCODE_INVALID_FOREIGN_KEY), > errmsg("number of referencing and referenced columns for foreign key disagree"))); > > + /* Enforce each foreign key restrictions */ > + if (fkconstraint->fk_is_each) > + { > + /* > + * ON UPDATE CASCADE action is not supported on EACH foreign keys > + */ > + if (fkconstraint->fk_upd_action == FKCONSTR_ACTION_CASCADE) > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_FOREIGN_KEY), > + errmsg("ON UPDATE CASCADE action is not supported on " > + "EACH foreign keys"), > + errhint("Use ON UPDATE EACH CASCADE, instead"))); Note from the Error Message Style Guide that errhint and errdetail messages shall be complete sentences. > + > + /* > + * EACH CASCADE and EACH SET NULL actions are only available > + * in single-column EACH foreign keys > + */ > + if (numpks > 1 && > + (fkconstraint->fk_upd_action == FKCONSTR_ACTION_ARRCASCADE > + || fkconstraint->fk_upd_action == FKCONSTR_ACTION_ARRSETNULL > + || fkconstraint->fk_del_action == FKCONSTR_ACTION_ARRCASCADE > + || fkconstraint->fk_del_action == FKCONSTR_ACTION_ARRSETNULL)) > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_FOREIGN_KEY), > + errmsg("EACH CASCADE and EACH SET NULL actions are only " > + "available on single-column EACH foreign keys"))); > + } > + else > + { > + /* > + * EACH CASCADE and EACH SET NULL actions are only available > + * in EACH foreign keys > + */ > + if (fkconstraint->fk_upd_action == FKCONSTR_ACTION_ARRCASCADE > + || fkconstraint->fk_upd_action == FKCONSTR_ACTION_ARRSETNULL > + || fkconstraint->fk_del_action == FKCONSTR_ACTION_ARRCASCADE > + || fkconstraint->fk_del_action == FKCONSTR_ACTION_ARRSETNULL) > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_FOREIGN_KEY), > + errmsg("EACH CASCADE and EACH SET NULL actions are only " > + "available on EACH foreign keys"))); I would just reuse the previous error message. It will be fine, maybe even better, for everyone to see the phrase "single-column". > *************** > *** 5812,5823 **** ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation rel, > Oid target_typeids[2]; > > input_typeids[0] = pktype; > ! input_typeids[1] = fktype; > target_typeids[0] = opcintype; > target_typeids[1] = opcintype; > if (can_coerce_type(2, input_typeids, target_typeids, > COERCION_IMPLICIT)) > ! pfeqop = ffeqop = ppeqop; > } > > if (!(OidIsValid(pfeqop) && OidIsValid(ffeqop))) > --- 5905,5925 ---- > Oid target_typeids[2]; > > input_typeids[0] = pktype; > ! input_typeids[1] = fktyped; By sending the base type instead of the domain type to can_coerce_type(), this changes behavior ever so slightly for ordinary foreign keys. It would take something fairly baroque to get a different result. Perhaps a polymorphic opcintype and an FK column of a domain over an enum. Still, please don't change that part of the logic. > target_typeids[0] = opcintype; > target_typeids[1] = opcintype; > if (can_coerce_type(2, input_typeids, target_typeids, > COERCION_IMPLICIT)) > ! { > ! pfeqop = ppeqop; > ! > ! /* > ! * In caso of an EACH FK the ffeqop must be left untouched Typo. > *** a/src/backend/parser/gram.y > --- b/src/backend/parser/gram.y > *************** > *** 2900,2917 **** ConstraintElem: > yyscanner); > $$ = (Node *)n; > } > ! | FOREIGN KEY '(' columnList ')' REFERENCES qualified_name > opt_column_list key_match key_actions ConstraintAttributeSpec > { > Constraint *n = makeNode(Constraint); > n->contype = CONSTR_FOREIGN; > n->location = @1; > n->pktable = $7; > ! n->fk_attrs = $4; > n->pk_attrs = $8; > n->fk_matchtype = $9; > n->fk_upd_action = (char) ($10 >> 8); > n->fk_del_action = (char) ($10 & 0xFF); > processCASbits($11, @11, "FOREIGN KEY", > &n->deferrable, &n->initdeferred, > &n->skip_validation, > --- 2918,2959 ---- > yyscanner); > $$ = (Node *)n; > } > ! | FOREIGN KEY '(' foreignKeyColumnList ')' REFERENCES qualified_name > opt_column_list key_match key_actions ConstraintAttributeSpec > { > Constraint *n = makeNode(Constraint); > n->contype = CONSTR_FOREIGN; > n->location = @1; > n->pktable = $7; > ! n->fk_attrs = $4; > n->pk_attrs = $8; > n->fk_matchtype = $9; > n->fk_upd_action = (char) ($10 >> 8); > n->fk_del_action = (char) ($10 & 0xFF); > + > + /* > + * Split the content of foreignKeyColumnList > + * in two separate list. One lis of fileds Two typos. > + * and one list of boolean values. > + */ > + { > + ListCell *i; > + > + n->fk_attrs = NIL; > + n->fk_is_each = false; > + n->fk_each_attrs = NIL; > + foreach (i, $4) > + { > + ForeignKeyColumnElem *elem = > + (ForeignKeyColumnElem *)lfirst(i); > + > + n->fk_attrs = lappend(n->fk_attrs, elem->name); > + n->fk_is_each |= elem->each; > + n->fk_each_attrs = lappend(n->fk_each_attrs, > + makeString(elem->each?"t":"f")); > + } > + } This processing should happen in parse_utilcmd.c rather than gram.y. Magic values "t" and "f" won't do. Make fk_attrs a list of ForeignKeyColumnElem or else at least use an int list of true/false. Either way, you would no longer need the decode loop in tablecmds.c. > *** a/src/backend/utils/adt/arrayfuncs.c > --- b/src/backend/utils/adt/arrayfuncs.c > + Datum > + array_replace(PG_FUNCTION_ARGS) > + { > + for (i = 0; i < nitems; i++) > + { > + bool isNull; > + bool oprresult; > + > + /* Get source element, checking for NULL */ > + if (bitmap && (*bitmap & bitmask) == 0) > + { > + if (old_value_isnull) > + { > + values[i] = new_value; > + isNull = false; > + changed = true; > + } > + else > + isNull = true; > + } This does the wrong thing if old_value_isnull == new_value_isnull == true: [local] regression=# select array_replace('{4,null,5}'::int[], null, null);array_replace ---------------{4,0,5} (With text[], that yields a crash.) > *** a/src/backend/utils/adt/ri_triggers.c > --- b/src/backend/utils/adt/ri_triggers.c > ! Datum > ! RI_FKey_eachcascade_upd(PG_FUNCTION_ARGS) > { > ! /* ---------- > ! * The query string built is > ! * UPDATE ONLY <fktable> > ! * SET fkatt1 = array_replace(fkatt1, $n, $1) [, ...] > ! * WHERE $n = fkatt1 [AND ...] > ! * The type id's for the $ parameters are those of the > ! * corresponding PK attributes. > ! * ---------- > ! */ > ! initStringInfo(&querybuf); > ! initStringInfo(&qualbuf); > ! quoteRelationName(fkrelname, fk_rel); > ! appendStringInfo(&querybuf, "UPDATE ONLY %s SET", fkrelname); > ! querysep = ""; > ! qualsep = "WHERE"; An elog(ERROR) when riinfo.nkeys != 1 would make sense here and for other functions where we forbid multiple columns at creation time. If nothing else, this needs a comment that the looping is vestigial and the function only supports single-column constraints. > *************** > *** 2678,2695 **** RI_Initial_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel) > * The query string built is: > * SELECT fk.keycols FROM ONLY relname fk > * LEFT OUTER JOIN ONLY pkrelname pk > ! * ON (pk.pkkeycol1=fk.keycol1 [AND ...]) > * WHERE pk.pkkeycol1 IS NULL AND > * For MATCH unspecified: > * (fk.keycol1 IS NOT NULL [AND ...]) > * For MATCH FULL: > * (fk.keycol1 IS NOT NULL [OR ...]) > * > * We attach COLLATE clauses to the operators when comparing columns > * that have different collations. > *---------- > */ > initStringInfo(&querybuf); > appendStringInfo(&querybuf, "SELECT "); > sep = ""; > for (i = 0; i < riinfo.nkeys; i++) > --- 3527,3557 ---- > * The query string built is: > * SELECT fk.keycols FROM ONLY relname fk > * LEFT OUTER JOIN ONLY pkrelname pk > ! * ON (pk.pkkeycol1=fk.keycol1 [AND ...] > ! * [AND NOT EXISTS(<RECHECK_SUBQUERY>)]) > * WHERE pk.pkkeycol1 IS NULL AND > * For MATCH unspecified: > * (fk.keycol1 IS NOT NULL [AND ...]) > * For MATCH FULL: > * (fk.keycol1 IS NOT NULL [OR ...]) > * > + * In case of an EACH foreign key, a recheck subquery is added to > + * the join condition in order to check that every combination of keys > + * is actually referenced. > + * The RECHECK_SUBQUERY is > + * SELECT 1 FROM > + * unnest(fk.keycol1) x1(x1) [CROSS JOIN ...] > + * LEFT OUTER JOIN ONLY pkrelname pk > + * ON (pk.pkkeycol1=x1.x1 [AND ...]) > + * WHERE pk.pkkeycol1 IS NULL AND > + * (fk.keycol1 IS NOT NULL [AND ...]) What is a worst-case query plan for a constraint with n ordinary keys and m EACH keys? > + * > * We attach COLLATE clauses to the operators when comparing columns > * that have different collations. > *---------- > */ > *** a/src/include/nodes/parsenodes.h > --- b/src/include/nodes/parsenodes.h > *************** > *** 570,575 **** typedef struct DefElem > --- 570,589 ---- > } DefElem; > > /* > + * ForeignKeyColumnElem - foreign key column (used in foreign key constaint) Typo. > + * > + * For a foreign key attribute, 'name' is the name of the table column to > + * index, and each is true if it is an EACH fk. > + */ > + typedef struct ForeignKeyColumnElem > + { > + NodeTag type; > + Node *name; /* name of the column, or NULL */ > + bool each; /* true if an EACH foreign key */ > + > + } ForeignKeyColumnElem; > + > + /* > * LockingClause - raw representation of FOR UPDATE/SHARE options > * > * Note: lockedRels == NIL means "all relations in query". Otherwise it > *************** > *** 1510,1515 **** typedef enum ConstrType /* types of constraints */ > --- 1524,1531 ---- > #define FKCONSTR_ACTION_CASCADE 'c' > #define FKCONSTR_ACTION_SETNULL 'n' > #define FKCONSTR_ACTION_SETDEFAULT 'd' > + #define FKCONSTR_ACTION_ARRCASCADE 'C' > + #define FKCONSTR_ACTION_ARRSETNULL 'N' These names need update for the array -> each terminology change. I consider these the core changes needed to reach Ready for Committer: - Fix crash in array_replace(arr, null, null). - Don't look through the domain before calling can_coerce_type(). - Compare operator family of pfeqop and TYPECACHE_EQ_OPR at creation. - Move post-processing from gram.y and remove "t"/"f" magic values. Thanks, nm
On Fri, Feb 24, 2012 at 9:01 PM, Noah Misch <noah@leadboat.com> wrote: > I consider these the core changes needed to reach Ready for Committer: > > - Fix crash in array_replace(arr, null, null). > - Don't look through the domain before calling can_coerce_type(). > - Compare operator family of pfeqop and TYPECACHE_EQ_OPR at creation. > - Move post-processing from gram.y and remove "t"/"f" magic values. So, is someone working on making these changes? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Il giorno gio, 08/03/2012 alle 08.11 -0500, Robert Haas ha scritto: > On Fri, Feb 24, 2012 at 9:01 PM, Noah Misch <noah@leadboat.com> wrote: > > I consider these the core changes needed to reach Ready for Committer: > > > > - Fix crash in array_replace(arr, null, null). > > - Don't look through the domain before calling can_coerce_type(). > > - Compare operator family of pfeqop and TYPECACHE_EQ_OPR at creation. > > - Move post-processing from gram.y and remove "t"/"f" magic values. > > So, is someone working on making these changes? > We are working on it and I hope we can send the v4 version during the upcoming weekend. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
Hi, please find attached v4 of the EACH Foreign Key patch (formerly known also as Foreign Key Array). All the open issues have been fixed, including the crucial ones; see below for details. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it On Fri, Feb 24, 2012 at 09:01:35PM -0500, Noah Misch wrote: > You support, and have test cases for, constraints with multiple EACH columns. > The documentation and comments do not explain their semantics. On reviewing > the behavior and code, you have implemented it in terms of a Cartesian > product. That's logical, but when is such a constraint actually useful? If > someone can think of an application, great; let's document his example. > Otherwise, let's reject such constraints at DDL time. Now multiple EACH column are rejected. > > > We can use multi-dimensional arrays as well as referencing columns. In > > that case though, ON DELETE EACH CASCADE will behave like ON DELETE EACH > > SET NULL. This is a safe way of implementing the action. > > We have some ideas on how to implement this, but we feel it is better to > > limit the behaviour for now. > > This still feels awfully unprincipled to me. How about just throwing an error > when we need to remove an element from a multidimensional array? Essentially, > have ON DELETE EACH CASCADE downgrade to ON DELETE RESTRICT when it encounters > a multidimensional array. That's strictly less code than what you have now, > and it keeps our options open. We can always change from error to set-null > later, but we can't so readily change from set-null to anything else. That seems reasonable to me. Implemented and documented. > As I pondered this patch again, I came upon formal hazards around non-default > operator classes. Today, ATAddForeignKeyConstraint() always chooses pfeqop, > ffeqop and ppeqop in the same operator family. If it neglected this, we would > get wrong behavior when one of the operators is sensitive to a change to which > another is insensitive. For EACH FK constraints, this patch sets ffeqop = > ARRAY_EQ_OP unconditionally. array_eq() uses the TYPECACHE_EQ_OPR (usually > from the default B-tree operator class). That operator may not exist at all, > let alone share an operator family with pfeqop. Shall we deal with this by > retrieving TYPECACHE_EQ_OPR in ATAddForeignKeyConstraint() and rejecting the > constraint creation if it does not share an operator family with pfeqop? The > same hazard affects ri_triggers.c use of array_remove() and array_replace(), > and the same change mitigates it. Check added. As a consequence of this stricter policy, certain previously allowed cases are now unacceptable (e.g pk FLOAT and fk INT[]). Regression tests have been added. > Let me say again how much I like the test coverage from this patch. It's > great to think of something worth verifying and find existing coverage for it > in the submitted test cases. > > > > *** a/doc/src/sgml/catalogs.sgml > > --- b/doc/src/sgml/catalogs.sgml > > > *************** > > *** 2102,2107 **** > > --- 2106,2118 ---- > > <entry></entry> > > <entry>If a check constraint, a human-readable representation of the expression</entry> > > </row> > > + > > + <row> > > + <entry><structfield>confisarray</structfield></entry> > > Name is now confiseach. Fixed. > pg_constraint.confeach needs documentation. Most of pg_constraint columns, including all the boolean ones, are documented only in the "description" column of http://www.postgresql.org/docs/9.1/static/catalog-pg-constraint.html#AEN85760 it seems that confiseach should not be an exception, since it just indicates whether the constraint is of a certain kind or not. > > > + <entry><type>bool</type></entry> > > + <entry></entry> > > + <entry>If a foreign key, true if a EACH REFERENCE foreign key</entry> > > + </row> > > </tbody> > > </tgroup> > > </table> > > *** a/doc/src/sgml/ddl.sgml > > --- b/doc/src/sgml/ddl.sgml > > > + <para> > > + Another option you have with foreign keys is to use a > > + referencing column which is an array of elements with > > + the same type (or a compatible one) as the referenced > > + column in the related table. This feature, commonly known > > + as <firstterm>foreign key arrays</firstterm>, is implemented > > Is it indeed "commonly known" by that name? My web searching did not turn up > any independent use of the term. > > In any event, this should be the only place where we mention multiple names > for the feature. Pick one preferred term and use it for all other mentions. > The documentation, comments and messages currently have a mix of "each foreign > key", "EACH foreign key" and "foreign key array". In the end we uniformly adopted "EACH foreign key" as the unique name of this feature; documentation, comments and messages have been updated accordingly. > > + <para> > > + Even though the most common use case for foreign key arrays > > + is on a single column key, you can define an <quote>each foreign > > + key constraint</quote> on a group of columns. As the following > > + contrived example shows, it needs to be written in table constraint form: > > + <programlisting> > > + CREATE TABLE t1 ( > > + a integer PRIMARY KEY, > > + b integer, > > + c integer[], > > + <emphasis>FOREIGN KEY (b, EACH c) REFERENCES other_table (c1, c2)</emphasis> > > + ); > > + </programlisting> > > Rather than merely showing an abstract syntax example, some words about the > semantics are in order. You might have a parent table with primary key > (category, value). Then your child table(s) have a category column and an > array of values. The constraint then ensures that all the entries in the > array exist in the parent table *and* belong to the proper category. That example has been replaced by a concrete one. > > *** a/src/backend/commands/tablecmds.c > > --- b/src/backend/commands/tablecmds.c > > > *************** > > *** 5736,5741 **** ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation rel, > > --- 5759,5807 ---- > > (errcode(ERRCODE_INVALID_FOREIGN_KEY), > > errmsg("number of referencing and referenced columns for foreign key disagree"))); > > > > + /* Enforce each foreign key restrictions */ > > + if (fkconstraint->fk_is_each) > > + { > > + /* > > + * ON UPDATE CASCADE action is not supported on EACH foreign keys > > + */ > > + if (fkconstraint->fk_upd_action == FKCONSTR_ACTION_CASCADE) > > + ereport(ERROR, > > + (errcode(ERRCODE_INVALID_FOREIGN_KEY), > > + errmsg("ON UPDATE CASCADE action is not supported on " > > + "EACH foreign keys"), > > + errhint("Use ON UPDATE EACH CASCADE, instead"))); > > Note from the Error Message Style Guide that errhint and errdetail messages > shall be complete sentences. Fixed. > > + > > + /* > > + * EACH CASCADE and EACH SET NULL actions are only available > > + * in single-column EACH foreign keys > > + */ > > + if (numpks > 1 && > > + (fkconstraint->fk_upd_action == FKCONSTR_ACTION_ARRCASCADE > > + || fkconstraint->fk_upd_action == FKCONSTR_ACTION_ARRSETNULL > > + || fkconstraint->fk_del_action == FKCONSTR_ACTION_ARRCASCADE > > + || fkconstraint->fk_del_action == FKCONSTR_ACTION_ARRSETNULL)) > > + ereport(ERROR, > > + (errcode(ERRCODE_INVALID_FOREIGN_KEY), > > + errmsg("EACH CASCADE and EACH SET NULL actions are only " > > + "available on single-column EACH foreign keys"))); > > + } > > + else > > + { > > + /* > > + * EACH CASCADE and EACH SET NULL actions are only available > > + * in EACH foreign keys > > + */ > > + if (fkconstraint->fk_upd_action == FKCONSTR_ACTION_ARRCASCADE > > + || fkconstraint->fk_upd_action == FKCONSTR_ACTION_ARRSETNULL > > + || fkconstraint->fk_del_action == FKCONSTR_ACTION_ARRCASCADE > > + || fkconstraint->fk_del_action == FKCONSTR_ACTION_ARRSETNULL) > > + ereport(ERROR, > > + (errcode(ERRCODE_INVALID_FOREIGN_KEY), > > + errmsg("EACH CASCADE and EACH SET NULL actions are only " > > + "available on EACH foreign keys"))); > > I would just reuse the previous error message. It will be fine, maybe even > better, for everyone to see the phrase "single-column". Fixed. > > *************** > > *** 5812,5823 **** ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation rel, > > Oid target_typeids[2]; > > > > input_typeids[0] = pktype; > > ! input_typeids[1] = fktype; > > target_typeids[0] = opcintype; > > target_typeids[1] = opcintype; > > if (can_coerce_type(2, input_typeids, target_typeids, > > COERCION_IMPLICIT)) > > ! pfeqop = ffeqop = ppeqop; > > } > > > > if (!(OidIsValid(pfeqop) && OidIsValid(ffeqop))) > > --- 5905,5925 ---- > > Oid target_typeids[2]; > > > > input_typeids[0] = pktype; > > ! input_typeids[1] = fktyped; > > By sending the base type instead of the domain type to can_coerce_type(), this > changes behavior ever so slightly for ordinary foreign keys. It would take > something fairly baroque to get a different result. Perhaps a polymorphic > opcintype and an FK column of a domain over an enum. Still, please don't > change that part of the logic. Fixed. > > target_typeids[0] = opcintype; > > target_typeids[1] = opcintype; > > if (can_coerce_type(2, input_typeids, target_typeids, > > COERCION_IMPLICIT)) > > ! { > > ! pfeqop = ppeqop; > > ! > > ! /* > > ! * In caso of an EACH FK the ffeqop must be left untouched > > Typo. Fixed. > > > *** a/src/backend/parser/gram.y > > --- b/src/backend/parser/gram.y > > > *************** > > *** 2900,2917 **** ConstraintElem: > > yyscanner); > > $$ = (Node *)n; > > } > > ! | FOREIGN KEY '(' columnList ')' REFERENCES qualified_name > > opt_column_list key_match key_actions ConstraintAttributeSpec > > { > > Constraint *n = makeNode(Constraint); > > n->contype = CONSTR_FOREIGN; > > n->location = @1; > > n->pktable = $7; > > ! n->fk_attrs = $4; > > n->pk_attrs = $8; > > n->fk_matchtype = $9; > > n->fk_upd_action = (char) ($10 >> 8); > > n->fk_del_action = (char) ($10 & 0xFF); > > processCASbits($11, @11, "FOREIGN KEY", > > &n->deferrable, &n->initdeferred, > > &n->skip_validation, > > --- 2918,2959 ---- > > yyscanner); > > $$ = (Node *)n; > > } > > ! | FOREIGN KEY '(' foreignKeyColumnList ')' REFERENCES qualified_name > > opt_column_list key_match key_actions ConstraintAttributeSpec > > { > > Constraint *n = makeNode(Constraint); > > n->contype = CONSTR_FOREIGN; > > n->location = @1; > > n->pktable = $7; > > ! n->fk_attrs = $4; > > n->pk_attrs = $8; > > n->fk_matchtype = $9; > > n->fk_upd_action = (char) ($10 >> 8); > > n->fk_del_action = (char) ($10 & 0xFF); > > + > > + /* > > + * Split the content of foreignKeyColumnList > > + * in two separate list. One lis of fileds > > Two typos. Fixed. > > + * and one list of boolean values. > > + */ > > + { > > + ListCell *i; > > + > > + n->fk_attrs = NIL; > > + n->fk_is_each = false; > > + n->fk_each_attrs = NIL; > > + foreach (i, $4) > > + { > > + ForeignKeyColumnElem *elem = > > + (ForeignKeyColumnElem *)lfirst(i); > > + > > + n->fk_attrs = lappend(n->fk_attrs, elem->name); > > + n->fk_is_each |= elem->each; > > + n->fk_each_attrs = lappend(n->fk_each_attrs, > > + makeString(elem->each?"t":"f")); > > + } > > + } > > This processing should happen in parse_utilcmd.c rather than gram.y. Magic > values "t" and "f" won't do. Make fk_attrs a list of ForeignKeyColumnElem or > else at least use an int list of true/false. Either way, you would no longer > need the decode loop in tablecmds.c. Fixed. We have removed the magic values and moved decoding of the ForeignKeyColumnElem list inside parse_utilcmd.c. > > *** a/src/backend/utils/adt/arrayfuncs.c > > --- b/src/backend/utils/adt/arrayfuncs.c > > > + Datum > > + array_replace(PG_FUNCTION_ARGS) > > + { > > > + for (i = 0; i < nitems; i++) > > + { > > + bool isNull; > > + bool oprresult; > > + > > + /* Get source element, checking for NULL */ > > + if (bitmap && (*bitmap & bitmask) == 0) > > + { > > + if (old_value_isnull) > > + { > > + values[i] = new_value; > > + isNull = false; > > + changed = true; > > + } > > + else > > + isNull = true; > > + } > > This does the wrong thing if old_value_isnull == new_value_isnull == true: > > [local] regression=# select array_replace('{4,null,5}'::int[], null, null); > array_replace > --------------- > {4,0,5} > > (With text[], that yields a crash.) Fixed. > > *** a/src/backend/utils/adt/ri_triggers.c > > --- b/src/backend/utils/adt/ri_triggers.c > > > ! Datum > > ! RI_FKey_eachcascade_upd(PG_FUNCTION_ARGS) > > { > > > ! /* ---------- > > ! * The query string built is > > ! * UPDATE ONLY <fktable> > > ! * SET fkatt1 = array_replace(fkatt1, $n, $1) [, ...] > > ! * WHERE $n = fkatt1 [AND ...] > > ! * The type id's for the $ parameters are those of the > > ! * corresponding PK attributes. > > ! * ---------- > > ! */ > > ! initStringInfo(&querybuf); > > ! initStringInfo(&qualbuf); > > ! quoteRelationName(fkrelname, fk_rel); > > ! appendStringInfo(&querybuf, "UPDATE ONLY %s SET", fkrelname); > > ! querysep = ""; > > ! qualsep = "WHERE"; > > An elog(ERROR) when riinfo.nkeys != 1 would make sense here and for other > functions where we forbid multiple columns at creation time. If nothing else, > this needs a comment that the looping is vestigial and the function only > supports single-column constraints. Logic implemented and comment added. > > *************** > > *** 2678,2695 **** RI_Initial_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel) > > * The query string built is: > > * SELECT fk.keycols FROM ONLY relname fk > > * LEFT OUTER JOIN ONLY pkrelname pk > > ! * ON (pk.pkkeycol1=fk.keycol1 [AND ...]) > > * WHERE pk.pkkeycol1 IS NULL AND > > * For MATCH unspecified: > > * (fk.keycol1 IS NOT NULL [AND ...]) > > * For MATCH FULL: > > * (fk.keycol1 IS NOT NULL [OR ...]) > > * > > * We attach COLLATE clauses to the operators when comparing columns > > * that have different collations. > > *---------- > > */ > > initStringInfo(&querybuf); > > appendStringInfo(&querybuf, "SELECT "); > > sep = ""; > > for (i = 0; i < riinfo.nkeys; i++) > > --- 3527,3557 ---- > > * The query string built is: > > * SELECT fk.keycols FROM ONLY relname fk > > * LEFT OUTER JOIN ONLY pkrelname pk > > ! * ON (pk.pkkeycol1=fk.keycol1 [AND ...] > > ! * [AND NOT EXISTS(<RECHECK_SUBQUERY>)]) > > * WHERE pk.pkkeycol1 IS NULL AND > > * For MATCH unspecified: > > * (fk.keycol1 IS NOT NULL [AND ...]) > > * For MATCH FULL: > > * (fk.keycol1 IS NOT NULL [OR ...]) > > * > > + * In case of an EACH foreign key, a recheck subquery is added to > > + * the join condition in order to check that every combination of keys > > + * is actually referenced. > > + * The RECHECK_SUBQUERY is > > + * SELECT 1 FROM > > + * unnest(fk.keycol1) x1(x1) [CROSS JOIN ...] > > + * LEFT OUTER JOIN ONLY pkrelname pk > > + * ON (pk.pkkeycol1=x1.x1 [AND ...]) > > + * WHERE pk.pkkeycol1 IS NULL AND > > + * (fk.keycol1 IS NOT NULL [AND ...]) > > What is a worst-case query plan for a constraint with n ordinary keys and m > EACH keys? The subquery complexity can be compared with the one of the main query. Anyway, it is a one-off cost, paid at constraint creation, which I believe is acceptable. The fact that we now implement only single-column EACH foreign keys might suggest that we postpone this discussion at a later time. > > > + * > > * We attach COLLATE clauses to the operators when comparing columns > > * that have different collations. > > *---------- > > */ > > > *** a/src/include/nodes/parsenodes.h > > --- b/src/include/nodes/parsenodes.h > > *************** > > *** 570,575 **** typedef struct DefElem > > --- 570,589 ---- > > } DefElem; > > > > /* > > + * ForeignKeyColumnElem - foreign key column (used in foreign key constaint) > > Typo. Fixed. > > > + * > > + * For a foreign key attribute, 'name' is the name of the table column to > > + * index, and each is true if it is an EACH fk. > > + */ > > + typedef struct ForeignKeyColumnElem > > + { > > + NodeTag type; > > + Node *name; /* name of the column, or NULL */ > > + bool each; /* true if an EACH foreign key */ > > + > > + } ForeignKeyColumnElem; > > + > > + /* > > * LockingClause - raw representation of FOR UPDATE/SHARE options > > * > > * Note: lockedRels == NIL means "all relations in query". Otherwise it > > *************** > > *** 1510,1515 **** typedef enum ConstrType /* types of constraints */ > > --- 1524,1531 ---- > > #define FKCONSTR_ACTION_CASCADE 'c' > > #define FKCONSTR_ACTION_SETNULL 'n' > > #define FKCONSTR_ACTION_SETDEFAULT 'd' > > + #define FKCONSTR_ACTION_ARRCASCADE 'C' > > + #define FKCONSTR_ACTION_ARRSETNULL 'N' > > These names need update for the array -> each terminology change. Fixed. > I consider these the core changes needed to reach Ready for Committer: > > - Fix crash in array_replace(arr, null, null). > - Don't look through the domain before calling can_coerce_type(). > - Compare operator family of pfeqop and TYPECACHE_EQ_OPR at creation. > - Move post-processing from gram.y and remove "t"/"f" magic values. All done.
Attachment
Hello, Il 15/03/12 05:03, Marco Nenciarini ha scritto: > please find attached v4 of the EACH Foreign Key patch (formerly known > also as Foreign Key Array). Please find attached version v4b which replaces v4 and fixes a bug in array_replace() and adds further regression tests on array_replace() and fixes a few typos in the documentation. Thank you, Gabriele -- Gabriele Bartolini - 2ndQuadrant Italia PostgreSQL Training, Services and Support gabriele.bartolini@2ndQuadrant.it | www.2ndQuadrant.it
Attachment
[used followup EACH-foreign-key.v4b.patch.bz2 for review] On Wed, Mar 14, 2012 at 07:03:08PM +0100, Marco Nenciarini wrote: > please find attached v4 of the EACH Foreign Key patch (formerly known > also as Foreign Key Array). > On Fri, Feb 24, 2012 at 09:01:35PM -0500, Noah Misch wrote: > > > We can use multi-dimensional arrays as well as referencing columns. In > > > that case though, ON DELETE EACH CASCADE will behave like ON DELETE EACH > > > SET NULL. This is a safe way of implementing the action. > > > We have some ideas on how to implement this, but we feel it is better to > > > limit the behaviour for now. > > > > This still feels awfully unprincipled to me. How about just throwing an error > > when we need to remove an element from a multidimensional array? Essentially, > > have ON DELETE EACH CASCADE downgrade to ON DELETE RESTRICT when it encounters > > a multidimensional array. That's strictly less code than what you have now, > > and it keeps our options open. We can always change from error to set-null > > later, but we can't so readily change from set-null to anything else. > > That seems reasonable to me. Implemented and documented. I like the semantics now. You implemented this by doing two scans per PK delete, the first to detect multidimensional dependants of the PK row and the second to fix 1-dimensional dependants. We don't require an index to support the scan, so this can mean two seq scans. Currently, the only benefit to doing it this way is a change of error message. Here is the current error message when we would need to remove a multidimensional array element: ERROR: update or delete on table "pktableforarray" violates foreign key constraint "fktableforarray_ftest1_fkey" on table"fktableforarray" DETAIL: Key (ptest1)=(2) is still referenced from table "fktableforarray". If I just rip out the first scan, we get the same outcome with a different error message: ERROR: removing elements from multidimensional arrays is not supported CONTEXT: SQL statement "UPDATE ONLY "public"."fktableforarray"SET "ftest1" = array_remove("ftest1", $1) WHERE $1 OPERATOR(pg_catalog.=) ANY ("ftest1")" That has less polish, but I think it's actually more useful. The first message gives no indication that a multidimensional array foiled your ON DELETE EACH CASCADE. The second message hints at that cause. I recommend removing the new block of code in RI_FKey_eachcascade_del() and letting array_remove() throw the error. If you find a way to throw a nicer error without an extra scan, by all means submit that to a future CF as an improvement. I don't think it's important enough to justify cycles at this late hour of the current CF. > > As I pondered this patch again, I came upon formal hazards around non-default > > operator classes. Today, ATAddForeignKeyConstraint() always chooses pfeqop, > > ffeqop and ppeqop in the same operator family. If it neglected this, we would > > get wrong behavior when one of the operators is sensitive to a change to which > > another is insensitive. For EACH FK constraints, this patch sets ffeqop = > > ARRAY_EQ_OP unconditionally. array_eq() uses the TYPECACHE_EQ_OPR (usually > > from the default B-tree operator class). That operator may not exist at all, > > let alone share an operator family with pfeqop. Shall we deal with this by > > retrieving TYPECACHE_EQ_OPR in ATAddForeignKeyConstraint() and rejecting the > > constraint creation if it does not share an operator family with pfeqop? The > > same hazard affects ri_triggers.c use of array_remove() and array_replace(), > > and the same change mitigates it. > > Check added. As a consequence of this stricter policy, certain > previously allowed cases are now unacceptable (e.g pk FLOAT and fk > INT[]). > Regression tests have been added. Ah, good. Not so formal after all. > > pg_constraint.confeach needs documentation. > > Most of pg_constraint columns, including all the boolean ones, are > documented only in the "description" column of > > http://www.postgresql.org/docs/9.1/static/catalog-pg-constraint.html#AEN85760 > > it seems that confiseach should not be an exception, since it just > indicates whether the constraint is of a certain kind or not. Your patch adds two columns to pg_constraint, confiseach and confeach, but it only mentions confiseach in documentation. Just add a similar minimal mention of confeach. > > > *************** > > > *** 5736,5741 **** ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation rel, > > > --- 5759,5807 ---- > > > (errcode(ERRCODE_INVALID_FOREIGN_KEY), > > > errmsg("number of referencing and referenced columns for foreign key disagree"))); > > > > > > + /* Enforce each foreign key restrictions */ > > > + if (fkconstraint->fk_is_each) > > > + { > > > + /* > > > + * ON UPDATE CASCADE action is not supported on EACH foreign keys > > > + */ > > > + if (fkconstraint->fk_upd_action == FKCONSTR_ACTION_CASCADE) > > > + ereport(ERROR, > > > + (errcode(ERRCODE_INVALID_FOREIGN_KEY), > > > + errmsg("ON UPDATE CASCADE action is not supported on " > > > + "EACH foreign keys"), > > > + errhint("Use ON UPDATE EACH CASCADE, instead"))); > > > > Note from the Error Message Style Guide that errhint and errdetail messages > > shall be complete sentences. > > Fixed. That is to say, they start with a capital letter and end with a period. Your old text was fine apart from the lack of a period. (Your new text also lacks a period.) > > This processing should happen in parse_utilcmd.c rather than gram.y. Magic > > values "t" and "f" won't do. Make fk_attrs a list of ForeignKeyColumnElem or > > else at least use an int list of true/false. Either way, you would no longer > > need the decode loop in tablecmds.c. > > Fixed. We have removed the magic values and moved decoding of the > ForeignKeyColumnElem list inside parse_utilcmd.c. When I said "int list", I meant a T_IntList (via lappend_int(), lfirst_int(), etc.). You used a T_List of T_Integer. > > > *************** > > > *** 2678,2695 **** RI_Initial_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel) > > > * The query string built is: > > > * SELECT fk.keycols FROM ONLY relname fk > > > * LEFT OUTER JOIN ONLY pkrelname pk > > > ! * ON (pk.pkkeycol1=fk.keycol1 [AND ...]) > > > * WHERE pk.pkkeycol1 IS NULL AND > > > * For MATCH unspecified: > > > * (fk.keycol1 IS NOT NULL [AND ...]) > > > * For MATCH FULL: > > > * (fk.keycol1 IS NOT NULL [OR ...]) > > > * > > > * We attach COLLATE clauses to the operators when comparing columns > > > * that have different collations. > > > *---------- > > > */ > > > initStringInfo(&querybuf); > > > appendStringInfo(&querybuf, "SELECT "); > > > sep = ""; > > > for (i = 0; i < riinfo.nkeys; i++) > > > --- 3527,3557 ---- > > > * The query string built is: > > > * SELECT fk.keycols FROM ONLY relname fk > > > * LEFT OUTER JOIN ONLY pkrelname pk > > > ! * ON (pk.pkkeycol1=fk.keycol1 [AND ...] > > > ! * [AND NOT EXISTS(<RECHECK_SUBQUERY>)]) > > > * WHERE pk.pkkeycol1 IS NULL AND > > > * For MATCH unspecified: > > > * (fk.keycol1 IS NOT NULL [AND ...]) > > > * For MATCH FULL: > > > * (fk.keycol1 IS NOT NULL [OR ...]) > > > * > > > + * In case of an EACH foreign key, a recheck subquery is added to > > > + * the join condition in order to check that every combination of keys > > > + * is actually referenced. > > > + * The RECHECK_SUBQUERY is > > > + * SELECT 1 FROM > > > + * unnest(fk.keycol1) x1(x1) [CROSS JOIN ...] > > > + * LEFT OUTER JOIN ONLY pkrelname pk > > > + * ON (pk.pkkeycol1=x1.x1 [AND ...]) > > > + * WHERE pk.pkkeycol1 IS NULL AND > > > + * (fk.keycol1 IS NOT NULL [AND ...]) > > > > What is a worst-case query plan for a constraint with n ordinary keys and m > > EACH keys? > > The subquery complexity can be compared with the one of the main query. > Anyway, it is a one-off cost, paid at constraint creation, which I > believe is acceptable. The fact that we now implement only single-column > EACH foreign keys might suggest that we postpone this discussion at a > later time. If the cost doesn't exceed O(F log P), where F is the size of the FK table and P is the size of the PK table, I'm not worried. If it can be O(F^2), we would have a problem to be documented, if not fixed. Your change to array_replace() made me look at array_remove() again and realize that it needs the same treatment. This command yields a segfault: SELECT array_remove('{a,null}'::text[], null); This latest version introduces two calls to get_element_type(), both of which should be get_base_element_type(). Patch "Avoid FK validations for no-rewrite ALTER TABLE ALTER TYPE", committed between v3b and v4 of this patch, added code to ATAddForeignKeyConstraint() requiring an update in this patch. Run this in the regression DB: [local] regression=# alter table fktableforeachfk alterftest1 type int[]; ERROR: could not find cast from 1007 to 23 RI_PLAN_EACHCASCADE_DEL_DOUPDATE should be RI_PLAN_EACHCASCADE_DEL_DODELETE. I think the patch has reached the stage where a committer can review it without wasting much time on things that might change radically. So, I'm marking it Ready for Committer. Please still submit an update correcting the above; I'm sure your committer will appreciate it. Thanks, nm
Hi Noah, thank you again for your thorough review, which is much appreciated. > I think the patch has reached the stage where a committer can review > it without wasting much time on things that might change radically. > So, I'm marking it Ready for Committer. Please still submit an update > correcting the above; I'm sure your committer will appreciate it. Attached is v5, which should address all the remaining issues. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it On Fri, Mar 16, 2012 at 11:33:12PM -0400, Noah Misch wrote: > I recommend removing the new block of code in RI_FKey_eachcascade_del() and > letting array_remove() throw the error. If you find a way to throw a nicer > error without an extra scan, by all means submit that to a future CF as an > improvement. I don't think it's important enough to justify cycles at this > late hour of the current CF. It makes sense; we have removed the block of code and updated the error message following your suggestion. Now the error is thrown by array_remove() itself. We'll keep an eye on this for further improvements in the future. > > > pg_constraint.confeach needs documentation. > > > > Most of pg_constraint columns, including all the boolean ones, are > > documented only in the "description" column of > > > > http://www.postgresql.org/docs/9.1/static/catalog-pg-constraint.html#AEN85760 > > > > it seems that confiseach should not be an exception, since it just > > indicates whether the constraint is of a certain kind or not. > > Your patch adds two columns to pg_constraint, confiseach and confeach, but it > only mentions confiseach in documentation. Just add a similar minimal mention > of confeach. Sorry, our mistake; a mention for confeach has now been added, and both entries have been reordered to reflect the column position in pg_constraint). > That is to say, they start with a capital letter and end with a period. Your > old text was fine apart from the lack of a period. (Your new text also lacks > a period.) Fixed, it should be fine now (another misunderstanding on our side, apologies). > If the cost doesn't exceed O(F log P), where F is the size of the FK table and > P is the size of the PK table, I'm not worried. If it can be O(F^2), we would > have a problem to be documented, if not fixed. We have rewritten the old query in a simpler way; now its cost is O(F log P). Here F must represent the size of the "flattened" table, that is, the total number of values that must be checked, which seems a reasonable assumption in any case. > Your change to array_replace() made me look at array_remove() again and > realize that it needs the same treatment. This command yields a segfault: > SELECT array_remove('{a,null}'::text[], null); Fixed. > > This latest version introduces two calls to get_element_type(), both of which > should be get_base_element_type(). Fixed. > Patch "Avoid FK validations for no-rewrite ALTER TABLE ALTER TYPE", committed > between v3b and v4 of this patch, added code to ATAddForeignKeyConstraint() > requiring an update in this patch. Run this in the regression DB: > [local] regression=# alter table fktableforeachfk alter ftest1 type int[]; > ERROR: could not find cast from 1007 to 23 Thanks for pointing it out. We have added a regression test and then fixed the issue. > > RI_PLAN_EACHCASCADE_DEL_DOUPDATE should be RI_PLAN_EACHCASCADE_DEL_DODELETE. Fixed.
Attachment
On Mon, Mar 19, 2012 at 06:41:39PM +0100, Marco Nenciarini wrote: > Attached is v5, which should address all the remaining issues. Looks clean to me. > On Fri, Mar 16, 2012 at 11:33:12PM -0400, Noah Misch wrote: > > If the cost doesn't exceed O(F log P), where F is the size of the FK table and > > P is the size of the PK table, I'm not worried. If it can be O(F^2), we would > > have a problem to be documented, if not fixed. > > We have rewritten the old query in a simpler way; now its cost is O(F log P). > Here F must represent the size of the "flattened" table, that is, the total > number of values that must be checked, which seems a reasonable assumption > in any case. Great; I find that approach easier to reason about.
Marco Nenciarini <marco.nenciarini@2ndQuadrant.it> writes: > Attached is v5, which should address all the remaining issues. I started to look at this patch a bit. I'm quite confused by the fact that some, but not all, of the possible FK action types now come in an EACH variant. This makes no sense at all to me. ISTM that EACH is a property of the FK constraint as a whole, that is that it says the constraint is from array elements on the referencing side to column values on the referenced side, rather than the normal case of column values to column values. Why would the possible actions be affected, and why only these? The patch documentation is extremely unclear about this. It's even less clear about what the semantics are in multi-key cases. Right offhand I would say that multi-key cases are nonsensical and should be forbidden outright, because there is no way to figure out which collections of elements of different arrays should be considered to be a referencing item. Could we see a specification of what the referencing semantics are intended to be, please? regards, tom lane
On Thu, Mar 22, 2012 at 03:02:45PM -0400, Tom Lane wrote: > It's even less clear about what the semantics are in multi-key > cases. Right offhand I would say that multi-key cases are > nonsensical and should be forbidden outright, because there is no > way to figure out which collections of elements of different arrays > should be considered to be a referencing item. Currently multi-column keys with more than one EACH column are unsupported, mainly because it's unclear how they should work (and I agree that they might not work at all). > Could we see a specification of what the referencing semantics are > intended to be, please? You are right, the discussion has never been put together in a single place, as it should have. Please find below an updated version of the specification, which Marco and I put together from the discussion in this list, and taking into account the changes happened in the review phase. Some comments have also been added to explain why some choices have been forbidden. Best regards, Dr. Gianni Ciolli - 2ndQuadrant Italia PostgreSQL Training, Services and Support gianni.ciolli@2ndquadrant.it | www.2ndquadrant.it ---8<------8<------8<------8<------8<------8<------8<------8<------8<--- ON (DELETE | UPDATE) actions for EACH foreign keys ================================================== ------------------ ----------- ----------- | ON | ON | Action | DELETE | UPDATE | ------------------ ----------- ----------- CASCADE | Row | Forbidden | SET NULL | Row | Row | SET DEFAULT | Row | Row | EACH CASCADE | Element | Element | EACH SET NULL | Element | Element | EACH SET DEFAULT | Forbidden | Forbidden | NO ACTION | - | - | RESTRICT | - | - | ------------------ --------- ------------- Example 1. Table C references table B via a (non-array) foreign key. Example 2. The referencing table A is constructed as GROUP BY of table C in Example 1. There is an EACH foreign key on A which references B, representing the same relationship as the foreign key in Example 1. Remark 3. Examples 1 and 2 are related, because they represent the same model; in making choices about a certain action on Example 2 we will considering its relationship with Example 1. Example 4. Assume that the FK in Example 1 has a ON DELETE CASCADE action. Deleting one row on table B will delete all the referencing rows in table A. The state that we get after the DELETE is the same obtained by Example 2 with the ON DELETE EACH CASCADE action after removing the same row. Example 4 suggests to associate the "Element" behaviour to the ON DELETE EACH CASCADE action. The user can choose between two different options for a CASCADE-style action when a referenced row is deleted; both of them have use cases, as the following Example 5 shows. Example 5. If you remove a vertex from a polygon (represented as an array of vertices), you can either destroy the polygon (ON DELETE CASCADE) or transform it into a polygon with less vertices (ON DELETE EACH CASCADE). ON UPDATE SET NULL has its own purpose as a different behaviour than ON UPDATE EACH SET NULL; again, both options are provided to the user, essentially like with ON DELETE CASCADE and ON DELETE EACH CASCADE. ON (UPDATE | DELETE) EACH SET DEFAULT is forbidden, because table A does not carry a default value for an array element. In theory the default value could be retrieved from the referenced table B, but that would be unusual and in any case different from the corresponding case of Example 1 with ON (UPDATE | DELETE) SET DEFAULT. ON UPDATE CASCADE is forbidden because, as far as we can see, the only meaningful action to propagate updated values is ON UPDATE EACH CASCADE.
Hello Tom,<br /><br /><div class="ace-line" id="magicdomid393"><span class="">> I started to look at this patch a bit. I'm quite confused by the fact</span></div><div class="ace-line" id="magicdomid394"><span class="">> that some, butnot all, of the possible FK action types now come in an</span></div><div class="ace-line" id="magicdomid395"><span class="">>EACH variant. This makes no sense at all to me. ISTM that EACH is a</span></div><div class="ace-line" id="magicdomid396"><spanclass="">> property of the FK constraint as a whole, that is that it says the</span></div><divclass="ace-line" id="magicdomid397"><span class="">> constraint is from array elements on the referencingside to column</span></div><div class="ace-line" id="magicdomid398"><span class="">> values on the referencedside, rather than the normal case of column</span></div><div class="ace-line" id="magicdomid399"><span class="">>values to column values.</span></div><div class="ace-line" id="magicdomid400"><br /></div><div class="ace-line"id="magicdomid401"><span class="">The specification that Gianni posted applies only to v5 of the patch.<br/> The original idea was indeed to have the whole foreign key to be defined with an EACH property (initially wewere actually thinking of the ARRAY keyword following your advice, then for grammar reasons we opted for EACH).</span></div><divclass="ace-line" id="magicdomid402"><span class="">However, during the actual development we facedsome difficulties with multi-column foreign keys.<br /> Through discussions on this list and with the reviewer we optedto allow the EACH keyword at column level.</span></div><div class="ace-line" id="magicdomid403"><span class="">We startedwith the case where at most one column is EACH, which is easier to understand.<br /> The case of two or more EACHcolumns in the same foreign key has been left open for future development.</span></div><div class="ace-line" id="magicdomid404"><br/></div><div class="ace-line" id="magicdomid405"><span class="">> Why would the possible actionsbe affected, and why only these?</span></div><div class="ace-line" id="magicdomid406"><br /></div><div class="ace-line"id="magicdomid407"><span class="">We had to add the EACH variant to two actions (EACH CASCADE and EACH SET NULL), in order to leave users the flexibility to choose the operation to be performed in case of delete or update ofone or more elements from the referenced table.</span></div><div class="ace-line" id="magicdomid408"><span class="">Some users indeed might prefer that, in case a referenced row is deleted, the whole row is deleted (therefore they'duse the standard CASCADE action). Others mights simply require that references to that row is removed from the referencingarray (therefore they'd use the variant EACH CASCADE action). The same concept applies for SET NULL (the whole array is set to NULL) and EACH SET NULL (referencing elements are set to NULL).</span></div><div class="ace-line"id="magicdomid390"><br /></div> Thank you.<br /><br /> Cheers,<br /> Gabriele<br /><pre class="moz-signature"cols="72">-- Gabriele Bartolini - 2ndQuadrant ItaliaPostgreSQL Training, Services and Support<a class="moz-txt-link-abbreviated"href="mailto:gabriele.bartolini@2ndQuadrant.it">gabriele.bartolini@2ndQuadrant.it</a> | <aclass="moz-txt-link-abbreviated" href="http://www.2ndQuadrant.it">www.2ndQuadrant.it</a> </pre>
Il giorno lun, 19/03/2012 alle 18.41 +0100, Marco Nenciarini ha scritto: > > Attached is v5, which should address all the remaining issues. Please find attached v6 of the EACH Foreign Key patch. From v5 only cosmetic changes to the documentation were made. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
Attachment
On lör, 2012-03-24 at 10:01 +0000, Gianni Ciolli wrote: > ON (DELETE | UPDATE) actions for EACH foreign keys > ================================================== > > ------------------ ----------- ----------- > | ON | ON | > Action | DELETE | UPDATE | > ------------------ ----------- ----------- > CASCADE | Row | Forbidden | > SET NULL | Row | Row | > SET DEFAULT | Row | Row | > EACH CASCADE | Element | Element | > EACH SET NULL | Element | Element | > EACH SET DEFAULT | Forbidden | Forbidden | > NO ACTION | - | - | > RESTRICT | - | - | > ------------------ --------- ------------- > I took another fresh look at this feature after not having looked for a month or two. I think the functionality is probably OK, but I find the interfaces somewhat poorly named. Consider, "PostgreSQL adds EACH foreign keys" -- huh? I think they key word ELEMENT would be more descriptive and precise, and it also leaves the door open to other kind of non-atomic foreign key relationships outside of arrays. EACH has no relationship with arrays. It might as well refer to each row. On the matter of the above chart, there has been a long back and forth about whether the row or the element case should be the default. Both cases are probably useful, but unfortunately you have now settled on making maximum destruction the default. Additionally, we would now have the case that sometimes, depending on some configuration elsewhere, an ON DELETE CASCADE deletes more than what was actually involved in the foreign key. What I'd suggest is to make both cases explicit. That is, forbid ON DELETE CASCADE altogether and make people write ON DELETE CASCADE ROW or ON DELETE CASCADE ELEMENT. In addition to making things more explicit and safer, it would again leave the door open to other kinds of relationships later.
On Fri, Apr 06, 2012 at 09:21:17AM +0300, Peter Eisentraut wrote: > On l??r, 2012-03-24 at 10:01 +0000, Gianni Ciolli wrote: > > ON (DELETE | UPDATE) actions for EACH foreign keys > > ================================================== > > > > ------------------ ----------- ----------- > > | ON | ON | > > Action | DELETE | UPDATE | > > ------------------ ----------- ----------- > > CASCADE | Row | Forbidden | > > SET NULL | Row | Row | > > SET DEFAULT | Row | Row | > > EACH CASCADE | Element | Element | > > EACH SET NULL | Element | Element | > > EACH SET DEFAULT | Forbidden | Forbidden | > > NO ACTION | - | - | > > RESTRICT | - | - | > > ------------------ --------- ------------- > > > I took another fresh look at this feature after not having looked for a > month or two. I think the functionality is probably OK, but I find the > interfaces somewhat poorly named. Consider, "PostgreSQL adds EACH > foreign keys" -- huh? I think they key word ELEMENT would be more > descriptive and precise, and it also leaves the door open to other kind > of non-atomic foreign key relationships outside of arrays. EACH has no > relationship with arrays. It might as well refer to each row. Good points. Your proposed naming works for me. > On the matter of the above chart, there has been a long back and forth > about whether the row or the element case should be the default. Both > cases are probably useful, but unfortunately you have now settled on > making maximum destruction the default. Additionally, we would now have > the case that sometimes, depending on some configuration elsewhere, an > ON DELETE CASCADE deletes more than what was actually involved in the > foreign key. What I'd suggest is to make both cases explicit. That is, > forbid ON DELETE CASCADE altogether and make people write ON DELETE > CASCADE ROW or ON DELETE CASCADE ELEMENT. In addition to making things > more explicit and safer, it would again leave the door open to other > kinds of relationships later. I'm ambivalent on this one. ON DELETE CASCADE truly does the same thing regardless of whether the FK incorporates an EACH column. The current syntax arises from that symmetry rather than a decision to dub its behavior as a default. That said, when a user wants CASCADE ELEMENT, your proposal would more-rapidly divert him from wrongly using CASCADE ROW. Thanks, nm
On Wed, Mar 28, 2012 at 04:25:06PM +0200, Marco Nenciarini wrote: > Il giorno lun, 19/03/2012 alle 18.41 +0100, Marco Nenciarini ha scritto: > > > > Attached is v5, which should address all the remaining issues. > > Please find attached v6 of the EACH Foreign Key patch. From v5 only > cosmetic changes to the documentation were made. This has bitrotted; please refresh. Also, please evaluate Peter's feedback: http://archives.postgresql.org/message-id/1333693277.32606.9.camel@vanquo.pezone.net Thanks, nm
Hi Noah, Il 10/06/12 22:53, Noah Misch ha scritto: > This has bitrotted; please refresh. > > Also, please evaluate Peter's feedback: > http://archives.postgresql.org/message-id/1333693277.32606.9.camel@vanquo.pezone.net Our goal is to work on this patch from the next commit fest. What we are about to do for this commit fest is to split the previous patch and send a small one just for the array_remove() and array_replace() functions. Then we will sit down and organise the development of the feature according to Peter's feedback. It is important indeed that we find a commonly accepted terminology and syntax for this feature. I hope this sounds like a reasonable plan. Thank you. Cheers, Gabriele -- Gabriele Bartolini - 2ndQuadrant Italia PostgreSQL Training, Services and Support gabriele.bartolini@2ndQuadrant.it |www.2ndQuadrant.it
On Wed, Jun 13, 2012 at 10:12:18PM +0200, Gabriele Bartolini wrote: > Our goal is to work on this patch from the next commit fest. > > What we are about to do for this commit fest is to split the previous > patch and send a small one just for the array_remove() and > array_replace() functions. > > Then we will sit down and organise the development of the feature > according to Peter's feedback. It is important indeed that we find a > commonly accepted terminology and syntax for this feature. > > I hope this sounds like a reasonable plan. Thank you. Sounds fine. The 2012-01 CF entry for this patch has been moved to the 2012-06 CF. Please mark that entry Returned with Feedback and create a new entry for the subset you're repackaging for this CommitFest. Thanks, nm
Il giorno mer, 13/06/2012 alle 18.07 -0400, Noah Misch ha scritto: > On Wed, Jun 13, 2012 at 10:12:18PM +0200, Gabriele Bartolini wrote: > > Our goal is to work on this patch from the next commit fest. > > > > What we are about to do for this commit fest is to split the previous > > patch and send a small one just for the array_remove() and > > array_replace() functions. > > > > Then we will sit down and organise the development of the feature > > according to Peter's feedback. It is important indeed that we find a > > commonly accepted terminology and syntax for this feature. > > > > I hope this sounds like a reasonable plan. Thank you. > > Sounds fine. The 2012-01 CF entry for this patch has been moved to the > 2012-06 CF. Please mark that entry Returned with Feedback and create a new > entry for the subset you're repackaging for this CommitFest. > > Thanks, > nm > Dear Noah, I have added a new patch to the commitfest (miscellaneous category) entitled "Support for array_remove and array_replace functions". I have also marked the "Foreign Key Array" patch as " Returned with Feedback" as per your request. We will start working again on this patch in the next commitfest as anticipated by Gabriele, hoping that the array functions will be committed by then. Thank you. Cheers, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it
On 6 April 2012 07:21, Peter Eisentraut <peter_e@gmx.net> wrote: > On lör, 2012-03-24 at 10:01 +0000, Gianni Ciolli wrote: >> ON (DELETE | UPDATE) actions for EACH foreign keys >> ================================================== >> >> ------------------ ----------- ----------- >> | ON | ON | >> Action | DELETE | UPDATE | >> ------------------ ----------- ----------- >> CASCADE | Row | Forbidden | >> SET NULL | Row | Row | >> SET DEFAULT | Row | Row | >> EACH CASCADE | Element | Element | >> EACH SET NULL | Element | Element | >> EACH SET DEFAULT | Forbidden | Forbidden | >> NO ACTION | - | - | >> RESTRICT | - | - | >> ------------------ --------- ------------- >> > I took another fresh look at this feature after not having looked for a > month or two. I think the functionality is probably OK, but I find the > interfaces somewhat poorly named. Consider, "PostgreSQL adds EACH > foreign keys" -- huh? I think they key word ELEMENT would be more > descriptive and precise, and it also leaves the door open to other kind > of non-atomic foreign key relationships outside of arrays. EACH has no > relationship with arrays. It might as well refer to each row. > > On the matter of the above chart, there has been a long back and forth > about whether the row or the element case should be the default. Both > cases are probably useful, but unfortunately you have now settled on > making maximum destruction the default. Additionally, we would now have > the case that sometimes, depending on some configuration elsewhere, an > ON DELETE CASCADE deletes more than what was actually involved in the > foreign key. What I'd suggest is to make both cases explicit. That is, > forbid ON DELETE CASCADE altogether and make people write ON DELETE > CASCADE ROW or ON DELETE CASCADE ELEMENT. In addition to making things > more explicit and safer, it would again leave the door open to other > kinds of relationships later. +1 -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Fri, 2011-11-04 at 13:48 +0100, Gabriele Bartolini wrote: > This patch adds basic support of arrays in foreign keys, by allowing to > define a referencing column as an array of elements having the same type > as the referenced column in the referenced table. > Every NOT NULL element in the referencing array is matched against the > referenced table. I'm trying to find commonalities between this feature and my future RANGE FOREIGN KEY feature (not past the hand-waving stage yet). The first thing I observe is that my idea for range foreign keys is almost the opposite of your idea for array FKs. I was imagining a range FK to mean that the referencing side is contained by the referenced side. This is the common definition in the temporal world, because the valid period for the referencing row must be within the valid period for the row it references (same for transaction time). The referenced side must be a range, and the referencing side must be either a range of the same type or the subtype of the range. Other similar definitions exist by replacing "contained by" with some other operator, though the use cases for those aren't as clear to me. This definition works for arrays and possibly many other types (geometry?) as well. It looks like this is orthogonal from your work, but it does seem like it has potential for confusion in the future. Thoughts? Regards,Jeff Davis
IMO, both approaches make sense... From temporal point no doubt, referencing should be contained by referenced table.... From other side could be useful if in master table are elements with simple data type, but for some set of elements there could be common properties in another table.. What today is doable on the way to in another table have the same data type and repeat the same properties for each element...That would be possible with Range data type, though it does not mean always data are in range so array is probably better option... However I am not sure from maintaining point of view, i,e when an element should be removed from that common properties set - but it is different topic :) Kind Regards, Misa Sent from my Windows Phone -----Original Message----- From: Jeff Davis Sent: 17/06/2012 08:55 To: Gabriele Bartolini Cc: PostgreSQL-development; Marco Nenciarini Subject: Re: [HACKERS] [PATCH] Support for foreign keys with arrays
On 17 June 2012 19:16, Misa Simic <misa.simic@gmail.com> wrote: > IMO, both approaches make sense... Agreed. It's also a good reason to do as Peter suggests and come up with a better description than just EACH. Do we need something like Exclusion FKs? i.e. the FK partner of Exclusion Constraints? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Simon Riggs wrote: Misa Simic wrote: >> IMO, both approaches make sense... > > Agreed. Can someone provide a practical example of a "foreign key with array" use case? The only situations I'm able to think of right now are the same cases where you would now use a table with primary keys of two tables to provide a many-to-many linkage. Does this proposed feature handle other cases or handle this type of case better? The "referencing value is contained by the referenced value" has many obvious uses. For example, in our courts data we have a statute table which effectively has a statute cite and effective date range for the primary key, and we have a charge table with a statute cite and an offense date used to match it to a statute row. -Kevin
2012/6/17 Simon Riggs <simon@2ndquadrant.com>
Do we need something like Exclusion FKs? i.e. the FK partner of
Exclusion Constraints?
+1
Definatelly it would be something usefull... Today's workaround to achieve that with additional table, and additional column in Key is a bit awkward...
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
2012/6/17 Kevin Grittner <Kevin.Grittner@wicourts.gov>
Can someone provide a practical example of a "foreign key with array"
use case? The only situations I'm able to think of right now are the
same cases where you would now use a table with primary keys of two
tables to provide a many-to-many linkage. Does this proposed feature
handle other cases or handle this type of case better?
I can't imagine either other usablity... Just many-to-one linkage... or to have many-to-many link with less rows in middle table... What is better - I think should be measured...
On Sun, Jun 17, 2012 at 09:58:17AM -0500, Kevin Grittner wrote: > Simon Riggs wrote: > Misa Simic wrote: > > >> IMO, both approaches make sense... > > > > Agreed. > > Can someone provide a practical example of a "foreign key with array" > use case? The only situations I'm able to think of right now are the > same cases where you would now use a table with primary keys of two > tables to provide a many-to-many linkage. Does this proposed feature > handle other cases or handle this type of case better? The way I think about "array foreign keys" is that they represent the "aggregated" form of a classical foreign key. In the aggregated form, each row in the referencing side represents a group of rows in the non-aggregated form. One advantage is that constraints on each group of rows as a whole are now possible, because they become constraints on a single row in the aggregated form. Example. If you have a table of points, then you can have a table of polygons where each polygon contains an array of points. The non-aggregated model would instead require an additional point_polygon table which references both the point and the polygon table, because the point <-> polygon relationship is many-to-many. In the aggregated model, you can easily specify a CHECK constraint that requires each polygon to have at least three points, while the corresponding condition cannot be specified in the non-aggregated model. Cheers, Dr. Gianni Ciolli - 2ndQuadrant Italia PostgreSQL Training, Services and Support gianni.ciolli@2ndquadrant.it | www.2ndquadrant.it
On Sun, 2012-06-17 at 21:10 +0800, Simon Riggs wrote: > Do we need something like Exclusion FKs? i.e. the FK partner of > Exclusion Constraints? Yes, "Inclusion Constraints". I've known we need something like that since I did Exclusion Constraints, but I haven't gotten further than that. Regards,Jeff Davis
IMO, both approaches make sense... From temporal point no doubt, referencing should be contained by referenced table.... From other side could be useful if in master table are elements with simple data type, but for some set of elements there could be common properties in another table.. What today is doable on the way to in another table have the same data type and repeat the same properties for each element...That would be possible with Range data type, though it does not mean always data are in range so array is probably better option... However I am not sure from maintaining point of view, i,e when an element should be removed from that common properties set - but it is different topic :) Kind Regards, Misa Sent from my Windows Phone -----Original Message----- From: Jeff Davis Sent: 17/06/2012 08:55 To: Gabriele Bartolini Cc: PostgreSQL-development; Marco Nenciarini Subject: Re: [HACKERS] [PATCH] Support for foreign keys with arrays
Misa Simic <misa.simic@gmail.com> wrote: > 2012/6/17 Kevin Grittner <Kevin.Grittner@wicourts.gov> >> Can someone provide a practical example of a "foreign key with >> array" use case? The only situations I'm able to think of right >> now are the same cases where you would now use a table with >> primary keys of two tables to provide a many-to-many linkage. >> Does this proposed feature handle other cases or handle this type >> of case better? > > I can't imagine either other usablity... Just many-to-one > linkage... or to have many-to-many link with less rows in middle > table... The many-to-one case seems like it is better handled in the other direction -- with the referenced table holding the set of valid keys and the referencing table holding the single key. (I believe the general case of this is what Jeff called an "inclusion constraint" -- a feature he wants to add at some point.) I can't think of a use case where that would not be better for this type of relationship than putting the array on the referencing side. The many-to-many relationship does seem like a potentially useful feature, at least in some cases. For example, a message board where a limited number of tags can be attached to each message -- the message could contain an array of tag IDs. Clearly this could be done as a table holding message_id and tag_id, but it seems potentially more convenient from a programming perspective to have an array of tag_id values in the message table. Logically, you are associating each of these with the primary key of the row it is in. Are there other obvious use-cases? If nobody can put one forward, this seems like a good "reality test" for proposed behaviors in any corner cases where "correct" behavior isn't obvious -- what would you want to do to the data if it were in the separate table with just the primary keys to link the two tables? > What is better - I think should be measured... It would make an interesting test to compare performance of a suitably sized data set and reasonable workload for both techniques. Of course, that's a non-trivial amount of work. It seems like people may be able to justify this just on ease-of-use, versus claiming that it's an optimization. -Kevin
2012/6/18 Kevin Grittner <Kevin.Grittner@wicourts.gov>
direction -- with the referenced table holding the set of valid keysThe many-to-one case seems like it is better handled in the other
and the referencing table holding the single key. (I believe the
general case of this is what Jeff called an "inclusion constraint"
-- a feature he wants to add at some point.) I can't think of a use
case where that would not be better for this type of relationship
than putting the array on the referencing side.
Hi Kevin,
Well, from my point of view "many-to-one" or "one-to-many" is more related from which point we are looking to the thing... But let me better explain what I thought...
Example 1)
If we have one table with master data TableA(ID, other properties...) and TableB(tableAIDFK, propA, propB, propC) -PK of TableB is irrelavant in this point... and let say a lot of TableA tuples could have the same TableB properties... So we can have how many common TableA tuples, that many tuples in TableB with the same values in PropA, PropB, and PropC with FK the same type as in Table A, or to have 1 tuple in TableB with Array type as FK field.... So it (1 tuple in TableB) can point many tuples in TableC... And in the same time simple element can exist in TableA, but could or doesn't have to exist in TableB...
What test would show is there any gain in this approach - I don't know... but think it should - especially if propA,PropB, and C should be updated for all of them...
Example 2)
From other side, what Jeff propose, and what is also usefull, but different thing is... to have the main data in TableA, but key field is an range datatype... what later each element what belong to the range, could have related tuple in TableB (Also, as the same range datatype - but smaller... contained by Master one... or simple datatype subtype of the range) - which is other way around... Opposite from exmaple 1 - but differnet from functional point of view... Depending what is the Master... Also, for example 1 - data in FK do not need to be in range.. so basicaly ID [1, 2 ,4 ,7] could have 1 tuple with its properties, and [3,5,6] in second tuple with different properties...
I am not sure Example 2) Jeff called "Inclusion Constraint" - Jeff can explain it better :)
Based on
Simon Riggs wrote:
>> Do we need something like Exclusion FKs? i.e. the FK partner of
>> Exclusion Constraints?
>Yes, "Inclusion Constraints". I've known we need something like that
>since I did Exclusion Constraints, but I haven't gotten further than
>that.
>Regards,
> Jeff Davis
I have understood it as:
TableA(ID, properties...)
TableB(ID, properties...)
Now if we define FK on TableB to TableA... It means that row inserted in TableB, must have already row with the same ID value in TableA...
But what would be usefull, to define Exclude FK to table A, to we prevent insert new row in Table B with ID value what already exist in TableA...
btw, if anyone is happy to point me in right direction, and there is common feeling it is usefull feature, I am happy to code it... Actually that is something what I will code anyway for "in-house" solution - but would be good to do it under Postgres standards...
Kind Regards,
Misa
Hi guys,<br /><br /> it is time to give another go to this patch. I would like to thank everyone for suggestions andideas expressed through this list.<br /><br /> We are happy that "Part 0" of the patch has been committed (array_remove()and array_replace() functions), which will be useful in "Part 2" (too early now to talk about it). Let's notrush though and focus on "Part 1" of the patch. :)<br /><blockquote cite="mid:CA+U5nMKvmv_c35D5W=O4boW1AAZFg+2qHKLbvXV3nJVdYgff8w@mail.gmail.com"type="cite"></blockquote> First, I wouldlike to find a unique and general term for this feature. We started with "Foreign keys with arrays" and ended up with"EACH foreign keys". Following Peter's suggestion, we will use the "ELEMENT" keyword (so that maybe in the future wecan extend the usage). Our proposals are:<br /><br /> * Array Foreign Key<br /> * Foreign Key Arrays<br /> * ELEMENT ForeignKeys<br /> * ...<br /><br /> Which one is your favourite?<br /><br /> Secondly, we have decided to split thepatch we proposed back in March in two smaller patches. The most important goal of "Part 1" is to find a generally acceptedsyntax. By removing ACTION handling from "Part 1" (see limitations below), we believe that the community will beable to contribute more to driving future directions and requirements. Based on Peter's comments, we would like to proposethe use of the "ELEMENT" keyword, rather than the "EACH" keyword proposed in March. You can find three examples atthe bottom of this email.<br /><br /> Finally, "Part 1" of this patch will have these limitations:<br /><br /> * Onlyone <code>ELEMENT</code> column allowed in a multi-column key (same as the proposed patch in March)<br /> * Supportedactions<code>:<br /> * NO ACTION</code><code><br /> * RESTRICT</code><br /><br /> Cheers,<br /> Gabriele<br /><br/><br /> Example 1: inline usage<br /><br /><tt>CREATE TABLE drivers (<br /> driver_id integer PRIMARY KEY,<br /> first_name text,<br /> last_name text,<br /> ...<br /> );<br /><br /> CREATE TABLE races (<br /> race_idinteger PRIMARY KEY,<br /> title text,<br /> race_day DATE,<br /> ...<br /> final_positions integer[]ELEMENT REFERENCES drivers<br /> );</tt><br /><br /><br /><br /> Example 2: with FOREIGN KEY<br /><br /><tt>CREATETABLE races (<br /> race_id integer PRIMARY KEY,<br /> title text,<br /> race_day DATE,<br /> ...<br /> final_positions integer[],<br /> FOREIGN KEY (ELEMENT final_positions) REFERENCES drivers<br /> );<br/></tt><br /><br /><br /> Example 3: with ALTER TABLE<br /><br /><tt>CREATE TABLE races (<br /> race_id integerPRIMARY KEY,<br /> title text,<br /> race_day DATE,<br /> ...<br /> final_positions integer[]<br />);<br /><br /> ALTER TABLE races ADD FOREIGN KEY (ELEMENT final_positions) REFERENCES drivers;<br /></tt><br /><pre class="moz-signature"cols="72">-- Gabriele Bartolini - 2ndQuadrant ItaliaPostgreSQL Training, Services and Support<a class="moz-txt-link-abbreviated"href="mailto:gabriele.bartolini@2ndQuadrant.it">gabriele.bartolini@2ndQuadrant.it</a> | <aclass="moz-txt-link-abbreviated" href="http://www.2ndQuadrant.it">www.2ndQuadrant.it</a> </pre>
On 30 July 2012 16:12, Gabriele Bartolini <gabriele.bartolini@2ndquadrant.it> wrote: > * Array Foreign Key > * Foreign Key Arrays > * ELEMENT Foreign Keys > * ... > > Which one is your favourite? Array Element Foreign Key -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Jul 30, 2012 at 11:12 AM, Gabriele Bartolini <gabriele.bartolini@2ndquadrant.it> wrote: > Hi guys, > > it is time to give another go to this patch. I would like to thank > everyone for suggestions and ideas expressed through this list. > > We are happy that "Part 0" of the patch has been committed > (array_remove() and array_replace() functions), which will be useful in > "Part 2" (too early now to talk about it). Let's not rush though and focus > on "Part 1" of the patch. :) > > First, I would like to find a unique and general term for this feature. > We started with "Foreign keys with arrays" and ended up with "EACH foreign > keys". Following Peter's suggestion, we will use the "ELEMENT" keyword (so > that maybe in the future we can extend the usage). Our proposals are: > > * Array Foreign Key > * Foreign Key Arrays > * ELEMENT Foreign Keys > * ... > > Which one is your favourite? I think having the word "element" in there makes it a lot clearer. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Excerpts from Simon Riggs's message of lun jul 30 11:21:46 -0400 2012: > On 30 July 2012 16:12, Gabriele Bartolini > <gabriele.bartolini@2ndquadrant.it> wrote: > > > * Array Foreign Key > > * Foreign Key Arrays > > * ELEMENT Foreign Keys > > * ... > > > > Which one is your favourite? > > Array Element Foreign Key I was going to say the same, except I had ELEMENT as a capitalized word in my mind (and in the docs it'd be within <literal>). -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Il 30/07/12 19:11, Alvaro Herrera ha scritto: > I was going to say the same, except I had ELEMENT as a capitalized > word in my mind (and in the docs it'd be within <literal>). So: "Array ELEMENT Foreign Key" +1 for me And it can be also interchanged with "Array element Foreign Key". -- Gabriele Bartolini - 2ndQuadrant Italia PostgreSQL Training, Services and Support gabriele.bartolini@2ndQuadrant.it |www.2ndQuadrant.it
Il 30/07/12 19:16, Gabriele Bartolini ha scritto: > And it can be also interchanged with "Array element Foreign Key". > As promised, we have sent a patch for the "Array ELEMENT foreign key" support. We are discontinuing this thread here and continue discussing the former "Foreign Keys with Arrays"/"EACH Foreign Key" feature support from here: http://archives.postgresql.org/pgsql-hackers/2012-08/msg00011.php Thank you, Gabriele -- Gabriele Bartolini - 2ndQuadrant Italia PostgreSQL Training, Services and Support gabriele.bartolini@2ndQuadrant.it |www.2ndQuadrant.it