Re: Allow foreign keys to reference a superset of unique columns - Mailing list pgsql-hackers

From Kaiting Chen
Subject Re: Allow foreign keys to reference a superset of unique columns
Date
Msg-id CA+CLzG-U4Rf4mPr64hCsPMRA4bMx8PVVCCbT5+ES9M1rCDWgHw@mail.gmail.com
Whole thread Raw
In response to Re: Allow foreign keys to reference a superset of unique columns  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Allow foreign keys to reference a superset of unique columns
List pgsql-hackers
> For one example of where the semantics get fuzzy, it's not
> very clear how the extra-baggage columns ought to participate in
> CASCADE updates.  Currently, if we have
>    CREATE TABLE foo (a integer PRIMARY KEY, b integer);
> then an update that changes only foo.b doesn't need to update
> referencing tables, and I think we even have optimizations that
> assume that if no unique-key columns are touched then RI checks
> need not be made.  But if you did
>    CREATE TABLE bar (x integer, y integer,
>                      FOREIGN KEY (x, y) REFERENCES foo(a, b) ON UPDATE CASCADE);
> then perhaps you expect bar.y to be updated ... or maybe you don't?

I'd expect bar.y to be updated. In my mind, the FOREIGN KEY constraint should
behave the same, regardless of whether the underlying unique index on the
referenced side is an equivalent set to, or a strict subset of, the referenced
columns.

> Another example is that I think the idea is only well-defined when
> the subset column(s) are a primary key, or at least all marked NOT NULL.
> Otherwise they're not as unique as you're claiming.  But then the FK
> constraint really has to be dependent on a PK constraint not just an
> index definition, since indexes in themselves don't enforce not-nullness.
> That gets back to Peter's complaint that referring to an index isn't
> good enough.

I think that uniqueness should be guaranteed enough even if the subset columns
are nullable:

  CREATE TABLE foo (a integer UNIQUE, b integer);

  CREATE TABLE bar (
    x integer,
    y integer,
    FOREIGN KEY (x, y) REFERENCES foo(a, b)
  );

The unique index underlying foo.a guarantees that (foo.a, foo.b) is unique if
foo.a isn't NULL. That is, there can be multiple rows (NULL, 1) in foo. However,
such a row can't be the target of the foreign key constraint anyway. So, I'm
fairly certain that, where it matters, a unique index on a nullable subset of
the referenced columns guarantees a distinct referenced row.

> It's also unclear to me how this ought to interact with the
> information_schema views concerning foreign keys.  We generally
> feel that we don't want to present any non-SQL-compatible data
> in information_schema, for fear that it will confuse applications
> that expect to see SQL-spec behavior there.  So do we leave such
> FKs out of the views altogether, or show only the columns involving
> the associated unique constraint?  Neither answer seems pleasant.

Here's the information_schema output for this example:

  CREATE TABLE foo (a integer, b integer);

  CREATE UNIQUE INDEX ON foo (a, b);

  CREATE TABLE bar (
    x integer,
    y integer,
    FOREIGN KEY (x, y) REFERENCES foo(a, b)
  );

  # SELECT * FROM information_schema.referential_constraints
    WHERE constraint_name = 'bar_x_y_fkey';

  -[ RECORD 1 ]-------------+----------------------------------------------
  constraint_catalog        | kaitingc
  constraint_schema         | public
  constraint_name           | bar_x_y_fkey
  unique_constraint_catalog |
  unique_constraint_schema  |
  unique_constraint_name    |
  match_option              | NONE
  update_rule               | NO ACTION
  delete_rule               | NO ACTION

  # SELECT * FROM information_schema.key_column_usage
    WHERE constraint_name = 'bar_x_y_fkey';

  -[ RECORD 173 ]---------------+----------------------------------------------
  constraint_catalog            | kaitingc
  constraint_schema             | public
  constraint_name               | bar_x_y_fkey
  table_catalog                 | kaitingc
  table_schema                  | public
  table_name                    | bar
  column_name                   | x
  ordinal_position              | 1
  position_in_unique_constraint | 1
  -[ RECORD 174 ]---------------+----------------------------------------------
  constraint_catalog            | kaitingc
  constraint_schema             | public
  constraint_name               | bar_x_y_fkey
  table_catalog                 | kaitingc
  table_schema                  | public
  table_name                    | bar
  column_name                   | y
  ordinal_position              | 2
  position_in_unique_constraint | 2

It appears that currently in PostgreSQL, the unique_constraint_catalog, schema,
and name are NULL in referential_constraints when a unique index (without an
associated unique constraint) underlies the referenced columns. The behaviour
I'm proposing would have the same behavior vis-a-vis referential_constraints.

As for key_column_usage, I propose that position_in_unique_constraint be NULL if
the referenced column isn't indexed.

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: longfin and tamandua aren't too happy but I'm not sure why
Next
From: Jacob Champion
Date:
Subject: Re: [PoC] Federated Authn/z with OAUTHBEARER