Thread: Allow composite foreign keys to reference a superset of unique constraint columns?
Allow composite foreign keys to reference a superset of unique constraint columns?
From
Paul Martinez
Date:
Hey, hackers! While working with some foreign keys, I noticed some mildly unexpected behavior. The columns referenced by a unique constraint must naturally have a unique constraint on them: CREATE TABLE foo (a integer); CREATE TABLE bar (x integer REFERENCES foo(a)); > ERROR: there is no unique constraint matching given keys for referenced table "foo" But Postgres doesn't allow a foreign key to reference a set of columns without a unique constraint, even if there a unique constraint on a subset of those columns (i.e., it doesn't allow referencing a superset of a unique constraint). CREATE TABLE foo (a integer PRIMARY KEY, b integer); CREATE TABLE bar (x integer, y integer, FOREIGN KEY (x, y) REFERENCES foo(a, b)); > ERROR: there is no unique constraint matching given keys for referenced table "foo" It seems to me like there would be nothing wrong in this case to allow this foreign key constraint to exist. Because there is a unique constraint on foo(a), foo(a, b) will also be unique. And it doesn't seem like it would be too complex to implement. Neither MATCH SIMPLE nor MATCH FULL constraints would have any issues with this. MATCH PARTIAL may, but, alas, it's not implemented. (I've had a few ideas about foreign keys, and MATCH PARTIAL seems to always come up, and I still don't understand what its use case is.) A real-world use case that uses denormalization could run into this. Imagine a naive music database that has a list of artists, albums, and songs, where each album is by one artist and each song is on one album, but we still store a reference to the artist on each song: CREATE TABLE artists (id serial PRIMARY KEY, name text); CREATE TABLE albums (id serial PRIMARY KEY, artist_id REFERENCES artists(id) name text); CREATE TABLE songs ( id serial PRIMARY KEY, artist_id REFERENCES artists(id) ON DELETE CASCADE, album_id REFERENCES albums(id) ON DELETE CASCADE, name text, ); To ensure that artist deletions are fast, we need to create an index on songs(artist_id) and songs(album_id). But, suppose we wanted to save on index space, and we never needed to query JUST by album_id. We could then do: CREATE TABLE songs ( id serial PRIMARY KEY, artist_id REFERENCES artists(id) ON DELETE CASCADE, album_id integer, name text, FOREIGN KEY (artist_id, album_id) REFERENCES albums(artist_id, id) ON DELETE CASCADE ); And then we could have a single index on songs(artist_id, album_id) that would serve both ON CASCADE DELETE triggers: -- Delete artist DELETE FROM songs WHERE artist_id = <artist.id>; -- Delete artist DELETE FROM songs WHERE artist_id = <album.artist_id> AND album_id = <album.id>; But Postgres wouldn't let us create the composite foreign key described. It seems like a somewhat useful feature. If people think it would be useful to implement, I might take a stab at it when I have time. - Paul
Re: Allow composite foreign keys to reference a superset of unique constraint columns?
From
"David G. Johnston"
Date:
On Mon, Aug 16, 2021 at 4:37 PM Paul Martinez <hellopfm@gmail.com> wrote:
It seems like a somewhat useful feature. If people think it would be useful to
implement, I might take a stab at it when I have time.
This doesn't seem useful enough for us to be the only implementation to go above and beyond the SQL Standard's specification for the references feature (I assume that is what this proposal suggests).
This example does a good job of explaining but its assumptions aren't that impactful and thus isn't that good at inducing desirability.
David J.
Re: Allow composite foreign keys to reference a superset of unique constraint columns?
From
Jack Christensen
Date:
On Mon, Aug 16, 2021 at 7:01 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Mon, Aug 16, 2021 at 4:37 PM Paul Martinez <hellopfm@gmail.com> wrote:
It seems like a somewhat useful feature. If people think it would be useful to
implement, I might take a stab at it when I have time.This doesn't seem useful enough for us to be the only implementation to go above and beyond the SQL Standard's specification for the references feature (I assume that is what this proposal suggests).This example does a good job of explaining but its assumptions aren't that impactful and thus isn't that good at inducing desirability.
I have no opinion on the broader concerns about this proposed feature, but speaking simply as a user I have wanted this on multiple occasions. In my case, it is usually because of the need to maintain consistency in a diamond table relationship. For example:
create table tenants (
id serial primary key
);
create table users (
id serial primary key,
tenant_id int references tenants
);
create table user_groups (
id serial primary key,
tenant_id int references tenants
);
create table user_group_memberships (
tenant_id int,
user_id int,
user_group_id,
primary key (user_id, user_group_id),
foreign key (user_id, tenant_id) references users (id, tenant_id),
foreign key (user_group_id, tenant_id) references user_groups (id, tenant_id)
);
The only way to ensure a user can only be a member of a group in the same tenant is to user_group_memberships.tenant_id be part of the foreign key. And that will only work with a unique key on id and tenant_id in both users and user_groups. It's a bit inelegant to create multiple extra indexes to ensure consistency when existing indexes are enough to ensure uniqueness.
Jack
Re: Allow composite foreign keys to reference a superset of unique constraint columns?
From
Paul Martinez
Date:
On Tue, Aug 17, 2021 at 8:41 AM Jack Christensen <jack@jncsoftware.com> wrote: > > The only way to ensure a user can only be a member of a group in the same > tenant is to user_group_memberships.tenant_id be part of the foreign key. And > that will only work with a unique key on id and tenant_id in both users and > user_groups. It's a bit inelegant to create multiple extra indexes to ensure > consistency when existing indexes are enough to ensure uniqueness. > > Jack You could accomplish this by using composite primary keys on the users and user_groups tables: CREATE TABLE users ( id serial, tenant_id int REFERENCES tenants(id), PRIMARY KEY (tenant_id, id) ); This approach works pretty well for multi-tenant databases, because then your indexes all start with tenant_id, which should help with performance, and, in theory, would make your database easier to shard. But then it requires including a tenant_id in *every* query (and subquery!), which may be difficult to enforce in a codebase. One downside of the composite primary/foreign key approach is that ON DELETE SET NULL foreign keys no longer work properly because they try to set both columns to NULL, the true foreign key id, AND the shared tenant_id that is part of the referencing table's primary key. I have a patch [1] out to add new functionality to solve this problem though. - Paul [1]: https://www.postgresql.org/message-id/flat/CACqFVBZQyMYJV%3DnjbSMxf%2BrbDHpx%3DW%3DB7AEaMKn8dWn9OZJY7w%40mail.gmail.com
Re: Allow composite foreign keys to reference a superset of unique constraint columns?
From
Laurenz Albe
Date:
On Tue, 2021-08-17 at 10:45 -0700, Paul Martinez wrote: > On Tue, Aug 17, 2021 at 8:41 AM Jack Christensen <jack@jncsoftware.com> wrote: > > The only way to ensure a user can only be a member of a group in the same > > tenant is to user_group_memberships.tenant_id be part of the foreign key. And > > that will only work with a unique key on id and tenant_id in both users and > > user_groups. It's a bit inelegant to create multiple extra indexes to ensure > > consistency when existing indexes are enough to ensure uniqueness. > > You could accomplish this by using composite primary keys on the users and > user_groups tables: > > CREATE TABLE users ( > id serial, > tenant_id int REFERENCES tenants(id), > PRIMARY KEY (tenant_id, id) > ); That is not a proper solution, because it does not guarantee uniqueness of the "id" column, which is typically what you want. So I think Jack's example illustrates the benefit of this proposal well. On the other hand, the SQL standard requires that a foreign key references a unique constraint, see chapter 11.8 <referential constraint definition>, Syntax Rules 3) a): "If the <referenced table and columns> specifies a <reference column list>, then there shall be a one-to-one correspondence between the set of <column name>s contained in that <reference column list> and the set of <column name>s contained in the <unique column list> of a unique constraint of the referenced table such that corresponding <column name>s are equivalent." So while I personally agree that the proposed feature is useful, I am not sure if it is useful enough to break the standard in a way that may be incompatible with future extensions of the standard. Yours, Laurenz Albe