Re: Referential integrity (foreign keys) across multiple tables - Mailing list pgsql-sql
From | Richard Huxton |
---|---|
Subject | Re: Referential integrity (foreign keys) across multiple tables |
Date | |
Msg-id | 44C52065.8070209@archonet.com Whole thread Raw |
In response to | Referential integrity (foreign keys) across multiple tables (Richard Jones <rich@annexia.org>) |
List | pgsql-sql |
Richard Jones wrote: > Simplified schema: > > create table hosts ( > id serial primary key, > hostname text not null > ); > > create table pages ( > id serial primary key, > hostid int not null references hosts (id), > url text not null, > unique (hostid, url) > ); > > create table page_contents ( > pageid int not null references pages (id), > section text not null > ); > > (There are many hosts, many pages per host, and many page_contents > sections per page). > > Now I want to add a column to page_contents, say called link_name, > which is going to reference the pages.url column for the particular > host that this page belongs to. Ah! It's only from reading later messages that I realise you're trying to ensure that a link in page_contents can only reference pages on the same host as itself. > Something like: > > alter table page_contents add link_name text; > alter table page_contents > add constraint foo foreign key (p.hostid, link_name) > references pages (hostid, url) > where p.id = pageid; > > Obviously that second statement isn't going to compile. > > I don't want to add the hostid column to page_contents table because I > have a lot of old code accessing the database which would be hard to > change (the old code would no longer be able to insert page_contents > rows). I'm sure you know, but for the benefit of the list the problem here is that the surrogate primary-key on "pages" has concealed valuable information (i.e. the hostid) from the "page_contents" table. Unfortunate that you can't update the application [* see rant below] > Is this possible somehow? Perhaps by adding a second table? Do I > have to use triggers, and if so is that as robust as referential > integrity? Write triggers to enforce the host dependency. Just make a table of all the columns involved on all three tables and decide what (if anything) should happen when a value is inserted/updated/deleted. Write your triggers then write a short test script. Oh, test it with concurrent transactions too - just to make sure it's doing what you'd expect. The only reason hand-built triggers would be less reliable than built-in foreign-keys is because they'll have had less testing. The other alternative would be to add the hostid column to page_contents and use a BEFORE trigger to set it based on the pageid the application inserts. Then you can have a straightforward foreign-key. If your application can't cope with having a column added to the table you might have to mask this with a view. [*minor rant follows: Why is it that when changes to the database schema break applications because the column-names are hard-wired, it is the RDBMS that is at fault? I'm prepared to bet good money (up to the value of say, a whole pound) that the primary key is just being used anonymously in the application and that if the database interface had been able to refer to it as "column(s) referencing pages primary-key" then Richard wouldn't have a problem to post about. And it's relational databases that aren't "agile" enough for the modern world, not most of your programming languages. ] -- Richard Huxton Archonet Ltd