Home > mailing lists

Re: surrogate key or not? - Mailing list pgsql-sql

From	Iain
Subject	Re: surrogate key or not?
Date	July 21, 2004 23:43:31
Msg-id	00ad01c46f95$94ebad20$7201a8c0@mst1x5r347kymb Whole thread
In response to	surrogate key or not? (Markus Bertheau <twanger@bluetwanger.de>)
List	pgsql-sql

Tree view

Hi Josh,

Considering that I generally agree with your comments (in this and your
later posts)
I'd say I didn't make myself clear in my brief comment.

Of the considerations: performance, convenience and business logic, I
personally rate performance as the lowest priority. The convenience part has
been well covered in other posts and is second most in my opinion..

By business logic I was thinking of a situation we had on an old db where
employees were idenitfied by codes. These codes were transmitted all over
the database (as you can imagine) which meant that the codes couldn't be
recycled as employees came and left. The answer was to introduce an integer
based auto-generated key. That way the data associated with an ex-employee
is associated with that employee, not with the code. It may be that this
really a convenience issue as opposed to business logic, but in my thinking
the code was data, not a
primary key - it is not even an alternate key (if ex-employees have their
code
nulled it can be made "unique" though). Perhaps the term surrogate key
doesn't apply in this case, I have to admit that I'm not 100% on terminalogy
here.

In the end, accurately representing the business logic is most importent,
and use of a contrived sequential key shouldnt preclude the use of unique
constraints where needed so the problem you described of duplicate events
should never have happened anyway.

This is a great topic though, I find it really interesting.

Regards
Iain




----- Original Message ----- 
From: "Josh Berkus" <josh@agliodbs.com>
To: "Iain" <iain@mst.co.jp>; "Markus Bertheau" <twanger@bluetwanger.de>;
<pgsql-sql@postgresql.org>
Sent: Wednesday, July 21, 2004 4:00 PM
Subject: Re: [SQL] surrogate key or not?


> Markus, Iain,
>
> > Thus I see it more as an issue of business logic than performance. There
are
> > of course many other considerations with relational theory and stuff
like
> > that which you could debate endlessly. I expect that googling on
"surrogate
> > keys" would yeild interesting results.
>
> Frankly, I couldn't disagree more.   This is why it was so problematic for
the
> SQL committee to enshrine "primary keys" and sequences in the standard; it
> mis-educates database designers into believing that surrogate keys are
> somehow part of the data model.  They are most decidely NOT.
>
> Given:  Surrogate keys, by definition, represent no real data;
> Given:  Only items which represent real data have any place in
> a data model
> Conclusion:  Surrogate keys have no place in the data model
>
> There are, in fact, three very good reasons to use surrogate keys, all of
> which are strictly due to limitations of technology; that is,
implementation
> and performance issues, NOT business logic.  They are:
>
> 1) Convenience:  It's very annoying to have to refer to a 4-column foriegn
key
> whenever you do a join in queries or want to delete a record, as well as
> tracking a 4-element composite in your client software.
>
> 2) Performance:  INT and BIGINT data types are among the most compact and
> efficient stored in most RDBMSs.   So using anything else as a key would
> likely result in a loss of performance on large-table joins.
>
> 3) Mutability:  Most RDBMSs are very inefficient about CASCADE deletes and
> updates.   Some RDBMSs do not support CASCADE, forcing the client software
to
> fix all the dependant rows.  This means that DBAs are very reluctant to
use
> columns which change frequently as join keys.
>
> All three of these implementation issues are, at least in theory,
> surmountable.   For example, Sybase overcame problems (1) and (3) by
creating
> an automated, system-controlled hash key based on the table's real key.
This
> was a solution endorsed by E.F. Codd in the mid-90's when he came to
regret
> his promotion of the "Primary Key" idea in the SQL standard.
>
> Now, you're probably wondering "why does this guy regard surrogate keys as
a
> problem?"   I'll tell you:  I absolutely cannot count the number of "bad
> databases" I've encountered which contained tables with a surrogate key,
and
> NO REAL KEY of any kind.   This makes data normalization impossible, and
> cleanup of the database becomes a labor-intensive process requiring
> hand-examination of each row.
>
> -- 
> -Josh Berkus
>  Aglio Database Solutions
>  San Francisco

pgsql-sql by date:

From: Josh Berkus
Date: 21 July 2004, 20:19:55
Subject: Re: ? on announcement of 7.5

From: azah azah
Date: 21 July 2004, 23:44:04
Subject: Re: Please help me.. problem in to_char

Re: surrogate key or not? - Mailing list pgsql-sql

Previous

Next