Re: Normal vs Surrogate Primary Keys... - Mailing list pgsql-general

From rlee0001
Subject Re: Normal vs Surrogate Primary Keys...
Date
Msg-id 1159824571.772571.259860@e3g2000cwe.googlegroups.com
Whole thread Raw
In response to Re: Normal vs Surrogate Primary Keys...  (Martijn van Oosterhout <kleptog@svana.org>)
List pgsql-general
Martijn van Oosterhout wrote:
> On Sun, Oct 01, 2006 at 07:48:14PM -0700, rlee0001 wrote:
> > <snip> For example, if I key "employee" by Last Name, First Name, Date
> > of Hire and Department, I would need to store copies of all this data
> > in any entity that relates to an employee (e.g. payroll, benefits and
> > so on). In addition, if any of these fields change in value, that
> > update would need to cascade to any related entities, which might be
> > perceived as a performance issue if there are many related records.
>
> Err, those fields don't make a natural key since they have no guarentee
> of uniqueness. You've simply decided that the chance of collision is
> low enough that you don't care, but for me that's not really good
> enough for use as a key.

Oh look mommy, a usenet troll. Sweet. I'm bored, so...

Those fields were a contrived example of a key that might be perceived
to be too large to use as a key for performance reasons. Are you
suggesting that because they are not guaranteed to be unique that no
perforance problem would exist in using such large and complex fields
as keys? Or do you acknowledge that my example holds regardless?

The fact of the matter is, non-abstract (natural) entities have only
one perfect candidate key, which is the compound of all their natural
attributes. For these entities, a decision must be made by the data
modeler after gathering the requirements of the application as to what
the minimum subset of attributes are that would never be duplicated
(again: within the context of the application). In my employee example,
I, as the data modeler, have decided that those four fields constitute
a reasonable candidate key based on the requirements of the
application.

> Secondly, three of the four fields you suggest are subject to change,
> so that indeed makes them a bad choice. My definition of "key" includes
> "unchanged for the lifetime of the tuple".

There is no such rule of normalization or good database logic. You are
refering to a technical limitation in some obsolete system that lack
cascading update support.

> In that situation your idea may work well, but that's just a surrogate
> key in disguise...

I know. But not just in disguise -- invisible. An internal peice of the
database, like an index. This is where perforance hacks belong, not
mixed in with business logic (or in this case business data). Basically
I'm introducing the concept of a hidden-psudo-sub-primary-key. The
index of relationships. Additionally the ID could be extracted and used
by the application for other uses such as transmitting a record pointer
via a query-string and other internal/technical/non-business-logic
activities.

> Have a nice day,

Which one?

> --
> Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> > From each according to his ability. To each according to his ability to litigate.
>
> --0ntfKIWw70PvrIHh
> Content-Type: application/pgp-signature
> Content-Disposition: inline;
>     filename="signature.asc"
> Content-Description: Digital signature
> X-Google-AttachSize: 190


pgsql-general by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: Re:    Hi,&nbs
Next
From: Eberhard Lisse
Date:
Subject: Re: Potentially annoying question about date ranges (part 2)