Home > mailing lists

Re: Normal vs Surrogate Primary Keys... - Mailing list pgsql-general

From	rlee0001
Subject	Re: Normal vs Surrogate Primary Keys...
Date	October 4, 2006 19:26:49
Msg-id	1159824571.772571.259860@e3g2000cwe.googlegroups.com Whole thread Raw
In response to	Re: Normal vs Surrogate Primary Keys... (Martijn van Oosterhout <kleptog@svana.org>)
List	pgsql-general

Tree view

Martijn van Oosterhout wrote:
> On Sun, Oct 01, 2006 at 07:48:14PM -0700, rlee0001 wrote:
> > <snip> For example, if I key "employee" by Last Name, First Name, Date
> > of Hire and Department, I would need to store copies of all this data
> > in any entity that relates to an employee (e.g. payroll, benefits and
> > so on). In addition, if any of these fields change in value, that
> > update would need to cascade to any related entities, which might be
> > perceived as a performance issue if there are many related records.
>
> Err, those fields don't make a natural key since they have no guarentee
> of uniqueness. You've simply decided that the chance of collision is
> low enough that you don't care, but for me that's not really good
> enough for use as a key.

Oh look mommy, a usenet troll. Sweet. I'm bored, so...

Those fields were a contrived example of a key that might be perceived
to be too large to use as a key for performance reasons. Are you
suggesting that because they are not guaranteed to be unique that no
perforance problem would exist in using such large and complex fields
as keys? Or do you acknowledge that my example holds regardless?

The fact of the matter is, non-abstract (natural) entities have only
one perfect candidate key, which is the compound of all their natural
attributes. For these entities, a decision must be made by the data
modeler after gathering the requirements of the application as to what
the minimum subset of attributes are that would never be duplicated
(again: within the context of the application). In my employee example,
I, as the data modeler, have decided that those four fields constitute
a reasonable candidate key based on the requirements of the
application.

> Secondly, three of the four fields you suggest are subject to change,
> so that indeed makes them a bad choice. My definition of "key" includes
> "unchanged for the lifetime of the tuple".

There is no such rule of normalization or good database logic. You are
refering to a technical limitation in some obsolete system that lack
cascading update support.

> In that situation your idea may work well, but that's just a surrogate
> key in disguise...

I know. But not just in disguise -- invisible. An internal peice of the
database, like an index. This is where perforance hacks belong, not
mixed in with business logic (or in this case business data). Basically
I'm introducing the concept of a hidden-psudo-sub-primary-key. The
index of relationships. Additionally the ID could be extracted and used
by the application for other uses such as transmitting a record pointer
via a query-string and other internal/technical/non-business-logic
activities.

> Have a nice day,

Which one?

> --
> Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> > From each according to his ability. To each according to his ability to litigate.
>
> --0ntfKIWw70PvrIHh
> Content-Type: application/pgp-signature
> Content-Disposition: inline;
>     filename="signature.asc"
> Content-Description: Digital signature
> X-Google-AttachSize: 190

pgsql-general by date:

From: Martijn van Oosterhout
Date: 04 October 2006, 14:36:00
Subject: Re: Hi,&nbs

From: Eberhard Lisse
Date: 04 October 2006, 19:26:50
Subject: Re: Potentially annoying question about date ranges (part 2)

Re: Normal vs Surrogate Primary Keys... - Mailing list pgsql-general

Previous

Next