wsheldah@lexmark.com wrote:
> Choosing an artificial key is the ideal, according to everything I've
> heard. In one of my database classes, I remember I had a classmate who had
> worked with some very large datasets of U.S. citizens, and found that there
> were actually duplicate social security numbers assigned to different
> people. Not many, and I don't recall whether the first person had died
> before the SSN was reused, but it really goes to show that they only to
> _guarantee_ a unique primary key is to generate it yourself. Yes, you may
> want to put a unique index on your SSN field or other candidate key fields
> that ought to be unique.
I think the desire to have an artificial numeric key is founded in the
manner in which SQL has implemented the relational model. Logically,
artificial candidate keys have no business in a relation. But I agree
in their convenience in throwing around keys of a consistent size and
type in client applications and middleware. So, IMHO, I think the
modeler should first design the database to be *logically consistent*:
1) Each relation has a unique, natural candidate key (the x of the
relation) - relations are sets, not bags.
2) Each relation's non-key attributes (the f(x), g(x), ... of the
relation) should be dependent upon the natural key, the whole key, and
nothing but the natural key - that's 3NF at a minimum.
Then, once the model complies with the RM wrt constraints on the
domains, relations, and a database as a whole, one could go back and
add the artificial keys for convenience purposes. It's the modeller's
job to design a database that ensures logical consistency *first* in
the face of a users, programmers, dbas, etc. that will attempt to
break it. The database should be, logically speaking, unbreakable. But
the whole point of the RM is that it is *provably* logically
consistent if its prescriptions and proscriptions are followed. SQL
doesn't force that on you, which is probably a mistake...
IMHO,
Mike Mascari
mascarm@mascari.com