Re: counting query - Mailing list pgsql-general

From Chad Wagner
Subject Re: counting query
Date
Msg-id 81961ff50701280802v681c5731t9014de27761d8d08@mail.gmail.com
Whole thread Raw
In response to Re: counting query  (Ron Johnson <ron.l.johnson@cox.net>)
List pgsql-general
On 1/28/07, Ron Johnson <ron.l.johnson@cox.net> wrote:
This is the great synthetic-vs-natural key debate.

Truly.  But what the heck!

Surrogate keys are not evil, and they do have value.  I see no value in proclaiming "surrogate keys are evil, do not use them".

Surrogate keys do have advantages:
- Disassociation of natural data to other referential tables (which can also be "confusing")
    Imagine a social security number, drivers license number, or any other natural key.  Now imagine that
    key value has changed for a specific person, and you have used it as a natural key throughout your data
    structures.  (and they do change)
- Reduced storage requirements (yields better performance)
    It is cheaper to store a 50 byte field + a 4 byte surrogate key once, then it is to store it a million times:
    (surrogate key)   54 bytes + (4 bytes * 1 million) = 4MB
    vs.
    (natural key)   50 bytes * 1 million = 50 MB


Natural keys are not evil either, and they have their own advantages.  But when your modeling very large databases (around 10TB range) then you absolutely have to consider every single decision, and natural keys (in my opinion) is not always a good one as a single natural key could result in another 100GB of storage requirements.

There should be some thought when you are modeling and these are some of the things to consider.  I don't see a 10 table join being a major performance penalty, especially when 8 of the tables may be a few MB in size.


--
Chad
http://www.postgresqlforums.com/

pgsql-general by date:

Previous
From: Furface
Date:
Subject: Re: Limit on number of users in postgresql?
Next
From: Ron Johnson
Date:
Subject: virtual (COMPUTED BY) columns?