Thread: Primary Keys

Primary Keys

From

"Farid Khan"

Date:

09 November 2003, 16:14:48

I quote from the PG Docs:

PRIMARY KEY (column constraint)
PRIMARY KEY ( column_name [, ... ] ) (table constraint)
The primary key constraint specifies that a column or columns of a table may
contain only unique (non-duplicate), non-NULL values. Technically, PRIMARY
KEY is merely a combination of UNIQUE and NOT NULL, but identifying a set of
columns as primary key also provides meta-data about the design of the
schema, as a primary key implies that other tables may rely on this set of
columns as a unique identifier for rows.
Only one primary key can be specified for a table, whether as a column
constraint or a table constraint.
The primary key constraint should name a set of columns that is different
from other sets of columns named by any unique constraint defined for the
same table.

My question is this, I want to create a very large table, 1M records or
more.  However, as I was reading some of the recent posts, namely:

From: Antonios Christofides <anthony ( at ) itia ( dot ) ntua ( dot ) gr>
To: pgsql-novice ( at ) postgresql ( dot ) org
Subject: Surrogate vs natural keys (Was: Almost relational PostgreSQL (was:
one-to-one))
Date: Tue, 4 Nov 2003 20:51:25 +0200

The table I am creating would be based on a X,Y point.  We have serious
doubts on what this would do to the performance of the system we are
developing.  I am assuming, that a table can have a primary key consisting
of 2 fields, which together would be unique to the entire table.  Would
indexing this table be the nightmare we think it would be?  Has anyone seen
anything remotely similar out there with 1M+ rows?

Thanks.

Farid

Re: Primary Keys

From

Nabil Sayegh

Date:

09 November 2003, 16:36:30

Farid Khan wrote:
> The table I am creating would be based on a X,Y point.  We have serious
> doubts on what this would do to the performance of the system we are
> developing.  I am assuming, that a table can have a primary key consisting
> of 2 fields, which together would be unique to the entire table.  Would
> indexing this table be the nightmare we think it would be?  Has anyone seen
> anything remotely similar out there with 1M+ rows?

Personally I would always choose an extra SERIAL column.
I don't really like the concept of having multiple columns as a PK.

1M doesn't sound like a big problem.

--
  e-Trolley Sayegh & John, Nabil Sayegh
  Tel.: 0700 etrolley /// 0700 38765539
  Fax.: +49 69 8299381-8
  PGP : http://www.e-trolley.de

Re: Primary Keys

From

Josh Berkus

Date:

09 November 2003, 17:34:13

Nabil, Farid,

> Personally I would always choose an extra SERIAL column.
> I don't really like the concept of having multiple columns as a PK.

I really have to disagree with you; if the real key of the table is two
numeric columns, then make that the primary key unless you can demonstrate
significant performance or application development problems.

For one thing, a serial column adds another 4 bytes per row to the table, and
overhead on inserts.   It also raises the possibility of duplicate columns
X,Y; and if you put a unique index on X,Y, why bother with the surrogate key
at all?

I've a lot of tables with SERIAL surrogate keys, but those exist only for 3
reasons:
1) The real key involves several columns, including text and dates, that would
be a real query-writing hassle and/or potential performance-killer on joins;
2) Or where my web programmer refuses to deal with non-numeric keys (which he
frequently does ...)
3) The values of the key in existing rows is expected to change frequently and
there are FK-dependant tables (cascading updates are another
performance-killer).

Where none of those conditions apply, I use real keys.

--
Josh Berkus
Aglio Database Solutions
San Francisco