Thread: Primary Keys
I quote from the PG Docs: PRIMARY KEY (column constraint) PRIMARY KEY ( column_name [, ... ] ) (table constraint) The primary key constraint specifies that a column or columns of a table may contain only unique (non-duplicate), non-NULL values. Technically, PRIMARY KEY is merely a combination of UNIQUE and NOT NULL, but identifying a set of columns as primary key also provides meta-data about the design of the schema, as a primary key implies that other tables may rely on this set of columns as a unique identifier for rows. Only one primary key can be specified for a table, whether as a column constraint or a table constraint. The primary key constraint should name a set of columns that is different from other sets of columns named by any unique constraint defined for the same table. My question is this, I want to create a very large table, 1M records or more. However, as I was reading some of the recent posts, namely: From: Antonios Christofides <anthony ( at ) itia ( dot ) ntua ( dot ) gr> To: pgsql-novice ( at ) postgresql ( dot ) org Subject: Surrogate vs natural keys (Was: Almost relational PostgreSQL (was: one-to-one)) Date: Tue, 4 Nov 2003 20:51:25 +0200 The table I am creating would be based on a X,Y point. We have serious doubts on what this would do to the performance of the system we are developing. I am assuming, that a table can have a primary key consisting of 2 fields, which together would be unique to the entire table. Would indexing this table be the nightmare we think it would be? Has anyone seen anything remotely similar out there with 1M+ rows? Thanks. Farid
Farid Khan wrote: > The table I am creating would be based on a X,Y point. We have serious > doubts on what this would do to the performance of the system we are > developing. I am assuming, that a table can have a primary key consisting > of 2 fields, which together would be unique to the entire table. Would > indexing this table be the nightmare we think it would be? Has anyone seen > anything remotely similar out there with 1M+ rows? Personally I would always choose an extra SERIAL column. I don't really like the concept of having multiple columns as a PK. 1M doesn't sound like a big problem. -- e-Trolley Sayegh & John, Nabil Sayegh Tel.: 0700 etrolley /// 0700 38765539 Fax.: +49 69 8299381-8 PGP : http://www.e-trolley.de
Nabil, Farid, > Personally I would always choose an extra SERIAL column. > I don't really like the concept of having multiple columns as a PK. I really have to disagree with you; if the real key of the table is two numeric columns, then make that the primary key unless you can demonstrate significant performance or application development problems. For one thing, a serial column adds another 4 bytes per row to the table, and overhead on inserts. It also raises the possibility of duplicate columns X,Y; and if you put a unique index on X,Y, why bother with the surrogate key at all? I've a lot of tables with SERIAL surrogate keys, but those exist only for 3 reasons: 1) The real key involves several columns, including text and dates, that would be a real query-writing hassle and/or potential performance-killer on joins; 2) Or where my web programmer refuses to deal with non-numeric keys (which he frequently does ...) 3) The values of the key in existing rows is expected to change frequently and there are FK-dependant tables (cascading updates are another performance-killer). Where none of those conditions apply, I use real keys. -- Josh Berkus Aglio Database Solutions San Francisco