Primary Key Column Size? - Mailing list pgsql-general

From Mike Schroepfer
Subject Primary Key Column Size?
Date
Msg-id 50D1DD22A3646047A6282C10311585123D0B48@mail01.raplix.com
Whole thread Raw
List pgsql-general
Hi All,

First off - thanks too all whom have answered previous
questions and the developers for a really great RDBMS.

Our system has ~50 tables - one of which in particular
can get decently sized (~1-10 million rows).  This table
has a large number of foreign key references into the
rest of the schema.  About 6 of these columns are commonly
used and thus indexed.  We often need very fast access to
the data in this table.

We are currently using VARCHAR as the primary key columns
for everything and due too some external application needs
we are using 16Byte (yes byte) GUID's.  Simply encoded as a
numeric (0-9) string this works out to 40 characters.  Best you
can do with a string column is a Base64 encoding which gets you down to
24 characters.  Finally, we can switch to a pair of IN8 columns
to store the actual 16 bytes of the data.

I'm exploring mapping the GUID's into a smaller datatype (i.e.
an INT4 or INT8) so that our primary keys are much smaller.   It's
quite a deal of work to rekey our
schema with a different primary key type - so I was wondering
if anyone had any experience with the impacts of the size
of the primary key on performance.   I'm particularly concerned
about the size of the index as it relates to the key size.

So I have a few questions for the experts in the group:

1) Is there a general rule of thumb about performance and key size
   (i.e. doubling the key size has a 4x impact on queries?)
2) Do different datatypes have radically different
   performance (i.e. IN4, INT8, VARCHAR) when used as primary
   keys?   Does it matter if I'm on a 64 bit (Solaris) or 32 Bit
   (windows/linux) platform?
3) If you have a VARCHAR(40) column as a primary key - but the
   data contained in that column never exceeds 10 or so characters
   do the indicies take that into account - or are they allocating
   fixed storage?
4) What is the easiest way to lookup the table and index size?
5) What configuration settings (is it shared_buffers?) affects the
   memory allocated that would let the backends load the indicies
   entirely into RAM?

Any help or experience here would be greatly appreciated.

Cheers,

Mike


pgsql-general by date:

Previous
From: Francisco Reyes
Date:
Subject: Re: Postgres not starting at boot(FreeBSD) - startup script
Next
From: Stephan Szabo
Date:
Subject: Re: Referential Integrity Triggers