Thread: string PK vs. interger PK
Can anyone offer some insight as to what should be considered when choosing between a primary key that is an integer vs. a primary key that is a string value? Does one or the other affect indexing speed? update speed?
Is a serial integer value better than using a PK that has some "value" in its meaning but is a string?
Thanks
Jodi
_______________________________
Jodi L Kanter
BioInformatics Database Administrator
University of Virginia
(434) 924-2846
jkanter@virginia.edu
As a general rule, a primary key (or any relationship key) should not contain a value that means something beyond its use in relating entities. The problem is that if a field describes an object, the day may come when you want to change the description, but you *never* want to change the field that identifies the record and relates it to other records. I can't offer insider's knowledge on speed of int vs string, but it seems very reasonable to think that an int would be faster. On the other hand, I'm forced to use strings in my app, and the performance doesn't suffer much so they must be close. -Nick -------------------------------------------------------------------------- Nick Fankhauser nickf@ontko.com Phone 1.765.935.4283 Fax 1.765.962.9788 Ray Ontko & Co. Software Consulting Services http://www.ontko.com/ -----Original Message----- From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org]On Behalf Of Jodi Kanter Sent: Monday, April 15, 2002 1:56 PM To: Postgres Admin List Subject: [ADMIN] string PK vs. interger PK Can anyone offer some insight as to what should be considered when choosing between a primary key that is an integer vs. a primary key that is a string value? Does one or the other affect indexing speed? update speed? Is a serial integer value better than using a PK that has some "value" in its meaning but is a string? Thanks Jodi _______________________________ Jodi L Kanter BioInformatics Database Administrator University of Virginia (434) 924-2846 jkanter@virginia.edu
Jodi: This opinion has less to do with Postgres and more to do will a general philosophy but we always use a 4 byte integer column as the primary key for almost every table. None of the following reasons means it has to be that way, but all support Ints over strings: 1) Strings would normally be bigger than 4 byte integers. While I have no experience to say this size issue could contribute to a perceived performance issue, theoretically speaking, a smaller, more compact value would be better. 2) Int processing of equality and inequality comparisons should be much faster than string compares. However, again I have no experience to say this size issue could contribute to a perceived performance issue. 3) Ints make it easier to employ Postgres Sequences in generating PK values. These are very nice and quick. 4) Lastly, I'll just agree with the other replies that I've already seen that you should avoid putting any kind of meaningful data into PK columns. That fact that you want to consider strings almost implies meaning to the values. For example, if these values are dictated by an outside source which happened to use an alphanumeric value, then they carry meaning to an outside source even if that meaning is that it is simply their own unique identifier. We've been faced with this question many a time. We almost always chose an int PK and more often than not when we've strayed from this course, it's come back to haunt us. Hope this is helpful. Marc Mitchell - Senior Application Architect Enterprise Information Solutions, Inc. 4910 Main Street Downers Grove, IL 60515 (630) 512-0570 marcm@eisolution.com ----- Original Message ----- From: "Jodi Kanter" <jkanter@virginia.edu> To: "Postgres Admin List" <pgsql-admin@postgresql.org> Sent: Monday, April 15, 2002 1:56 PM Subject: [ADMIN] string PK vs. interger PK Can anyone offer some insight as to what should be considered when choosing between a primary key that is an integer vs. a primary key that is a string value? Does one or the other affect indexing speed? update speed? Is a serial integer value better than using a PK that has some "value" in its meaning but is a string? Thanks Jodi _______________________________ Jodi L Kanter BioInformatics Database Administrator University of Virginia (434) 924-2846 jkanter@virginia.edu
On 15 Apr 2002 at 14:20, Nick Fankhauser wrote: > As a general rule, a primary key (or any relationship key) should not > contain a value that means something beyond its use in relating entities. > The problem is that if a field describes an object, the day may come when > you want to change the description, but you *never* want to change the > field that identifies the record and relates it to other records. FWIW, I would recommend not using business values as a primary key (and by extension, as a foreign key). Business values are just that: business values. Sure, put unique constraints on business values if you want. Just don't make them a primary key or a foreign key. -- Dan Langille The FreeBSD Diary - http://freebsddiary.org/ - practical examples