Thread: IS it a good practice to use SERIAL as Primary Key?
Hi all, I am wonderring if it is a good practice to use SERIAL index as primary key, as it is only available up to 9999999? Currently i am dealing with storing LDAP users into Postgres and i am looking for a better way to make use of the DN as primary key instead of SERIAL index. Any advice or suggestion is appreciated. Thanks. Regards, Carter _________________________________________________________________ Find singles online in your area with MSN Dating and Match.com! http://match.sg.msn.com/match/mt.cfm?pg=channel&tcid=281203
On Nov 22, 2006, at 6:23 PM, carter ck wrote: > Hi all, > > I am wonderring if it is a good practice to use SERIAL index as > primary key, as it is only available up to 9999999? Where did you get that idea? A serial should be good up to at least 2,000,000,000 or so, and if that's not enough there's always bigserial. > > Currently i am dealing with storing LDAP users into Postgres and i > am looking for a better way to make use of the DN as primary key > instead of SERIAL index. > > Any advice or suggestion is appreciated. If you want a synthetic primary key then a serial field is the easiest way to create one. Cheers, Steve
"carter ck" <carterck32@hotmail.com> writes: > I am wonderring if it is a good practice to use SERIAL index as primary key, > as it is only available up to 9999999? Where in the world did you get that idea? SERIAL goes up to 2^31 (2 billion); if you need more use BIGSERIAL. regards, tom lane
> I am wonderring if it is a good practice to use SERIAL index as primary key, > as it is only available up to 9999999? That isn't true. It is much larger that that. If yor need more than that there is always bigserial. serial = int4 bigserial = int8 "" The type names serial and serial4 are equivalent: both create integer columns. The type names bigserial and serial8 work just the same way, except that they create a bigint column. bigserial should be used if you anticipate the use of more than 231 identifiers over the lifetime of the table. "" http://www.postgresql.org/docs/8.2/interactive/datatype-numeric.html#DATATYPE-SERIAL > Currently i am dealing with storing LDAP users into Postgres and i am > looking for a better way to make use of the DN as primary key instead of > SERIAL index. > > Any advice or suggestion is appreciated. Here is a similar discussion that you may be enterested in: http://archives.postgresql.org/pgsql-general/2006-10/msg00024.php Regards, Richard Broersma Jr.
Richard Broersma Jr <rabroersma@yahoo.com> writes: > "" The type names serial and serial4 are equivalent: both create integer > columns. The type names bigserial and serial8 work just the same way, except > that they create a bigint column. bigserial should be used if you anticipate > the use of more than 231 identifiers over the lifetime of the table. "" > http://www.postgresql.org/docs/8.2/interactive/datatype-numeric.html#DATATYPE-SERIAL What would be those "231 identifiers"? -- Jorge Godoy <jgodoy@gmail.com>
> > "" The type names serial and serial4 are equivalent: both create integer > > columns. The type names bigserial and serial8 work just the same way, except > > that they create a bigint column. bigserial should be used if you anticipate > > the use of more than 231 identifiers over the lifetime of the table. "" > > http://www.postgresql.org/docs/8.2/interactive/datatype-numeric.html#DATATYPE-SERIAL > > What would be those "231 identifiers"? oops, when I copied that text from the 8.2 docs I didn't catch that format error. 231 should read 2^31. Regards, Richard Broersma Jr.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/22/06 20:23, carter ck wrote: > Hi all, > > I am wonderring if it is a good practice to use SERIAL index as primary > key, as it is only available up to 9999999? > > Currently i am dealing with storing LDAP users into Postgres and i am > looking for a better way to make use of the DN as primary key instead of > SERIAL index. > > Any advice or suggestion is appreciated. I'm one of those who thinks that a (possibly multisegment) natural key *does* exist, and that if you think it doesn't, your design is wrong. For those times when and that when numeric sequences *are* needed (employee_id and account_number for example) they should include a check digit, to ensure that you don't mis-type a number and charge the wrong account. [I'm old enough to have worked in a Service Bureau where lots women keypunched form data into Mohawk key-to-tape machines, and check digits, which are also in credit cards and SSNs, are a perfect way to protect against typos.] - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQFFZcsbS9HxQb37XmcRAmtYAJ44k15B2bX8GQ6MegaEFGxeWm9q6gCgoVAT w+exLaR8symCHDzKwSgp5q0= =uIq6 -----END PGP SIGNATURE-----
On Thu, Nov 23, 2006 at 10:23:55AM -0600, Ron Johnson wrote: > For those times when and that when numeric sequences *are* needed > (employee_id and account_number for example) they should include a > check digit, to ensure that you don't mis-type a number and charge > the wrong account. Sure, but the check digit does not need to be stored, as it can be regenerated on demand. The user interface just verifies the check digit, then throws it away. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/23/06 10:49, Martijn van Oosterhout wrote: > On Thu, Nov 23, 2006 at 10:23:55AM -0600, Ron Johnson wrote: >> For those times when and that when numeric sequences *are* needed >> (employee_id and account_number for example) they should include a >> check digit, to ensure that you don't mis-type a number and charge >> the wrong account. > > Sure, but the check digit does not need to be stored, as it can be > regenerated on demand. The user interface just verifies the check > digit, then throws it away. $ SET GEEZER $ WRITE SYS$OUTPUT "THAT'S JUST EXTRA CYCLES WASTED BY THE" $ WRITE SYS$OUTPUT "CLIENT. BETTER TO USE THEM FOR SOME OTHER" $ WRITE SYS$OUTPUT "MORE PRODUCTIVE PURPOSE." $ SET NOGEEZER That's the VAX/VMS in me oozing out. But seriously, regenerate it on demand??? That's not how it works. This isn't a CRC or hash function. - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQFFZd+xS9HxQb37XmcRAgB9AJ99CR+I7NvxMWtJkFENPP/IRJUq7QCg709/ eC2gtd+QLk1+PiTU/ub5WOU= =VuM+ -----END PGP SIGNATURE-----
Hi, >> Sure, but the check digit does not need to be stored, as it can be >> regenerated on demand. The user interface just verifies the check >> digit, then throws it away. > > $ SET GEEZER > $ WRITE SYS$OUTPUT "THAT'S JUST EXTRA CYCLES WASTED BY THE" > $ WRITE SYS$OUTPUT "CLIENT. BETTER TO USE THEM FOR SOME OTHER" > $ WRITE SYS$OUTPUT "MORE PRODUCTIVE PURPOSE." > $ SET NOGEEZER > > That's the VAX/VMS in me oozing out. But seriously, regenerate it > on demand??? That's not how it works. This isn't a CRC or hash > function. Well, a check digit _is_ a kind of CRC. It is redundant information. For every number there is only one correct check digit, which means that the check digit does not add extra information to the number. So why store it? You will need to add the check digit on most (all?) output that is interpreted by humans. The software itself can just use the number itself (assuming you don't need to check the integrity of the software). If you store the number in the database, I would suggest making the db check the number on all input too. Otherwise you might end up with invalid data in the database. - Sander
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/23/06 12:38, Sander Steffann wrote: > Hi, > >>> Sure, but the check digit does not need to be stored, as it can be >>> regenerated on demand. The user interface just verifies the check >>> digit, then throws it away. >> >> $ SET GEEZER >> $ WRITE SYS$OUTPUT "THAT'S JUST EXTRA CYCLES WASTED BY THE" >> $ WRITE SYS$OUTPUT "CLIENT. BETTER TO USE THEM FOR SOME OTHER" >> $ WRITE SYS$OUTPUT "MORE PRODUCTIVE PURPOSE." >> $ SET NOGEEZER >> >> That's the VAX/VMS in me oozing out. But seriously, regenerate it >> on demand??? That's not how it works. This isn't a CRC or hash >> function. > > Well, a check digit _is_ a kind of CRC. It is redundant information. For > every number there is only one correct check digit, which means that the > check digit does not add extra information to the number. So why store it? Because it's *part of* the id number. The way we implement it, it's the one's digit. c = f(n) n' = n*10 + c n' is what is stored in id column of the relevant table. This way, if "you" {mistype an id number, it gets garbled in transmission, etc}, it can be algorithmically determined whether or not that is a valid number or not, and only if it is a valid number do you hit the database. Bottom line: check digits are in SSNs and credit card numbers, for a good reason. > You will need to add the check digit on most (all?) output that is > interpreted by humans. The software itself can just use the number > itself (assuming you don't need to check the integrity of the software). > > If you store the number in the database, I would suggest making the db > check the number on all input too. Otherwise you might end up with > invalid data in the database. - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQFFZiVUS9HxQb37XmcRAhogAKCPp6s995Lm84tUG9TunRvvaFbD8gCcDsv5 18fDxZwV6PrPskRym7hPzHs= =ON0M -----END PGP SIGNATURE-----
> Bottom line: check digits are in SSNs Uhm, no they're not. And this is of course one of the huge problems with SSNs. (Although not quite as bad as the fact that they're not strictly unique. Yes, really, duplicates have been issued in the past.) -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
On Thu, 2006-11-23 at 10:23, Ron Johnson wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 11/22/06 20:23, carter ck wrote: > > Hi all, > > > > I am wonderring if it is a good practice to use SERIAL index as primary > > key, as it is only available up to 9999999? > > > > Currently i am dealing with storing LDAP users into Postgres and i am > > looking for a better way to make use of the DN as primary key instead of > > SERIAL index. > > > > Any advice or suggestion is appreciated. > > I'm one of those who thinks that a (possibly multisegment) natural > key *does* exist, and that if you think it doesn't, your design is > wrong. Spend some time in the travel industry... The tax category ZO means Passenger Service Chareg in Denmark. Or Greenland, or Faroe Islands. And can be entered more than once. And the travel agent has to look at the context of the travel itinerary to know which one(s) it is. Sadly, the real world has many data problems created by idiots in suits 30 years ago that aren't going to go away any time soon.
On Mon, 2006-11-27 at 11:47 -0600, Scott Marlowe wrote: > On Thu, 2006-11-23 at 10:23, Ron Johnson wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA1 > > > > On 11/22/06 20:23, carter ck wrote: > > > Hi all, > > > > > > I am wonderring if it is a good practice to use SERIAL index as primary > > > key, as it is only available up to 9999999? > > > > > > Currently i am dealing with storing LDAP users into Postgres and i am > > > looking for a better way to make use of the DN as primary key instead of > > > SERIAL index.' Bigserial? Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/27/06 11:26, Scott Ribe wrote: >> Bottom line: check digits are in SSNs > > Uhm, no they're not. And this is of course one of the huge problems with > SSNs. (Although not quite as bad as the fact that they're not strictly > unique. Yes, really, duplicates have been issued in the past.) Hmm, you're right. Other kinds of important numbers have check digits, though. http://www.cs.nmsu.edu/~cssem/DickOct18.pdf - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQFFay20S9HxQb37XmcRAvh/AJ9q2mgWBGCgR7/IY2lB2TJVheq/DwCgiJkP MLhLb6Au0HOL3Iruk0ZrCSk= =vGcr -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/27/06 11:47, Scott Marlowe wrote: > On Thu, 2006-11-23 at 10:23, Ron Johnson wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> On 11/22/06 20:23, carter ck wrote: >>> Hi all, >>> >>> I am wonderring if it is a good practice to use SERIAL index as primary >>> key, as it is only available up to 9999999? >>> >>> Currently i am dealing with storing LDAP users into Postgres and i am >>> looking for a better way to make use of the DN as primary key instead of >>> SERIAL index. >>> >>> Any advice or suggestion is appreciated. >> I'm one of those who thinks that a (possibly multisegment) natural >> key *does* exist, and that if you think it doesn't, your design is >> wrong. > > Spend some time in the travel industry... The tax category ZO means > Passenger Service Chareg in Denmark. Or Greenland, or Faroe Islands. > And can be entered more than once. And the travel agent has to look at > the context of the travel itinerary to know which one(s) it is. > > Sadly, the real world has many data problems created by idiots in suits > 30 years ago that aren't going to go away any time soon. Yes, that's the point. They are legacy designs, and that portion of the design is wrong. - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQFFay6ES9HxQb37XmcRAly9AKC5qEpO9Z9Oscf5Bp3nbuSgIswPJwCg0dh3 FoDu81i4pndNwIQ88Bl2SsU= =WCzo -----END PGP SIGNATURE-----
>Yes, that's the point. They are legacy designs, and that portion of >the design is wrong. > > > I'll weigh in my my .02 on this subject. After much pain and agony in the real world, I have taken the stance that every table in my database must have an arbitrary, numeric primary key (generally autogenerated). I have found that this gets me into a lot of arguments with other database guys, but never seems to cause any problems for me. Conversely, I have seen innumerable problems in the real world caused by the usage of actual data as primary keys. Perhaps I am amazingly ignorant, but I have yet to find a case where my approach causes any real problems. What does using "real" data as a primary key buy you? The only real advantages I can see are that an individual record's data will be somewhat more human-readable without joining to other tables, and that your search queries can be simpler because they don't have to join against other tables. On the (many) occasions that I have worked on databases with "real" data as primary keys, I just saw so many problems arise. In the real world, data changes, even supposedly unchangeable data. When using arbitrary primary keys, all you have to do is change the data in the one table where it lives. If you are using real data as your keys, you have write complex queries or code to "fix" your data when the supposedly unchangeable data changes. Anyway, I'm sure this is a huge argument, but that's my 0.2
Simply put, it doesn't scale as well. If a table already has candidate keys, then you've presumably got unique indices on them. A surrogate primary key adds another segment of data to pass through I/O and another index to maintain. Under high loads, those extra cycles will cost you transactions per minute. If you're able to throw hardware at the problem to compensate for performance and data size issues, it's not a problem. Most databases are run on systems that are overkill already. If, OTOH, you're running a system that needs to be able to process billions of transactions with exabytes data (say, for example, a comprehensive multi-national health record database) then you're going to be as interested in SQL tuning as it's possible to be because no amount of hardware will be enough. The other argument is that it's redundant data with no real meaning to the domain, meaning using surrogate keys technically violates low-order normal forms. As far as data changing, if you're using foreign key constraints properly you should never need to issue more than one UPDATE command. ON UPDATE CASCADE is your friend. It is always possible to design a domain model which perfectly captures business logic. However, it is *not* always possible to actually implement that domain in a computerized RDBMS, nor is it always practical. Just as the domain model represents an estimated implementation of the real world information, an RDBMS is just an estimated implementation of the relational model. -- Brandon Aiken CS/IT Systems Engineer -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of John McCawley Sent: Monday, November 27, 2006 1:53 PM To: Ron Johnson Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] IS it a good practice to use SERIAL as Primary Key? >Yes, that's the point. They are legacy designs, and that portion of >the design is wrong. > > > I'll weigh in my my .02 on this subject. After much pain and agony in the real world, I have taken the stance that every table in my database must have an arbitrary, numeric primary key (generally autogenerated). I have found that this gets me into a lot of arguments with other database guys, but never seems to cause any problems for me. Conversely, I have seen innumerable problems in the real world caused by the usage of actual data as primary keys. Perhaps I am amazingly ignorant, but I have yet to find a case where my approach causes any real problems. What does using "real" data as a primary key buy you? The only real advantages I can see are that an individual record's data will be somewhat more human-readable without joining to other tables, and that your search queries can be simpler because they don't have to join against other tables. On the (many) occasions that I have worked on databases with "real" data as primary keys, I just saw so many problems arise. In the real world, data changes, even supposedly unchangeable data. When using arbitrary primary keys, all you have to do is change the data in the one table where it lives. If you are using real data as your keys, you have write complex queries or code to "fix" your data when the supposedly unchangeable data changes. Anyway, I'm sure this is a huge argument, but that's my 0.2 ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend
John,
I feel the same.
In the "real world" there is no such thing as a primary key. At least not over time. Not enough people understand the concept of a primary key to make those things existent in the real world.
So we take an artificially primary key - and most reliable way is to create it yourself.
Harald
--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Reinsburgstraße 202b
70197 Stuttgart
0173/9409607
-
Python: the only language with more web frameworks than keywords.
I'll weigh in my my .02 on this subject. After much pain and agony in
the real world, I have taken the stance that every table in my database
must have an arbitrary, numeric primary key (generally autogenerated).
I feel the same.
In the "real world" there is no such thing as a primary key. At least not over time. Not enough people understand the concept of a primary key to make those things existent in the real world.
So we take an artificially primary key - and most reliable way is to create it yourself.
Harald
--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Reinsburgstraße 202b
70197 Stuttgart
0173/9409607
-
Python: the only language with more web frameworks than keywords.
> In the real world, > data changes, even supposedly unchangeable data. And that's the crux of it. All of the "that design is wrong" arguments in the world won't stop data that has been constant & unique for decades from changing when some manager, rightly or wrongly, sees a "business need" for it. User-provided data can never, ever be counted on to forever and always have the attributes (uniqueness, stability, whatever) described by the users, no matter how many times you are reassured on the subject. The data will very often behave exactly as described, but to base a database design on that assumption is to create a brittle design--as you described. -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/27/06 13:23, Harald Armin Massa wrote: > John, > > I'll weigh in my my .02 on this subject. After much pain and agony in >> the real world, I have taken the stance that every table in my database >> must have an arbitrary, numeric primary key (generally autogenerated). > > > I feel the same. > > In the "real world" there is no such thing as a primary key. At least not > over time. Not enough people understand the concept of a primary key to > make those things existent in the real world. > > So we take an artificially primary key - and most reliable way is to create > it yourself. There are lots of numbers. Credit card numbers, account numbers sales ticket numbers, etc, etc ad nauseum. Julian day and Julian date, even. You can't have lived in "the west" in the past 30 years without being surrounded by them. It's their blind use in *every* table which I take issue with, *most especially* when they are the only unique key. - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQFFaz8HS9HxQb37XmcRAjbWAKCT2yN84OiemgPnCH2uc0JNMsykxACeOrfI vNKts9gWRqq7U3hIyEVNKY8= =mGfl -----END PGP SIGNATURE-----
> There are lots of numbers. Credit card numbers, account numbers > sales ticket numbers, etc, etc ad nauseum. Julian day and Julian > date, even. You can't have lived in "the west" in the past 30 years > without being surrounded by them. > > It's their blind use in *every* table which I take issue with, *most > especially* when they are the only unique key. There is also *almost* always a way to have a natural and artificial key. Joshua D. Drake > > - -- > Ron Johnson, Jr. > Jefferson LA USA > > Is "common sense" really valid? > For example, it is "common sense" to white-power racists that > whites are superior to blacks, and that those with brown skins > are mud people. > However, that "common sense" is obviously wrong. > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.5 (GNU/Linux) > > iD8DBQFFaz8HS9HxQb37XmcRAjbWAKCT2yN84OiemgPnCH2uc0JNMsykxACeOrfI > vNKts9gWRqq7U3hIyEVNKY8= > =mGfl > -----END PGP SIGNATURE----- > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/27/06 13:37, Scott Ribe wrote: >> In the real world, >> data changes, even supposedly unchangeable data. > > And that's the crux of it. All of the "that design is wrong" arguments in > the world won't stop data that has been constant & unique for decades from > changing when some manager, rightly or wrongly, sees a "business need" for > it. User-provided data can never, ever be counted on to forever and always > have the attributes (uniqueness, stability, whatever) described by the > users, no matter how many times you are reassured on the subject. The data > will very often behave exactly as described, but to base a database design > on that assumption is to create a brittle design--as you described. If I base a master sales table on account_number and date/time, then every CPA in the country will descend on me with calculators sharpened if I decide to update the SALE_DATE column. - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQFFa0BiS9HxQb37XmcRAmQ2AJ9hYFVgXmW9ls5uvhrQkvxqvV7KWwCgoaOi fa9fMXbjOK4ZDzsd3JH67xs= =j0Rb -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/27/06 12:53, John McCawley wrote: > >> Yes, that's the point. They are legacy designs, and that portion of >> the design is wrong. >> >> >> > I'll weigh in my my .02 on this subject. After much pain and agony in > the real world, I have taken the stance that every table in my database > must have an arbitrary, numeric primary key (generally autogenerated). > I have found that this gets me into a lot of arguments with other > database guys, but never seems to cause any problems for me. > Conversely, I have seen innumerable problems in the real world caused by > the usage of actual data as primary keys. > > Perhaps I am amazingly ignorant, but I have yet to find a case where my > approach causes any real problems. What does using "real" data as a > primary key buy you? The only real advantages I can see are that an > individual record's data will be somewhat more human-readable without > joining to other tables, and that your search queries can be simpler > because they don't have to join against other tables. Joshua Drake wrote an interesting blog article earlier this year about what happened when a firm he was contracting for dropped the natural key on an important table and retained only the synthetic key. Unfortunately I can't find it... :( > On the (many) occasions that I have worked on databases with "real" data > as primary keys, I just saw so many problems arise. In the real world, > data changes, even supposedly unchangeable data. When using arbitrary > primary keys, all you have to do is change the data in the one table > where it lives. If you are using real data as your keys, you have write > complex queries or code to "fix" your data when the supposedly > unchangeable data changes. > > Anyway, I'm sure this is a huge argument, but that's my 0.2 - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQFFa0FoS9HxQb37XmcRAt8ZAJ40c0KtB8zHifTymcsa/PBLwWkEuACfQ+dD K+4NNU7m0RNTQLgeP9pNaqQ= =SH10 -----END PGP SIGNATURE-----
It's an arbitrary identifier that only has meaning within the context of the database. The domain model isn't supposed to model data in a database. It's supposed to model data which coincidentally is going to be stored in a database. As far as your bank's poor software design, I can't help you there. That's simply poor planning. Look, I'm not denying the benefits of surrogate keys. There are many cases where it makes the most sense to use them. My only point is that it *does* violate the relational model. The fact is that's nothing special or new for a DBA. The SQL standard itself violates the relational model by allowing you to create tables without primary keys. -- Brandon Aiken CS/IT Systems Engineer -----Original Message----- From: David Morton [mailto:mortonda@dgrmm.net] Sent: Monday, November 27, 2006 2:30 PM To: Brandon Aiken Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] IS it a good practice to use SERIAL as Primary Key? -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 27, 2006, at 1:21 PM, Brandon Aiken wrote: > The other argument is that it's redundant data with no real meaning to > the domain, meaning using surrogate keys technically violates low- > order > normal forms. It has real meaning in the sense that it is an internal identifier that doesn't change. My bank set my online login to a stupid 5 letters of my name plus last four digits of SSN, and they "can not change" it. Most likely, it is the primary key used for as a foreign key to all the financial data. Dumb, dumb, dumb. If, OTOH, they would go with an internal id, it would be trivial to change the login id. David Morton Maia Mailguard http://www.maiamailguard.com mortonda@dgrmm.net -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iD8DBQFFazzQUy30ODPkzl0RAs/sAJ9rBTbXPNN/T4eQ9zjJFMAKFpfrPACdHcLj pVtAZhjxk24vgRm/ScNfuyw= =mLTC -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Nov 27, 2006, at 1:21 PM, Brandon Aiken wrote: > The other argument is that it's redundant data with no real meaning to > the domain, meaning using surrogate keys technically violates low- > order > normal forms. It has real meaning in the sense that it is an internal identifier that doesn't change. My bank set my online login to a stupid 5 letters of my name plus last four digits of SSN, and they "can not change" it. Most likely, it is the primary key used for as a foreign key to all the financial data. Dumb, dumb, dumb. If, OTOH, they would go with an internal id, it would be trivial to change the login id. David Morton Maia Mailguard http://www.maiamailguard.com mortonda@dgrmm.net -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iD8DBQFFazzQUy30ODPkzl0RAs/sAJ9rBTbXPNN/T4eQ9zjJFMAKFpfrPACdHcLj pVtAZhjxk24vgRm/ScNfuyw= =mLTC -----END PGP SIGNATURE-----
> > Perhaps I am amazingly ignorant, but I have yet to find a case where my > > approach causes any real problems. What does using "real" data as a > > primary key buy you? The only real advantages I can see are that an > > individual record's data will be somewhat more human-readable without > > joining to other tables, and that your search queries can be simpler > > because they don't have to join against other tables. > > Joshua Drake wrote an interesting blog article earlier this year > about what happened when a firm he was contracting for dropped the > natural key on an important table and retained only the synthetic > key. Unfortunately I can't find it... :( I did? hmmm.... I just reviewed my blog entries and don't see one but I can certainly point out billions of reasons why that is a bad idea. Actually now that I think about it.. you are probably thinking of the other illustrious Josh.. Berkus, he does have a blog discussing the topic: http://blogs.ittoolbox.com/database/soup/archives/primary-keyvil-part-i-7327 Here is a simple example: id, first_name, last_name ========================= 1 | Joshua | Drake 2 | Joshua | Drake Which one is the Joshua Drake you want? But: id, first_name, last_name, address_id ===================================== 1 | Joshua | Drake | 1 2 | Joshua | Drake | 2 primary key (first_name,last_name,address_id) One will enable you to have a distinct Joshua Drake, one in portland oregon and one in portland maine. The other, you will have more than one Joshua Drake and not know if you are applying your sales to the correct one (without a whole bunch of work). Besides, I am sure that many in this community will agree that one Joshua Drake is certainly enough. Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
> ...*most especially* when they are the only unique key. There are usually other keys which should be unique, and this should certainly be reflected in the db design. On the other hand, designers should not strive to find and enforce combinations that won't actually necessarily be unique, such as the above-cited example of first 5 letters of last name + last 4 of SSN. (There are certainly more than 10,000 Smiths in the US. In fact: there will be more than 10,000 Smiths in each of most of the 50 states!) > If I base a master sales table on account_number and date/time, then > every CPA in the country will descend on me with calculators > sharpened if I decide to update the SALE_DATE column. But if the company is sold/merged, it is likely that accounts will get new account numbers, and even possible that account numbers will not be unique across the union of the (formerly) two companies' accounts thus absolutely requiring account number changes. This is exactly the kind of thing I'm talking about, and why I think account # + date/time would be a lousy primary key. It's fine to treat it as a key, but certainly not the primary. -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
> primary key (first_name,last_name,address_id) > > One will enable you to have a distinct Joshua Drake, one in portland > oregon and one in portland maine. What happens when you move? Do we treat you as a different person? Or do we pretend that you've always lived in the same place? -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/27/06 14:34, Scott Ribe wrote: >> ...*most especially* when they are the only unique key. > > There are usually other keys which should be unique, and this should > certainly be reflected in the db design. On the other hand, designers should > not strive to find and enforce combinations that won't actually necessarily > be unique, such as the above-cited example of first 5 letters of last name + > last 4 of SSN. (There are certainly more than 10,000 Smiths in the US. In > fact: there will be more than 10,000 Smiths in each of most of the 50 > states!) Just because your (bank?) creates a lame username, doesn't mean that there shouldn't be one. >> If I base a master sales table on account_number and date/time, then >> every CPA in the country will descend on me with calculators >> sharpened if I decide to update the SALE_DATE column. > > But if the company is sold/merged, it is likely that accounts will get new > account numbers, and even possible that account numbers will not be unique > across the union of the (formerly) two companies' accounts thus absolutely > requiring account number changes. This is exactly the kind of thing I'm > talking about, and why I think account # + date/time would be a lousy > primary key. It's fine to treat it as a key, but certainly not the primary. OK, let's use a synthetic key on the sales master table. In fact, *both* companies have a synthetic key on their sales master tables. OMG, conflicting/overlapping synthetic keys!!!!! - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQFFa1AUS9HxQb37XmcRAt0fAJsFPJfjMUEv+2E2XELq6Av6ZFZ98gCfXnkf sJeeyjr3Bq2T9N5Sd0ca7SY= =vedL -----END PGP SIGNATURE-----
On Mon, 2006-11-27 at 13:38 -0700, Scott Ribe wrote: > > primary key (first_name,last_name,address_id) > > > > One will enable you to have a distinct Joshua Drake, one in portland > > oregon and one in portland maine. > > What happens when you move? Do we treat you as a different person? Or do we > pretend that you've always lived in the same place? You would update the address, the address id wouldn't change. If you want to keep track of old addresses you would keep an archive table associated with the user.id. Sincerely, Joshua D. Drake > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
> You would update the address, the address id wouldn't change. If you > want to keep track of old addresses you would keep an archive table > associated with the user.id. But what about historical data that referenced the address? If you move today, I still want to know where I shipped last week's orders. -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
> OK, let's use a synthetic key on the sales master table. In fact, > *both* companies have a synthetic key on their sales master tables. > > OMG, conflicting/overlapping synthetic keys!!!!! Which are not exposed in any UI and therefore easier to change? Heck, I'd use UUIDs everywhere if it weren't such a pain in the neck to type them into queries. -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
Right, but when you write your script (or whatever) that cleans these keys up and does the merge, you're where you started -- arbitrary integer keys with no meaning. If you merge databases where the keys are *supposed* to have meaning, you then have to mangle *real* data to make them merge. Ron Johnson wrote: >OK, let's use a synthetic key on the sales master table. In fact, >*both* companies have a synthetic key on their sales master tables. > >OMG, conflicting/overlapping synthetic keys!!!!! > > > >
> Right, but when you write your script (or whatever) that cleans these > keys up and does the merge, you're where you started -- arbitrary > integer keys with no meaning. If you merge databases where the keys are > *supposed* to have meaning, you then have to mangle *real* data to make > them merge. Also, there are plenty of circumstances besides the sale/merge one I mentioned which would require account # changes but not require changes to the synthetic keys. -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
On Mon, 2006-11-27 at 14:00 -0700, Scott Ribe wrote: > > You would update the address, the address id wouldn't change. If you > > want to keep track of old addresses you would keep an archive table > > associated with the user.id. > > But what about historical data that referenced the address? If you move > today, I still want to know where I shipped last week's orders. As I mentioned, you can use an archive table, or you could have a column in the order table that references the sale_to or ship_to address id. There are actually many different ways to do this. For example: insert a new address, and update the users table to the new address_id. The new address has a column called active(boolean). Thus all your sales orders would reference an inactive, but valid address for that sale. Sincerely, Joshua D. Drake > > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
> insert a new address, and update the users table to the new address_id Which changes the user's "primary key". My point was that having the address id be part of the primary key is wrong. Having it be a part of a key may be fine for many uses. But it's contrary to the notion of primary key that something that not only can, but will, change for many records should be part of the primary key. "Unique" and "primary" are *not* synonyms. -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
Shenanigans! That problem occurs regardless of whether or not you use surrogate keys. You have exceeded the scope of the example. -- Brandon Aiken CS/IT Systems Engineer -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Scott Ribe Sent: Monday, November 27, 2006 4:01 PM To: Joshua D. Drake; pgsql-general@postgresql.org Subject: Re: [GENERAL] IS it a good practice to use SERIAL as Primary Key? > You would update the address, the address id wouldn't change. If you > want to keep track of old addresses you would keep an archive table > associated with the user.id. But what about historical data that referenced the address? If you move today, I still want to know where I shipped last week's orders. -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings
> Shenanigans! > > That problem occurs regardless of whether or not you use surrogate keys. > You have exceeded the scope of the example. Yes the "problem occurs" in that this is something that needs to be tracked, but the suggested schema presents peculiar problems for what otherwise is a pretty simple thing. Reasonable solution: every address is kept, so each address is a row in an address table, with its own id. Shipment rows simply record the id of the address current at the time of shipment. The user row simply records the id of the most-recent address as the now-current one. That's rational. But if you use the address id as part of the user's primary key, well now you have a primary key that keeps changing as users move. On the other hand, if you don't change the address key but the contents of the address row itself, in order to preserve this bizarre notion of primary key, you have to keep a copy somewhere of the same address with a different id in order to use that for shipments. >> You would update the address, the address id wouldn't change. If you >> want to keep track of old addresses you would keep an archive table >> associated with the user.id. > > But what about historical data that referenced the address? If you move > today, I still want to know where I shipped last week's orders. -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
On Mon, 2006-11-27 at 14:36 -0700, Scott Ribe wrote: > > insert a new address, and update the users table to the new address_id > > Which changes the user's "primary key". My point was that having the address > id be part of the primary key is wrong. As I said, you don't *have* to do it that way. I was just giving an example. You could just as easily grab the address id, insert that into an archive table with a date stamp and then just update the address itself. Thus *not* changing the "Primary Key". Joshua D. Drake > Having it be a part of a key may be > fine for many uses. But it's contrary to the notion of primary key that > something that not only can, but will, change for many records should be > part of the primary key. "Unique" and "primary" are *not* synonyms. > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
>>> insert a new address, and update the users table to the new address_id >> >> Which changes the user's "primary key". My point was that having the address >> id be part of the primary key is wrong. > > As I said, you don't *have* to do it that way. I was just giving an > example. You could just as easily grab the address id, insert that into > an archive table with a date stamp and then just update the address > itself. Thus *not* changing the "Primary Key". Thus making it more difficult to deal with historical data, and also reducing the "address id" in the "user" row to nothing more than an additional auto-generated number referencing address data that might as well just be put into the user row, because that would be no less normalized anyway than this single address row whose contents keep changing to represent different addresses over time. Using the address id as part of the user primary key forces this choice between a user primary key which changes, or funkiness in tracking addresses. Whether you change the id stored in the user row, or whether you update the address row, you are still constructing a primary key from data that is expected to change. Even the justification for it is thin. The address is certainly one good way to figure out which of two users with the same name is being referred to, but that doesn't make it part of a reasonable primary key. Phone numbers, credit card used, order history--all of these could help discriminate between users and not one of them has any place in the users' primary key. -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
On Mon, 2006-11-27 at 15:47 -0700, Scott Ribe wrote: > >>> insert a new address, and update the users table to the new address_id > >> > >> Which changes the user's "primary key". My point was that having the address > >> id be part of the primary key is wrong. > > > > As I said, you don't *have* to do it that way. I was just giving an > > example. You could just as easily grab the address id, insert that into > > an archive table with a date stamp and then just update the address > > itself. Thus *not* changing the "Primary Key". > > Thus making it more difficult to deal with historical data, and also > reducing the "address id" in the "user" row to nothing more than an > additional auto-generated number referencing address data that might as well > just be put into the user row, because that would be no less normalized > anyway than this single address row whose contents keep changing to > represent different addresses over time. O.k., do you make it a point of over analyzing everything? I gave a very simple example of how to not use an artificial key and why it could be bad. I wasn't meant to be a golden parachute. Of course there are problems with the example. It was a 10 second example with zero business or data requirements qualified around it. Please... find something more productive to do. Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
This list represents the most informed database admins I know, and while the conversation can easily devolve into minutae, I am genuinely interested in everyone's opinions on this subject. Primary keys are a pretty central aspect of database design, and most everyone on this list has unique experiences with the pros and cons of each approach...I'm pretty interested in the discussion. Joshua D. Drake wrote: >Please... find something more productive to do. > >Sincerely, > >Joshua D. Drake > > > >
On Mon, 2006-11-27 at 17:09 -0600, John McCawley wrote: > This list represents the most informed database admins I know, and while > the conversation can easily devolve into minutae, I am genuinely > interested in everyone's opinions on this subject. Primary keys are a > pretty central aspect of database design, and most everyone on this list > has unique experiences with the pros and cons of each approach...I'm > pretty interested in the discussion. Well sure, but the conversation was no longer productive. People were getting stuck on what amounted to a very trivial example. There were huge problems with that example and it in no way could represent an legitimate use in production without a whole lot of caveats. If we want to have a discussion about artificial versus natural keys, rock on.. but the answer is simple: A artificial does not protect against duplication. That's it, in a nut shell. There is no argument there. That is why you don't use artificial keys. That said... pretty much every table I create will have an artificial key... because it makes managing data easy. An example (to reuse the simple example): users ===== id serial unique, first_name text, last_name text, primary key (first_name,last_name) Yes there are problems with the above, namely you will likely have more than one joshua drake. Sincerely, Joshua D. Drake > > Joshua D. Drake wrote: > > >Please... find something more productive to do. > > > >Sincerely, > > > >Joshua D. Drake > > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
I promise I'm not trying to be a pain in the butt ;) Do you then use your serial id as your foreign key in other tables, or the firstname/lastname primary key? Joshua D. Drake wrote: >users >===== >id serial unique, >first_name text, >last_name text, >primary key (first_name,last_name) > >Yes there are problems with the above, namely you will likely have more >than one joshua drake. > >Sincerely, > >Joshua D. Drake > > >> >>
> A artificial does not protect against duplication. > > That's it, in a nut shell. There is no argument there. That is why you > don't use artificial keys. Sure, but in many cases natural primary keys simply do not exist. (People being the prime example.) Many examples of what are proposed as natural primary keys are actually not--they are very often non-unique or prone to change, or both. Relational theory is quite powerful, but its dogmatic application often conflicts with the mess that is the real world. Given a choice between a synthetic primary key and a sloppy approximation of a natural one (or even an accurate-looking one handed to me by a clean-shaven guy in an expensive suit), I'll take the synthetic one because my experience has been that over the long term it will cause fewer problems by far. -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
On Mon, 2006-11-27 at 17:31 -0600, John McCawley wrote: > I promise I'm not trying to be a pain in the butt ;) Do you then use > your serial id as your foreign key in other tables, or the > firstname/lastname primary key? Now that is a good question. I would use the id, but that is not technically proper :). Joshua D. Drake > > Joshua D. Drake wrote: > > >users > >===== > >id serial unique, > >first_name text, > >last_name text, > >primary key (first_name,last_name) > > > >Yes there are problems with the above, namely you will likely have more > >than one joshua drake. > > > >Sincerely, > > > >Joshua D. Drake > > > > > >> > >> > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/27/06 17:25, Joshua D. Drake wrote: > On Mon, 2006-11-27 at 17:09 -0600, John McCawley wrote: [snip] > users > ===== > id serial unique, > first_name text, > last_name text, > primary key (first_name,last_name) > > Yes there are problems with the above, namely you will likely have more > than one joshua drake. Right, and then the question gets to: how do you create a "good" user id? Many prefer serial types; I prefer something that is not a monotonically incrementing scalar. - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQFFa39KS9HxQb37XmcRAm+ZAKCAcpky09a2eTYMZX6HZiPIMECfcgCgrkXf 9oj3C45XcamX8avi5lYVn98= =f0Ct -----END PGP SIGNATURE-----
>> I promise I'm not trying to be a pain in the butt ;) Do you then use >> your serial id as your foreign key in other tables, or the >> firstname/lastname primary key? > > Now that is a good question. I would use the id, but that is not > technically proper :). But firstname/lastname is *NOT* a primary key. Merely calling it one in the db schema does not make it so. -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
On Nov 28, 2006, at 9:02 , Joshua D. Drake wrote: > On Mon, 2006-11-27 at 17:31 -0600, John McCawley wrote: >> I promise I'm not trying to be a pain in the butt ;) Do you then use >> your serial id as your foreign key in other tables, or the >> firstname/lastname primary key? > > Now that is a good question. I would use the id, but that is not > technically proper :). If you have both a surrogate key (the serial column) as well as a natural key (e.g., the (first_name, last_name) composite key), what difference does it make? You can get to the first_name, last_name data via a join on the surrogate key. I'm currently loosely in the surrogate key + natural key camp. One advantage of this is that if there *is* a change to the natural key, it can be changed in one table rather than cascading throughout the database schema. Michael Glaesemann grzm seespotcode net (Can we talk about NULL next? :P)
On Mon, 2006-11-27 at 18:14 -0600, Ron Johnson wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 11/27/06 17:25, Joshua D. Drake wrote: > > On Mon, 2006-11-27 at 17:09 -0600, John McCawley wrote: > [snip] > > users > > ===== > > id serial unique, > > first_name text, > > last_name text, > > primary key (first_name,last_name) > > > > Yes there are problems with the above, namely you will likely have more > > than one joshua drake. > > Right, and then the question gets to: how do you create a "good" > user id? Many prefer serial types; I prefer something that is not a > monotonically incrementing scalar. Well what is wrong with using just a serial or bigserial? In that case specifically? You could... if you really wanted to... select md5(id||first_name||last_name) from id; ;) Joshua D. Drake > > - -- > Ron Johnson, Jr. > Jefferson LA USA > > Is "common sense" really valid? > For example, it is "common sense" to white-power racists that > whites are superior to blacks, and that those with brown skins > are mud people. > However, that "common sense" is obviously wrong. > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.5 (GNU/Linux) > > iD8DBQFFa39KS9HxQb37XmcRAm+ZAKCAcpky09a2eTYMZX6HZiPIMECfcgCgrkXf > 9oj3C45XcamX8avi5lYVn98= > =f0Ct > -----END PGP SIGNATURE----- > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
On Mon, 2006-11-27 at 17:22 -0700, Scott Ribe wrote: > >> I promise I'm not trying to be a pain in the butt ;) Do you then use > >> your serial id as your foreign key in other tables, or the > >> firstname/lastname primary key? > > > > Now that is a good question. I would use the id, but that is not > > technically proper :). > > But firstname/lastname is *NOT* a primary key. Merely calling it one in the > db schema does not make it so. It certainly could be a primary key it depends on the data requirements. Joshua D. Drake > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
On Tue, 2006-11-28 at 09:29 +0900, Michael Glaesemann wrote: > On Nov 28, 2006, at 9:02 , Joshua D. Drake wrote: > > > On Mon, 2006-11-27 at 17:31 -0600, John McCawley wrote: > >> I promise I'm not trying to be a pain in the butt ;) Do you then use > >> your serial id as your foreign key in other tables, or the > >> firstname/lastname primary key? > > > > Now that is a good question. I would use the id, but that is not > > technically proper :). > > If you have both a surrogate key (the serial column) as well as a > natural key (e.g., the (first_name, last_name) composite key), what > difference does it make? You can get to the first_name, last_name > data via a join on the surrogate key. The point is easy data management... SELECT * FROM names; UPDATE names set first_name = 'foo' WHERE id = 6 vs SELECT * FROM names; UPDATE names set first_name = 'foo' WHERE first_name = 'Joshua' AND last_name = 'Drake'; Or did I misunderstand your question? Joshua D. Drake > > I'm currently loosely in the surrogate key + natural key camp. One > advantage of this is that if there *is* a change to the natural key, > it can be changed in one table rather than cascading throughout the > database schema. > > Michael Glaesemann > grzm seespotcode net > > (Can we talk about NULL next? :P) > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
On Nov 28, 2006, at 9:55 , Joshua D. Drake wrote: > On Tue, 2006-11-28 at 09:29 +0900, Michael Glaesemann wrote: >> On Nov 28, 2006, at 9:02 , Joshua D. Drake wrote: >> >>> On Mon, 2006-11-27 at 17:31 -0600, John McCawley wrote: >>>> I promise I'm not trying to be a pain in the butt ;) Do you >>>> then use >>>> your serial id as your foreign key in other tables, or the >>>> firstname/lastname primary key? >>> >>> Now that is a good question. I would use the id, but that is not >>> technically proper :). >> >> If you have both a surrogate key (the serial column) as well as a >> natural key (e.g., the (first_name, last_name) composite key), what >> difference does it make? You can get to the first_name, last_name >> data via a join on the surrogate key. > > The point is easy data management... > > SELECT * FROM names; > UPDATE names set first_name = 'foo' WHERE id = 6 > vs > SELECT * FROM names; > UPDATE names set first_name = 'foo' WHERE first_name = 'Joshua' AND > last_name = 'Drake'; > > Or did I misunderstand your question? question was more rhetorical than anything else. In the case of having both a surrogate key and a natural key, you have the advantage of both. You can perform the update just as you describe on the names table. Where I see an advantage of natural keys is when you're updating a table that references "names". Then, all you've got is an integer if you're using the surrogate key. UPDATE ... FROM and DELETE ... USING are very helpful in this situation. Though I'd rather build an app to perform such updates than enter them via direct SQL. Michael Glaesemann grzm seespotcode net
On Nov 23, 2006, at 10:23 AM, Ron Johnson wrote: > I'm one of those who thinks that a (possibly multisegment) natural > key *does* exist, and that if you think it doesn't, your design is > wrong. I agree, but that doesn't mean you want to be spreading that multi- field key throughout your database. For example, origination, destination, flight_number, and date together make a unique key for a segment of a flight. But that doesn't mean you want to be storing all that info in every other table in the database that references a segment. Not only does moving all that data around become non- trivial, writing all that stuff into all the joins is error-prone (not to mention a PITA). -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Ron,
Yes. There are a lot of numbers, but I the concept of a "primary key" is totally different from the concept of "number".
a "Primary Key" is supposed to identify <something< across time and space.
Some time people thought that an IP-Number will do this... soonly after there was NAT.
Again, somebody thougt, "every microsecond is unique", and then there was daylight saving.
Who gives guarantees that credit card numbers will not get reused after years of inactivity?
Yes, there are natural keys, and it's good to use them to identify things for humans. But I got bitten many times by using them as primary. Many things that "will not change, ever", just were changed on the next meeting.
Harald
--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Reinsburgstraße 202b
70197 Stuttgart
0173/9409607
-
Python: the only language with more web frameworks than keywords.
> In the "real world" there is no such thing as a primary key. At least not
> over time. Not enough people understand the concept of a primary key to
> make those things existent in the real world.
There are lots of numbers. Credit card numbers, account numbers
sales ticket numbers, etc, etc ad nauseum. Julian day and Julian
date, even. You can't have lived in "the west" in the past 30 years
without being surrounded by them.
Yes. There are a lot of numbers, but I the concept of a "primary key" is totally different from the concept of "number".
a "Primary Key" is supposed to identify <something< across time and space.
Some time people thought that an IP-Number will do this... soonly after there was NAT.
Again, somebody thougt, "every microsecond is unique", and then there was daylight saving.
Who gives guarantees that credit card numbers will not get reused after years of inactivity?
Yes, there are natural keys, and it's good to use them to identify things for humans. But I got bitten many times by using them as primary. Many things that "will not change, ever", just were changed on the next meeting.
Harald
--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Reinsburgstraße 202b
70197 Stuttgart
0173/9409607
-
Python: the only language with more web frameworks than keywords.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/27/06 23:58, Harald Armin Massa wrote: > Ron, > >> >> > In the "real world" there is no such thing as a primary key. At least >> not >> > over time. Not enough people understand the concept of a primary key to >> > make those things existent in the real world. >> >> There are lots of numbers. Credit card numbers, account numbers >> sales ticket numbers, etc, etc ad nauseum. Julian day and Julian >> date, even. You can't have lived in "the west" in the past 30 years >> without being surrounded by them. > > > Yes. There are a lot of numbers, but I the concept of a "primary key" is > totally different from the concept of "number". > > a "Primary Key" is supposed to identify <something< across time and space. > > Some time people thought that an IP-Number will do this... soonly after > there was NAT. > > Again, somebody thougt, "every microsecond is unique", and then there was > daylight saving. Until we get to velocities near c, every microsecond (ticks in OpenVMS are 100ns intervals, 0.1us, but that's just getting picky) *is* unique. But since DST is a valid concern, that's why you don't store the string 'YYYY-AA-DD HH:MM:SS.mmmm', you store a bigint with the ticks since epoch, and let the app interpret into environment-appropriate form. > Who gives guarantees that credit card numbers will not get reused after > years of inactivity? > > Yes, there are natural keys, and it's good to use them to identify things > for humans. But I got bitten many times by using them as primary. Many > things that "will not change, ever", just were changed on the next meeting. - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQFFa+QjS9HxQb37XmcRAv6OAJ42fE2FSdhCdaySH1GUczpJlP1B1gCg1TpE OLTs3bhPVqT4YaFaNvMEbI0= =ktpr -----END PGP SIGNATURE-----
"Joshua D. Drake" wrote: <snip/> > > That's it, in a nut shell. There is no argument there. That is why you > don't use artificial keys. That said... pretty much every table I create > will have an artificial key... because it makes managing data easy. An > example (to reuse the simple example): > > users > ===== > id serial unique, > first_name text, > last_name text, > primary key (first_name,last_name) Why not users ===== id serial primary key, first_name text, last_name text, unique (first_name, last_name) ? This way, you can use the "id" as foreign key (more efficient), allows you to "UPDATE first_name" if needed, and if you find out you need to add another Joshua Drake to the DB, all you need is to drop the unique constraint?