Thread: PostgreSQL Developer Best Practices
so I've started my own. At the risk of stirring up a storm of controversy, I would
appreciate additional suggestions and feedback.
As a way of clarifying, generally, a DBA is someone that is responsible for maintaining the integrity of the database, while a developer is someone that writes code and SQL statements to update the data.
I've attached a file with a few starters that although are numbered, are in no special order.
Please keep in mind the attached are purely based on my years of experience working with developers that are not familiar with PostgreSQL and are not hard and fast rules, but general guidelines.
Hopefully this will result in something that brings about harmony between PostgreSQL DBA's and Developers.
--
Attachment
On Sat, Aug 22, 2015 at 11:15:07AM -0400, Melvin Davidson wrote: > PostgreSQL Developer Best Practices > > 1. Prefix ALL literals with an Escape > EG: SELECT E'This is a \'quoted literal \''; > SELECT E'This is an unquoted literal'; > > Doing so will prevent the annoying "WARNING: nonstandard use of escape in a string literal" This is certainly not Best Practice as the warning is annoying for a reason. Best Practice would rather be something along the lines: Avoid coding in a way that triggers "WARNING: nonstandard use of escape in a string literal". If you cannot comply with this rule document your reasons. > Good example: > CREATE TABLE accounts > ( accout_id bigint NOT NULL , Typo. Karsten -- GPG key ID E4071346 @ eu.pool.sks-keyservers.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
>annoying for a reason.
>Best Practice would rather be something along the lines:
> Avoid coding in a way that triggers "WARNING:
> nonstandard use of escape in a string literal". If you
> cannot comply with this rule document your reasons.
> Good example:
> CREATE TABLE accounts
> ( accout_id bigint NOT NULL ,
Typo.
On Sat, Aug 22, 2015 at 11:15:07AM -0400, Melvin Davidson wrote:
> PostgreSQL Developer Best Practices
>
> 1. Prefix ALL literals with an Escape
> EG: SELECT E'This is a \'quoted literal \'';
> SELECT E'This is an unquoted literal';
>
> Doing so will prevent the annoying "WARNING: nonstandard use of escape in a string literal"
This is certainly not Best Practice as the warning is
annoying for a reason.
Best Practice would rather be something along the lines:
Avoid coding in a way that triggers "WARNING:
nonstandard use of escape in a string literal". If you
cannot comply with this rule document your reasons.
> Good example:
> CREATE TABLE accounts
> ( accout_id bigint NOT NULL ,
Typo.
Karsten
--
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
Melvin Davidson <melvin6925@gmail.com> writes: >> Best Practice would rather be something along the lines: >>>> Avoid coding in a way that triggers "WARNING: >>>> nonstandard use of escape in a string literal". If you >>>> cannot comply with this rule document your reasons. > Thanks for the suggestion. For the past few months I've been dealing with > an error log that is filled with these warnings simply because > the developers do not comprehend how to use ( or the requirement to use) > an escape clause. IMO best practice in this area is "run with standard_conforming_strings = ON". If you're seeing this warning at all, it's because you aren't doing that, which means your code is unnecessarily unportable to other DBMSes. Adopting a coding policy of always using E'' would make that worse. regards, tom lane
Melvin Davidson <melvin6925@gmail.com> writes:
>> Best Practice would rather be something along the lines:
>>>> Avoid coding in a way that triggers "WARNING:
>>>> nonstandard use of escape in a string literal". If you
>>>> cannot comply with this rule document your reasons.
> Thanks for the suggestion. For the past few months I've been dealing with
> an error log that is filled with these warnings simply because
> the developers do not comprehend how to use ( or the requirement to use)
> an escape clause.
IMO best practice in this area is "run with standard_conforming_strings = ON".
If you're seeing this warning at all, it's because you aren't doing that,
which means your code is unnecessarily unportable to other DBMSes.
Adopting a coding policy of always using E'' would make that worse.
regards, tom lane
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
> On Sat, Aug 22, 2015 at 1:16 PM, Tom Lane <tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us>> wrote: > > Melvin Davidson <melvin6925@gmail.com <mailto:melvin6925@gmail.com>> writes: > >> Best Practice would rather be something along the lines: > >>>> Avoid coding in a way that triggers "WARNING: > >>>> nonstandard use of escape in a string literal". If you > >>>> cannot comply with this rule document your reasons. > > > Thanks for the suggestion. For the past few months I've been dealing with > > an error log that is filled with these warnings simply because > > the developers do not comprehend how to use ( or the requirement to use) > > an escape clause. > > IMO best practice in this area is "run with standard_conforming_strings = ON". > If you're seeing this warning at all, it's because you aren't doing that, > which means your code is unnecessarily unportable to other DBMSes. > Adopting a coding policy of always using E'' would make that worse. > > regards, tom lane > On 08/22/2015 02:40 PM, Melvin Davidson wrote: > Tom, > > Thank you for pointing out "run with standard_conforming_strings = ON".. > However, that is NOT the problem. > What is occurring is that the developers are sending strings like 'Mr. M\'vey', > which, if we set standard_conforming_strings = ON, would, and does, result in errors and the statement failing, > which is a lot less desirable that a simple warning. > > Therefore, I am trying to educate the developers in the proper method of escaping strings, > instead of loading up the error log with annoying warnings. > Please dont top post. But you are not educating them correctly. Using E'' isnt right. The correct way to escape a quote is to double quote it: 'Mr. M''vey' -Andy
On Sat, Aug 22, 2015 at 1:16 PM, Tom Lane <tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us>> wrote:
Melvin Davidson <melvin6925@gmail.com <mailto:melvin6925@gmail.com>> writes:
>> Best Practice would rather be something along the lines:
>>>> Avoid coding in a way that triggers "WARNING:
>>>> nonstandard use of escape in a string literal". If you
>>>> cannot comply with this rule document your reasons.
> Thanks for the suggestion. For the past few months I've been dealing with
> an error log that is filled with these warnings simply because
> the developers do not comprehend how to use ( or the requirement to use)
> an escape clause.
IMO best practice in this area is "run with standard_conforming_strings = ON".
If you're seeing this warning at all, it's because you aren't doing that,
which means your code is unnecessarily unportable to other DBMSes.
Adopting a coding policy of always using E'' would make that worse.
regards, tom lane
On 08/22/2015 02:40 PM, Melvin Davidson wrote:Tom,
Thank you for pointing out "run with standard_conforming_strings = ON"..
However, that is NOT the problem.
What is occurring is that the developers are sending strings like 'Mr. M\'vey',
which, if we set standard_conforming_strings = ON, would, and does, result in errors and the statement failing,
which is a lot less desirable that a simple warning.
Therefore, I am trying to educate the developers in the proper method of escaping strings,
instead of loading up the error log with annoying warnings.
Please dont top post.
But you are not educating them correctly. Using E'' isnt right. The correct way to escape a quote is to double quote it: 'Mr. M''vey'
-Andy
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
beyond our control at this point. Therefore, the best solution is to use ESCAPE E.>The correct way to escape a quote is to double quote it: 'Mr. M''vey'That is a matter of opinion. However, the real problem is the enclosed backslashes, which is
On 8/22/2015 12:40 PM, Melvin Davidson wrote: > What is occurring is that the developers are sending strings like 'Mr. > M\'vey', > which, if we set standard_conforming_strings = ON, would, and does, > result in errors and the statement failing, > which is a lot less desirable that a simple warning. if your developers were writing invalid C code (or whatever other language they program in), would you modify the compiler to accept their invalid syntax? or would you tell the developer to fix their code properly ? if the developers refused, why would you not fire them on the spot for incompetence ? -- john r pierce, recycling bits in santa cruz
On 8/22/2015 12:40 PM, Melvin Davidson wrote:What is occurring is that the developers are sending strings like 'Mr. M\'vey',
which, if we set standard_conforming_strings = ON, would, and does, result in errors and the statement failing,
which is a lot less desirable that a simple warning.
if your developers were writing invalid C code (or whatever other language they program in), would you modify the compiler to accept their invalid syntax? or would you tell the developer to fix their code properly ? if the developers refused, why would you not fire them on the spot for incompetence ?
--
john r pierce, recycling bits in santa cruz
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
Unfortunately, so far, people seem to fixate on item one of my guidelines and I've had no additional suggestions.John,I believe you and I think alike. The truth is, I was brought on as a consultant to help this client, so I do not have the authority to fire the developers. Rather, I am trying to help them fix the absolute worst designed DB and coding I have seen in 15 years of working with PostgreSQL. So I've asked for input on additional guidelines to help try to help them understand the right way to do things.
Melvin Davidson schrieb am 22.08.2015 um 21:40: > Thank you for pointing out "run with standard_conforming_strings = ON".. > However, that is NOT the problem. > What is occurring is that the developers are sending strings like 'Mr. M\'vey', > which, if we set standard_conforming_strings = ON, would, and does, result in errors and the statement failing, > which is a lot less desirable that a simple warning. > > Therefore, I am trying to educate the developers in the proper method of escaping strings, > instead of loading up the error log with annoying warnings. I strongly disagree: the error is not "annoying" and the statement _should_ fail. The only way you can make the developers stop using that non-standard syntax is to make the satement fail. Thomas
Melvin Davidson schrieb am 22.08.2015 um 17:15: > I've attached a file with a few starters that although are numbered, > are in no special order. > 2. End ALL queries with a semi-colon (;) > EG: SELECT some_column FROM a_table; > > Although autocommit is on by default, it is always a good idea to signal the query processor that a statement is completewith the semicolon. > Failure to do so could result in <IDLE IN TRANSACTION>, which will > hold locks on the tables involved and prevent other queries from being processed. Terminating a statement with ; has nothing to do with "<idle in transaction>" connections. It is a mere syntax thing to make the SQL client (e.g. psql) recognize the end of the statement. If you don't use it, your statement won't be executed in the first place - at least with psql as it will wait indefinitely until you finish typing the statement. A GUI client might simply send the wrong statement to the backend. If you run with autocommit disabled, ending each statement with a semicolon, will not prevent your connection from getting into that "<idle in transaction>" state. You have to end the _transaction_ using commit or rollback to avoid that. I do agree with the "end all queries with a semi-colon" rule, but the explanation is wrong. You should have another rule that says: End all transactions as soon as possible using commit or rollback. Thomas
Melvin Davidson schrieb am 22.08.2015 um 17:15:
> I've attached a file with a few starters that although are numbered,
> are in no special order.
> 2. End ALL queries with a semi-colon (;)
> EG: SELECT some_column FROM a_table;
>
> Although autocommit is on by default, it is always a good idea to signal the query processor that a statement is complete with the semicolon.
> Failure to do so could result in <IDLE IN TRANSACTION>, which will
> hold locks on the tables involved and prevent other queries from being processed.
Terminating a statement with ; has nothing to do with "<idle in transaction>" connections.
It is a mere syntax thing to make the SQL client (e.g. psql) recognize the end of the statement.
If you don't use it, your statement won't be executed in the first place - at least with psql
as it will wait indefinitely until you finish typing the statement. A GUI client might simply send
the wrong statement to the backend.
If you run with autocommit disabled, ending each statement with a semicolon, will not prevent your connection
from getting into that "<idle in transaction>" state. You have to end the _transaction_ using commit or
rollback to avoid that.
I do agree with the "end all queries with a semi-colon" rule, but the explanation is wrong.
You should have another rule that says:
End all transactions as soon as possible using commit or rollback.
Thomas
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
> 1. Prefix ALL literals with an Escape
> EG: SELECT E'This is a \'quoted literal \'';
> SELECT E'This is an unquoted literal';
>
> Doing so will prevent the annoying "WARNING: nonstandard use of escape in a string literal"
I'd be concerned that what is missing here is the bigger issue of Best Practice #0: Use Bound Variables.
9. Do NOT arbitrarily assign an "id" column to a table as a primary key when other columns
are perfectly suited as a unique primary key.
...
Good example:
CREATE TABLE accounts
( accout_id bigint NOT NULL ,
On Sat, Aug 22, 2015 at 11:46 AM, Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote:> 1. Prefix ALL literals with an Escape
> EG: SELECT E'This is a \'quoted literal \'';
> SELECT E'This is an unquoted literal';
>
> Doing so will prevent the annoying "WARNING: nonstandard use of escape in a string literal"
I'd be concerned that what is missing here is the bigger issue of Best Practice #0: Use Bound Variables.The only way I've seen invalid literals show up in SQL queries is through the dynamic generation of SQL Statements vs. using bound variables.Not using bound variables is your doorway to SQL injection exploits.9. Do NOT arbitrarily assign an "id" column to a table as a primary key when other columns
are perfectly suited as a unique primary key....Good example:
CREATE TABLE accounts
( accout_id bigint NOT NULL ,I would not consider the general use of natural primary keys to be best practice.Let's assume your account_id field is used as a foreign key in a dozen other tables.1) What happens if someone mis-types the account-id?To correct that, you also need to correct the FK field in the other dozen tables.2) What happens when your company starts a new project (or buys a competitor) and all the new account numbers are alpha-numeric?3) Your example shows the id as a bigint, but your rule is not limited to integers.What if your table is country populations and the primary key is country name?Now, you have quite large foreign keys (and a country changing its name is not unheard of).(and let's not even get started on case-sensitivity or character encodings).--Raymond Cote, Presidentvoice: +1.603.924.6079 email: rgacote@AppropriateSolutions.com skype: ray.cote
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
9.I would reply that in good applications, the user DOES NOT type the key, but rather selects from a drop down list, or the app looks it up / enters it for them. Besides, it's just as easy to miskey an integer as it is an aplha numeric. The point is, do not create two primary pkey's when one will do.>1) What happens if someone mis-types the account-id?> To correct that, you also need to correct the FK field in the other dozen tables.>2) What happens when your company starts a new project (or buys a competitor) >and all the new account numbers are alpha-numeric?
On Sat, Aug 22, 2015 at 11:46 AM, Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote:> 1. Prefix ALL literals with an Escape
> EG: SELECT E'This is a \'quoted literal \'';
> SELECT E'This is an unquoted literal';
>
> Doing so will prevent the annoying "WARNING: nonstandard use of escape in a string literal"
I'd be concerned that what is missing here is the bigger issue of Best Practice #0: Use Bound Variables.The only way I've seen invalid literals show up in SQL queries is through the dynamic generation of SQL Statements vs. using bound variables.Not using bound variables is your doorway to SQL injection exploits.
9.I would reply that in good applications, the user DOES NOT type the key, but rather selects from a drop down list, or the app looks it up / enters it for them. Besides, it's just as easy to miskey an integer as it is an aplha numeric. The point is, do not create two primary pkey's when one will do.>1) What happens if someone mis-types the account-id?> To correct that, you also need to correct the FK field in the other dozen tables.>2) What happens when your company starts a new project (or buys a competitor) >and all the new account numbers are alpha-numeric?Your missing the point. The existing "Account ID" that you refer to is apparently externally defined. Pretend it is a social security number. How would one create a new user in your system, and record their account_id/social-security-number, without typing it in. What then if it is discovered that the keyed in value was mis-typed?The "point" is to not introduce redundant information. Creating your own surrogate identifier in order to avoid using a surrogate identifier value created by another system does not introduce redundancy but rather provides the system using the primary key control over its generation and, more importantly, format. The highly situational nature of this is why "data modelling" is not something I'd incorporate in a "usage" document.David J.
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
it can be fixed, be it one method of another.>What then if it is discovered that the keyed in value was mis-typed?That is why SQL has UPDATE and DELETE statements. If a primary key is incorrect,
9. Do NOT arbitrarily assign an "id" column to a table as a primary key when other columns
are perfectly suited as a unique primary key....Good example:
CREATE TABLE accounts
( accout_id bigint NOT NULL ,I would not consider the general use of natural primary keys to be best practice.Let's assume your account_id field is used as a foreign key in a dozen other tables.1) What happens if someone mis-types the account-id?To correct that, you also need to correct the FK field in the other dozen tables.2) What happens when your company starts a new project (or buys a competitor) and all the new account numbers are alpha-numeric?
On 08/24/2015 07:58 AM, John Turner wrote: > On Mon, 24 Aug 2015 09:15:27 -0400, Ray Cote > > Point 9 is well-intentioned, but perhaps needs to be > clarified/rephrased: Developers should not be creating production-grade > tables devoid of well-defined business keys, period. That would be > regardless of whether they're used as de facto primary keys or simply as > unique keys. Although I appreciate your argument, I think we need a little foundation in reality. The "serial" key is the default primary key amongst every single web development environment in existence. We can make an argument within the doc to why that can be bad, but to state that it is "wrong" is just not going to get you anywhere. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Centered full stack support, consulting and development. Announcing "I'm offended" is basically telling the world you can't control your own emotions, so everyone else should do it for you.
On 08/24/2015 07:58 AM, John Turner wrote:On Mon, 24 Aug 2015 09:15:27 -0400, Ray Cote
Point 9 is well-intentioned, but perhaps needs to be
clarified/rephrased: Developers should not be creating production-grade
tables devoid of well-defined business keys, period. That would be
regardless of whether they're used as de facto primary keys or simply as
unique keys.
Although I appreciate your argument, I think we need a little foundation in reality. The "serial" key is the default primary key amongst every single web development environment in existence.
We can make an argument within the doc to why that can be bad, but to state that it is "wrong" is just not going to get you anywhere.
JD
--
Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
On 08/24/2015 08:56 AM, Melvin Davidson wrote: > >The "serial" key is the default primary key amongst every single web > development environment in existence. > > Methinks thou doest take too much for granted. > > Yes, serial has it's purpose, but I sincerely doubt it is "the default > primary key amongst every single web development environment in existence" > I am not sure where you get your stats from. Probably you are referring > to "Ruby on Rails". IMHO, RoR is something which has made it easier to > code Web apps, at the cost of developers not needing to use brain power. > In any case, the idea is to develop good database design. not web apps. Well I think the broader definition is that surrogate key use is a byproduct of ORM use. Web development frameworks tend to use an ORM as the default way of interacting with the database so you get surrogate keys. > > On Mon, Aug 24, 2015 at 11:46 AM, Joshua D. Drake <jd@commandprompt.com > <mailto:jd@commandprompt.com>> wrote: > > On 08/24/2015 07:58 AM, John Turner wrote: > > On Mon, 24 Aug 2015 09:15:27 -0400, Ray Cote > > > > Point 9 is well-intentioned, but perhaps needs to be > clarified/rephrased: Developers should not be creating > production-grade > tables devoid of well-defined business keys, period. That would be > regardless of whether they're used as de facto primary keys or > simply as > unique keys. > > > Although I appreciate your argument, I think we need a little > foundation in reality. The "serial" key is the default primary key > amongst every single web development environment in existence. > > We can make an argument within the doc to why that can be bad, but > to state that it is "wrong" is just not going to get you anywhere. > > JD > > > -- > Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 > <tel:503-667-4564> > PostgreSQL Centered full stack support, consulting and development. > Announcing "I'm offended" is basically telling the world you can't > control your own emotions, so everyone else should do it for you. > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org > <mailto:pgsql-general@postgresql.org>) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > > > > > -- > *Melvin Davidson* > I reserve the right to fantasize. Whether or not you > wish to share my fantasy is entirely up to you. -- Adrian Klaver adrian.klaver@aklaver.com
On 08/24/2015 08:56 AM, Melvin Davidson wrote: > >The "serial" key is the default primary key amongst every single web > development environment in existence. > > Methinks thou doest take too much for granted. > > Yes, serial has it's purpose, but I sincerely doubt it is "the default > primary key amongst every single web development environment in existence" > I am not sure where you get your stats from. Probably you are referring > to "Ruby on Rails". IMHO, Rails Anything that uses Hibernate (Java) Django Every PHP framework Pyramid Anything that uses sql-alchemy I can go on for miles with this. It is true that a lot of these support non-serial keys. It is also true that is not the default. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Centered full stack support, consulting and development. Announcing "I'm offended" is basically telling the world you can't control your own emotions, so everyone else should do it for you.
On 08/24/2015 08:56 AM, Melvin Davidson wrote:>The "serial" key is the default primary key amongst every single web
development environment in existence.
Methinks thou doest take too much for granted.
Yes, serial has it's purpose, but I sincerely doubt it is "the default
primary key amongst every single web development environment in existence"
I am not sure where you get your stats from. Probably you are referring
to "Ruby on Rails". IMHO,
Rails
Anything that uses Hibernate (Java)
Django
Every PHP framework
Pyramid
Anything that uses sql-alchemy
I can go on for miles with this. It is true that a lot of these support non-serial keys. It is also true that is not the default.
JD
--
Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
On 08/24/2015 09:34 AM, Melvin Davidson wrote: > And again, I am talking about _database_ design, not Web apps. Letting > Web developers design a database to work with their app, is a very, > Very, VERY bad idea. Again this is not restricted to Web apps. Anything that touches a database via an ORM is fair game. > It is far better to let DBA's and "database develeopers" design a good > database, then to let those apps mold a db into a non-optimum design. Aah, the perfect world I do not live in. > > On Mon, Aug 24, 2015 at 12:26 PM, Joshua D. Drake <jd@commandprompt.com > <mailto:jd@commandprompt.com>> wrote: > > On 08/24/2015 08:56 AM, Melvin Davidson wrote: > > >The "serial" key is the default primary key amongst every > single web > development environment in existence. > > Methinks thou doest take too much for granted. > > Yes, serial has it's purpose, but I sincerely doubt it is "the > default > primary key amongst every single web development environment in > existence" > I am not sure where you get your stats from. Probably you are > referring > to "Ruby on Rails". IMHO, > > > Rails > Anything that uses Hibernate (Java) > Django > Every PHP framework > Pyramid > Anything that uses sql-alchemy > > I can go on for miles with this. It is true that a lot of these > support non-serial keys. It is also true that is not the default. > > > JD > > > -- > Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 > <tel:503-667-4564> > PostgreSQL Centered full stack support, consulting and development. > Announcing "I'm offended" is basically telling the world you can't > control your own emotions, so everyone else should do it for you. > > > > > -- > *Melvin Davidson* > I reserve the right to fantasize. Whether or not you > wish to share my fantasy is entirely up to you. -- Adrian Klaver adrian.klaver@aklaver.com
On 08/24/2015 09:34 AM, Melvin Davidson wrote: > And again, I am talking about _database_ design, not Web apps. Letting > Web developers design a database to work with their app, is a very, > Very, VERY bad idea. And I don't argue that but we also live in a world based on reality. DBAs are rare, web developers who think they know how to DBA (and are wrong) are anywhere from 8 to 40 bucks an hour depending on where you are paying them. > It is far better to let DBA's and "database develeopers" design a good > database, then to let those apps mold a db into a non-optimum design. It is far better to recognize the realities of the market place and document why/how to do something that understands those realities than to put in place a pedantic document that will only let DBAs sit there and feel good about themselves. In short, I agree with you, we just don't live in that world. I am not even suggesting that you remove your very good work. I am suggesting that you allow for the fact that what you are suggesting is probably < 20% of designs out there and therefore we have to compromise perfect to good enough. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Centered full stack support, consulting and development. Announcing "I'm offended" is basically telling the world you can't control your own emotions, so everyone else should do it for you.
On Mon, 24 Aug 2015 12:40:37 -0400, Joshua D. Drake <jd@commandprompt.com> wrote: > On 08/24/2015 09:34 AM, Melvin Davidson wrote: >> And again, I am talking about _database_ design, not Web apps. Letting >> Web developers design a database to work with their app, is a very, >> Very, VERY bad idea. > > And I don't argue that but we also live in a world based on reality. > DBAs are rare, web developers who think they know how to DBA (and are > wrong) are anywhere from 8 to 40 bucks an hour depending on where you > are paying them. > >> It is far better to let DBA's and "database develeopers" design a good >> database, then to let those apps mold a db into a non-optimum design. > > It is far better to recognize the realities of the market place and > document why/how to do something that understands those realities than > to put in place a pedantic document that will only let DBAs sit there > and feel good about themselves. > > In short, I agree with you, we just don't live in that world. I am not > even suggesting that you remove your very good work. I am suggesting > that you allow for the fact that what you are suggesting is probably < > 20% of designs out there and therefore we have to compromise perfect to > good enough. > > JD > I'm all for development via ORMs, but alas, the ORMs would've done well to have given more consideration to key conventions... I would only add that if the app is of any significance for an LOB, then at some point the schema would need to be validated against real business keys. - John
On 8/24/2015 9:34 AM, Melvin Davidson wrote: > And again, I am talking about _database_ design, not Web apps. Letting > Web developers design a database to work with their app, is a very, > Very, VERY bad idea. > It is far better to let DBA's and "database develeopers" design a good > database, then to let those apps mold a db into a non-optimum design. if you let the app drive the database design, you tend to end up with a database which is only useful to that single app, and likely breaks when that app changes. -- john r pierce, recycling bits in santa cruz
On 8/24/2015 9:34 AM, Melvin Davidson wrote:And again, I am talking about _database_ design, not Web apps. Letting Web developers design a database to work with their app, is a very, Very, VERY bad idea.
It is far better to let DBA's and "database develeopers" design a good database, then to let those apps mold a db into a non-optimum design.
if you let the app drive the database design, you tend to end up with a database which is only useful to that single app, and likely breaks when that app changes.
--
john r pierce, recycling bits in santa cruz
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
I believe he's talking about triggers.
Melvin Davidson wrote:9.
>1) What happens if someone mis-types the account-id?
> To correct that, you also need to correct the FK field in the
other dozen tables.
>2) What happens when your company starts a new project (or buys a
I would not consider the general use of natural primary keys to be
best practice.
Let's assume your account_id field is used as a foreign key in a
dozen other tables.
1) What happens if someone mis-types the account-id?
To correct that, you also need to correct the FK field in the
other dozen tables.
... ON UPDATE CASCADE ?
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
On Mon, Aug 24, 2015 at 08:22:17PM -0400, Melvin Davidson wrote: > On Mon, Aug 24, 2015 at 8:00 PM, Berend Tober <btober@computer.org> wrote: > >> 1) What happens if someone mis-types the account-id? > >> To correct that, you also need to correct the FK field in the > >> other dozen tables. > > > > ... ON UPDATE CASCADE ? > > I believe he's talking about triggers. Huh? Why would you use a trigger when FOREIGN KEY has ON UPDATE CASCADE? -- "A search of his car uncovered pornography, a homemade sex aid, women's stockings and a Jack Russell terrier." - http://www.dailytelegraph.com.au/news/wacky/indeed/story-e6frev20-1111118083480
On Mon, Aug 24, 2015 at 08:22:17PM -0400, Melvin Davidson wrote:
> On Mon, Aug 24, 2015 at 8:00 PM, Berend Tober <btober@computer.org> wrote:
> >> 1) What happens if someone mis-types the account-id?
> >> To correct that, you also need to correct the FK field in the
> >> other dozen tables.
> >
> > ... ON UPDATE CASCADE ?
>
> I believe he's talking about triggers.
Huh? Why would you use a trigger when FOREIGN KEY has ON UPDATE CASCADE?
--
"A search of his car uncovered pornography, a homemade sex aid, women's
stockings and a Jack Russell terrier."
- http://www.dailytelegraph.com.au/news/wacky/indeed/story-e6frev20-1111118083480
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
On 25/08/15 01:15, Ray Cote wrote: > On Sat, Aug 22, 2015 at 11:46 AM, Karsten Hilbert > <Karsten.Hilbert@gmx.net <mailto:Karsten.Hilbert@gmx.net>> wrote: > [...] > > 9. Do NOT arbitrarily assign an "id" column to a table as a > primary key when other columns > are perfectly suited as a unique primary key. > > ... > > Good example: > CREATE TABLE accounts > ( accout_id bigint NOT NULL , > > > I would not consider the general use of natural primary keys to be > best practice. [...] Neither would I. The database has primary keys that are often foreign keys for other tables. So if the primary key is a natural key, then if the external world redefines the nature of the natural key, for example changing its type or format, then this would have unnecessary invasive changes to multiple tables within the database. Also you are at the mercy of external control of what constitutes uniqueness, for example the American Social Security Number is not unique! Also the best practice is to make the primary key name 'id' as you do know the table it is in, so prepending the table name is redundant - so you can clearly identify foreign keys because the suffix '_id 'is prepended by the table name of the referenced table. Hence 'id' is a primary key, and account_id is a foreign key pointing into the account table. I have had to deal with databases were a child table's primary key is the parent table's primary key with extra characters appended, so you can have a child table's primary key exceeding 45 characters. The child table only need to know the primary key of it direct parent, so using int, or bigint, would be a far better solution! Having said the above, there may well be valid reasons to use a natural key for the primary key - so it should NOT be an absolute rule to disallow it. Cheers, Gavin
On 25/08/15 02:58, John Turner wrote: [...] > Conversely, if synthetic keys are chosen as Primary, they must be > accompanied by a legitimate Unique natural key. Agreed, but only where appropriate. Cheers, Gavin
On 25/08/15 04:26, Joshua D. Drake wrote: > On 08/24/2015 08:56 AM, Melvin Davidson wrote: >> >The "serial" key is the default primary key amongst every single web >> development environment in existence. >> >> Methinks thou doest take too much for granted. >> >> Yes, serial has it's purpose, but I sincerely doubt it is "the default >> primary key amongst every single web development environment in >> existence" >> I am not sure where you get your stats from. Probably you are referring >> to "Ruby on Rails". IMHO, > > Rails > Anything that uses Hibernate (Java) > Django > Every PHP framework > Pyramid > Anything that uses sql-alchemy > > I can go on for miles with this. It is true that a lot of these > support non-serial keys. It is also true that is not the default. > > JD > > I came to the idea of using surrogate primary keys long before I knew anything about the software on the above list or anything similar! Cheers, Gavin
On Mon, Aug 24, 2015 at 08:53:43PM -0400, Melvin Davidson wrote: > You are right, he was probably talking about FK's. I was just so frustrated > about people insisting that using "ID" as the primary key in every table is > a "good" idea, > I didn't bother to reply previously. I stand firm on my belief that the > primary key should be something meaningful and NOT "id" just for the sake > of having a unique numeric key. I think there is a case to be made for pragmatism. Perhaps you should split your points up into "ideal" and "pragmatic" requirements. ie you should aim for the "ideal" but, with good justification, "pragmatic" will be acceptable. Get those doing the work to think about their decisions. -- "A search of his car uncovered pornography, a homemade sex aid, women's stockings and a Jack Russell terrier." - http://www.dailytelegraph.com.au/news/wacky/indeed/story-e6frev20-1111118083480
Also the best practice is to make the primary key name 'id' as you do know the table it is in, so prepending the table name is redundant - so you can clearly identify foreign keys because the suffix '_id 'is prepended by the table name of the referenced table. Hence 'id' is a primary key, and account_id is a foreign key pointing into the account table.
> On Aug 24, 2015, at 6:53 PM, Melvin Davidson <melvin6925@gmail.com> wrote: > > You are right, he was probably talking about FK's. I was just so frustrated about people insisting that using "ID" as theprimary key in every table is a "good" idea, > I didn't bother to reply previously. I stand firm on my belief that the primary key should be something meaningful andNOT "id" just for the sake of having a unique numeric key. > What, pray tell, is the unique natural key of person in any meaningfully large domain such as state? Certainly not name +birthdate. Current address isn’t guaranteed. Social isn’t reliable and actually not truly unique. Even given that there are models which are made of entities with legitimate attributes which per force define a unique instance,I see no benefit in avoiding the convenience of an arbitrary and simple value for the key. Is it the overhead ofgenerating and storing one more value per tuple that you can’t abide?
On 25/08/15 14:45, David G. Johnston wrote: > On Mon, Aug 24, 2015 at 10:02 PM, Gavin Flower > <GavinFlower@archidevsys.co.nz > <mailto:GavinFlower@archidevsys.co.nz>>wrote: > > Also the best practice is to make the primary key name 'id' as you > do know the table it is in, so prepending the table name is > redundant - so you can clearly identify foreign keys because the > suffix '_id 'is prepended by the table name of the referenced > table. Hence 'id' is a primary key, and account_id is a foreign > key pointing into the account table. > > > I would much rather be able to write: > > SELECT parent_id, child_id, [...] > FROM parent > JOIN child USING (parent_id) > > instead of > > SELECT parent.id <http://parent.id> AS parent_id, child.id > <http://child.id> AS child_id, [...] > FROM parent > JOIN child ON (parent.id <http://parent.id> = child.parent_id) > > Yes, looking at the parent table it is obvious that the id you are > looking at is the "parent" id. But as soon as you join two or more > tables you are guaranteed to have multiple columns with the name "id" > that you now need to disambiguate. > > > The column name "table_id" refers to the primary identifier for that > entity no matter where it appears. I'd rather have one redundant > situation than one exception to the rule. > > David J. > Hmm... I consider it good practice to always give an alias for each table used, especially for non trivial SQL statements. So I think the above would look better (using slightly more realistic table names) as: SELECT c.id, s.id, [...] FROM company c JOIN shop s USING (s.company_id = c.id); Which is I think a lot clearer (it is obvious that you are joining a foreign key with a primary key), and you can add more stuff without it suddenly becoming ambiguous. I once wrote a Sybase stored proc with over 3000 lines of SQL (not practical to split it up, unfortunately), individual selects were often over half a page. It interrogated 17 tables from two different databases and needed 5 temporary tables. Cheers, Gavin P.S. the use of '[...]' was started by me way back in the heyday of usenet, in the beginning of the 1990's! Previously people used '[ omitted ]'.
On Tue, Aug 25, 2015 at 02:02:17PM +1200, Gavin Flower wrote: > On 25/08/15 01:15, Ray Cote wrote: > >On Sat, Aug 22, 2015 at 11:46 AM, Karsten Hilbert <Karsten.Hilbert@gmx.net > ><mailto:Karsten.Hilbert@gmx.net>> wrote: > > > [...] > > > > 9. Do NOT arbitrarily assign an "id" column to a table as a > > primary key when other columns > > are perfectly suited as a unique primary key. > > > > ... > > > > Good example: > > CREATE TABLE accounts > > ( accout_id bigint NOT NULL , > > > > > >I would not consider the general use of natural primary keys to be best > >practice. Gavin, Ray, I certainly didn't write any of the above. Karsten -- GPG key ID E4071346 @ eu.pool.sks-keyservers.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
On 25/08/15 19:04, Karsten Hilbert wrote: > On Tue, Aug 25, 2015 at 02:02:17PM +1200, Gavin Flower wrote: > >> On 25/08/15 01:15, Ray Cote wrote: >>> On Sat, Aug 22, 2015 at 11:46 AM, Karsten Hilbert <Karsten.Hilbert@gmx.net >>> <mailto:Karsten.Hilbert@gmx.net>> wrote: >>> >> [...] >>> 9. Do NOT arbitrarily assign an "id" column to a table as a >>> primary key when other columns >>> are perfectly suited as a unique primary key. >>> >>> ... >>> >>> Good example: >>> CREATE TABLE accounts >>> ( accout_id bigint NOT NULL , >>> >>> >>> I would not consider the general use of natural primary keys to be best >>> practice. > Gavin, Ray, > > I certainly didn't write any of the above. > > Karsten Hi Karsten, It took me a couple of minutes, but I traced "9. ..." to melvin6925@gmail.com who opened the thread Looks like Ray misquoted back in the entry that can be identified by (using the 'source' option on my mail client) From: Ray Cote <rgacote@appropriatesolutions.com> Date: Mon, 24 Aug 2015 09:15:27 -0400 Message-ID: <CAG5tnzqTausEhFtRpfWCunx4YNFuGTFyUZyTkn5f2E7RaYKE=g@mail.gmail.com> which was On Sat, Aug 22, 2015 at 11:46 AM, Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote: > > 1. Prefix ALL literals with an Escape > > EG: SELECT E'This is a \'quoted literal \''; > > SELECT E'This is an unquoted literal'; > > > > Doing so will prevent the annoying "WARNING: nonstandard use of > escape in a string literal" > I'd be concerned that what is missing here is the bigger issue of Best Practice #0: Use Bound Variables. The only way I've seen invalid literals show up in SQL queries is through the dynamic generation of SQL Statements vs. using bound variables. Not using bound variables is your doorway to SQL injection exploits. 9. Do NOT arbitrarily assign an "id" column to a table as a primary key > when other columns > are perfectly suited as a unique primary key. ... Good example: > CREATE TABLE accounts > ( accout_id bigint NOT NULL , I would not consider the general use of natural primary keys to be best practice. Let's assume your account_id field is used as a foreign key in a dozen other tables. 1) What happens if someone mis-types the account-id? To correct that, you also need to correct the FK field in the other dozen tables. 2) What happens when your company starts a new project (or buys a competitor) and all the new account numbers are alpha-numeric? 3) Your example shows the id as a bigint, but your rule is not limited to integers. What if your table is country populations and the primary key is country name? Now, you have quite large foreign keys (and a country changing its name is not unheard of). (and let's not even get started on case-sensitivity or character encodings). Cheers, Gavin
> On Aug 22, 2015, at 10:15 AM, Melvin Davidson <melvin6925@gmail.com> wrote: > 6. Although it is legal to use the form "column TYPE PRIMARY KEY," It is best to specify as a CONSTRAINT, > that way YOU get to choose the name, otherwise postgres assigns a default name which may not be to your liking. > EG: , CONSTRAINT accounts_pk PRIMARY KEY (account_id) > 10. Standardize Index names with the form table_name + col(s) + “idx” > EG: For accounts table: > accounts_name_idx > accounts_city_state_idx > I really like the standardization that PostgreSQL uses in auto generating default names. The rule I use is to always usethe auto generated names unless the object is referenced routinely in code. In most cases developers don’t care aboutindex, unique, foreign key, or primary key names (from a coding standpoint) so why should they be creating the names.Since the postgresql standard uses auto generated names with ‘_pkey’ for PRIMARY KEY ‘_fkey’ for FOREIGN KEY, and‘_key’ for UNIQUE, why not use the same rules for consistency? So I disagree with 6 and would extend 10 to include theseother names if they are manually generated. interestingly enough, when I searched 9.5 docs I could not find a description of these postgreSQL naming convention. Probablybecause the developers consider it an internal detail that could change which is fine, since the names usually don’tmatter, until they do. I would say use “column TYPE PRIMARY KEY”, “column TYPE UNIQUE”, and ‘column TYPE REFERENCES …’ every place you can and onlycreate manual names when absolutely necessary. When you do create manual names follow the standard PostgreSQL convention. Now I have worked on mostly smaller installations so maybe someone should chime in if this is a bad best practice. Neil
On 08/24/2015 08:44 PM, Rob Sargent wrote: > >> On Aug 24, 2015, at 6:53 PM, Melvin Davidson <melvin6925@gmail.com> wrote: >> >> You are right, he was probably talking about FK's. I was just so frustrated about people insisting that using "ID" asthe primary key in every table is a "good" idea, >> I didn't bother to reply previously. I stand firm on my belief that the primary key should be something meaningful andNOT "id" just for the sake of having a unique numeric key. >> > What, pray tell, is the unique natural key of person in any meaningfully large domain such as state? Certainly not name+ birthdate. Current address isn’t guaranteed. Social isn’t reliable and actually not truly unique. To add: 1) Who determined that a number is not natural? 2) One of the older unique natural keys (genus, species) is not so unique. I am a fisheries biologist by training and in my time the 'unique' identifier for various fishes has changed. Now that ichthyologists have discovered DNA testing, it can be expected there will be even more changes. This is even more apparent when you go back in in history. As an example: https://en.wikipedia.org/wiki/Rainbow_trout Rainbow trout Current Oncorhynchus mykiss Past Salmo mykiss Walbaum, 1792 Parasalmo mykiss (Walbaum, 1792) Salmo purpuratus Pallas, 1814 Salmo penshinensis Pallas, 1814 Parasalmo penshinensis (Pallas, 1814) Salmo gairdnerii Richardson, 1836 <--The one I learned. Fario gairdneri (Richardson, 1836) Oncorhynchus gairdnerii (Richardson, 1836) Salmo gairdnerii gairdnerii Richardson, 1836 Salmo rivularis Ayres, 1855 Salmo iridea Gibbons, 1855 Salmo gairdnerii irideus Gibbons, 1855 Salmo irideus Gibbons, 1855 Trutta iridea (Gibbons, 1855) Salmo truncatus Suckley, 1859 Salmo masoni Suckley, 1860 Oncorhynchus kamloops Jordan, 1892 Salmo kamloops (Jordan, 1892) Salmo rivularis kamloops (Jordan, 1892) Salmo gairdneri shasta Jordan, 1894 Salmo gilberti Jordan, 1894 Salmo nelsoni Evermann, 1908 All the above point to the same fish and have appeared and appear in articles and reports about said fish. Lets not even get into the common name situation:). > > Even given that there are models which are made of entities with legitimate attributes which per force define a uniqueinstance, I see no benefit in avoiding the convenience of an arbitrary and simple value for the key. Is it the overheadof generating and storing one more value per tuple that you can’t abide? > > > > -- Adrian Klaver adrian.klaver@aklaver.com
Stop being so technical. When we/I speak of natural keys, we are talking about the column
that would NATURALly lend itself as the primary key.
No one ever said a number is not natural. just that there is no need to duplicate uniqueness
with a separate number.
IOW: If we have an account table, then the account_id or account_no
would be the primary key. There is no need to have a separate serial id as the primary key.
Likewise, if we have a car table, then registration (or vehicle_id) is preferred.
EG: Good
CREATE TABLE car
(
registration_no varchar(30) not null,
car_make varchar(25) not null,
model varchar(15) not null;
build_year date not null;
owner varchar(50),
CONSTRAINT car_pk PRIMARY KEY (registration_no)
);
bad
CREATE TABLE car
(
id serial not null,
registration_no varchar(30) not null,
car_make varchar(25) not null,
model varchar(15) not null;
build_year date not null;
owner varchar(50),
CONSTRAINT car_pk PRIMARY KEY (id)
);
The benefit in avoiding arbitrary and simple values for the key is that it makes
the database design much more logical.
Consider:
SELECT c.registration_no,
c.car_make,
p.part_no
FROM car c
JOIN parts p ON ( p.registration_no = c.registration_no)
WHERE registration_no = <some_var>;
versus:
SELECT c.registration_no,
c.car_make,
p.part_no
FROM car c
JOIN parts p ON ( p.id = c.id)
WHERE registration_no = <some_var>;
Why join on id when registration_no is better?
On 08/24/2015 08:44 PM, Rob Sargent wrote:On Aug 24, 2015, at 6:53 PM, Melvin Davidson <melvin6925@gmail.com> wrote:What, pray tell, is the unique natural key of person in any meaningfully large domain such as state? Certainly not name + birthdate. Current address isn’t guaranteed. Social isn’t reliable and actually not truly unique.
You are right, he was probably talking about FK's. I was just so frustrated about people insisting that using "ID" as the primary key in every table is a "good" idea,
I didn't bother to reply previously. I stand firm on my belief that the primary key should be something meaningful and NOT "id" just for the sake of having a unique numeric key.
To add:
1) Who determined that a number is not natural?
2) One of the older unique natural keys (genus, species) is not so unique. I am a fisheries biologist by training and in my time the 'unique' identifier for various fishes has changed. Now that ichthyologists have discovered DNA testing, it can be expected there will be even more changes. This is even more apparent when you go back in in history. As an example:
https://en.wikipedia.org/wiki/Rainbow_trout
Rainbow trout
Current
Oncorhynchus mykiss
Past
Salmo mykiss Walbaum, 1792
Parasalmo mykiss (Walbaum, 1792)
Salmo purpuratus Pallas, 1814
Salmo penshinensis Pallas, 1814
Parasalmo penshinensis (Pallas, 1814)
Salmo gairdnerii Richardson, 1836 <--The one I learned.
Fario gairdneri (Richardson, 1836)
Oncorhynchus gairdnerii (Richardson, 1836)
Salmo gairdnerii gairdnerii Richardson, 1836
Salmo rivularis Ayres, 1855
Salmo iridea Gibbons, 1855
Salmo gairdnerii irideus Gibbons, 1855
Salmo irideus Gibbons, 1855
Trutta iridea (Gibbons, 1855)
Salmo truncatus Suckley, 1859
Salmo masoni Suckley, 1860
Oncorhynchus kamloops Jordan, 1892
Salmo kamloops (Jordan, 1892)
Salmo rivularis kamloops (Jordan, 1892)
Salmo gairdneri shasta Jordan, 1894
Salmo gilberti Jordan, 1894
Salmo nelsoni Evermann, 1908
All the above point to the same fish and have appeared and appear in articles and reports about said fish. Lets not even get into the common name situation:).
Even given that there are models which are made of entities with legitimate attributes which per force define a unique instance, I see no benefit in avoiding the convenience of an arbitrary and simple value for the key. Is it the overhead of generating and storing one more value per tuple that you can’t abide?
--
Adrian Klaver
adrian.klaver@aklaver.com
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
Melvin Davidson wrote: > 9. > >1) What happens if someone mis-types the account-id? > > To correct that, you also need to correct the FK field in the > other dozen tables. > >2) What happens when your company starts a new project (or buys a > I would not consider the general use of natural primary keys to be > best practice. > Let's assume your account_id field is used as a foreign key in a > dozen other tables. > 1) What happens if someone mis-types the account-id? > To correct that, you also need to correct the FK field in the > other dozen tables. ... ON UPDATE CASCADE ?
On 08/25/2015 09:40 AM, Melvin Davidson wrote: > Adrian, > > Stop being so technical. When we/I speak of natural keys, we are > talking about the column > that would NATURALly lend itself as the primary key. > No one ever said a number is not natural. just that there is no need > to duplicate uniqueness > with a separate number. > > IOW: If we have an account table, then the account_id or account_no > would be the primary key. There is no need to have a separate > serial id as the primary key. If I'm following correctly, you're saying that if the definition of the entity contains and arbitrary unique value then use that. Fine. I guess I quibble with the notion of VIN as a "natural" attribute of car. (I have no firsthand experience with VINs but I would bet there's information tucked inside them, which would make me sceptical of using them :) )
Consider:
SELECT c.registration_no,
c.car_make,
p.part_no
FROM car c
JOIN parts p ON ( p.registration_no = c.registration_no)
WHERE registration_no = <some_var>;
versus:
SELECT c.registration_no,
c.car_make,
p.part_no
FROM car c
JOIN parts p ON ( p.id = c.id)
WHERE registration_no = <some_var>;
Why join on id when registration_no is better?
On 08/25/2015 09:40 AM, Melvin Davidson wrote:Adrian,If I'm following correctly, you're saying that if the definition of the entity contains and arbitrary unique value then use that. Fine. I guess I quibble with the notion of VIN as a "natural" attribute of car. (I have no firsthand experience with VINs but I would bet there's information tucked inside them, which would make me sceptical of using them :) )
Stop being so technical. When we/I speak of natural keys, we are talking about the column
that would NATURALly lend itself as the primary key.
No one ever said a number is not natural. just that there is no need to duplicate uniqueness
with a separate number.
IOW: If we have an account table, then the account_id or account_no
would be the primary key. There is no need to have a separate serial id as the primary key.
On 08/25/2015 09:09 AM, Rob Sargent wrote: > On 08/25/2015 09:40 AM, Melvin Davidson wrote: >> Adrian, >> >> Stop being so technical. When we/I speak of natural keys, we are >> talking about the column >> that would NATURALly lend itself as the primary key. >> No one ever said a number is not natural. just that there is no need >> to duplicate uniqueness >> with a separate number. >> >> IOW: If we have an account table, then the account_id or account_no >> would be the primary key. There is no need to have a separate >> serial id as the primary key. > If I'm following correctly, you're saying that if the definition of the > entity contains and arbitrary unique value then use that. Fine. I guess > I quibble with the notion of VIN as a "natural" attribute of car. (I > have no firsthand experience with VINs but I would bet there's > information tucked inside them, which would make me sceptical of using > them :) ) > But a VIN is in fact, UNIQUE so it is useful as a PK. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Centered full stack support, consulting and development. Announcing "I'm offended" is basically telling the world you can't control your own emotions, so everyone else should do it for you.
> No one ever said a number is not natural. just that there is no need to duplicate uniqueness > with a separate number. The whole point is that people are telling you that surrogate keys do not _duplicate_ uniqueness but rather _generate_ it, artificially, and therefore reliably. Today's external uniqueness is ambiguous tomorrow. Karsten
Consider:
SELECT c.registration_no,
c.car_make,
p.part_no
FROM car c
JOIN parts p ON ( p.registration_no = c.registration_no)
WHERE registration_no = <some_var>;
versus:
SELECT c.registration_no,
c.car_make,
p.part_no
FROM car c
JOIN parts p ON ( p.id = c.id)
WHERE registration_no = <some_var>;
Why join on id when registration_no is better?I believe you are mistaken if you think there are absolute rules you can cling to here. But even then I would lean toward calling primary keys an internal implementation detail that should be under the full control of the database in which they are directly used. Artifical "natural" keys I would lean toward turning into, possibly unique, attributes. Inherent "natural"keys get some consideration for using directly.The issue arise more, say, in a many-to-many situation. Do you define the PK of the linking table as a two-column composite key or do you introduce a third, serial, field to stand in for the pair?David J.
Good suggestion!
On Sat, 2015-08-22 at 15:15 +0000, Melvin Davidson wrote:
> I've been searching for a "PostgreSQL Developer Best Practices" with not
> much luck,
> so I've started my own. At the risk of stirring up a storm of controversy,
> I would appreciate additional suggestions and feedback.
>
You might add: Create all relation names as plurals.
Or, if your site uses predominantly singular names, make that the
standard. Consistency within the site is more important than any
dogmatic belief about whether singular or plural forms is better. If
you don't put it in the standard, someone will eventually create tables
with names that don't gel with everything else.
__
Marc
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
> In most cases developers don’t care about index, unique, foreign key, or primary key names (from a coding standpoint) Until the day they'd like to write a reliable database change script. (PG's internal conventions for object names _have_ changed over the years) Karsten
On 26/08/15 03:40, Melvin Davidson wrote: [...] > IOW: If we have an account table, then the account_id or account_no > would be the primary key. There is no need to have a separate > serial id as the primary key. [...] Account numbers are externally generated, and may potentially change. Management might suddenly decide that they want to start using the year the account started as the first 4 digits, or that the branch code should be reflected in it, or something else. The database should be protected from these arbitrary changes. Hence the account_no is not a good candidate for a primary key. Cheers, Gavin
On 08/25/2015 08:40 AM, Melvin Davidson wrote: > Adrian, > > Stop being so technical. When we/I speak of natural keys, we are talking > about the column > that would NATURALly lend itself as the primary key. Pretty sure this is a technical list:) > No one ever said a number is not natural. just that there is no need to > duplicate uniqueness > with a separate number. I would agree, but I have interacted with people, especially PHBes, where a duplicate 'hidden' key is a life saver. See more below. > > IOW: If we have an account table, then the account_id or account_no > would be the primary key. There is no need to have a separate > serial id as the primary key. > > Likewise, if we have a car table, then registration (or > vehicle_id) is preferred. > > EG: Good > CREATE TABLE car > ( > registration_no varchar(30) not null, > car_make varchar(25) not null, > model varchar(15) not null; > build_year date not null; > owner varchar(50), > CONSTRAINT car_pk PRIMARY KEY (registration_no) > ); > > bad > CREATE TABLE car > ( > id serial not null, > registration_no varchar(30) not null, > car_make varchar(25) not null, > model varchar(15) not null; > build_year date not null; > owner varchar(50), > CONSTRAINT car_pk PRIMARY KEY (id) > ); > > The benefit in avoiding arbitrary and simple values for the key is that > it makes > the database design much more logical. > > Consider: > SELECT c.registration_no, > c.car_make, > p.part_no > FROM car c > JOIN parts p ON ( p.registration_no = c.registration_no) > WHERE registration_no = <some_var>; Pretty sure parts are not unique to an exact vehicle, unless you are talking a totally handmade one. They are not even unique to make and model. As an example, I used to work on Class B Isuzu trucks. These models(FTR) where also built for Chevrolet as the Forward models. So right of the bat there where two part numbers for each part, one that started with 9 if you got it from Chevrolet and one with 11 from Isuzu, if memory serves. Then Isuzu decided to reorganize their part numbers, so that introduced another number, all pointing to the exact same part. Then there where those parts available from the parts houses(NAPA, etc). Then there was the greenhouse I worked for where we supplied UPC coded tags for our customers. In the beginning, it was simple, the item portion of the UPC was unique and with the company prefix served as a 'natural' key for the tags. Then the chain stores we worked with must have all gone to the same seminar on how to be Walmart and decided they did not want unique numbers, but UPCs tied to price groups that covered a variety of plants. Luckily, I was too stupid to know surrogate keys where bad and had a sequence attached to the tag table. This then became the tag id and made life a lot easier during the transition. It still remains there, because people are people and 'natural' tends to be artificial and transient. > > versus: > SELECT c.registration_no, > c.car_make, > p.part_no > FROM car c > JOIN parts p ON ( p.id <http://p.id> = c.id <http://c.id>) > WHERE registration_no = <some_var>; > > Why join on id when registration_no is better? > > > On Tue, Aug 25, 2015 at 10:17 AM, Adrian Klaver > <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote: > > On 08/24/2015 08:44 PM, Rob Sargent wrote: > > > On Aug 24, 2015, at 6:53 PM, Melvin Davidson > <melvin6925@gmail.com <mailto:melvin6925@gmail.com>> wrote: > > You are right, he was probably talking about FK's. I was > just so frustrated about people insisting that using "ID" as > the primary key in every table is a "good" idea, > I didn't bother to reply previously. I stand firm on my > belief that the primary key should be something meaningful > and NOT "id" just for the sake of having a unique numeric key. > > What, pray tell, is the unique natural key of person in any > meaningfully large domain such as state? Certainly not name + > birthdate. Current address isn’t guaranteed. Social isn’t > reliable and actually not truly unique. > > > To add: > > 1) Who determined that a number is not natural? > > 2) One of the older unique natural keys (genus, species) is not so > unique. I am a fisheries biologist by training and in my time the > 'unique' identifier for various fishes has changed. Now that > ichthyologists have discovered DNA testing, it can be expected there > will be even more changes. This is even more apparent when you go > back in in history. As an example: > > https://en.wikipedia.org/wiki/Rainbow_trout > > Rainbow trout > > Current > > Oncorhynchus mykiss > > Past > > Salmo mykiss Walbaum, 1792 > Parasalmo mykiss (Walbaum, 1792) > Salmo purpuratus Pallas, 1814 > Salmo penshinensis Pallas, 1814 > Parasalmo penshinensis (Pallas, 1814) > Salmo gairdnerii Richardson, 1836 <--The one I learned. > Fario gairdneri (Richardson, 1836) > Oncorhynchus gairdnerii (Richardson, 1836) > Salmo gairdnerii gairdnerii Richardson, 1836 > Salmo rivularis Ayres, 1855 > Salmo iridea Gibbons, 1855 > Salmo gairdnerii irideus Gibbons, 1855 > Salmo irideus Gibbons, 1855 > Trutta iridea (Gibbons, 1855) > Salmo truncatus Suckley, 1859 > Salmo masoni Suckley, 1860 > Oncorhynchus kamloops Jordan, 1892 > Salmo kamloops (Jordan, 1892) > Salmo rivularis kamloops (Jordan, 1892) > Salmo gairdneri shasta Jordan, 1894 > Salmo gilberti Jordan, 1894 > Salmo nelsoni Evermann, 1908 > > > All the above point to the same fish and have appeared and appear in > articles and reports about said fish. Lets not even get into the > common name situation:). > > > Even given that there are models which are made of entities with > legitimate attributes which per force define a unique instance, > I see no benefit in avoiding the convenience of an arbitrary and > simple value for the key. Is it the overhead of generating and > storing one more value per tuple that you can’t abide? > > > > > > > -- > Adrian Klaver > adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> > > > > > -- > *Melvin Davidson* > I reserve the right to fantasize. Whether or not you > wish to share my fantasy is entirely up to you. -- Adrian Klaver adrian.klaver@aklaver.com
On 8/25/2015 1:42 PM, Gavin Flower wrote: > Account numbers are externally generated, and may potentially change. > Management might suddenly decide that they want to start using the > year the account started as the first 4 digits, or that the branch > code should be reflected in it, or something else. The database > should be protected from these arbitrary changes. Hence the > account_no is not a good candidate for a primary key. such practices would raise total havoc on a traditional paper ledger accounting system as well as things like pending AR/AP where external companies will be referencing your account numbers. -- john r pierce, recycling bits in santa cruz
On 08/25/2015 01:56 PM, John R Pierce wrote: > On 8/25/2015 1:42 PM, Gavin Flower wrote: >> Account numbers are externally generated, and may potentially change. >> Management might suddenly decide that they want to start using the >> year the account started as the first 4 digits, or that the branch >> code should be reflected in it, or something else. The database >> should be protected from these arbitrary changes. Hence the >> account_no is not a good candidate for a primary key. > > > such practices would raise total havoc on a traditional paper ledger > accounting system as well as things like pending AR/AP where external > companies will be referencing your account numbers. Agreed, but it happens. When Lowes took over a local hardware chain(Eagles) here in Washington state they moved very quickly on changing the account numbers. The company I worked for who supplied Eagles and then Lowes sat on a check for $22,000 that was sent to us in error because the account numbers got switched. We called them when we got the check, but it still took them six months to own up to it. > > > > -- Adrian Klaver adrian.klaver@aklaver.com
On 26/08/15 08:56, Adrian Klaver wrote: > On 08/25/2015 08:40 AM, Melvin Davidson wrote: >> Adrian, >> >> Stop being so technical. When we/I speak of natural keys, we are talking >> about the column >> that would NATURALly lend itself as the primary key. > > Pretty sure this is a technical list:) > Don't let inconvenient facts get in the way of a good argument! :-) [...] > Pretty sure parts are not unique to an exact vehicle, unless you are > talking a totally handmade one. They are not even unique to make and > model. As an example, I used to work on Class B Isuzu trucks. These > models(FTR) where also built for Chevrolet as the Forward models. So > right of the bat there where two part numbers for each part, one that > started with 9 if you got it from Chevrolet and one with 11 from > Isuzu, if memory serves. Then Isuzu decided to reorganize their part > numbers, so that introduced another number, all pointing to the exact > same part. Then there where those parts available from the parts > houses(NAPA, etc). > > Then there was the greenhouse I worked for where we supplied UPC coded > tags for our customers. In the beginning, it was simple, the item > portion of the UPC was unique and with the company prefix served as a > 'natural' key for the tags. Then the chain stores we worked with must > have all gone to the same seminar on how to be Walmart and decided > they did not want unique numbers, but UPCs tied to price groups that > covered a variety of plants. Luckily, I was too stupid to Natural Stupidity??? :-) (Sorry, couldn't resist!) > know surrogate keys where bad and had a sequence attached to the tag > table. This then became the tag id and made life a lot easier during > the transition. It still remains there, because people are people and > 'natural' tends to be artificial and transient. Extremely good examples, I'll bear them in mind - makes me even more keen on surrogate primary keys. I'm always very wary when people tell me some numbering scheme will NEVER change!!! [...] Cheers, Gavin
On 26/08/15 02:17, Adrian Klaver wrote: [...] > > 2) One of the older unique natural keys (genus, species) is not so > unique. I am a fisheries biologist by training and in my time the > 'unique' identifier for various fishes has changed. Now that > ichthyologists have discovered DNA testing, it can be expected there > will be even more changes. This is even more apparent when you go back > in in history. As an example: > > https://en.wikipedia.org/wiki/Rainbow_trout > > Rainbow trout > > Current > > Oncorhynchus mykiss > > Past > > Salmo mykiss Walbaum, 1792 > Parasalmo mykiss (Walbaum, 1792) [...] > > Salmo gilberti Jordan, 1894 > Salmo nelsoni Evermann, 1908 > So you probably need a date stamp so you could record things relating to the correct name for a given period in a mapping table, and still relate to the same surrogate key for referencing other tables. Maybe even worse, is when a species is suddenly found to be 2 or more distinct species! Something similar could happen with account numbers: 2 companies with similar names might be assigned to the same account number, and lots of transactions recorded before the mistake is discovered. Though obviously a surrogate key would not give you complete protection from a lot of work sorting the mess out, but it would probably help! I read on post a year or 2 back, a guy in Europe had at least 4 different variations on his name depending on the country he was in and the local language and cultural norms. When I worked at a freezing works in the 1970's in Auckland, I heard that the pay roll allowed for over 52 different names per employee (per year?). Though, I was never told the maximum name changes ever used. Essentially management might fire someone, but the union would complain, and they would be rehired under a different name - so I was told! So the correct holiday pay & PAYE tax deductions would still relate to the same individual no matter how many name changes they had. Cheers, Gavin
On 08/25/2015 02:23 PM, Gavin Flower wrote: > On 26/08/15 08:56, Adrian Klaver wrote: >> On 08/25/2015 08:40 AM, Melvin Davidson wrote: >>> Adrian, >>> >>> Stop being so technical. When we/I speak of natural keys, we are talking >>> about the column >>> that would NATURALly lend itself as the primary key. >> >> Pretty sure this is a technical list:) >> > Don't let inconvenient facts get in the way of a good argument! :-) >> have all gone to the same seminar on how to be Walmart and decided >> they did not want unique numbers, but UPCs tied to price groups that >> covered a variety of plants. Luckily, I was too stupid to > Natural Stupidity??? :-) Oh yeah and a long history too, but that needs at least a pitcher of beer to recount. > > (Sorry, couldn't resist!) > >> know surrogate keys where bad and had a sequence attached to the tag >> table. This then became the tag id and made life a lot easier during >> the transition. It still remains there, because people are people and >> 'natural' tends to be artificial and transient. > Extremely good examples, I'll bear them in mind - makes me even more > keen on surrogate primary keys. I'm always very wary when people tell > me some numbering scheme will NEVER change!!! To add a recent one. My partner Nancy signed up for Medicare last year to avoid the premium penalty. This year in July she signed up for Social Security. Turns out, for reasons I do not understand, CMS(https://www.cms.gov/) changes the Medicare account number at that point. The reason we even cared is that the billing system thinks she has two accounts and is double billing. Time on the phone with someone at CMS was not enlightening. We where told to trust the system and eventually it will work itself out. Still waiting:( > > [...] > > > Cheers, > Gavin > -- Adrian Klaver adrian.klaver@aklaver.com
On 08/25/2015 02:44 PM, Gavin Flower wrote: > On 26/08/15 02:17, Adrian Klaver wrote: > [...] >> >> 2) One of the older unique natural keys (genus, species) is not so >> unique. I am a fisheries biologist by training and in my time the >> 'unique' identifier for various fishes has changed. Now that >> ichthyologists have discovered DNA testing, it can be expected there >> will be even more changes. This is even more apparent when you go back >> in in history. As an example: >> >> https://en.wikipedia.org/wiki/Rainbow_trout >> >> Rainbow trout >> >> Current >> >> Oncorhynchus mykiss >> >> Past >> >> Salmo mykiss Walbaum, 1792 >> Parasalmo mykiss (Walbaum, 1792) > [...] >> >> Salmo gilberti Jordan, 1894 >> Salmo nelsoni Evermann, 1908 >> > So you probably need a date stamp so you could record things relating to > the correct name for a given period in a mapping table, and still relate > to the same surrogate key for referencing other tables. > > Maybe even worse, is when a species is suddenly found to be 2 or more > distinct species! Funny you should say that. Furry critters instead of slimy: http://news.nationalgeographic.com/2015/07/150730-jackals-wolves-evolution-new-species-animals-africa/ > > Something similar could happen with account numbers: 2 companies with > similar names might be assigned to the same account number, and lots of > transactions recorded before the mistake is discovered. Though obviously > a surrogate key would not give you complete protection from a lot of > work sorting the mess out, but it would probably help! Or if you have a mortgage with Well Fargo and find your account number is being used in their agent training program which explains why you have been receiving all sorts of correspondence saying your account is in arrears and is facing foreclosure(personal experience). Bottom line is databases are great and theory is useful, but it all goes out the window when people start meddling. > > I read on post a year or 2 back, a guy in Europe had at least 4 > different variations on his name depending on the country he was in and > the local language and cultural norms. I am familiar with that issue. > > When I worked at a freezing works in the 1970's in Auckland, I heard > that the pay roll allowed for over 52 different names per employee (per > year?). Though, I was never told the maximum name changes ever used. > Essentially management might fire someone, but the union would complain, > and they would be rehired under a different name - so I was told! So > the correct holiday pay & PAYE tax deductions would still relate to the > same individual no matter how many name changes they had. Or a system I took over where someone had made a natural primary key of first name, last name and that was all. So you had John Smith, John Smith2, etc. Poor design obviously, but that stuff is out there. > > > Cheers, > Gavin > -- Adrian Klaver adrian.klaver@aklaver.com
NB the attribution colours seems to be mixed up a bit here, but this all dialogue between me & Adrian. On 26/08/15 09:48, Adrian Klaver wrote: > On 08/25/2015 02:23 PM, Gavin Flower wrote: >> On 26/08/15 08:56, Adrian Klaver wrote: [...] >>> have all gone to the same seminar on how to be Walmart and decided >>> they did not want unique numbers, but UPCs tied to price groups that >>> covered a variety of plants. Luckily, I was too stupid to >> Natural Stupidity??? :-) > > Oh yeah and a long history too, but that needs at least a pitcher of > beer to recount. Well if you're ever in Auckland, I'll shout you a beer! (We might even put you up for a night or two.) > >> >> (Sorry, couldn't resist!) >> >>> know surrogate keys where bad and had a sequence attached to the tag >>> table. This then became the tag id and made life a lot easier during >>> the transition. It still remains there, because people are people and >>> 'natural' tends to be artificial and transient. >> Extremely good examples, I'll bear them in mind - makes me even more >> keen on surrogate primary keys. I'm always very wary when people tell >> me some numbering scheme will NEVER change!!! > > To add a recent one. My partner Nancy signed up for Medicare last year > to avoid the premium penalty. This year in July she signed up for > Social Security. Turns out, for reasons I do not understand, > CMS(https://www.cms.gov/) changes the Medicare account number at that > point. The reason we even cared is that the billing system thinks she > has two accounts and is double billing. Time on the phone with someone > at CMS was not enlightening. We where told to trust the system and > eventually it will work itself out. Still waiting:( STOP IT!!! You're making me even more cynical and paranoid! :-)
> On Aug 25, 2015, at 1:38 PM, Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote: > >> In most cases developers don’t care about index, unique, foreign key, or primary key names (from a coding standpoint) > > Until the day they’d like to write a reliable database change script. Not sure I understand. Once the object is created the name is set, it does not change, so I don’t understand why it is notpossible to write a reliable database change script. Dump and restore maintain the name. Of course every project hasperiodic scripts that need to run, so these objects would, if they are dropped or manipulated in the script, have to bemanually named, especially during development since the whole database might be dropped and recreated multiple times. My original comment included that situation. My projects typically have many, many objects that once created are not referredto again, unless a DBA is doing some tuning or troubleshooting. In that case, the DBA just looks up the name. I can see if say 2 years later you want to create a development database from the original SQL that generated the originaltable definitions that could be problematic. But I always have used the current definitions not the original andthose can be exported with the current names. It just seems like busy work to me, but I would love to be enlightened. Neil
Adrian Klaver <adrian.klaver@aklaver.com> writes: > On 08/25/2015 01:56 PM, John R Pierce wrote: > >> On 8/25/2015 1:42 PM, Gavin Flower wrote: >>> Account numbers are externally generated, and may potentially change. >>> Management might suddenly decide that they want to start using the >>> year the account started as the first 4 digits, or that the branch >>> code should be reflected in it, or something else. The database >>> should be protected from these arbitrary changes. Hence the >>> account_no is not a good candidate for a primary key. >> >> >> such practices would raise total havoc on a traditional paper ledger >> accounting system as well as things like pending AR/AP where external >> companies will be referencing your account numbers. > > Agreed, but it happens. When Lowes took over a local hardware > chain(Eagles) here in Washington state they moved very quickly on > changing the account numbers. The company I worked for who supplied > Eagles and then Lowes sat on a check for $22,000 that was sent to us > in error because the account numbers got switched. We called them when > we got the check, but it still took them six months to own up to it. DOH! Next time a screwball outfit sends you a check for $22k erroneously just go deposit it :-) > -- > Adrian Klaver > adrian.klaver@aklaver.com -- Jerry Sievers Postgres DBA/Development Consulting e: postgres.consulting@comcast.net p: 312.241.7800
On 08/25/2015 04:23 PM, Jerry Sievers wrote: > Adrian Klaver <adrian.klaver@aklaver.com> writes: > >> On 08/25/2015 01:56 PM, John R Pierce wrote: >> >>> On 8/25/2015 1:42 PM, Gavin Flower wrote: >>>> Account numbers are externally generated, and may potentially change. >>>> Management might suddenly decide that they want to start using the >>>> year the account started as the first 4 digits, or that the branch >>>> code should be reflected in it, or something else. The database >>>> should be protected from these arbitrary changes. Hence the >>>> account_no is not a good candidate for a primary key. >>> >>> >>> such practices would raise total havoc on a traditional paper ledger >>> accounting system as well as things like pending AR/AP where external >>> companies will be referencing your account numbers. >> >> Agreed, but it happens. When Lowes took over a local hardware >> chain(Eagles) here in Washington state they moved very quickly on >> changing the account numbers. The company I worked for who supplied >> Eagles and then Lowes sat on a check for $22,000 that was sent to us >> in error because the account numbers got switched. We called them when >> we got the check, but it still took them six months to own up to it. > > DOH! > > Next time a screwball outfit sends you a check for $22k erroneously just > go deposit it :-) Well that is what I wanted to do, the owner overruled me:(. Something about Lowes having more lawyers then we did. The strange part was we called them and told them what had happened and supplied the relevant information that explained the mix up. You would have thought us calling to return a check that was supposed to be to us would have raised a flag! > >> -- >> Adrian Klaver >> adrian.klaver@aklaver.com > -- Adrian Klaver adrian.klaver@aklaver.com
On 08/25/2015 04:23 PM, Jerry Sievers wrote:Adrian Klaver <adrian.klaver@aklaver.com> writes:On 08/25/2015 01:56 PM, John R Pierce wrote:On 8/25/2015 1:42 PM, Gavin Flower wrote:Account numbers are externally generated, and may potentially change.
Management might suddenly decide that they want to start using the
year the account started as the first 4 digits, or that the branch
code should be reflected in it, or something else. The database
should be protected from these arbitrary changes. Hence the
account_no is not a good candidate for a primary key.
such practices would raise total havoc on a traditional paper ledger
accounting system as well as things like pending AR/AP where external
companies will be referencing your account numbers.
Agreed, but it happens. When Lowes took over a local hardware
chain(Eagles) here in Washington state they moved very quickly on
changing the account numbers. The company I worked for who supplied
Eagles and then Lowes sat on a check for $22,000 that was sent to us
in error because the account numbers got switched. We called them when
we got the check, but it still took them six months to own up to it.
DOH!
Next time a screwball outfit sends you a check for $22k erroneously just
go deposit it :-)
Well that is what I wanted to do, the owner overruled me:(. Something about Lowes having more lawyers then we did. The strange part was we called them and told them what had happened and supplied the relevant information that explained the mix up. You would have thought us calling to return a check that was supposed to be to us would have raised a flag!--
Adrian Klaver
adrian.klaver@aklaver.com
--
Adrian Klaver
adrian.klaver@aklaver.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
On 26/08/15 11:34, Adrian Klaver wrote: [...] >>> >>> Agreed, but it happens. When Lowes took over a local hardware >>> chain(Eagles) here in Washington state they moved very quickly on >>> changing the account numbers. The company I worked for who supplied >>> Eagles and then Lowes sat on a check for $22,000 that was sent to us >>> in error because the account numbers got switched. We called them when >>> we got the check, but it still took them six months to own up to it. >> >> DOH! >> >> Next time a screwball outfit sends you a check for $22k erroneously just >> go deposit it :-) > > Well that is what I wanted to do, the owner overruled me:(. Something > about Lowes having more lawyers then we did. The strange part was we > called them and told them what had happened and supplied the relevant > information that explained the mix up. You would have thought us > calling to return a check that was supposed to be to us would have > raised a flag! [...] Many years ago a department store credited our account with a refund for about $150, which obviously was not meant for us - we had never bought that item! They replied, essentially saying we we were due the refund. We sent a letter yet again, explaining the problem, and saying we were not entitled. They then sent used a check for the amount, which we cashed, feeling we had done our best to help them & that we could not be bothered wasting more time trying to sort things out for them! I think any judge would laugh them out of court!
On 08/25/2015 05:17 PM, Melvin Davidson wrote: > I think a lot of people here are missing the point. I was trying to give > examples of natural keys, but a lot of people are taking great delight > in pointing out exceptions to examples, rather than understanding the point. > So for the sake of argument, a natural key is something that in itself > is unique and the possibility of a duplicate does not exist. > Before ANYONE continues to insist that a serial id column is good, > consider the case where the number of tuples will exceed a bigint. > Don't say it cannot happen, because it can. > However, if you have an alphanumeric field, let's say varchar 50, and > it's guaranteed that it will never have a duplicate, then THAT is a > natural primary That is a big IF and a guarantee I would not put money on. > key and beats the hell out of a generic "id" field. > > Further to the point, since I started this thread, I am holding to it > and will not discuss "natural primary keys" any further. > -- Adrian Klaver adrian.klaver@aklaver.com
On 08/25/2015 05:21 PM, Gavin Flower wrote: > On 26/08/15 11:34, Adrian Klaver wrote: > [...] >>>> >>>> Agreed, but it happens. When Lowes took over a local hardware >>>> chain(Eagles) here in Washington state they moved very quickly on >>>> changing the account numbers. The company I worked for who supplied >>>> Eagles and then Lowes sat on a check for $22,000 that was sent to us >>>> in error because the account numbers got switched. We called them when >>>> we got the check, but it still took them six months to own up to it. >>> >>> DOH! >>> >>> Next time a screwball outfit sends you a check for $22k erroneously just >>> go deposit it :-) >> >> Well that is what I wanted to do, the owner overruled me:(. Something >> about Lowes having more lawyers then we did. The strange part was we >> called them and told them what had happened and supplied the relevant >> information that explained the mix up. You would have thought us >> calling to return a check that was supposed to be to us would have >> raised a flag! > [...] > > Many years ago a department store credited our account with a refund for > about $150, which obviously was not meant for us - we had never bought > that item! > > They replied, essentially saying we we were due the refund. > > We sent a letter yet again, explaining the problem, and saying we were > not entitled. > > They then sent used a check for the amount, which we cashed, feeling we > had done our best to help them & that we could not be bothered wasting > more time trying to sort things out for them! I have seen this enough to think either companies would rather take the loss then admit the mistake or employees don't care because it is not their money. > > I think any judge would laugh them out of court! > > > -- Adrian Klaver adrian.klaver@aklaver.com
key and beats the hell out of a generic "id" field.However, if you have an alphanumeric field, let's say varchar 50, and it's guaranteed that it will never have a duplicate, then THAT is a natural primaryBefore ANYONE continues to insist that a serial id column is good, consider the case where the number of tuples will exceed a bigint.Don't say it cannot happen, because it can.
On Tue, 2015-08-25 at 20:17 -0400, Melvin Davidson wrote: > I think a lot of people here are missing the point. I was trying to > give examples of natural keys, but a lot of people are taking great > delight > in pointing out exceptions to examples, rather than understanding the > point. > So for the sake of argument, a natural key is something that in > itself is unique and the possibility of a duplicate does not exist. > Before ANYONE continues to insist that a serial id column is good, > consider the case where the number of tuples will exceed a bigint. > Don't say it cannot happen, because it can. > However, if you have an alphanumeric field, let's say varchar 50, and > it's guaranteed that it will never have a duplicate, then THAT is a > natural primary > key and beats the hell out of a generic "id" field. > > Further to the point, since I started this thread, I am holding to it > and will not discuss "natural primary keys" any further. > > Other suggestions for good PostgreSQL Developer database (not web > app) guidelines are still welcome. > Funny how Melvin's attempt to bring order to the chaos ended up as a discussion about primary keys. We once hired a "genius" to design an application to handle fixed assets. Every table had a primary key named "id". Some were integer and some were character. So the foreign key columns in child tables had to be named differently. Writing the joins was complex. I also know of an airline reservation system where you are unable to alter your e-mail address. It apparently needs a DBA type person to make the change. I can only guess that your e-mail address is used as a foreign key in one or more tables. As well as assigning you a frequent flyer number they also assign another integer identifier. A bit of common sense goes a long way when designing an application. Cheers, rob
On Tue, 2015-08-25 at 20:17 -0400, Melvin Davidson wrote:
> I think a lot of people here are missing the point. I was trying to
> give examples of natural keys, but a lot of people are taking great
> delight
> in pointing out exceptions to examples, rather than understanding the
> point.
> So for the sake of argument, a natural key is something that in
> itself is unique and the possibility of a duplicate does not exist.
> Before ANYONE continues to insist that a serial id column is good,
> consider the case where the number of tuples will exceed a bigint.
> Don't say it cannot happen, because it can.
> However, if you have an alphanumeric field, let's say varchar 50, and
> it's guaranteed that it will never have a duplicate, then THAT is a
> natural primary
> key and beats the hell out of a generic "id" field.
>
> Further to the point, since I started this thread, I am holding to it
> and will not discuss "natural primary keys" any further.
>
> Other suggestions for good PostgreSQL Developer database (not web
> app) guidelines are still welcome.
>
Funny how Melvin's attempt to bring order to the chaos ended up as a
discussion about primary keys.
We once hired a "genius" to design an application to handle fixed
assets. Every table had a primary key named "id". Some were integer and
some were character. So the foreign key columns in child tables had to
be named differently. Writing the joins was complex.
I also know of an airline reservation system where you are unable to
alter your e-mail address. It apparently needs a DBA type person to
make the change. I can only guess that your e-mail address is used as a
foreign key in one or more tables. As well as assigning you a frequent
flyer number they also assign another integer identifier. A bit of
common sense goes a long way when designing an application.
Cheers,
rob
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
I am in favour of using BIGINT "id" for the primary key in each table I create.
I found out that in the fields in my tables that I thought would be unique end up not being so in the longer term.
Also these values may need to be updated for some reason.
I have been using PRIMARY KEY(id) where id is of type BIGINT on each table I create.
I use a sequence to provide a default value to this field.
I create one such sequence DB object per table and the use it in the table definition.
For example if I have a sequenceDB "some_schema.some_table_seq" for table "some_schema.some_table".
In the table definition of "some_schema.some_table" I have the field "id" as follows.
id BIGINT NOT NULL DEFAULT NEXTVAL('some_schema.some_table_seq')
When I use this "id" field as a foreign key in another table, I would prefix it with the name of its parent table followed by a couple of underscores as shown below.
FOREIGN KEY(some_table__id)REFERENCES some_schema.some_table(id)ON UPDATE CASCADE ON DELETE CASCADE
For the composite keys that are unique (for now) I create a unique constraint.
On 26-08-2015 10:13, Allan Kamau wrote: > On Wed, Aug 26, 2015 at 5:23 AM, rob stone <floriparob@gmail.com> > wrote: > >> On Tue, 2015-08-25 at 20:17 -0400, Melvin Davidson wrote: >>> I think a lot of people here are missing the point. I was trying >> to >>> give examples of natural keys, but a lot of people are taking >> great >>> delight >>> in pointing out exceptions to examples, rather than understanding >> the >>> point. >>> So for the sake of argument, a natural key is something that in >>> itself is unique and the possibility of a duplicate does not >> exist. >>> Before ANYONE continues to insist that a serial id column is good, >>> consider the case where the number of tuples will exceed a bigint. >>> Don't say it cannot happen, because it can. >>> However, if you have an alphanumeric field, let's say varchar 50, >> and >>> it's guaranteed that it will never have a duplicate, then THAT is >> a >>> natural primary >>> key and beats the hell out of a generic "id" field. >>> >>> Further to the point, since I started this thread, I am holding to >> it >>> and will not discuss "natural primary keys" any further. >>> >>> Other suggestions for good PostgreSQL Developer database (not web >>> app) guidelines are still welcome. >>> >> >> Funny how Melvin's attempt to bring order to the chaos ended up as a >> discussion about primary keys. >> >> We once hired a "genius" to design an application to handle fixed >> assets. Every table had a primary key named "id". Some were integer >> and >> some were character. So the foreign key columns in child tables had >> to >> be named differently. Writing the joins was complex. >> >> I also know of an airline reservation system where you are unable to >> alter your e-mail address. It apparently needs a DBA type person to >> make the change. I can only guess that your e-mail address is used >> as a >> foreign key in one or more tables. As well as assigning you a >> frequent >> flyer number they also assign another integer identifier. A bit of >> common sense goes a long way when designing an application. >> >> Cheers, >> rob >> >> -- >> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general [1] > > I am in favour of using BIGINT "id" for the primary key in each table > I create. > I found out that in the fields in my tables that I thought would be > unique end up not being so in the longer term. > Also these values may need to be updated for some reason. > > I have been using PRIMARY KEY(id) where id is of type BIGINT on each > table I create. > I use a sequence to provide a default value to this field. > I create one such sequence DB object per table and the use it in the > table definition. > For example if I have a sequenceDB "some_schema.some_table_seq" for > table "some_schema.some_table". > In the table definition of "some_schema.some_table" I have the field > "id" as follows. > > id BIGINT NOT NULL DEFAULT NEXTVAL('some_schema.some_table_seq') > > When I use this "id" field as a foreign key in another table, I would > prefix it with the name of its parent table followed by a couple of > underscores as shown below. > FOREIGN KEY(some_table__id)REFERENCES some_schema.some_table(id)ON > UPDATE CASCADE ON DELETE CASCADE > > For the composite keys that are unique (for now) I create a unique > constraint. > > Allan. I recall the words of my professor at last lecture of Databases was telling us that model of thinking as he told: nomenclature is wrong and not good and we should avoid it in any cost if we can. Cheers, Hristo
On Tue, 25 Aug 2015 18:57:28 -0400, Neil Tiffin <neilt@neiltiffin.com> wrote: > >> On Aug 25, 2015, at 1:38 PM, Karsten Hilbert <Karsten.Hilbert@gmx.net> >> wrote: >> >>> In most cases developers don’t care about index, unique, foreign key, >>> or primary key names (from a coding standpoint) >> >> Until the day they’d like to write a reliable database change script. > > Not sure I understand. Once the object is created the name is set, it > does not change, so I don’t understand why it is not possible to write a > reliable database change script. Dump and restore maintain the name. Of > course every project has periodic scripts that need to run, so these > objects would, if they are dropped or manipulated in the script, have to > be manually named, especially during development since the whole > database might be dropped and recreated multiple times. My original > comment included that situation. My projects typically have many, many > objects that once created are not referred to again, unless a DBA is > doing some tuning or troubleshooting. In that case, the DBA just looks > up the name. > > I can see if say 2 years later you want to create a development database > from the original SQL that generated the original table definitions that > could be problematic. But I always have used the current definitions > not the original and those can be exported with the current names. > > It just seems like busy work to me, but I would love to be enlightened. > > Neil I suspect he's alluding to migration scripts from an ORM - which are typically scaffolded with boilerplate, but almost invariably need to be tweaked in order to effect the desired changes in the database.. - John
On Wed, Aug 26, 2015 at 09:04:08AM -0400, John Turner wrote: > >>>In most cases developers don’t care about index, unique, foreign key, > >>>or primary key names (from a coding standpoint) > >> > >>Until the day they’d like to write a reliable database change script. > > > >Not sure I understand. Once the object is created the name is set, it > >does not change, so I don’t understand why it is not possible to write a > >reliable database change script. Dump and restore maintain the name. Of > >course every project has periodic scripts that need to run, so these > >objects would, if they are dropped or manipulated in the script, have to > >be manually named, especially during development since the whole database > >might be dropped and recreated multiple times. My original comment > >included that situation. My projects typically have many, many objects > >that once created are not referred to again, unless a DBA is doing some > >tuning or troubleshooting. In that case, the DBA just looks up the name. > > > >I can see if say 2 years later you want to create a development database > >from the original SQL that generated the original table definitions that > >could be problematic. But I always have used the current definitions not > >the original and those can be exported with the current names. > > > >It just seems like busy work to me, but I would love to be enlightened. > > I suspect he's alluding to migration scripts from an ORM Not in the least. https://github.com/ncqgm/gnumed/tree/master/gnumed/gnumed/server/sql Karsten -- GPG key ID E4071346 @ eu.pool.sks-keyservers.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Melvin Davidson
Sent: Tuesday, August 25, 2015 8:18 PM
To: Adrian Klaver <adrian.klaver@aklaver.com>
Cc: Jerry Sievers <gsievers19@comcast.net>; John R Pierce <pierce@hogranch.com>; pgsql-general@postgresql.org
Subject: Re: [GENERAL] PostgreSQL Developer Best Practices
….
Before ANYONE continues to insist that a serial id column is good, consider the case where the number of tuples will exceed a bigint.
Don't say it cannot happen, because it can.
……………………
Melvin Davidson
Now, it’s easy to overcome this limitation.
You just make concatenated PK (id1, id2) with both columns of BIGINT type.
In general, I see the main advantage of artificial PK in NO NEED to change multiple child tables, when NATURAL key changes in the parent table. And I never saw a system where NATURAL key wouldn’t need to be changed eventually.
So, my conclusion: use artificial PK (for db convenience) and unique NATURAL key (for GUI representation).
Regards,
Igor Neyman
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Melvin Davidson
Sent: Tuesday, August 25, 2015 8:18 PM
To: Adrian Klaver <adrian.klaver@aklaver.com>
Cc: Jerry Sievers <gsievers19@comcast.net>; John R Pierce <pierce@hogranch.com>; pgsql-general@postgresql.org
Subject: Re: [GENERAL] PostgreSQL Developer Best Practices
….
Before ANYONE continues to insist that a serial id column is good, consider the case where the number of tuples will exceed a bigint.
Don't say it cannot happen, because it can.
……………………
Melvin Davidson
Now, it’s easy to overcome this limitation.
You just make concatenated PK (id1, id2) with both columns of BIGINT type.
In general, I see the main advantage of artificial PK in NO NEED to change multiple child tables, when NATURAL key changes in the parent table. And I never saw a system where NATURAL key wouldn’t need to be changed eventually.
So, my conclusion: use artificial PK (for db convenience) and unique NATURAL key (for GUI representation).
On 08/25/2015 05:28 PM, Adrian Klaver wrote: > On 08/25/2015 05:17 PM, Melvin Davidson wrote: >> I think a lot of people here are missing the point. I was trying to give >> examples of natural keys, but a lot of people are taking great delight >> in pointing out exceptions to examples, rather than understanding the >> point. >> So for the sake of argument, a natural key is something that in itself >> is unique and the possibility of a duplicate does not exist. Correct. >> Before ANYONE continues to insist that a serial id column is good, >> consider the case where the number of tuples will exceed a bigint. >> Don't say it cannot happen, because it can. Yes it can. >> However, if you have an alphanumeric field, let's say varchar 50, and >> it's guaranteed that it will never have a duplicate, then THAT is a >> natural primary Wrong. Refer back to your above definition. It is definitely possible, based on a varchar(50) that a duplicate will happen. A better definition would be something along the lines of: A natural key is distinct and is derived from the data being stored. > > That is a big IF and a guarantee I would not put money on. Right, here is a perfect example. Generally speaking if you are storing a United States company's information, a natural primary key could be an FEIN. However, there is an exception that would have to be incorporated into that idea. If the company is a Sole Proprietorship the FEIN may actually be the SSN of the owner, but not necessarily. Then you have to ask yourself if that matters. It may not depending on the application you are building or the reason the data is being stored. > >> key and beats the hell out of a generic "id" field. >> >> Further to the point, since I started this thread, I am holding to it >> and will not discuss "natural primary keys" any further. That doesn't mean others won't. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Centered full stack support, consulting and development. Announcing "I'm offended" is basically telling the world you can't control your own emotions, so everyone else should do it for you.
Melvin Davidson wrote: > Before ANYONE continues to insist that a serial id column is good, consider > the case where the number of tuples will exceed a bigint. > Don't say it cannot happen, because it can. In practice, it cannot happen. A tuple with a bigint column weighs at least 32 bytes (in the sense that it's what pg_column_size("table".*) reports when "table" has only a bigint column). So the size of your hypothetical table would be at a minimum 32 bytes * 2^63 tuples = 2^68 bytes But a postgres table size weighs 32TB max, or 2^45 bytes (see http://www.postgresql.org/about/ ) So the table with more rows than a bigint can count would have to be 2^23 (=8388608) times bigger than the biggest possible table. Also there's the fact that COUNT() returns a BIGINT, so the tuples couldn't be counted in SQL. That by itself hints at the fact that counts of tuples are expected to always fit in BIGINT these days. Also what about pg_database_size() returning a bigint? Even if the hypothetical table was alone in the database, and even if every tuple occupied only 1 byte instead of 32+, the single table would exceed what pg_database_size() can report. Maybe at some point all these will be 128 bits, but that's years ahead. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
On 26/08/15 12:17, Melvin Davidson wrote: [...] > So for the sake of argument, a natural key is something that in itself > is unique and the possibility of a duplicate does not exist. > Before ANYONE continues to insist that a serial id column is good, > consider the case where the number of tuples will exceed a bigint. > Don't say it cannot happen, because it can. > If you create tuples in your table at the rate of a million records per second every hour of every day of every year, it will take over 250,000 years to exceed the value of a bigint! [...] > > > -- > *Melvin Davidson* > I reserve the right to fantasize. Whether or not you > wish to share my fantasy is entirely up to you. I will refrain from marking snarky comments about your sig - tempting though it might be! :-)