Thread: PostgreSQL Developer Best Practices

PostgreSQL Developer Best Practices

From
Melvin Davidson
Date:
I've been searching for a "PostgreSQL Developer Best Practices" with not much luck,
so I've started my own. At the risk of stirring up a storm of controversy, I would
appreciate additional suggestions and feedback.

As a way of clarifying, generally, a DBA is someone that is responsible for maintaining the integrity of the database, while a developer is someone that writes code and SQL statements to update the data.

I've attached a file with a few starters that although are numbered, are in no special order.

Please keep in mind the attached are purely based on my years of experience working with developers that are not familiar with PostgreSQL and are not hard and fast rules, but general guidelines.

Hopefully this will result in something that brings about harmony between PostgreSQL DBA's and Developers.

--
Melvin Davidson

Attachment

Re: PostgreSQL Developer Best Practices

From
Karsten Hilbert
Date:
On Sat, Aug 22, 2015 at 11:15:07AM -0400, Melvin Davidson wrote:

> PostgreSQL Developer Best Practices
>
> 1. Prefix ALL literals with an Escape
>    EG:  SELECT E'This is a \'quoted literal \'';
>         SELECT E'This is an unquoted literal';
>
>    Doing so will prevent the annoying "WARNING:  nonstandard use of escape in a string literal"

This is certainly not Best Practice as the warning is
annoying for a reason.

Best Practice would rather be something along the lines:

    Avoid coding in a way that triggers "WARNING:
    nonstandard use of escape in a string literal". If you
    cannot comply with this rule document your reasons.

>         Good example:
>         CREATE TABLE accounts
>         ( accout_id bigint NOT NULL ,

Typo.

Karsten
--
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346


Re: PostgreSQL Developer Best Practices

From
Melvin Davidson
Date:
>This is certainly not Best Practice as the warning is
>annoying for a reason.

>Best Practice would rather be something along the lines:

  >      Avoid coding in a way that triggers "WARNING:
  >      nonstandard use of escape in a string literal". If you
  >      cannot comply with this rule document your reasons.

Thanks for the suggestion. For the past few months I've been dealing with an error log that is filled with these warnings simply because
the developers do not comprehend how to use ( or the requirement to use)  an escape clause.

>         Good example:
>         CREATE TABLE accounts
>         ( accout_id bigint NOT NULL ,

Typo.

So noted, I'll correct.

On Sat, Aug 22, 2015 at 11:46 AM, Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote:
On Sat, Aug 22, 2015 at 11:15:07AM -0400, Melvin Davidson wrote:

> PostgreSQL Developer Best Practices
>
> 1. Prefix ALL literals with an Escape
>    EG:  SELECT E'This is a \'quoted literal \'';
>         SELECT E'This is an unquoted literal';
>
>    Doing so will prevent the annoying "WARNING:  nonstandard use of escape in a string literal"

This is certainly not Best Practice as the warning is
annoying for a reason.

Best Practice would rather be something along the lines:

        Avoid coding in a way that triggers "WARNING:
        nonstandard use of escape in a string literal". If you
        cannot comply with this rule document your reasons.

>         Good example:
>         CREATE TABLE accounts
>         ( accout_id bigint NOT NULL ,

Typo.

Karsten
--
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: PostgreSQL Developer Best Practices

From
Tom Lane
Date:
Melvin Davidson <melvin6925@gmail.com> writes:
>> Best Practice would rather be something along the lines:
>>>> Avoid coding in a way that triggers "WARNING:
>>>> nonstandard use of escape in a string literal". If you
>>>> cannot comply with this rule document your reasons.

> Thanks for the suggestion. For the past few months I've been dealing with
> an error log that is filled with these warnings simply because
> the developers do not comprehend how to use ( or the requirement to use)
> an escape clause.

IMO best practice in this area is "run with standard_conforming_strings = ON".
If you're seeing this warning at all, it's because you aren't doing that,
which means your code is unnecessarily unportable to other DBMSes.
Adopting a coding policy of always using E'' would make that worse.

            regards, tom lane


Re: PostgreSQL Developer Best Practices

From
Melvin Davidson
Date:
Tom,

Thank you for pointing out "run with standard_conforming_strings = ON"..
However, that is NOT the problem.
What is occurring is that the developers are sending strings like 'Mr. M\'vey',
which, if we set standard_conforming_strings = ON, would, and does, result in errors and the statement failing,
which is a lot less desirable that a simple warning.

Therefore, I am trying to educate the developers in the proper method of escaping strings,
instead of loading up the error log with annoying warnings.

On Sat, Aug 22, 2015 at 1:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Melvin Davidson <melvin6925@gmail.com> writes:
>> Best Practice would rather be something along the lines:
>>>> Avoid coding in a way that triggers "WARNING:
>>>> nonstandard use of escape in a string literal". If you
>>>> cannot comply with this rule document your reasons.

> Thanks for the suggestion. For the past few months I've been dealing with
> an error log that is filled with these warnings simply because
> the developers do not comprehend how to use ( or the requirement to use)
> an escape clause.

IMO best practice in this area is "run with standard_conforming_strings = ON".
If you're seeing this warning at all, it's because you aren't doing that,
which means your code is unnecessarily unportable to other DBMSes.
Adopting a coding policy of always using E'' would make that worse.

                        regards, tom lane



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: PostgreSQL Developer Best Practices

From
Andy Colson
Date:
> On Sat, Aug 22, 2015 at 1:16 PM, Tom Lane <tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us>> wrote:
>
>     Melvin Davidson <melvin6925@gmail.com <mailto:melvin6925@gmail.com>> writes:
>     >> Best Practice would rather be something along the lines:
>     >>>> Avoid coding in a way that triggers "WARNING:
>     >>>> nonstandard use of escape in a string literal". If you
>     >>>> cannot comply with this rule document your reasons.
>
>     > Thanks for the suggestion. For the past few months I've been dealing with
>     > an error log that is filled with these warnings simply because
>     > the developers do not comprehend how to use ( or the requirement to use)
>     > an escape clause.
>
>     IMO best practice in this area is "run with standard_conforming_strings = ON".
>     If you're seeing this warning at all, it's because you aren't doing that,
>     which means your code is unnecessarily unportable to other DBMSes.
>     Adopting a coding policy of always using E'' would make that worse.
>
>                              regards, tom lane
>

On 08/22/2015 02:40 PM, Melvin Davidson wrote:
> Tom,
>
> Thank you for pointing out "run with standard_conforming_strings = ON"..
> However, that is NOT the problem.
> What is occurring is that the developers are sending strings like 'Mr. M\'vey',
> which, if we set standard_conforming_strings = ON, would, and does, result in errors and the statement failing,
> which is a lot less desirable that a simple warning.
>
> Therefore, I am trying to educate the developers in the proper method of escaping strings,
> instead of loading up the error log with annoying warnings.
>


Please dont top post.

But you are not educating them correctly.  Using E'' isnt right.  The correct way to escape a quote is to double quote
it: 'Mr. M''vey' 

-Andy


Re: PostgreSQL Developer Best Practices

From
Melvin Davidson
Date:
>The correct way to escape a quote is to double quote it:  'Mr. M''vey'

That is a matter of opinion. However, the real problem is the enclosed backslashes, which is
beyond our control at this point. Therefore, the best solution is to use ESCAPE E.

On Sat, Aug 22, 2015 at 3:49 PM, Andy Colson <andy@squeakycode.net> wrote:
On Sat, Aug 22, 2015 at 1:16 PM, Tom Lane <tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us>> wrote:

    Melvin Davidson <melvin6925@gmail.com <mailto:melvin6925@gmail.com>> writes:
    >> Best Practice would rather be something along the lines:
    >>>> Avoid coding in a way that triggers "WARNING:
    >>>> nonstandard use of escape in a string literal". If you
    >>>> cannot comply with this rule document your reasons.

    > Thanks for the suggestion. For the past few months I've been dealing with
    > an error log that is filled with these warnings simply because
    > the developers do not comprehend how to use ( or the requirement to use)
    > an escape clause.

    IMO best practice in this area is "run with standard_conforming_strings = ON".
    If you're seeing this warning at all, it's because you aren't doing that,
    which means your code is unnecessarily unportable to other DBMSes.
    Adopting a coding policy of always using E'' would make that worse.

                             regards, tom lane


On 08/22/2015 02:40 PM, Melvin Davidson wrote:
Tom,

Thank you for pointing out "run with standard_conforming_strings = ON"..
However, that is NOT the problem.
What is occurring is that the developers are sending strings like 'Mr. M\'vey',
which, if we set standard_conforming_strings = ON, would, and does, result in errors and the statement failing,
which is a lot less desirable that a simple warning.

Therefore, I am trying to educate the developers in the proper method of escaping strings,
instead of loading up the error log with annoying warnings.



Please dont top post.

But you are not educating them correctly.  Using E'' isnt right.  The correct way to escape a quote is to double quote it:  'Mr. M''vey'

-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: PostgreSQL Developer Best Practices

From
"David G. Johnston"
Date:
On Saturday, August 22, 2015, Melvin Davidson <melvin6925@gmail.com> wrote:
>The correct way to escape a quote is to double quote it:  'Mr. M''vey'

That is a matter of opinion. However, the real problem is the enclosed backslashes, which is
beyond our control at this point. Therefore, the best solution is to use ESCAPE E.


Why is this a best practice and not just "how things work"?  If you want to use backlash escapes you use E''.  If you don't use a backslash escape it doesn't matter - except if you are writing a backslash and don't want to have to escape it.

Beyond that turn your idea of best practice into a requirement and enable standard-conforming-strings.

Always using E'' is pragmatic advice but hardly worthy of being considered best practice.  The best practice is to write code in such a way that you can leave standard conforming strings off AND not generate any warnings.

David J.


Re: PostgreSQL Developer Best Practices

From
John R Pierce
Date:
On 8/22/2015 12:40 PM, Melvin Davidson wrote:
> What is occurring is that the developers are sending strings like 'Mr.
> M\'vey',
> which, if we set standard_conforming_strings = ON, would, and does,
> result in errors and the statement failing,
> which is a lot less desirable that a simple warning.

if your developers were writing invalid C code (or whatever other
language they program in), would you modify the compiler to accept their
invalid syntax?   or would you tell the developer to fix their code
properly ?   if the developers refused, why would you not fire them on
the spot for incompetence ?



--
john r pierce, recycling bits in santa cruz



Re: PostgreSQL Developer Best Practices

From
Melvin Davidson
Date:
John,

I believe you and I think alike. The truth is, I was brought on as a consultant to help this client, so I do not have the authority to fire the developers. Rather, I am trying to help them fix the absolute worst designed DB and coding I have seen in 15 years of working with PostgreSQL. So I've asked for input on additional guidelines to help try to help them understand the right way to do things.

Unfortunately, so far, people seem to fixate on item one of my guidelines and I've had no additional suggestions.


On Sat, Aug 22, 2015 at 4:37 PM, John R Pierce <pierce@hogranch.com> wrote:
On 8/22/2015 12:40 PM, Melvin Davidson wrote:
What is occurring is that the developers are sending strings like 'Mr. M\'vey',
which, if we set standard_conforming_strings = ON, would, and does, result in errors and the statement failing,
which is a lot less desirable that a simple warning.

if your developers were writing invalid C code (or whatever other language they program in), would you modify the compiler to accept their invalid syntax?   or would you tell the developer to fix their code properly ?   if the developers refused, why would you not fire them on the spot for incompetence ?



--
john r pierce, recycling bits in santa cruz




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: PostgreSQL Developer Best Practices

From
"David G. Johnston"
Date:
On Sat, Aug 22, 2015 at 7:33 PM, Melvin Davidson <melvin6925@gmail.com> wrote:
John,

I believe you and I think alike. The truth is, I was brought on as a consultant to help this client, so I do not have the authority to fire the developers. Rather, I am trying to help them fix the absolute worst designed DB and coding I have seen in 15 years of working with PostgreSQL. So I've asked for input on additional guidelines to help try to help them understand the right way to do things.

Unfortunately, so far, people seem to fixate on item one of my guidelines and I've had no additional suggestions.


​​So is this a style guide for this one client or a best-practices guide?  From the subsequent elaboration I'll assume it is a style guide for a client...

1. OK, better they simply do this going forward than break existing warnings-only stuff by enabling standard conforming strings.
2. I do not get how the advice reconciles with the comment.  Besides, presumes an unstated client application to actually execute those statements.  Again, this is why it is a style guide for one client and not a general purpose best practices.
3. Yes -  though to be honest this only matters at the top-level of the query.  Sub-queries can make use of "SELECT *" without the performance downside (I think) and so there boils down to communication with the reader.
4. Good general advice but a handful of simplistic examples seems like inadequate training.
5. I would include how and when to use underscores.
6. My argument here would be that since in some cases you must use the constraint syntax (e.g., multi-column PK) it should be used in all cases for consistency.  If you are going to recommend they provide their own name the form of that name should be specified.  But the default is usually adequate so that seems like a very minor point to bring up and divert concentration and memory too.
7. (going from memory) Why hasn't PostgreSQL adopted the MySQL syntax of allowing "COMMENT ..." directly within CREATE DDL?  Especially for column comments.
8. Agreed;  I find this potentially opens a "do you prefix all table columns" discussion and have seen arguments to the effect of "but my ORM expects the 'id' convention'.
9. I agree with the sentiment but the example and extent of explanation seems lacking IMO...
10. Style.

Mechanics: 1, 3, 5, 6
Usability: 4, 7, 8
Style: 2,10
Modelling: 9

Based upon your definition of DBA only the following are in-scope:
1 - applications works but logs are full of warnings
3 - application consumes more resources than needed
7 - helps the DBA understand what data is in the database
10 - toss the DBA a bone by having all of their indexes have a consistent form.

If you want to expand the DBA role to application support and maintenance some of the other items would possibly come into scope.

In so far as a poorly defined model can load the database #9 is important...but I would not mix PostgreSQL style and usage suggestions with data modelling education.

David J.

Re: PostgreSQL Developer Best Practices

From
Thomas Kellerer
Date:
Melvin Davidson schrieb am 22.08.2015 um 21:40:
> Thank you for pointing out "run with standard_conforming_strings = ON"..
> However, that is NOT the problem.
> What is occurring is that the developers are sending strings like 'Mr. M\'vey',
> which, if we set standard_conforming_strings = ON, would, and does, result in errors and the statement failing,
> which is a lot less desirable that a simple warning.
>
> Therefore, I am trying to educate the developers in the proper method of escaping strings,
> instead of loading up the error log with annoying warnings.


I strongly disagree: the error is not "annoying" and the statement _should_ fail.

The only way you can make the developers stop using that non-standard syntax is to make the satement fail.

Thomas

Re: PostgreSQL Developer Best Practices

From
Thomas Kellerer
Date:
Melvin Davidson schrieb am 22.08.2015 um 17:15:
> I've attached a file with a few starters that although are numbered,
> are in no special order.


> 2. End ALL queries with a semi-colon (;)
>    EG: SELECT some_column FROM a_table;
>
>    Although autocommit is on by default, it is always a good idea to signal the query processor that a statement is
completewith the semicolon.  
>    Failure to do so could result in <IDLE IN TRANSACTION>, which will
>    hold locks on the tables involved and prevent other queries from being processed.

Terminating a statement with ; has nothing to do with "<idle in transaction>" connections.
It is a mere syntax thing to make the SQL client (e.g. psql) recognize the end of the statement.
If you don't use it, your statement won't be executed in the first place - at least with psql
as it will wait indefinitely until you finish typing the statement. A GUI client might simply send
the wrong statement to the backend.

If you run with autocommit disabled, ending each statement with a semicolon, will not prevent your connection
from getting into that "<idle in transaction>" state. You have to end the _transaction_ using commit or
rollback to avoid that.

I do agree with the "end all queries with a semi-colon" rule, but the explanation is wrong.

You should have another rule that says:

   End all transactions as soon as possible using commit or rollback.

Thomas



Re: PostgreSQL Developer Best Practices

From
Melvin Davidson
Date:
David,

Thank you. This is exactly the type of feedback I was looking for. To answer your question, for now it is a guide for one particular client, however, if I get enough feedback and contributions, I will revise it and submit to the PostgreSQL community.

On Mon, Aug 24, 2015 at 2:34 AM, Thomas Kellerer <spam_eater@gmx.net> wrote:
Melvin Davidson schrieb am 22.08.2015 um 17:15:
> I've attached a file with a few starters that although are numbered,
> are in no special order.


> 2. End ALL queries with a semi-colon (;)
>    EG: SELECT some_column FROM a_table;
>
>    Although autocommit is on by default, it is always a good idea to signal the query processor that a statement is complete with the semicolon.
>    Failure to do so could result in <IDLE IN TRANSACTION>, which will
>    hold locks on the tables involved and prevent other queries from being processed.

Terminating a statement with ; has nothing to do with "<idle in transaction>" connections.
It is a mere syntax thing to make the SQL client (e.g. psql) recognize the end of the statement.
If you don't use it, your statement won't be executed in the first place - at least with psql
as it will wait indefinitely until you finish typing the statement. A GUI client might simply send
the wrong statement to the backend.

If you run with autocommit disabled, ending each statement with a semicolon, will not prevent your connection
from getting into that "<idle in transaction>" state. You have to end the _transaction_ using commit or
rollback to avoid that.

I do agree with the "end all queries with a semi-colon" rule, but the explanation is wrong.

You should have another rule that says:

   End all transactions as soon as possible using commit or rollback.

Thomas





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: PostgreSQL Developer Best Practices

From
Ray Cote
Date:
On Sat, Aug 22, 2015 at 11:46 AM, Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote:
> 1. Prefix ALL literals with an Escape
>    EG:  SELECT E'This is a \'quoted literal \'';
>         SELECT E'This is an unquoted literal';
>
>    Doing so will prevent the annoying "WARNING:  nonstandard use of escape in a string literal"

I'd be concerned that what is missing here is the bigger issue of  Best Practice #0: Use Bound Variables. 
The only way I've seen invalid literals show up in SQL queries is through the dynamic generation of SQL Statements vs. using bound variables. 
Not using bound variables is your doorway to SQL injection exploits.


9. Do NOT arbitrarily assign an "id" column to a table as a primary key when other columns
    are perfectly suited as a unique primary key.
... 
        Good example:
        CREATE TABLE accounts
        ( accout_id bigint NOT NULL ,

I would not consider the general use of natural primary keys to be best practice. 
Let's assume your account_id field is used as a foreign key in a dozen other tables.
1) What happens if someone mis-types the account-id?
     To correct that, you also need to correct the FK field in the other dozen tables.
2) What happens when your company starts a new project (or buys a competitor) and all the new account numbers are alpha-numeric?
3) Your example shows the id as a bigint, but your rule is not limited to integers. 
What if your table is country populations and the primary key is country name?
Now, you have quite large foreign keys (and a country changing its name is not unheard of).
(and let's not even get started on case-sensitivity or character encodings).

--
Raymond Cote, President
voice: +1.603.924.6079 email: rgacote@AppropriateSolutions.com skype: ray.cote


Re: PostgreSQL Developer Best Practices

From
Melvin Davidson
Date:
9.
>1) What happens if someone mis-types the account-id?
>     To correct that, you also need to correct the FK field in the other dozen tables.
>2) What happens when your company starts a new project (or buys a competitor) >and all the new account numbers are alpha-numeric?

I would reply that in good applications, the user DOES NOT type the key, but rather selects from a drop down list, or the app looks it up / enters it for them. Besides, it's just as easy to miskey an integer as it is an aplha numeric. The point is, do not create two primary pkey's when one will do.

On Mon, Aug 24, 2015 at 9:15 AM, Ray Cote <rgacote@appropriatesolutions.com> wrote:
On Sat, Aug 22, 2015 at 11:46 AM, Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote:
> 1. Prefix ALL literals with an Escape
>    EG:  SELECT E'This is a \'quoted literal \'';
>         SELECT E'This is an unquoted literal';
>
>    Doing so will prevent the annoying "WARNING:  nonstandard use of escape in a string literal"

I'd be concerned that what is missing here is the bigger issue of  Best Practice #0: Use Bound Variables. 
The only way I've seen invalid literals show up in SQL queries is through the dynamic generation of SQL Statements vs. using bound variables. 
Not using bound variables is your doorway to SQL injection exploits.


9. Do NOT arbitrarily assign an "id" column to a table as a primary key when other columns
    are perfectly suited as a unique primary key.
... 
        Good example:
        CREATE TABLE accounts
        ( accout_id bigint NOT NULL ,

I would not consider the general use of natural primary keys to be best practice. 
Let's assume your account_id field is used as a foreign key in a dozen other tables.
1) What happens if someone mis-types the account-id?
     To correct that, you also need to correct the FK field in the other dozen tables.
2) What happens when your company starts a new project (or buys a competitor) and all the new account numbers are alpha-numeric?
3) Your example shows the id as a bigint, but your rule is not limited to integers. 
What if your table is country populations and the primary key is country name?
Now, you have quite large foreign keys (and a country changing its name is not unheard of).
(and let's not even get started on case-sensitivity or character encodings).

--
Raymond Cote, President
voice: +1.603.924.6079 email: rgacote@AppropriateSolutions.com skype: ray.cote





--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: PostgreSQL Developer Best Practices

From
"David G. Johnston"
Date:
On Mon, Aug 24, 2015 at 9:27 AM, Melvin Davidson <melvin6925@gmail.com> wrote:
9.
>1) What happens if someone mis-types the account-id?
>     To correct that, you also need to correct the FK field in the other dozen tables.
>2) What happens when your company starts a new project (or buys a competitor) >and all the new account numbers are alpha-numeric?

I would reply that in good applications, the user DOES NOT type the key, but rather selects from a drop down list, or the app looks it up / enters it for them. Besides, it's just as easy to miskey an integer as it is an aplha numeric. The point is, do not create two primary pkey's when one will do.

​Your missing the point.  The existing "Account ID" that you refer to is apparently externally defined.  Pretend it is a social security number.  How would one create a new user in your system, and record their account_id/social-security-number, without typing it in.  What then if it is discovered that the keyed in value was mis-typed?

​The "point" is to not introduce redundant information.  Creating your own surrogate identifier in order to avoid using a surrogate identifier value created by another system does not introduce redundancy but rather provides the system using the primary key control over its generation and, more importantly, format.  The highly situational nature of this is why "data modelling" is not something I'd incorporate in a "usage" document.​

David J.

Re: PostgreSQL Developer Best Practices

From
"David G. Johnston"
Date:
On Mon, Aug 24, 2015 at 9:15 AM, Ray Cote <rgacote@appropriatesolutions.com> wrote:
On Sat, Aug 22, 2015 at 11:46 AM, Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote:
> 1. Prefix ALL literals with an Escape
>    EG:  SELECT E'This is a \'quoted literal \'';
>         SELECT E'This is an unquoted literal';
>
>    Doing so will prevent the annoying "WARNING:  nonstandard use of escape in a string literal"

I'd be concerned that what is missing here is the bigger issue of  Best Practice #0: Use Bound Variables. 
The only way I've seen invalid literals show up in SQL queries is through the dynamic generation of SQL Statements vs. using bound variables. 
Not using bound variables is your doorway to SQL injection exploits.

​SELECT * FROM joblist WHERE job_status = 'Active';  is not at risk of an exploit...but your point is still a good one.

The other area where this is likely to crop up is in using regular expressions.  From that use case alone I've learned to only use E'' when I need the escaping behavior of the blackslash.  Since you rare need that when constructing a regexp I would rare write a regexp literal using E''.

David J.

Re: PostgreSQL Developer Best Practices

From
Melvin Davidson
Date:
>What then if it is discovered that the keyed in value was mis-typed?

That is why SQL has UPDATE and DELETE statements. If a primary key is incorrect,
it can be fixed, be it one method of another.

On Mon, Aug 24, 2015 at 10:04 AM, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Mon, Aug 24, 2015 at 9:27 AM, Melvin Davidson <melvin6925@gmail.com> wrote:
9.
>1) What happens if someone mis-types the account-id?
>     To correct that, you also need to correct the FK field in the other dozen tables.
>2) What happens when your company starts a new project (or buys a competitor) >and all the new account numbers are alpha-numeric?

I would reply that in good applications, the user DOES NOT type the key, but rather selects from a drop down list, or the app looks it up / enters it for them. Besides, it's just as easy to miskey an integer as it is an aplha numeric. The point is, do not create two primary pkey's when one will do.

​Your missing the point.  The existing "Account ID" that you refer to is apparently externally defined.  Pretend it is a social security number.  How would one create a new user in your system, and record their account_id/social-security-number, without typing it in.  What then if it is discovered that the keyed in value was mis-typed?

​The "point" is to not introduce redundant information.  Creating your own surrogate identifier in order to avoid using a surrogate identifier value created by another system does not introduce redundancy but rather provides the system using the primary key control over its generation and, more importantly, format.  The highly situational nature of this is why "data modelling" is not something I'd incorporate in a "usage" document.​

David J.



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: PostgreSQL Developer Best Practices

From
"David G. Johnston"
Date:
On Mon, Aug 24, 2015 at 10:32 AM, Melvin Davidson <melvin6925@gmail.com> wrote:
>What then if it is discovered that the keyed in value was mis-typed?

That is why SQL has UPDATE and DELETE statements. If a primary key is incorrect,
it can be fixed, be it one method of another. 

​Yes, a DBA can use ON DELETE CASCADE and ON UPDATE CASCADE​
 
​to manually resolve the issue of a typo.    At scale it is not a clear-cut solution, however.

David J.

Re: PostgreSQL Developer Best Practices

From
John Turner
Date:
On Mon, 24 Aug 2015 09:15:27 -0400, Ray Cote <rgacote@appropriatesolutions.com> wrote:

9. Do NOT arbitrarily assign an "id" column to a table as a primary key when other columns
    are perfectly suited as a unique primary key.
... 
        Good example:
        CREATE TABLE accounts
        ( accout_id bigint NOT NULL ,

I would not consider the general use of natural primary keys to be best practice. 
Let's assume your account_id field is used as a foreign key in a dozen other tables.
1) What happens if someone mis-types the account-id?
     To correct that, you also need to correct the FK field in the other dozen tables.
2) What happens when your company starts a new project (or buys a competitor) and all the new account numbers are alpha-numeric?

Point 9 is well-intentioned, but perhaps needs to be clarified/rephrased:  Developers should not be creating production-grade tables devoid of well-defined business keys, period. That would be regardless of whether they're used as de facto primary keys or simply as unique keys.

As long as that is made clear as a foundational requirement, then developers should be allowed some leeway as to the subsequent design choice between synthetic vs natural keys.  Further to the above remarks, offering some guidelines on the trade-offs would be beneficial.  E.g., if natural keys are chosen as Primary, it's likely that cascading mechanisms ought to be implemented.  Conversely, if synthetic keys are chosen as Primary, they must be accompanied by a legitimate Unique natural key.

- John

Re: PostgreSQL Developer Best Practices

From
"Joshua D. Drake"
Date:
On 08/24/2015 07:58 AM, John Turner wrote:
> On Mon, 24 Aug 2015 09:15:27 -0400, Ray Cote

>
> Point 9 is well-intentioned, but perhaps needs to be
> clarified/rephrased:  Developers should not be creating production-grade
> tables devoid of well-defined business keys, period. That would be
> regardless of whether they're used as de facto primary keys or simply as
> unique keys.

Although I appreciate your argument, I think we need a little foundation
in reality. The "serial" key is the default primary key amongst every
single web development environment in existence.

We can make an argument within the doc to why that can be bad, but to
state that it is "wrong" is just not going to get you anywhere.

JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


Re: PostgreSQL Developer Best Practices

From
Melvin Davidson
Date:
>The "serial" key is the default primary key amongst every single web development environment in existence.

Methinks thou doest take too much for granted.

Yes, serial has it's purpose, but I sincerely doubt it is "the default primary key amongst every single web development environment in existence"
I am not sure where you get your stats from. Probably you are referring to "Ruby on Rails". IMHO, RoR is something which has made it easier to code Web apps, at the cost of developers not needing to use brain power. In any case, the idea is to develop good database design. not web apps.

On Mon, Aug 24, 2015 at 11:46 AM, Joshua D. Drake <jd@commandprompt.com> wrote:
On 08/24/2015 07:58 AM, John Turner wrote:
On Mon, 24 Aug 2015 09:15:27 -0400, Ray Cote


Point 9 is well-intentioned, but perhaps needs to be
clarified/rephrased:  Developers should not be creating production-grade
tables devoid of well-defined business keys, period. That would be
regardless of whether they're used as de facto primary keys or simply as
unique keys.

Although I appreciate your argument, I think we need a little foundation in reality. The "serial" key is the default primary key amongst every single web development environment in existence.

We can make an argument within the doc to why that can be bad, but to state that it is "wrong" is just not going to get you anywhere.

JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: PostgreSQL Developer Best Practices

From
Adrian Klaver
Date:
On 08/24/2015 08:56 AM, Melvin Davidson wrote:
>  >The "serial" key is the default primary key amongst every single web
> development environment in existence.
>
> Methinks thou doest take too much for granted.
>
> Yes, serial has it's purpose, but I sincerely doubt it is "the default
> primary key amongst every single web development environment in existence"
> I am not sure where you get your stats from. Probably you are referring
> to "Ruby on Rails". IMHO, RoR is something which has made it easier to
> code Web apps, at the cost of developers not needing to use brain power.
> In any case, the idea is to develop good database design. not web apps.

Well I think the broader definition is that surrogate key use is a
byproduct of ORM use. Web development frameworks tend to use an ORM as
the default way of interacting with the database so you get surrogate keys.

>
> On Mon, Aug 24, 2015 at 11:46 AM, Joshua D. Drake <jd@commandprompt.com
> <mailto:jd@commandprompt.com>> wrote:
>
>     On 08/24/2015 07:58 AM, John Turner wrote:
>
>         On Mon, 24 Aug 2015 09:15:27 -0400, Ray Cote
>
>
>
>         Point 9 is well-intentioned, but perhaps needs to be
>         clarified/rephrased:  Developers should not be creating
>         production-grade
>         tables devoid of well-defined business keys, period. That would be
>         regardless of whether they're used as de facto primary keys or
>         simply as
>         unique keys.
>
>
>     Although I appreciate your argument, I think we need a little
>     foundation in reality. The "serial" key is the default primary key
>     amongst every single web development environment in existence.
>
>     We can make an argument within the doc to why that can be bad, but
>     to state that it is "wrong" is just not going to get you anywhere.
>
>     JD
>
>
>     --
>     Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564
>     <tel:503-667-4564>
>     PostgreSQL Centered full stack support, consulting and development.
>     Announcing "I'm offended" is basically telling the world you can't
>     control your own emotions, so everyone else should do it for you.
>
>
>     --
>     Sent via pgsql-general mailing list (pgsql-general@postgresql.org
>     <mailto:pgsql-general@postgresql.org>)
>     To make changes to your subscription:
>     http://www.postgresql.org/mailpref/pgsql-general
>
>
>
>
> --
> *Melvin Davidson*
> I reserve the right to fantasize.  Whether or not you
> wish to share my fantasy is entirely up to you.


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: PostgreSQL Developer Best Practices

From
"Joshua D. Drake"
Date:
On 08/24/2015 08:56 AM, Melvin Davidson wrote:
>  >The "serial" key is the default primary key amongst every single web
> development environment in existence.
>
> Methinks thou doest take too much for granted.
>
> Yes, serial has it's purpose, but I sincerely doubt it is "the default
> primary key amongst every single web development environment in existence"
> I am not sure where you get your stats from. Probably you are referring
> to "Ruby on Rails". IMHO,

Rails
Anything that uses Hibernate (Java)
Django
Every PHP framework
Pyramid
Anything that uses sql-alchemy

I can go on for miles with this. It is true that a lot of these support
non-serial keys. It is also true that is not the default.

JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


Re: PostgreSQL Developer Best Practices

From
Melvin Davidson
Date:
And again, I am talking about _database_ design, not Web apps. Letting Web developers design a database to work with their app, is a very, Very, VERY bad idea.
It is far better to let DBA's and "database develeopers" design a good database, then to let those apps mold a db into a non-optimum design.

On Mon, Aug 24, 2015 at 12:26 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
On 08/24/2015 08:56 AM, Melvin Davidson wrote:
 >The "serial" key is the default primary key amongst every single web
development environment in existence.

Methinks thou doest take too much for granted.

Yes, serial has it's purpose, but I sincerely doubt it is "the default
primary key amongst every single web development environment in existence"
I am not sure where you get your stats from. Probably you are referring
to "Ruby on Rails". IMHO,

Rails
Anything that uses Hibernate (Java)
Django
Every PHP framework
Pyramid
Anything that uses sql-alchemy

I can go on for miles with this. It is true that a lot of these support non-serial keys. It is also true that is not the default.


JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: PostgreSQL Developer Best Practices

From
Adrian Klaver
Date:
On 08/24/2015 09:34 AM, Melvin Davidson wrote:
> And again, I am talking about _database_ design, not Web apps. Letting
> Web developers design a database to work with their app, is a very,
> Very, VERY bad idea.

Again this is not restricted to Web apps. Anything that touches a
database via an ORM is fair game.

> It is far better to let DBA's and "database develeopers" design a good
> database, then to let those apps mold a db into a non-optimum design.

Aah, the perfect world I do not live in.

>
> On Mon, Aug 24, 2015 at 12:26 PM, Joshua D. Drake <jd@commandprompt.com
> <mailto:jd@commandprompt.com>> wrote:
>
>     On 08/24/2015 08:56 AM, Melvin Davidson wrote:
>
>           >The "serial" key is the default primary key amongst every
>         single web
>         development environment in existence.
>
>         Methinks thou doest take too much for granted.
>
>         Yes, serial has it's purpose, but I sincerely doubt it is "the
>         default
>         primary key amongst every single web development environment in
>         existence"
>         I am not sure where you get your stats from. Probably you are
>         referring
>         to "Ruby on Rails". IMHO,
>
>
>     Rails
>     Anything that uses Hibernate (Java)
>     Django
>     Every PHP framework
>     Pyramid
>     Anything that uses sql-alchemy
>
>     I can go on for miles with this. It is true that a lot of these
>     support non-serial keys. It is also true that is not the default.
>
>
>     JD
>
>
>     --
>     Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564
>     <tel:503-667-4564>
>     PostgreSQL Centered full stack support, consulting and development.
>     Announcing "I'm offended" is basically telling the world you can't
>     control your own emotions, so everyone else should do it for you.
>
>
>
>
> --
> *Melvin Davidson*
> I reserve the right to fantasize.  Whether or not you
> wish to share my fantasy is entirely up to you.


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: PostgreSQL Developer Best Practices

From
"Joshua D. Drake"
Date:
On 08/24/2015 09:34 AM, Melvin Davidson wrote:
> And again, I am talking about _database_ design, not Web apps. Letting
> Web developers design a database to work with their app, is a very,
> Very, VERY bad idea.

And I don't argue that but we also live in a world based on reality.
DBAs are rare, web developers who think they know how to DBA (and are
wrong) are anywhere from 8 to 40 bucks an hour depending on where you
are paying them.

> It is far better to let DBA's and "database develeopers" design a good
> database, then to let those apps mold a db into a non-optimum design.

It is far better to recognize the realities of the market place and
document why/how to do something that understands those realities than
to put in place a pedantic document that will only let DBAs sit there
and feel good about themselves.

In short, I agree with you, we just don't live in that world. I am not
even suggesting that you remove your very good work. I am suggesting
that you allow for the fact that what you are suggesting is probably <
20% of designs out there and therefore we have to compromise perfect to
good enough.

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


Re: PostgreSQL Developer Best Practices

From
John Turner
Date:
On Mon, 24 Aug 2015 12:40:37 -0400, Joshua D. Drake <jd@commandprompt.com>
wrote:

> On 08/24/2015 09:34 AM, Melvin Davidson wrote:
>> And again, I am talking about _database_ design, not Web apps. Letting
>> Web developers design a database to work with their app, is a very,
>> Very, VERY bad idea.
>
> And I don't argue that but we also live in a world based on reality.
> DBAs are rare, web developers who think they know how to DBA (and are
> wrong) are anywhere from 8 to 40 bucks an hour depending on where you
> are paying them.
>
>> It is far better to let DBA's and "database develeopers" design a good
>> database, then to let those apps mold a db into a non-optimum design.
>
> It is far better to recognize the realities of the market place and
> document why/how to do something that understands those realities than
> to put in place a pedantic document that will only let DBAs sit there
> and feel good about themselves.
>
> In short, I agree with you, we just don't live in that world. I am not
> even suggesting that you remove your very good work. I am suggesting
> that you allow for the fact that what you are suggesting is probably <
> 20% of designs out there and therefore we have to compromise perfect to
> good enough.
>
> JD
>


I'm all for development via ORMs, but alas, the ORMs would've done well to
have given more consideration to key conventions...

I would only add that if the app is of any significance for an LOB, then
at some point the schema would need to be validated against real business
keys.

- John


Re: PostgreSQL Developer Best Practices

From
John R Pierce
Date:
On 8/24/2015 9:34 AM, Melvin Davidson wrote:
> And again, I am talking about _database_ design, not Web apps. Letting
> Web developers design a database to work with their app, is a very,
> Very, VERY bad idea.
> It is far better to let DBA's and "database develeopers" design a good
> database, then to let those apps mold a db into a non-optimum design.

if you let the app drive the database design, you tend to end up with a
database which is only useful to that single app, and likely breaks when
that app changes.

--
john r pierce, recycling bits in santa cruz



Re: PostgreSQL Developer Best Practices

From
Melvin Davidson
Date:
Thank you John R. Pierce. Finally someone who understands the purpose of this thread. Otherwise, next thing you know, we'll have Web apps/developers designing bra's for milk cows so they'll look better in the field. :)

On Mon, Aug 24, 2015 at 1:05 PM, John R Pierce <pierce@hogranch.com> wrote:
On 8/24/2015 9:34 AM, Melvin Davidson wrote:
And again, I am talking about _database_ design, not Web apps. Letting Web developers design a database to work with their app, is a very, Very, VERY bad idea.
It is far better to let DBA's and "database develeopers" design a good database, then to let those apps mold a db into a non-optimum design.

if you let the app drive the database design, you tend to end up with a database which is only useful to that single app, and likely breaks when that app changes.

--
john r pierce, recycling bits in santa cruz




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: PostgreSQL Developer Best Practices

From
Melvin Davidson
Date:
>ON UPDATE CASCADE ?

I believe he's talking about triggers.

On Mon, Aug 24, 2015 at 8:00 PM, Berend Tober <btober@computer.org> wrote:
Melvin Davidson wrote:
9.
 >1) What happens if someone mis-types the account-id?
 >     To correct that, you also need to correct the FK field in the
other dozen tables.
 >2) What happens when your company starts a new project (or buys a
    I would not consider the general use of natural primary keys to be
    best practice.
    Let's assume your account_id field is used as a foreign key in a
    dozen other tables.
    1) What happens if someone mis-types the account-id?
          To correct that, you also need to correct the FK field in the
    other dozen tables.


... ON UPDATE CASCADE ?




--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: PostgreSQL Developer Best Practices

From
CaT
Date:
On Mon, Aug 24, 2015 at 08:22:17PM -0400, Melvin Davidson wrote:
> On Mon, Aug 24, 2015 at 8:00 PM, Berend Tober <btober@computer.org> wrote:
> >>     1) What happens if someone mis-types the account-id?
> >>           To correct that, you also need to correct the FK field in the
> >>     other dozen tables.
> >
> > ... ON UPDATE CASCADE ?
>
> I believe he's talking about triggers.

Huh? Why would you use a trigger when FOREIGN KEY has ON UPDATE CASCADE?

--
  "A search of his car uncovered pornography, a homemade sex aid, women's
  stockings and a Jack Russell terrier."
    - http://www.dailytelegraph.com.au/news/wacky/indeed/story-e6frev20-1111118083480


Re: PostgreSQL Developer Best Practices

From
Melvin Davidson
Date:
You are right, he was probably talking about FK's. I was just so frustrated about people insisting that using "ID" as the primary key in every table is a "good" idea,
I didn't bother to reply previously. I stand firm on my belief that the primary key should be something meaningful and NOT "id" just for the sake of having a unique numeric key.

On Mon, Aug 24, 2015 at 8:39 PM, CaT <cat@zip.com.au> wrote:
On Mon, Aug 24, 2015 at 08:22:17PM -0400, Melvin Davidson wrote:
> On Mon, Aug 24, 2015 at 8:00 PM, Berend Tober <btober@computer.org> wrote:
> >>     1) What happens if someone mis-types the account-id?
> >>           To correct that, you also need to correct the FK field in the
> >>     other dozen tables.
> >
> > ... ON UPDATE CASCADE ?
>
> I believe he's talking about triggers.

Huh? Why would you use a trigger when FOREIGN KEY has ON UPDATE CASCADE?

--
  "A search of his car uncovered pornography, a homemade sex aid, women's
  stockings and a Jack Russell terrier."
    - http://www.dailytelegraph.com.au/news/wacky/indeed/story-e6frev20-1111118083480



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: PostgreSQL Developer Best Practices

From
Gavin Flower
Date:
On 25/08/15 01:15, Ray Cote wrote:
> On Sat, Aug 22, 2015 at 11:46 AM, Karsten Hilbert
> <Karsten.Hilbert@gmx.net <mailto:Karsten.Hilbert@gmx.net>> wrote:
>
[...]
>
>     9. Do NOT arbitrarily assign an "id" column to a table as a
>     primary key when other columns
>         are perfectly suited as a unique primary key.
>
>     ...
>
>           Good example:
>             CREATE TABLE accounts
>             ( accout_id bigint NOT NULL ,
>
>
> I would not consider the general use of natural primary keys to be
> best practice.
[...]

Neither would I.

The database has primary keys that are often foreign keys for other
tables.  So if the primary key is a natural key, then if the external
world redefines the nature of the natural key, for example changing its
type or format, then this would have unnecessary invasive changes to
multiple tables within the database.  Also you are at the mercy of
external control of what constitutes uniqueness, for example the
American Social Security Number is not unique!

Also the best practice is to make the primary key name 'id' as you do
know the table it is in, so prepending the table name is redundant - so
you can clearly identify foreign keys because the suffix '_id 'is
prepended by the table name of the referenced table.  Hence 'id' is a
primary key, and account_id is a foreign key pointing into the account
table.

I have had to deal with databases were a child table's primary key is
the parent table's primary key with extra characters appended, so you
can have a child table's primary key exceeding 45 characters. The child
table only need to know the primary key of it direct parent, so using
int, or bigint, would be a far better solution!

Having said the above, there may well be valid reasons to use a natural
key for the primary key - so it should NOT be an absolute rule to
disallow it.


Cheers,
Gavin


Re: PostgreSQL Developer Best Practices

From
Gavin Flower
Date:
On 25/08/15 02:58, John Turner wrote:
[...]
> Conversely, if synthetic keys are chosen as Primary, they must be
> accompanied by a legitimate Unique natural key.
Agreed, but only where appropriate.

Cheers,
Gavin


Re: PostgreSQL Developer Best Practices

From
Gavin Flower
Date:
On 25/08/15 04:26, Joshua D. Drake wrote:
> On 08/24/2015 08:56 AM, Melvin Davidson wrote:
>>  >The "serial" key is the default primary key amongst every single web
>> development environment in existence.
>>
>> Methinks thou doest take too much for granted.
>>
>> Yes, serial has it's purpose, but I sincerely doubt it is "the default
>> primary key amongst every single web development environment in
>> existence"
>> I am not sure where you get your stats from. Probably you are referring
>> to "Ruby on Rails". IMHO,
>
> Rails
> Anything that uses Hibernate (Java)
> Django
> Every PHP framework
> Pyramid
> Anything that uses sql-alchemy
>
> I can go on for miles with this. It is true that a lot of these
> support non-serial keys. It is also true that is not the default.
>
> JD
>
>
I came to the idea of using surrogate primary keys long before I knew
anything about the software on the above list or anything similar!


Cheers,
Gavin


Re: PostgreSQL Developer Best Practices

From
CaT
Date:
On Mon, Aug 24, 2015 at 08:53:43PM -0400, Melvin Davidson wrote:
> You are right, he was probably talking about FK's. I was just so frustrated
> about people insisting that using "ID" as the primary key in every table is
> a "good" idea,
> I didn't bother to reply previously. I stand firm on my belief that the
> primary key should be something meaningful and NOT "id" just for the sake
> of having a unique numeric key.

I think there is a case to be made for pragmatism. Perhaps you should
split your points up into "ideal" and "pragmatic" requirements. ie you
should aim for the "ideal" but, with good justification, "pragmatic"
will be acceptable.

Get those doing the work to think about their decisions.

--
  "A search of his car uncovered pornography, a homemade sex aid, women's
  stockings and a Jack Russell terrier."
    - http://www.dailytelegraph.com.au/news/wacky/indeed/story-e6frev20-1111118083480


Re: PostgreSQL Developer Best Practices

From
"David G. Johnston"
Date:
On Mon, Aug 24, 2015 at 10:02 PM, Gavin Flower <GavinFlower@archidevsys.co.nz> wrote:
Also the best practice is to make the primary key name 'id' as you do know the table it is in, so prepending the table name is redundant - so you can clearly identify foreign keys because the suffix '_id 'is prepended by the table name of the referenced table.  Hence 'id' is a primary key, and account_id is a foreign key pointing into the account table.

​I would much rather be able to write:

SELECT parent_id, child_id, [...]
FROM parent
JOIN child USING (parent_id)

instead of

SELECT parent.id AS parent_id, child.id AS child_id, [...]
FROM parent
JOIN child ON (parent.id = child.parent_id)

​Yes, looking at the parent table it is obvious that the id you are looking at is the "parent" id.  But as soon as you join two or more tables you are guaranteed to have multiple columns with the name "id" that you now need to disambiguate.


The column name "table_id" refers to the primary identifier for that entity no matter where it appears.  I'd rather have one redundant situation than one exception to the rule.

David J.

Re: PostgreSQL Developer Best Practices

From
Rob Sargent
Date:
> On Aug 24, 2015, at 6:53 PM, Melvin Davidson <melvin6925@gmail.com> wrote:
>
> You are right, he was probably talking about FK's. I was just so frustrated about people insisting that using "ID" as
theprimary key in every table is a "good" idea, 
> I didn't bother to reply previously. I stand firm on my belief that the primary key should be something meaningful
andNOT "id" just for the sake of having a unique numeric key. 
>
What, pray tell, is the unique natural key of person in any meaningfully large domain such as state? Certainly not name
+birthdate.  Current address isn’t guaranteed. Social isn’t reliable and actually not truly unique. 

Even given that there are models which are made of entities with legitimate attributes which per force define a unique
instance,I see no benefit in avoiding the convenience of an arbitrary and simple value for the key.  Is it the overhead
ofgenerating and storing one more value per tuple that you can’t abide? 




Re: PostgreSQL Developer Best Practices

From
Gavin Flower
Date:
On 25/08/15 14:45, David G. Johnston wrote:
> On Mon, Aug 24, 2015 at 10:02 PM, Gavin Flower
> <GavinFlower@archidevsys.co.nz
> <mailto:GavinFlower@archidevsys.co.nz>>wrote:
>
>     Also the best practice is to make the primary key name 'id' as you
>     do know the table it is in, so prepending the table name is
>     redundant - so you can clearly identify foreign keys because the
>     suffix '_id 'is prepended by the table name of the referenced
>     table.  Hence 'id' is a primary key, and account_id is a foreign
>     key pointing into the account table.
>
>
> ​ I would much rather be able to write:
>
> SELECT parent_id, child_id, [...]
> FROM parent
> JOIN child USING (parent_id)
>
> instead of
>
> SELECT parent.id <http://parent.id> AS parent_id, child.id
> <http://child.id> AS child_id, [...]
> FROM parent
> JOIN child ON (parent.id <http://parent.id> = child.parent_id)
>
> ​ Yes, looking at the parent table it is obvious that the id you are
> looking at is the "parent" id.  But as soon as you join two or more
> tables you are guaranteed to have multiple columns with the name "id"
> that you now need to disambiguate.
>
>
> The column name "table_id" refers to the primary identifier for that
> entity no matter where it appears.  I'd rather have one redundant
> situation than one exception to the rule.
>
> David J.
>
Hmm...

I consider it good practice to always give an alias for each table used,
especially for non trivial SQL statements.

So I think the above would look better (using slightly more realistic
table names) as:

SELECT
     c.id,
     s.id,
     [...]
FROM
     company c
     JOIN shop s USING (s.company_id = c.id);

Which is I think a lot clearer (it is obvious that you are joining a
foreign key with a primary key), and you can add more stuff without it
suddenly becoming ambiguous.

I once wrote a Sybase stored proc with over 3000 lines of SQL (not
practical to split it up, unfortunately), individual selects were often
over half a page.  It interrogated 17 tables from two different
databases and needed 5 temporary tables.




Cheers,
Gavin

P.S.  the use of '[...]' was started by me way back in the heyday of
usenet, in the beginning of the 1990's!  Previously people used '[
omitted ]'.



Re: PostgreSQL Developer Best Practices

From
Karsten Hilbert
Date:
On Tue, Aug 25, 2015 at 02:02:17PM +1200, Gavin Flower wrote:

> On 25/08/15 01:15, Ray Cote wrote:
> >On Sat, Aug 22, 2015 at 11:46 AM, Karsten Hilbert <Karsten.Hilbert@gmx.net
> ><mailto:Karsten.Hilbert@gmx.net>> wrote:
> >
> [...]
> >
> >    9. Do NOT arbitrarily assign an "id" column to a table as a
> >    primary key when other columns
> >        are perfectly suited as a unique primary key.
> >
> >    ...
> >
> >          Good example:
> >            CREATE TABLE accounts
> >            ( accout_id bigint NOT NULL ,
> >
> >
> >I would not consider the general use of natural primary keys to be best
> >practice.

Gavin, Ray,

I certainly didn't write any of the above.

Karsten
--
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346


Re: PostgreSQL Developer Best Practices

From
Gavin Flower
Date:
On 25/08/15 19:04, Karsten Hilbert wrote:
> On Tue, Aug 25, 2015 at 02:02:17PM +1200, Gavin Flower wrote:
>
>> On 25/08/15 01:15, Ray Cote wrote:
>>> On Sat, Aug 22, 2015 at 11:46 AM, Karsten Hilbert <Karsten.Hilbert@gmx.net
>>> <mailto:Karsten.Hilbert@gmx.net>> wrote:
>>>
>> [...]
>>>     9. Do NOT arbitrarily assign an "id" column to a table as a
>>>     primary key when other columns
>>>         are perfectly suited as a unique primary key.
>>>
>>>     ...
>>>
>>>           Good example:
>>>             CREATE TABLE accounts
>>>             ( accout_id bigint NOT NULL ,
>>>
>>>
>>> I would not consider the general use of natural primary keys to be best
>>> practice.
> Gavin, Ray,
>
> I certainly didn't write any of the above.
>
> Karsten
Hi Karsten,

It took me a couple of minutes, but I traced "9.  ..." to
melvin6925@gmail.com who opened the thread

Looks like Ray misquoted back in the entry that can be identified by
(using the 'source' option on my mail client)

    From: Ray Cote <rgacote@appropriatesolutions.com>
    Date: Mon, 24 Aug 2015 09:15:27 -0400
    Message-ID:
    <CAG5tnzqTausEhFtRpfWCunx4YNFuGTFyUZyTkn5f2E7RaYKE=g@mail.gmail.com>

which was

    On Sat, Aug 22, 2015 at 11:46 AM, Karsten Hilbert <Karsten.Hilbert@gmx.net>
    wrote:

    > > 1. Prefix ALL literals with an Escape
    > >    EG:  SELECT E'This is a \'quoted literal \'';
    > >         SELECT E'This is an unquoted literal';
    > >
    > >    Doing so will prevent the annoying "WARNING:  nonstandard use of
    > escape in a string literal"
    >

    I'd be concerned that what is missing here is the bigger issue of  Best
    Practice #0: Use Bound Variables.
    The only way I've seen invalid literals show up in SQL queries is through
    the dynamic generation of SQL Statements vs. using bound variables.
    Not using bound variables is your doorway to SQL injection exploits.


    9. Do NOT arbitrarily assign an "id" column to a table as a primary key
    > when other columns
    >     are perfectly suited as a unique primary key.

    ...

             Good example:
    >         CREATE TABLE accounts
    >         ( accout_id bigint NOT NULL ,


    I would not consider the general use of natural primary keys to be best
    practice.
    Let's assume your account_id field is used as a foreign key in a dozen
    other tables.
    1) What happens if someone mis-types the account-id?
          To correct that, you also need to correct the FK field in the other
    dozen tables.
    2) What happens when your company starts a new project (or buys a
    competitor) and all the new account numbers are alpha-numeric?
    3) Your example shows the id as a bigint, but your rule is not limited to
    integers.
    What if your table is country populations and the primary key is country
    name?
    Now, you have quite large foreign keys (and a country changing its name is
    not unheard of).
    (and let's not even get started on case-sensitivity or character encodings).


Cheers,
Gavin


Re: PostgreSQL Developer Best Practices

From
Neil Tiffin
Date:
> On Aug 22, 2015, at 10:15 AM, Melvin Davidson <melvin6925@gmail.com> wrote:
> 6. Although it is legal to use the form "column TYPE PRIMARY KEY," It is best to specify as a CONSTRAINT,
>    that way YOU get to choose the name, otherwise postgres assigns a default name which may not be to your liking.
>    EG: , CONSTRAINT accounts_pk PRIMARY KEY (account_id)

> 10.  Standardize Index names with the form table_name + col(s) + “idx”
>         EG: For accounts table:
>                 accounts_name_idx
>                 accounts_city_state_idx
>


I really like the standardization that PostgreSQL uses in auto generating default names.  The rule I use is to always
usethe auto generated names unless the object is referenced routinely in code. In most cases developers don’t care
aboutindex, unique, foreign key, or primary key names (from a coding standpoint) so why should they be creating the
names.Since the postgresql standard uses auto generated names with ‘_pkey’ for PRIMARY KEY  ‘_fkey’ for FOREIGN KEY,
and‘_key’ for UNIQUE, why not use the same rules for consistency?  So I disagree with 6 and would extend 10 to include
theseother names if they are manually generated. 

interestingly enough, when I searched 9.5 docs I could not find a description of these postgreSQL naming convention.
Probablybecause the developers consider it an internal detail that could change which is fine, since the names usually
don’tmatter, until they do.   

I would say use “column TYPE PRIMARY KEY”, “column TYPE UNIQUE”, and ‘column TYPE REFERENCES …’ every place you can and
onlycreate manual names when absolutely necessary.  When you do create manual names follow the standard PostgreSQL
convention.

Now I have worked on mostly smaller installations so maybe someone should chime in if this is a bad best practice.

Neil




Re: PostgreSQL Developer Best Practices

From
Adrian Klaver
Date:
On 08/24/2015 08:44 PM, Rob Sargent wrote:
>
>> On Aug 24, 2015, at 6:53 PM, Melvin Davidson <melvin6925@gmail.com> wrote:
>>
>> You are right, he was probably talking about FK's. I was just so frustrated about people insisting that using "ID"
asthe primary key in every table is a "good" idea, 
>> I didn't bother to reply previously. I stand firm on my belief that the primary key should be something meaningful
andNOT "id" just for the sake of having a unique numeric key. 
>>
> What, pray tell, is the unique natural key of person in any meaningfully large domain such as state? Certainly not
name+ birthdate.  Current address isn’t guaranteed. Social isn’t reliable and actually not truly unique. 

To add:

1) Who determined that a number is not natural?

2) One of the older unique natural keys (genus, species) is not so
unique. I am a fisheries biologist by training and in my time the
'unique' identifier for various fishes has changed. Now that
ichthyologists have discovered DNA testing, it can be expected there
will be even more changes. This is even more apparent when you go back
in in history. As an example:

https://en.wikipedia.org/wiki/Rainbow_trout

Rainbow trout

Current

Oncorhynchus mykiss

Past

Salmo mykiss Walbaum, 1792
Parasalmo mykiss (Walbaum, 1792)
Salmo purpuratus Pallas, 1814
Salmo penshinensis Pallas, 1814
Parasalmo penshinensis (Pallas, 1814)
Salmo gairdnerii Richardson, 1836  <--The one I learned.
Fario gairdneri (Richardson, 1836)
Oncorhynchus gairdnerii (Richardson, 1836)
Salmo gairdnerii gairdnerii Richardson, 1836
Salmo rivularis Ayres, 1855
Salmo iridea Gibbons, 1855
Salmo gairdnerii irideus Gibbons, 1855
Salmo irideus Gibbons, 1855
Trutta iridea (Gibbons, 1855)
Salmo truncatus Suckley, 1859
Salmo masoni Suckley, 1860
Oncorhynchus kamloops Jordan, 1892
Salmo kamloops (Jordan, 1892)
Salmo rivularis kamloops (Jordan, 1892)
Salmo gairdneri shasta Jordan, 1894
Salmo gilberti Jordan, 1894
Salmo nelsoni Evermann, 1908


All the above point to the same fish and have appeared and appear in
articles and reports about said fish. Lets not even get into the common
name situation:).

>
> Even given that there are models which are made of entities with legitimate attributes which per force define a
uniqueinstance, I see no benefit in avoiding the convenience of an arbitrary and simple value for the key.  Is it the
overheadof generating and storing one more value per tuple that you can’t abide? 
>
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: PostgreSQL Developer Best Practices

From
Melvin Davidson
Date:
Adrian,

Stop being so technical. When we/I speak of natural keys, we are talking about the column
that would NATURALly lend itself as the primary key.
No one ever said a number is not natural. just that there is no need to duplicate uniqueness
with a separate number.

IOW: If we have an account table, then the account_id or account_no
     would be the primary key. There is no need to have a separate serial id as the primary key.
     
     Likewise, if we have a car table, then registration (or vehicle_id) is preferred.
     
     EG: Good
     CREATE TABLE car
     (
      registration_no varchar(30) not null,
      car_make       varchar(25) not null,
      model          varchar(15) not null;
      build_year     date not null;
      owner          varchar(50),
      CONSTRAINT car_pk PRIMARY KEY (registration_no)
      );
     
     bad
     CREATE TABLE car
     (
      id              serial not null,
      registration_no varchar(30) not null,
      car_make       varchar(25) not null,
      model          varchar(15) not null;
      build_year     date not null;
      owner          varchar(50),
      CONSTRAINT car_pk PRIMARY KEY (id)
      );
     
The benefit in avoiding arbitrary and simple values for the key is that it makes
the database design much more logical.

Consider:
SELECT c.registration_no,
       c.car_make,
       p.part_no
   FROM car c
   JOIN parts p ON ( p.registration_no = c.registration_no)
 WHERE registration_no = <some_var>;
 
 versus:
 SELECT c.registration_no,
       c.car_make,
       p.part_no
   FROM car c
   JOIN parts p ON ( p.id = c.id)
 WHERE registration_no = <some_var>;

 Why join on id when registration_no is better?


On Tue, Aug 25, 2015 at 10:17 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 08/24/2015 08:44 PM, Rob Sargent wrote:

On Aug 24, 2015, at 6:53 PM, Melvin Davidson <melvin6925@gmail.com> wrote:

You are right, he was probably talking about FK's. I was just so frustrated about people insisting that using "ID" as the primary key in every table is a "good" idea,
I didn't bother to reply previously. I stand firm on my belief that the primary key should be something meaningful and NOT "id" just for the sake of having a unique numeric key.

What, pray tell, is the unique natural key of person in any meaningfully large domain such as state? Certainly not name + birthdate.  Current address isn’t guaranteed. Social isn’t reliable and actually not truly unique.

To add:

1) Who determined that a number is not natural?

2) One of the older unique natural keys (genus, species) is not so unique. I am a fisheries biologist by training and in my time the 'unique' identifier for various fishes has changed. Now that ichthyologists have discovered DNA testing, it can be expected there will be even more changes. This is even more apparent when you go back in in history. As an example:

https://en.wikipedia.org/wiki/Rainbow_trout

Rainbow trout

Current

Oncorhynchus mykiss

Past

Salmo mykiss Walbaum, 1792
Parasalmo mykiss (Walbaum, 1792)       
Salmo purpuratus Pallas, 1814
Salmo penshinensis Pallas, 1814
Parasalmo penshinensis (Pallas, 1814)
Salmo gairdnerii Richardson, 1836  <--The one I learned.
Fario gairdneri (Richardson, 1836)
Oncorhynchus gairdnerii (Richardson, 1836)
Salmo gairdnerii gairdnerii Richardson, 1836
Salmo rivularis Ayres, 1855
Salmo iridea Gibbons, 1855
Salmo gairdnerii irideus Gibbons, 1855
Salmo irideus Gibbons, 1855
Trutta iridea (Gibbons, 1855)
Salmo truncatus Suckley, 1859
Salmo masoni Suckley, 1860
Oncorhynchus kamloops Jordan, 1892
Salmo kamloops (Jordan, 1892)
Salmo rivularis kamloops (Jordan, 1892)
Salmo gairdneri shasta Jordan, 1894
Salmo gilberti Jordan, 1894
Salmo nelsoni Evermann, 1908


All the above point to the same fish and have appeared and appear in articles and reports about said fish. Lets not even get into the common name situation:).


Even given that there are models which are made of entities with legitimate attributes which per force define a unique instance, I see no benefit in avoiding the convenience of an arbitrary and simple value for the key.  Is it the overhead of generating and storing one more value per tuple that you can’t abide?






--
Adrian Klaver
adrian.klaver@aklaver.com



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: PostgreSQL Developer Best Practices

From
Berend Tober
Date:
Melvin Davidson wrote:
> 9.
>  >1) What happens if someone mis-types the account-id?
>  >     To correct that, you also need to correct the FK field in the
> other dozen tables.
>  >2) What happens when your company starts a new project (or buys a
>     I would not consider the general use of natural primary keys to be
>     best practice.
>     Let's assume your account_id field is used as a foreign key in a
>     dozen other tables.
>     1) What happens if someone mis-types the account-id?
>           To correct that, you also need to correct the FK field in the
>     other dozen tables.


... ON UPDATE CASCADE ?



Re: PostgreSQL Developer Best Practices

From
Rob Sargent
Date:
On 08/25/2015 09:40 AM, Melvin Davidson wrote:
> Adrian,
>
> Stop being so technical. When we/I speak of natural keys, we are
> talking about the column
> that would NATURALly lend itself as the primary key.
> No one ever said a number is not natural. just that there is no need
> to duplicate uniqueness
> with a separate number.
>
> IOW: If we have an account table, then the account_id or account_no
>      would be the primary key. There is no need to have a separate
> serial id as the primary key.
If I'm following correctly, you're saying that if the definition of the
entity contains and arbitrary unique value then use that. Fine.  I guess
I quibble with the notion of VIN as a "natural" attribute of car.  (I
have no firsthand experience with VINs but I would bet there's
information tucked inside them, which would make me sceptical of using
them :) )


Re: PostgreSQL Developer Best Practices

From
"David G. Johnston"
Date:
On Tue, Aug 25, 2015 at 11:40 AM, Melvin Davidson <melvin6925@gmail.com> wrote:
Consider:
SELECT c.registration_no,
       c.car_make,
       p.part_no
   FROM car c
   JOIN parts p ON ( p.registration_no = c.registration_no)
 WHERE registration_no = <some_var>;
 
 versus:
 SELECT c.registration_no,
       c.car_make,
       p.part_no
   FROM car c
   JOIN parts p ON ( p.id = c.id)
 WHERE registration_no = <some_var>;

 Why join on id when registration_no is better?


​I believe you are mistaken if you think there are absolute rules you can cling to here.  But even then I would lean toward calling primary keys an internal implementation detail that should be under the full control of the database in which they are directly used.  Artifical "natural" keys I would lean toward turning into, possibly unique, attributes.  Inherent "natural"​
 
​keys get some consideration for using directly.

The issue arise more, say, in a many-to-many situation.  Do you define the PK of the linking table as a two-column composite key or do you introduce ​a third, serial, field to stand in for the pair?

David J.

Re: PostgreSQL Developer Best Practices

From
"David G. Johnston"
Date:
On Tue, Aug 25, 2015 at 12:09 PM, Rob Sargent <robjsargent@gmail.com> wrote:
On 08/25/2015 09:40 AM, Melvin Davidson wrote:
Adrian,

Stop being so technical. When we/I speak of natural keys, we are talking about the column
that would NATURALly lend itself as the primary key.
No one ever said a number is not natural. just that there is no need to duplicate uniqueness
with a separate number.

IOW: If we have an account table, then the account_id or account_no
     would be the primary key. There is no need to have a separate serial id as the primary key.
If I'm following correctly, you're saying that if the definition of the entity contains and arbitrary unique value then use that. Fine.  I guess I quibble with the notion of VIN as a "natural" attribute of car.  (I have no firsthand experience with VINs but I would bet there's information tucked inside them, which would make me sceptical of using them :) )


​Yes, the VIN is an encoding in a similar fashion to how Object IDs function in the computer world: http://www.zytrax.com/books/ldap/apa/oid.html

The problem with using a VIN is a combination of usability during manual entry - even with the checksum feature - and the fact that only physically produced vehicles are assigned one but both manufacturers and their dealers end up dealing with the concept of a vehicle before one is ever produced.  Neither are overly problematic but they are annoying enough that usually additional identifiers are constructed an used by the business in order to avoid having to see the VIN as anything other than an attribute.  The length itself is also problematic - 17 characters typically is a bit much when the user likely only care about thousands or tens of thousands of entities at any given time.

David J.


Re: PostgreSQL Developer Best Practices

From
"Joshua D. Drake"
Date:
On 08/25/2015 09:09 AM, Rob Sargent wrote:
> On 08/25/2015 09:40 AM, Melvin Davidson wrote:
>> Adrian,
>>
>> Stop being so technical. When we/I speak of natural keys, we are
>> talking about the column
>> that would NATURALly lend itself as the primary key.
>> No one ever said a number is not natural. just that there is no need
>> to duplicate uniqueness
>> with a separate number.
>>
>> IOW: If we have an account table, then the account_id or account_no
>>      would be the primary key. There is no need to have a separate
>> serial id as the primary key.
> If I'm following correctly, you're saying that if the definition of the
> entity contains and arbitrary unique value then use that. Fine.  I guess
> I quibble with the notion of VIN as a "natural" attribute of car.  (I
> have no firsthand experience with VINs but I would bet there's
> information tucked inside them, which would make me sceptical of using
> them :) )
>

But a VIN is in fact, UNIQUE so it is useful as a PK.

JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


Re: PostgreSQL Developer Best Practices

From
"Karsten Hilbert"
Date:
> No one ever said a number is not natural. just that there is no need to duplicate uniqueness
> with a separate number.

The whole point is that people are telling you that surrogate keys do not _duplicate_ uniqueness but
rather _generate_ it, artificially, and therefore reliably.

Today's external uniqueness is ambiguous tomorrow.

Karsten


Re: PostgreSQL Developer Best Practices

From
Adam Brusselback
Date:
Personally I always set the natural key with a not null and unique constraint, but create an artificial key for it as well.  As an example, if we had a product table, the product_sku is defined as not null with a unique constraint on it, while product_id is the primary key which all other tables reference as a foreign key.

In the case of a many to many situation, I prefer to use a two column composite key.  In the case of a many to many, i've never run into a case where I needed to reference a single row in that table without knowing about both sides of that relation.

Just my $0.02
-Adam

On Tue, Aug 25, 2015 at 12:15 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Tue, Aug 25, 2015 at 11:40 AM, Melvin Davidson <melvin6925@gmail.com> wrote:
Consider:
SELECT c.registration_no,
       c.car_make,
       p.part_no
   FROM car c
   JOIN parts p ON ( p.registration_no = c.registration_no)
 WHERE registration_no = <some_var>;
 
 versus:
 SELECT c.registration_no,
       c.car_make,
       p.part_no
   FROM car c
   JOIN parts p ON ( p.id = c.id)
 WHERE registration_no = <some_var>;

 Why join on id when registration_no is better?


​I believe you are mistaken if you think there are absolute rules you can cling to here.  But even then I would lean toward calling primary keys an internal implementation detail that should be under the full control of the database in which they are directly used.  Artifical "natural" keys I would lean toward turning into, possibly unique, attributes.  Inherent "natural"​
 
​keys get some consideration for using directly.

The issue arise more, say, in a many-to-many situation.  Do you define the PK of the linking table as a two-column composite key or do you introduce ​a third, serial, field to stand in for the pair?

David J.


Re: PostgreSQL Developer Best Practices

From
Melvin Davidson
Date:
Consistency in naming convention.

Good suggestion!

On Tue, Aug 25, 2015 at 12:33 PM, Marc Munro <marc.munro@gmail.com> wrote:
On Sat, 2015-08-22 at 15:15 +0000, Melvin Davidson wrote:

> I've been searching for a "PostgreSQL Developer Best Practices" with not
> much luck,
> so I've started my own. At the risk of stirring up a storm of controversy,
> I would appreciate additional suggestions and feedback.
>

You might add: Create all relation names as plurals.

Or, if your site uses predominantly singular names, make that the
standard.  Consistency within the site is more important than any
dogmatic belief about whether singular or plural forms is better.  If
you don't put it in the standard, someone will eventually create tables
with names that don't gel with everything else.

__
Marc






--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: PostgreSQL Developer Best Practices

From
"Karsten Hilbert"
Date:
> In most cases developers don’t care about index, unique, foreign key, or primary key names (from a coding standpoint)

Until the day they'd like to write a reliable database change script.

(PG's internal conventions for object names _have_ changed over the years)

Karsten


Re: PostgreSQL Developer Best Practices

From
Gavin Flower
Date:
On 26/08/15 03:40, Melvin Davidson wrote:
[...]
> IOW: If we have an account table, then the account_id or account_no
>      would be the primary key. There is no need to have a separate
> serial id as the primary key.
[...]

Account numbers are externally generated, and may potentially change.
Management might suddenly decide that they want to start using the year
the account started as the first 4 digits, or that the branch code
should be reflected in it, or something else.  The database should be
protected from these arbitrary changes.  Hence the account_no is not a
good candidate for a primary key.


Cheers,
Gavin


Re: PostgreSQL Developer Best Practices

From
Adrian Klaver
Date:
On 08/25/2015 08:40 AM, Melvin Davidson wrote:
> Adrian,
>
> Stop being so technical. When we/I speak of natural keys, we are talking
> about the column
> that would NATURALly lend itself as the primary key.

Pretty sure this is a technical list:)

> No one ever said a number is not natural. just that there is no need to
> duplicate uniqueness
> with a separate number.

I would agree, but I have interacted with people, especially PHBes,
where a duplicate 'hidden' key is a life saver. See more below.

>
> IOW: If we have an account table, then the account_id or account_no
>       would be the primary key. There is no need to have a separate
> serial id as the primary key.
>
>       Likewise, if we have a car table, then registration (or
> vehicle_id) is preferred.
>
>       EG: Good
>       CREATE TABLE car
>       (
>        registration_no varchar(30) not null,
>        car_make       varchar(25) not null,
>        model          varchar(15) not null;
>        build_year     date not null;
>        owner          varchar(50),
>        CONSTRAINT car_pk PRIMARY KEY (registration_no)
>        );
>
>       bad
>       CREATE TABLE car
>       (
>        id              serial not null,
>        registration_no varchar(30) not null,
>        car_make       varchar(25) not null,
>        model          varchar(15) not null;
>        build_year     date not null;
>        owner          varchar(50),
>        CONSTRAINT car_pk PRIMARY KEY (id)
>        );
>
> The benefit in avoiding arbitrary and simple values for the key is that
> it makes
> the database design much more logical.
>
> Consider:
> SELECT c.registration_no,
>         c.car_make,
>         p.part_no
>     FROM car c
>     JOIN parts p ON ( p.registration_no = c.registration_no)
>   WHERE registration_no = <some_var>;

Pretty sure parts are not unique to an exact vehicle, unless you are
talking a totally handmade one. They are not even unique to make and
model. As an example, I used to work on Class B Isuzu trucks. These
models(FTR) where also built for Chevrolet as the Forward models. So
right of the bat there where two part numbers for each part, one that
started with 9 if you got it from Chevrolet and one with 11 from Isuzu,
if memory serves. Then Isuzu decided to reorganize their part numbers,
so that introduced another number, all pointing to the exact same part.
Then there where those parts available from the parts houses(NAPA, etc).

Then there was the greenhouse I worked for where we supplied UPC coded
tags for our customers. In the beginning, it was simple, the item
portion of the UPC was unique and with the company prefix served as a
'natural' key for the tags. Then the chain stores we worked with must
have all gone to the same seminar on how to be Walmart and decided they
did not want unique numbers, but UPCs tied to price groups that covered
a variety of plants. Luckily, I was too stupid to know surrogate keys
where bad and had a sequence attached to the tag table. This then became
the tag id and made life a lot easier during the transition. It still
remains there, because people are people and 'natural' tends to be
artificial and transient.

>
>   versus:
>   SELECT c.registration_no,
>         c.car_make,
>         p.part_no
>     FROM car c
>     JOIN parts p ON ( p.id <http://p.id> = c.id <http://c.id>)
>   WHERE registration_no = <some_var>;
>
>   Why join on id when registration_no is better?
>
>
> On Tue, Aug 25, 2015 at 10:17 AM, Adrian Klaver
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     On 08/24/2015 08:44 PM, Rob Sargent wrote:
>
>
>             On Aug 24, 2015, at 6:53 PM, Melvin Davidson
>             <melvin6925@gmail.com <mailto:melvin6925@gmail.com>> wrote:
>
>             You are right, he was probably talking about FK's. I was
>             just so frustrated about people insisting that using "ID" as
>             the primary key in every table is a "good" idea,
>             I didn't bother to reply previously. I stand firm on my
>             belief that the primary key should be something meaningful
>             and NOT "id" just for the sake of having a unique numeric key.
>
>         What, pray tell, is the unique natural key of person in any
>         meaningfully large domain such as state? Certainly not name +
>         birthdate.  Current address isn’t guaranteed. Social isn’t
>         reliable and actually not truly unique.
>
>
>     To add:
>
>     1) Who determined that a number is not natural?
>
>     2) One of the older unique natural keys (genus, species) is not so
>     unique. I am a fisheries biologist by training and in my time the
>     'unique' identifier for various fishes has changed. Now that
>     ichthyologists have discovered DNA testing, it can be expected there
>     will be even more changes. This is even more apparent when you go
>     back in in history. As an example:
>
>     https://en.wikipedia.org/wiki/Rainbow_trout
>
>     Rainbow trout
>
>     Current
>
>     Oncorhynchus mykiss
>
>     Past
>
>     Salmo mykiss Walbaum, 1792
>     Parasalmo mykiss (Walbaum, 1792)
>     Salmo purpuratus Pallas, 1814
>     Salmo penshinensis Pallas, 1814
>     Parasalmo penshinensis (Pallas, 1814)
>     Salmo gairdnerii Richardson, 1836  <--The one I learned.
>     Fario gairdneri (Richardson, 1836)
>     Oncorhynchus gairdnerii (Richardson, 1836)
>     Salmo gairdnerii gairdnerii Richardson, 1836
>     Salmo rivularis Ayres, 1855
>     Salmo iridea Gibbons, 1855
>     Salmo gairdnerii irideus Gibbons, 1855
>     Salmo irideus Gibbons, 1855
>     Trutta iridea (Gibbons, 1855)
>     Salmo truncatus Suckley, 1859
>     Salmo masoni Suckley, 1860
>     Oncorhynchus kamloops Jordan, 1892
>     Salmo kamloops (Jordan, 1892)
>     Salmo rivularis kamloops (Jordan, 1892)
>     Salmo gairdneri shasta Jordan, 1894
>     Salmo gilberti Jordan, 1894
>     Salmo nelsoni Evermann, 1908
>
>
>     All the above point to the same fish and have appeared and appear in
>     articles and reports about said fish. Lets not even get into the
>     common name situation:).
>
>
>         Even given that there are models which are made of entities with
>         legitimate attributes which per force define a unique instance,
>         I see no benefit in avoiding the convenience of an arbitrary and
>         simple value for the key.  Is it the overhead of generating and
>         storing one more value per tuple that you can’t abide?
>
>
>
>
>
>
>     --
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
>
>
>
> --
> *Melvin Davidson*
> I reserve the right to fantasize.  Whether or not you
> wish to share my fantasy is entirely up to you.


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: PostgreSQL Developer Best Practices

From
John R Pierce
Date:
On 8/25/2015 1:42 PM, Gavin Flower wrote:
> Account numbers are externally generated, and may potentially change.
> Management might suddenly decide that they want to start using the
> year the account started as the first 4 digits, or that the branch
> code should be reflected in it, or something else.  The database
> should be protected from these arbitrary changes.  Hence the
> account_no is not a good candidate for a primary key.


such practices would raise total havoc on a traditional paper ledger
accounting system as well as things like pending AR/AP where external
companies will be referencing your account numbers.




--
john r pierce, recycling bits in santa cruz



Re: PostgreSQL Developer Best Practices

From
Adrian Klaver
Date:
On 08/25/2015 01:56 PM, John R Pierce wrote:
> On 8/25/2015 1:42 PM, Gavin Flower wrote:
>> Account numbers are externally generated, and may potentially change.
>> Management might suddenly decide that they want to start using the
>> year the account started as the first 4 digits, or that the branch
>> code should be reflected in it, or something else.  The database
>> should be protected from these arbitrary changes.  Hence the
>> account_no is not a good candidate for a primary key.
>
>
> such practices would raise total havoc on a traditional paper ledger
> accounting system as well as things like pending AR/AP where external
> companies will be referencing your account numbers.

Agreed, but it happens. When Lowes took over a local hardware
chain(Eagles) here in Washington state they moved very quickly on
changing the account numbers. The company I worked for who supplied
Eagles and then Lowes sat on a check for $22,000 that was sent to us in
error because the account numbers got switched. We called them when we
got the check, but it still took them six months to own up to it.

>
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: PostgreSQL Developer Best Practices

From
Gavin Flower
Date:
On 26/08/15 08:56, Adrian Klaver wrote:
> On 08/25/2015 08:40 AM, Melvin Davidson wrote:
>> Adrian,
>>
>> Stop being so technical. When we/I speak of natural keys, we are talking
>> about the column
>> that would NATURALly lend itself as the primary key.
>
> Pretty sure this is a technical list:)
>
Don't let inconvenient facts get in the way of a good argument!  :-)

[...]
> Pretty sure parts are not unique to an exact vehicle, unless you are
> talking a totally handmade one. They are not even unique to make and
> model. As an example, I used to work on Class B Isuzu trucks. These
> models(FTR) where also built for Chevrolet as the Forward models. So
> right of the bat there where two part numbers for each part, one that
> started with 9 if you got it from Chevrolet and one with 11 from
> Isuzu, if memory serves. Then Isuzu decided to reorganize their part
> numbers, so that introduced another number, all pointing to the exact
> same part. Then there where those parts available from the parts
> houses(NAPA, etc).
>
> Then there was the greenhouse I worked for where we supplied UPC coded
> tags for our customers. In the beginning, it was simple, the item
> portion of the UPC was unique and with the company prefix served as a
> 'natural' key for the tags. Then the chain stores we worked with must
> have all gone to the same seminar on how to be Walmart and decided
> they did not want unique numbers, but UPCs tied to price groups that
> covered a variety of plants. Luckily, I was too stupid to
Natural Stupidity??? :-)

(Sorry, couldn't resist!)

> know surrogate keys where bad and had a sequence attached to the tag
> table. This then became the tag id and made life a lot easier during
> the transition. It still remains there, because people are people and
> 'natural' tends to be artificial and transient.
Extremely good examples, I'll bear them in mind - makes me even more
keen on surrogate primary keys.  I'm always very wary when people tell
me some numbering scheme will NEVER change!!!

[...]


Cheers,
Gavin


Re: PostgreSQL Developer Best Practices

From
Gavin Flower
Date:
On 26/08/15 02:17, Adrian Klaver wrote:
[...]
>
> 2) One of the older unique natural keys (genus, species) is not so
> unique. I am a fisheries biologist by training and in my time the
> 'unique' identifier for various fishes has changed. Now that
> ichthyologists have discovered DNA testing, it can be expected there
> will be even more changes. This is even more apparent when you go back
> in in history. As an example:
>
> https://en.wikipedia.org/wiki/Rainbow_trout
>
> Rainbow trout
>
> Current
>
> Oncorhynchus mykiss
>
> Past
>
> Salmo mykiss Walbaum, 1792
> Parasalmo mykiss (Walbaum, 1792)
[...]
>
> Salmo gilberti Jordan, 1894
> Salmo nelsoni Evermann, 1908
>
So you probably need a date stamp so you could record things relating to
the correct name for a given period in a mapping table, and still relate
to the same surrogate key for referencing other tables.

Maybe even worse, is when a species is suddenly found to be 2 or more
distinct species!

Something similar could happen with account numbers: 2 companies with
similar names might be assigned to the same account number, and lots of
transactions recorded before the mistake is discovered. Though obviously
a surrogate key would not give you complete protection from a lot of
work sorting the mess out, but it would probably help!

I read on post a year or 2 back, a guy in Europe had at least 4
different variations on his name depending on the country he was in and
the local language and cultural norms.

When I worked at a freezing works in the 1970's in Auckland, I heard
that the pay roll allowed for over 52 different names per employee (per
year?).  Though, I was never told the maximum name changes ever used.
Essentially management might fire someone, but the union would complain,
and they would be rehired under a different name - so I was told!  So
the correct holiday pay & PAYE tax deductions would still relate to the
same individual no matter how many name changes they had.


Cheers,
Gavin


Re: PostgreSQL Developer Best Practices

From
Adrian Klaver
Date:
On 08/25/2015 02:23 PM, Gavin Flower wrote:
> On 26/08/15 08:56, Adrian Klaver wrote:
>> On 08/25/2015 08:40 AM, Melvin Davidson wrote:
>>> Adrian,
>>>
>>> Stop being so technical. When we/I speak of natural keys, we are talking
>>> about the column
>>> that would NATURALly lend itself as the primary key.
>>
>> Pretty sure this is a technical list:)
>>
> Don't let inconvenient facts get in the way of a good argument!  :-)


>> have all gone to the same seminar on how to be Walmart and decided
>> they did not want unique numbers, but UPCs tied to price groups that
>> covered a variety of plants. Luckily, I was too stupid to
> Natural Stupidity??? :-)

Oh yeah and a long history too, but that needs at least a pitcher of
beer to recount.

>
> (Sorry, couldn't resist!)
>
>> know surrogate keys where bad and had a sequence attached to the tag
>> table. This then became the tag id and made life a lot easier during
>> the transition. It still remains there, because people are people and
>> 'natural' tends to be artificial and transient.
> Extremely good examples, I'll bear them in mind - makes me even more
> keen on surrogate primary keys.  I'm always very wary when people tell
> me some numbering scheme will NEVER change!!!

To add a recent one. My partner Nancy signed up for Medicare last year
to avoid the premium penalty. This year in July she signed up for Social
Security. Turns out, for reasons I do not understand,
CMS(https://www.cms.gov/) changes the Medicare account number at that
point. The reason we even cared is that the billing system thinks she
has two accounts and is double billing. Time on the phone with someone
at CMS was not enlightening. We where told to trust the system and
eventually it will work itself out. Still waiting:(

>
> [...]
>
>
> Cheers,
> Gavin
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: PostgreSQL Developer Best Practices

From
Adrian Klaver
Date:
On 08/25/2015 02:44 PM, Gavin Flower wrote:
> On 26/08/15 02:17, Adrian Klaver wrote:
> [...]
>>
>> 2) One of the older unique natural keys (genus, species) is not so
>> unique. I am a fisheries biologist by training and in my time the
>> 'unique' identifier for various fishes has changed. Now that
>> ichthyologists have discovered DNA testing, it can be expected there
>> will be even more changes. This is even more apparent when you go back
>> in in history. As an example:
>>
>> https://en.wikipedia.org/wiki/Rainbow_trout
>>
>> Rainbow trout
>>
>> Current
>>
>> Oncorhynchus mykiss
>>
>> Past
>>
>> Salmo mykiss Walbaum, 1792
>> Parasalmo mykiss (Walbaum, 1792)
> [...]
>>
>> Salmo gilberti Jordan, 1894
>> Salmo nelsoni Evermann, 1908
>>
> So you probably need a date stamp so you could record things relating to
> the correct name for a given period in a mapping table, and still relate
> to the same surrogate key for referencing other tables.
>
> Maybe even worse, is when a species is suddenly found to be 2 or more
> distinct species!

Funny you should say that. Furry critters instead of slimy:

http://news.nationalgeographic.com/2015/07/150730-jackals-wolves-evolution-new-species-animals-africa/

>
> Something similar could happen with account numbers: 2 companies with
> similar names might be assigned to the same account number, and lots of
> transactions recorded before the mistake is discovered. Though obviously
> a surrogate key would not give you complete protection from a lot of
> work sorting the mess out, but it would probably help!

Or if you have a mortgage with Well Fargo and find your account number
is being used in their agent training program which explains why you
have been receiving all sorts of correspondence saying your account is
in arrears and is facing foreclosure(personal experience).

Bottom line is databases are great and theory is useful, but it all goes
out the window when people start meddling.

>
> I read on post a year or 2 back, a guy in Europe had at least 4
> different variations on his name depending on the country he was in and
> the local language and cultural norms.

I am familiar with that issue.

>
> When I worked at a freezing works in the 1970's in Auckland, I heard
> that the pay roll allowed for over 52 different names per employee (per
> year?).  Though, I was never told the maximum name changes ever used.
> Essentially management might fire someone, but the union would complain,
> and they would be rehired under a different name - so I was told!  So
> the correct holiday pay & PAYE tax deductions would still relate to the
> same individual no matter how many name changes they had.

Or a system I took over where someone had made a natural primary key of
first name, last name and that was all. So you had John Smith, John
Smith2, etc. Poor design obviously, but that stuff is out there.

>
>
> Cheers,
> Gavin
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: PostgreSQL Developer Best Practices

From
Gavin Flower
Date:
NB the attribution colours seems to be mixed up a bit here, but this all
dialogue between me & Adrian.


On 26/08/15 09:48, Adrian Klaver wrote:
> On 08/25/2015 02:23 PM, Gavin Flower wrote:
>> On 26/08/15 08:56, Adrian Klaver wrote:
[...]
>>> have all gone to the same seminar on how to be Walmart and decided
>>> they did not want unique numbers, but UPCs tied to price groups that
>>> covered a variety of plants. Luckily, I was too stupid to
>> Natural Stupidity??? :-)
>
> Oh yeah and a long history too, but that needs at least a pitcher of
> beer to recount.
Well if you're ever in Auckland, I'll shout you a beer!
(We might even put you up for a night or two.)
>
>>
>> (Sorry, couldn't resist!)
>>
>>> know surrogate keys where bad and had a sequence attached to the tag
>>> table. This then became the tag id and made life a lot easier during
>>> the transition. It still remains there, because people are people and
>>> 'natural' tends to be artificial and transient.
>> Extremely good examples, I'll bear them in mind - makes me even more
>> keen on surrogate primary keys.  I'm always very wary when people tell
>> me some numbering scheme will NEVER change!!!
>
> To add a recent one. My partner Nancy signed up for Medicare last year
> to avoid the premium penalty. This year in July she signed up for
> Social Security. Turns out, for reasons I do not understand,
> CMS(https://www.cms.gov/) changes the Medicare account number at that
> point. The reason we even cared is that the billing system thinks she
> has two accounts and is double billing. Time on the phone with someone
> at CMS was not enlightening. We where told to trust the system and
> eventually it will work itself out. Still waiting:(

STOP IT!!!  You're making me even more cynical and paranoid!  :-)



Re: PostgreSQL Developer Best Practices

From
Neil Tiffin
Date:
> On Aug 25, 2015, at 1:38 PM, Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote:
>
>> In most cases developers don’t care about index, unique, foreign key, or primary key names (from a coding
standpoint)
>
> Until the day they’d like to write a reliable database change script.

Not sure I understand.  Once the object is created the name is set, it does not change, so I don’t understand why it is
notpossible to write a reliable database change script.  Dump and restore maintain the name. Of course every project
hasperiodic scripts that need to run, so these objects would, if they are dropped or manipulated in the script, have to
bemanually named, especially during development since the whole database might be dropped and recreated multiple times.
My original comment included that situation. My projects typically have many, many objects that once created are not
referredto again, unless a DBA is doing some tuning or troubleshooting.  In that case, the DBA just looks up the name. 

I can see if say 2 years later you want to create a development database from the original SQL that generated the
originaltable definitions that could be problematic.  But I always have used the current definitions not the original
andthose can be exported with the current names. 

It just seems like busy work to me, but I would love to be enlightened.

Neil



Re: PostgreSQL Developer Best Practices

From
Jerry Sievers
Date:
Adrian Klaver <adrian.klaver@aklaver.com> writes:

> On 08/25/2015 01:56 PM, John R Pierce wrote:
>
>> On 8/25/2015 1:42 PM, Gavin Flower wrote:
>>> Account numbers are externally generated, and may potentially change.
>>> Management might suddenly decide that they want to start using the
>>> year the account started as the first 4 digits, or that the branch
>>> code should be reflected in it, or something else.  The database
>>> should be protected from these arbitrary changes.  Hence the
>>> account_no is not a good candidate for a primary key.
>>
>>
>> such practices would raise total havoc on a traditional paper ledger
>> accounting system as well as things like pending AR/AP where external
>> companies will be referencing your account numbers.
>
> Agreed, but it happens. When Lowes took over a local hardware
> chain(Eagles) here in Washington state they moved very quickly on
> changing the account numbers. The company I worked for who supplied
> Eagles and then Lowes sat on a check for $22,000 that was sent to us
> in error because the account numbers got switched. We called them when
> we got the check, but it still took them six months to own up to it.

DOH!

Next time a screwball outfit sends you a check for $22k erroneously just
go deposit it :-)

> --
> Adrian Klaver
> adrian.klaver@aklaver.com

--
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consulting@comcast.net
p: 312.241.7800


Re: PostgreSQL Developer Best Practices

From
Adrian Klaver
Date:
On 08/25/2015 04:23 PM, Jerry Sievers wrote:
> Adrian Klaver <adrian.klaver@aklaver.com> writes:
>
>> On 08/25/2015 01:56 PM, John R Pierce wrote:
>>
>>> On 8/25/2015 1:42 PM, Gavin Flower wrote:
>>>> Account numbers are externally generated, and may potentially change.
>>>> Management might suddenly decide that they want to start using the
>>>> year the account started as the first 4 digits, or that the branch
>>>> code should be reflected in it, or something else.  The database
>>>> should be protected from these arbitrary changes.  Hence the
>>>> account_no is not a good candidate for a primary key.
>>>
>>>
>>> such practices would raise total havoc on a traditional paper ledger
>>> accounting system as well as things like pending AR/AP where external
>>> companies will be referencing your account numbers.
>>
>> Agreed, but it happens. When Lowes took over a local hardware
>> chain(Eagles) here in Washington state they moved very quickly on
>> changing the account numbers. The company I worked for who supplied
>> Eagles and then Lowes sat on a check for $22,000 that was sent to us
>> in error because the account numbers got switched. We called them when
>> we got the check, but it still took them six months to own up to it.
>
> DOH!
>
> Next time a screwball outfit sends you a check for $22k erroneously just
> go deposit it :-)

Well that is what I wanted to do, the owner overruled me:(. Something
about Lowes having more lawyers then we did. The strange part was we
called them and told them what had happened and supplied the relevant
information that explained the mix up. You would have thought us calling
to return a check that was supposed to be to us would have raised a flag!

>
>> --
>> Adrian Klaver
>> adrian.klaver@aklaver.com
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: PostgreSQL Developer Best Practices

From
Melvin Davidson
Date:
I think a lot of people here are missing the point. I was trying to give examples of natural keys, but a lot of people are taking great delight
in pointing out exceptions to examples, rather than understanding the point.
So for the sake of argument, a natural key is something that in itself is unique and the possibility of a duplicate does not exist.
Before ANYONE continues to insist that a serial id column is good, consider the case where the number of tuples will exceed a bigint.
Don't say it cannot happen, because it can.
However, if you have an alphanumeric field, let's say varchar 50, and it's guaranteed that it will never have a duplicate, then THAT is a natural primary
key and beats the hell out of a generic "id" field.

Further to the point, since I started this thread, I am holding to it and will not discuss "natural primary keys" any further.

Other suggestions for good PostgreSQL Developer database (not web app) guidelines are still welcome.

On Tue, Aug 25, 2015 at 7:34 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 08/25/2015 04:23 PM, Jerry Sievers wrote:
Adrian Klaver <adrian.klaver@aklaver.com> writes:

On 08/25/2015 01:56 PM, John R Pierce wrote:

On 8/25/2015 1:42 PM, Gavin Flower wrote:
Account numbers are externally generated, and may potentially change.
Management might suddenly decide that they want to start using the
year the account started as the first 4 digits, or that the branch
code should be reflected in it, or something else.  The database
should be protected from these arbitrary changes.  Hence the
account_no is not a good candidate for a primary key.


such practices would raise total havoc on a traditional paper ledger
accounting system as well as things like pending AR/AP where external
companies will be referencing your account numbers.

Agreed, but it happens. When Lowes took over a local hardware
chain(Eagles) here in Washington state they moved very quickly on
changing the account numbers. The company I worked for who supplied
Eagles and then Lowes sat on a check for $22,000 that was sent to us
in error because the account numbers got switched. We called them when
we got the check, but it still took them six months to own up to it.

DOH!

Next time a screwball outfit sends you a check for $22k erroneously just
go deposit it :-)

Well that is what I wanted to do, the owner overruled me:(. Something about Lowes having more lawyers then we did. The strange part was we called them and told them what had happened and supplied the relevant information that explained the mix up. You would have thought us calling to return a check that was supposed to be to us would have raised a flag!


--
Adrian Klaver
adrian.klaver@aklaver.com



--
Adrian Klaver
adrian.klaver@aklaver.com



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: PostgreSQL Developer Best Practices

From
Gavin Flower
Date:
On 26/08/15 11:34, Adrian Klaver wrote:
[...]
>>>
>>> Agreed, but it happens. When Lowes took over a local hardware
>>> chain(Eagles) here in Washington state they moved very quickly on
>>> changing the account numbers. The company I worked for who supplied
>>> Eagles and then Lowes sat on a check for $22,000 that was sent to us
>>> in error because the account numbers got switched. We called them when
>>> we got the check, but it still took them six months to own up to it.
>>
>> DOH!
>>
>> Next time a screwball outfit sends you a check for $22k erroneously just
>> go deposit it :-)
>
> Well that is what I wanted to do, the owner overruled me:(. Something
> about Lowes having more lawyers then we did. The strange part was we
> called them and told them what had happened and supplied the relevant
> information that explained the mix up. You would have thought us
> calling to return a check that was supposed to be to us would have
> raised a flag!
[...]

Many years ago a department store credited our account with a refund for
about $150, which obviously was not meant for us - we had never bought
that item!

They replied, essentially saying we we were due the refund.

We sent a letter yet again, explaining the problem, and saying we were
not entitled.

They then sent used a check for the amount, which we cashed, feeling we
had done our best to help them & that we could not be bothered wasting
more time trying to sort things out for them!

I think any judge would laugh them out of court!




Re: PostgreSQL Developer Best Practices

From
Adrian Klaver
Date:
On 08/25/2015 05:17 PM, Melvin Davidson wrote:
> I think a lot of people here are missing the point. I was trying to give
> examples of natural keys, but a lot of people are taking great delight
> in pointing out exceptions to examples, rather than understanding the point.
> So for the sake of argument, a natural key is something that in itself
> is unique and the possibility of a duplicate does not exist.
> Before ANYONE continues to insist that a serial id column is good,
> consider the case where the number of tuples will exceed a bigint.
> Don't say it cannot happen, because it can.
> However, if you have an alphanumeric field, let's say varchar 50, and
> it's guaranteed that it will never have a duplicate, then THAT is a
> natural primary

That is a big IF and a guarantee I would not put money on.

> key and beats the hell out of a generic "id" field.
>
> Further to the point, since I started this thread, I am holding to it
> and will not discuss "natural primary keys" any further.
>



--
Adrian Klaver
adrian.klaver@aklaver.com


Re: PostgreSQL Developer Best Practices

From
Adrian Klaver
Date:
On 08/25/2015 05:21 PM, Gavin Flower wrote:
> On 26/08/15 11:34, Adrian Klaver wrote:
> [...]
>>>>
>>>> Agreed, but it happens. When Lowes took over a local hardware
>>>> chain(Eagles) here in Washington state they moved very quickly on
>>>> changing the account numbers. The company I worked for who supplied
>>>> Eagles and then Lowes sat on a check for $22,000 that was sent to us
>>>> in error because the account numbers got switched. We called them when
>>>> we got the check, but it still took them six months to own up to it.
>>>
>>> DOH!
>>>
>>> Next time a screwball outfit sends you a check for $22k erroneously just
>>> go deposit it :-)
>>
>> Well that is what I wanted to do, the owner overruled me:(. Something
>> about Lowes having more lawyers then we did. The strange part was we
>> called them and told them what had happened and supplied the relevant
>> information that explained the mix up. You would have thought us
>> calling to return a check that was supposed to be to us would have
>> raised a flag!
> [...]
>
> Many years ago a department store credited our account with a refund for
> about $150, which obviously was not meant for us - we had never bought
> that item!
>
> They replied, essentially saying we we were due the refund.
>
> We sent a letter yet again, explaining the problem, and saying we were
> not entitled.
>
> They then sent used a check for the amount, which we cashed, feeling we
> had done our best to help them & that we could not be bothered wasting
> more time trying to sort things out for them!

I have seen this enough to think either companies would rather take the
loss then admit the mistake or employees don't care because it is not
their money.

>
> I think any judge would laugh them out of court!
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: PostgreSQL Developer Best Practices

From
"David G. Johnston"
Date:
On Tuesday, August 25, 2015, Melvin Davidson <melvin6925@gmail.com> wrote:
Before ANYONE continues to insist that a serial id column is good, consider the case where the number of tuples will exceed a bigint.
Don't say it cannot happen, because it can.
However, if you have an alphanumeric field, let's say varchar 50, and it's guaranteed that it will never have a duplicate, then THAT is a natural primary
key and beats the hell out of a generic "id" field.
 
Except for it being fatter.  400 bits wide instead of 64.  But that, too, is simply another consideration to evaluate.

David J.

Re: PostgreSQL Developer Best Practices

From
rob stone
Date:
On Tue, 2015-08-25 at 20:17 -0400, Melvin Davidson wrote:
> I think a lot of people here are missing the point. I was trying to
> give examples of natural keys, but a lot of people are taking great
> delight
> in pointing out exceptions to examples, rather than understanding the
> point.
> So for the sake of argument, a natural key is something that in
> itself is unique and the possibility of a duplicate does not exist.
> Before ANYONE continues to insist that a serial id column is good,
> consider the case where the number of tuples will exceed a bigint.
> Don't say it cannot happen, because it can.
> However, if you have an alphanumeric field, let's say varchar 50, and
> it's guaranteed that it will never have a duplicate, then THAT is a
> natural primary
> key and beats the hell out of a generic "id" field.
>
> Further to the point, since I started this thread, I am holding to it
> and will not discuss "natural primary keys" any further.
>
> Other suggestions for good PostgreSQL Developer database (not web
> app) guidelines are still welcome.
>

Funny how Melvin's attempt to bring order to the chaos ended up as a
discussion about primary keys.

We once hired a "genius" to design an application to handle fixed
assets. Every table had a primary key named "id". Some were integer and
some were character. So the foreign key columns in child tables had to
be named differently. Writing the joins was complex.

I also know of an airline reservation system where you are unable to
alter your e-mail address. It apparently needs a DBA type person to
make the change. I can only guess that your e-mail address is used as a
foreign key in one or more tables. As well as assigning you a frequent
flyer number they also assign another integer identifier. A bit of
common sense goes a long way when designing an application.

Cheers,
rob



Re: PostgreSQL Developer Best Practices

From
Allan Kamau
Date:
On Wed, Aug 26, 2015 at 5:23 AM, rob stone <floriparob@gmail.com> wrote:
On Tue, 2015-08-25 at 20:17 -0400, Melvin Davidson wrote:
> I think a lot of people here are missing the point. I was trying to
> give examples of natural keys, but a lot of people are taking great
> delight
> in pointing out exceptions to examples, rather than understanding the
> point.
> So for the sake of argument, a natural key is something that in
> itself is unique and the possibility of a duplicate does not exist.
> Before ANYONE continues to insist that a serial id column is good,
> consider the case where the number of tuples will exceed a bigint.
> Don't say it cannot happen, because it can.
> However, if you have an alphanumeric field, let's say varchar 50, and
> it's guaranteed that it will never have a duplicate, then THAT is a
> natural primary
> key and beats the hell out of a generic "id" field.
>
> Further to the point, since I started this thread, I am holding to it
> and will not discuss "natural primary keys" any further.
>
> Other suggestions for good PostgreSQL Developer database (not web
> app) guidelines are still welcome.
>

Funny how Melvin's attempt to bring order to the chaos ended up as a
discussion about primary keys.

We once hired a "genius" to design an application to handle fixed
assets. Every table had a primary key named "id". Some were integer and
some were character. So the foreign key columns in child tables had to
be named differently. Writing the joins was complex.

I also know of an airline reservation system where you are unable to
alter your e-mail address. It apparently needs a DBA type person to
make the change. I can only guess that your e-mail address is used as a
foreign key in one or more tables. As well as assigning you a frequent
flyer number they also assign another integer identifier. A bit of
common sense goes a long way when designing an application.

Cheers,
rob



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



I am in favour of using BIGINT "id" for the primary key in each table I create.
I found out that in the fields in my tables that I thought would be unique end up not being so in the longer term.
Also these values may need to be updated for some reason.

I have been using PRIMARY KEY(id) where id is of type BIGINT on each table I create.
I use a sequence to provide a default value to this field.
I create one such sequence DB object per table and the use it in the table definition.
For example if I have a sequenceDB "some_schema.some_table_seq" for table "some_schema.some_table".
In the table definition of "some_schema.some_table" I have the field "id" as follows.

id BIGINT NOT NULL DEFAULT NEXTVAL('some_schema.some_table_seq')

When I use this "id" field as a foreign key in another table, I would prefix it with the name of its parent table followed by a couple of underscores as shown below.
FOREIGN KEY(some_table__id)REFERENCES some_schema.some_table(id)ON UPDATE CASCADE ON DELETE CASCADE

For the composite keys that are unique (for now) I create a unique constraint.

Allan.

Re: PostgreSQL Developer Best Practices

From
Condor
Date:
On 26-08-2015 10:13, Allan Kamau wrote:
> On Wed, Aug 26, 2015 at 5:23 AM, rob stone <floriparob@gmail.com>
> wrote:
>
>> On Tue, 2015-08-25 at 20:17 -0400, Melvin Davidson wrote:
>>> I think a lot of people here are missing the point. I was trying
>> to
>>> give examples of natural keys, but a lot of people are taking
>> great
>>> delight
>>> in pointing out exceptions to examples, rather than understanding
>> the
>>> point.
>>> So for the sake of argument, a natural key is something that in
>>> itself is unique and the possibility of a duplicate does not
>> exist.
>>> Before ANYONE continues to insist that a serial id column is good,
>>> consider the case where the number of tuples will exceed a bigint.
>>> Don't say it cannot happen, because it can.
>>> However, if you have an alphanumeric field, let's say varchar 50,
>> and
>>> it's guaranteed that it will never have a duplicate, then THAT is
>> a
>>> natural primary
>>> key and beats the hell out of a generic "id" field.
>>>
>>> Further to the point, since I started this thread, I am holding to
>> it
>>> and will not discuss "natural primary keys" any further.
>>>
>>> Other suggestions for good PostgreSQL Developer database (not web
>>> app) guidelines are still welcome.
>>>
>>
>> Funny how Melvin's attempt to bring order to the chaos ended up as a
>> discussion about primary keys.
>>
>> We once hired a "genius" to design an application to handle fixed
>> assets. Every table had a primary key named "id". Some were integer
>> and
>> some were character. So the foreign key columns in child tables had
>> to
>> be named differently. Writing the joins was complex.
>>
>> I also know of an airline reservation system where you are unable to
>> alter your e-mail address. It apparently needs a DBA type person to
>> make the change. I can only guess that your e-mail address is used
>> as a
>> foreign key in one or more tables. As well as assigning you a
>> frequent
>> flyer number they also assign another integer identifier. A bit of
>> common sense goes a long way when designing an application.
>>
>> Cheers,
>> rob
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general [1]
>
> I am in favour of using BIGINT "id" for the primary key in each table
> I create.
> I found out that in the fields in my tables that I thought would be
> unique end up not being so in the longer term.
> Also these values may need to be updated for some reason.
>
> I have been using PRIMARY KEY(id) where id is of type BIGINT on each
> table I create.
> I use a sequence to provide a default value to this field.
> I create one such sequence DB object per table and the use it in the
> table definition.
> For example if I have a sequenceDB "some_schema.some_table_seq" for
> table "some_schema.some_table".
> In the table definition of "some_schema.some_table" I have the field
> "id" as follows.
>
> id BIGINT NOT NULL DEFAULT NEXTVAL('some_schema.some_table_seq')
>
> When I use this "id" field as a foreign key in another table, I would
> prefix it with the name of its parent table followed by a couple of
> underscores as shown below.
> FOREIGN KEY(some_table__id)REFERENCES some_schema.some_table(id)ON
> UPDATE CASCADE ON DELETE CASCADE
>
> For the composite keys that are unique (for now) I create a unique
> constraint.
>
> Allan.


I recall the words of my professor at last lecture of Databases was
telling us that model of thinking as he told: nomenclature is wrong and
not good and we should avoid it in any cost if we can.


Cheers,
Hristo




Re: PostgreSQL Developer Best Practices

From
John Turner
Date:
On Tue, 25 Aug 2015 18:57:28 -0400, Neil Tiffin <neilt@neiltiffin.com>
wrote:

>
>> On Aug 25, 2015, at 1:38 PM, Karsten Hilbert <Karsten.Hilbert@gmx.net>
>> wrote:
>>
>>> In most cases developers don’t care about index, unique, foreign key,
>>> or primary key names (from a coding standpoint)
>>
>> Until the day they’d like to write a reliable database change script.
>
> Not sure I understand.  Once the object is created the name is set, it
> does not change, so I don’t understand why it is not possible to write a
> reliable database change script.  Dump and restore maintain the name. Of
> course every project has periodic scripts that need to run, so these
> objects would, if they are dropped or manipulated in the script, have to
> be manually named, especially during development since the whole
> database might be dropped and recreated multiple times.  My original
> comment included that situation. My projects typically have many, many
> objects that once created are not referred to again, unless a DBA is
> doing some tuning or troubleshooting.  In that case, the DBA just looks
> up the name.
>
> I can see if say 2 years later you want to create a development database
> from the original SQL that generated the original table definitions that
> could be problematic.  But I always have used the current definitions
> not the original and those can be exported with the current names.
>
> It just seems like busy work to me, but I would love to be enlightened.
>
> Neil

I suspect he's alluding to migration scripts from an ORM - which are
typically scaffolded with boilerplate, but almost invariably need to be
tweaked in order to effect the desired changes in the database..

- John


Re: PostgreSQL Developer Best Practices

From
Karsten Hilbert
Date:
On Wed, Aug 26, 2015 at 09:04:08AM -0400, John Turner wrote:

> >>>In most cases developers don’t care about index, unique, foreign key,
> >>>or primary key names (from a coding standpoint)
> >>
> >>Until the day they’d like to write a reliable database change script.
> >
> >Not sure I understand.  Once the object is created the name is set, it
> >does not change, so I don’t understand why it is not possible to write a
> >reliable database change script.  Dump and restore maintain the name. Of
> >course every project has periodic scripts that need to run, so these
> >objects would, if they are dropped or manipulated in the script, have to
> >be manually named, especially during development since the whole database
> >might be dropped and recreated multiple times.  My original comment
> >included that situation. My projects typically have many, many objects
> >that once created are not referred to again, unless a DBA is doing some
> >tuning or troubleshooting.  In that case, the DBA just looks up the name.
> >
> >I can see if say 2 years later you want to create a development database
> >from the original SQL that generated the original table definitions that
> >could be problematic.  But I always have used the current definitions not
> >the original and those can be exported with the current names.
> >
> >It just seems like busy work to me, but I would love to be enlightened.
>
> I suspect he's alluding to migration scripts from an ORM

Not in the least.

    https://github.com/ncqgm/gnumed/tree/master/gnumed/gnumed/server/sql

Karsten
--
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346


Re: PostgreSQL Developer Best Practices

From
Igor Neyman
Date:

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Melvin Davidson
Sent: Tuesday, August 25, 2015 8:18 PM
To: Adrian Klaver <adrian.klaver@aklaver.com>
Cc: Jerry Sievers <gsievers19@comcast.net>; John R Pierce <pierce@hogranch.com>; pgsql-general@postgresql.org
Subject: Re: [GENERAL] PostgreSQL Developer Best Practices

 

….

Before ANYONE continues to insist that a serial id column is good, consider the case where the number of tuples will exceed a bigint.

Don't say it cannot happen, because it can.

……………………

Melvin Davidson

 

Now, it’s easy to overcome this limitation.

You just make concatenated PK (id1, id2) with both columns of BIGINT type.

 

In general, I see the main advantage of artificial PK in NO NEED to change multiple child tables, when NATURAL key changes in the parent table.  And I never saw a system where NATURAL key wouldn’t need to be changed eventually.

So, my conclusion: use artificial PK (for db convenience)  and unique NATURAL key (for GUI representation).

 

Regards,

Igor Neyman

 

Re: PostgreSQL Developer Best Practices

From
"David G. Johnston"
Date:
On Wed, Aug 26, 2015 at 9:45 AM, Igor Neyman <ineyman@perceptron.com> wrote:

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Melvin Davidson
Sent: Tuesday, August 25, 2015 8:18 PM
To: Adrian Klaver <adrian.klaver@aklaver.com>
Cc: Jerry Sievers <gsievers19@comcast.net>; John R Pierce <pierce@hogranch.com>; pgsql-general@postgresql.org
Subject: Re: [GENERAL] PostgreSQL Developer Best Practices

 

….

Before ANYONE continues to insist that a serial id column is good, consider the case where the number of tuples will exceed a bigint.

Don't say it cannot happen, because it can.

……………………

Melvin Davidson

 

Now, it’s easy to overcome this limitation.

You just make concatenated PK (id1, id2) with both columns of BIGINT type.

 


​Easy, yes, but at this point I'd probably resort to converting to a length-limited text field (so as ensure toasting never occurs).​

In general, I see the main advantage of artificial PK in NO NEED to change multiple child tables, when NATURAL key changes in the parent table.  And I never saw a system where NATURAL key wouldn’t need to be changed eventually.

So, my conclusion: use artificial PK (for db convenience)  and unique NATURAL key (for GUI representation).



​I haven't really had a chance to implement this formally but I've had similar thoughts along these lines.  One nice thing about this, in theory, is that you can have a different lifecycle and usage policy for those GUI identifiers and they can be made to be inherently changeable.  A unique tag that you can remove from one entity and reuse on a different one should the need arise.

David J.

Re: PostgreSQL Developer Best Practices

From
"Joshua D. Drake"
Date:
On 08/25/2015 05:28 PM, Adrian Klaver wrote:
> On 08/25/2015 05:17 PM, Melvin Davidson wrote:
>> I think a lot of people here are missing the point. I was trying to give
>> examples of natural keys, but a lot of people are taking great delight
>> in pointing out exceptions to examples, rather than understanding the
>> point.
>> So for the sake of argument, a natural key is something that in itself
>> is unique and the possibility of a duplicate does not exist.

Correct.

>> Before ANYONE continues to insist that a serial id column is good,
>> consider the case where the number of tuples will exceed a bigint.
>> Don't say it cannot happen, because it can.

Yes it can.


>> However, if you have an alphanumeric field, let's say varchar 50, and
>> it's guaranteed that it will never have a duplicate, then THAT is a
>> natural primary

Wrong. Refer back to your above definition. It is definitely possible,
based on a varchar(50) that a duplicate will happen. A better definition
would be something along the lines of:

A natural key is distinct and is derived from the data being stored.

>
> That is a big IF and a guarantee I would not put money on.

Right, here is a perfect example. Generally speaking if you are storing
a United States company's information, a natural primary key could be an
FEIN. However, there is an exception that would have to be incorporated
into that idea. If the company is a Sole Proprietorship the FEIN may
actually be the SSN of the owner, but not necessarily. Then you have to
ask yourself if that matters. It may not depending on the application
you are building or the reason the data is being stored.

>
>> key and beats the hell out of a generic "id" field.
>>
>> Further to the point, since I started this thread, I am holding to it
>> and will not discuss "natural primary keys" any further.

That doesn't mean others won't.

JD



--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


Re: PostgreSQL Developer Best Practices

From
"Daniel Verite"
Date:
    Melvin Davidson wrote:

> Before ANYONE continues to insist that a serial id column is good, consider
> the case where the number of tuples will exceed a bigint.
> Don't say it cannot happen, because it can.

In practice, it cannot happen.

A tuple with a bigint column weighs at least 32 bytes (in the sense
that it's what pg_column_size("table".*) reports when "table" has
only a bigint column).

So the size of your hypothetical table would be at a minimum
32 bytes *  2^63 tuples = 2^68 bytes

But a postgres table size weighs 32TB max, or 2^45 bytes
(see http://www.postgresql.org/about/ )

So the table with more rows than a bigint can count would have to be
2^23 (=8388608) times bigger than the biggest possible table.

Also there's the fact that COUNT() returns a BIGINT, so the tuples
couldn't be counted in SQL. That by itself hints at the fact that counts of
tuples are expected to always fit in BIGINT these days.

Also what about pg_database_size() returning a bigint?

Even if the hypothetical table was alone in the database, and even if every
tuple occupied only 1 byte instead of 32+,  the single table would exceed
what pg_database_size() can report.

Maybe at some point all these will be 128 bits, but that's years ahead.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


Re: PostgreSQL Developer Best Practices

From
Gavin Flower
Date:
On 26/08/15 12:17, Melvin Davidson wrote:
[...]
> So for the sake of argument, a natural key is something that in itself
> is unique and the possibility of a duplicate does not exist.
> Before ANYONE continues to insist that a serial id column is good,
> consider the case where the number of tuples will exceed a bigint.
> Don't say it cannot happen, because it can.
>
If you create tuples in your table at the rate of a million records per
second every hour of every day of every year, it will take over 250,000
years to exceed the value of a bigint!

[...]
>
>
> --
> *Melvin Davidson*
> I reserve the right to fantasize.  Whether or not you
> wish to share my fantasy is entirely up to you.
I will refrain from marking snarky comments about your sig - tempting
though it might be!  :-)