Thread: Allow disabling folding of unquoted identifiers to lowercase

Allow disabling folding of unquoted identifiers to lowercase

From
Evgeny Morozov
Date:
It would be great if Postgres had a server setting that allowed the automatic folding of identifiers to lowercase to be disabled, so that camel case identifiers could be used without having to quote every single identifier, i.e.

SELECT MyColumn FROM MyTable ORDER BY MyColumn

instead of

SELECT "MyColumn" FROM "MyTable" ORDER BY "MyColumn"

I understand that Postgres has a well-established naming convention and I'm certainly not suggesting changing the default behaviour, but having this as an option would remove a major pain point for users migrating from another RDBMS, like MS SQL Server.

My company is looking into doing this. Currently our table and column names exactly match our class and property names, which are in camel case. MSSQL supports this just fine. To move to Postgres we would have to either quote *everything* or translate names back-and-forth between code and database. Both options are OK for auto-generated SQL, but we also have many users writing ad-hoc SQL queries. Having to quote everything would have those users screaming to switch back to MSSQL very quickly! That leaves us with the mapping approach, which is doable, but also a constant "mental speedbump" at best.

Re: Allow disabling folding of unquoted identifiers to lowercase

From
John R Pierce
Date:
On 4/29/2016 10:21 AM, Evgeny Morozov wrote:
It would be great if Postgres had a server setting that allowed the automatic folding of identifiers to lowercase to be disabled, so that camel case identifiers could be used without having to quote every single identifier, i.e.

SELECT MyColumn FROM MyTable ORDER BY MyColumn

instead of

SELECT "MyColumn" FROM "MyTable" ORDER BY "MyColumn"



I suspect this would be painful for the parser, unless you also enforced that all SQL keywords were in a specific case (all lower would be the minimal impact to the code).   otherwise the parser would have to lower() every token to check to see if its a keyword, but if not, revert it to its original case.



-- 
john r pierce, recycling bits in santa cruz

Re: Allow disabling folding of unquoted identifiers to lowercase

From
John McKown
Date:
On Fri, Apr 29, 2016 at 2:44 PM, John R Pierce <pierce@hogranch.com> wrote:
On 4/29/2016 10:21 AM, Evgeny Morozov wrote:
It would be great if Postgres had a server setting that allowed the automatic folding of identifiers to lowercase to be disabled, so that camel case identifiers could be used without having to quote every single identifier, i.e.

SELECT MyColumn FROM MyTable ORDER BY MyColumn

instead of

SELECT "MyColumn" FROM "MyTable" ORDER BY "MyColumn"

I suspect this would be painful for the parser, unless you also enforced that all SQL keywords were in a specific case (all lower would be the minimal impact to the code).   otherwise the parser would have to lower() every token to check to see if its a keyword, but if not, revert it to its original case.


​Why? PostgreSQL is written in C. So use strncasecmp() instead of strncmp() or strcasecmp() instead of strcmp() to test for a token.
-- 
john r pierce, recycling bits in santa cruz



--
The unfacts, did we have them, are too imprecisely few to warrant our certitude.

Maranatha! <><
John McKown

Re: Allow disabling folding of unquoted identifiers to lowercase

From
John R Pierce
Date:
On 4/29/2016 12:56 PM, John McKown wrote:

I suspect this would be painful for the parser, unless you also enforced that all SQL keywords were in a specific case (all lower would be the minimal impact to the code).   otherwise the parser would have to lower() every token to check to see if its a keyword, but if not, revert it to its original case.


​ Why? PostgreSQL is written in C. So use strncasecmp() instead of strncmp() or strcasecmp() instead of strcmp() to test for a token.

are those the APIs the parser uses?


-- 
john r pierce, recycling bits in santa cruz

Re: Allow disabling folding of unquoted identifiers to lowercase

From
John McKown
Date:
On Fri, Apr 29, 2016 at 3:38 PM, John R Pierce <pierce@hogranch.com> wrote:
On 4/29/2016 12:56 PM, John McKown wrote:

I suspect this would be painful for the parser, unless you also enforced that all SQL keywords were in a specific case (all lower would be the minimal impact to the code).   otherwise the parser would have to lower() every token to check to see if its a keyword, but if not, revert it to its original case.


​ Why? PostgreSQL is written in C. So use strncasecmp() instead of strncmp() or strcasecmp() instead of strcmp() to test for a token.

are those the APIs the parser uses?

​Ah. Good point. I ASSuMEd that what any C language program would use. IIRC, PostgreSQL uses Bison and Flex to generate the parsing grammars, and I don't know what those products output.​

 


-- 
john r pierce, recycling bits in santa cruz



--
The unfacts, did we have them, are too imprecisely few to warrant our certitude.

Maranatha! <><
John McKown

Re: Allow disabling folding of unquoted identifiers to lowercase

From
John McKown
Date:
On Fri, Apr 29, 2016 at 3:38 PM, John R Pierce <pierce@hogranch.com> wrote:
On 4/29/2016 12:56 PM, John McKown wrote:

I suspect this would be painful for the parser, unless you also enforced that all SQL keywords were in a specific case (all lower would be the minimal impact to the code).   otherwise the parser would have to lower() every token to check to see if its a keyword, but if not, revert it to its original case.


​ Why? PostgreSQL is written in C. So use strncasecmp() instead of strncmp() or strcasecmp() instead of strcmp() to test for a token.

are those the APIs the parser uses?

​Did a quick check of the .c files in the backend/parser directory source code. All the programs use the, oldest, strcmp() function. Too bad.​ I'm not a C expert, nor extremely familiar with the PostgreSQL source code. But doing some scans, I see the use of a function called "ScanKeywordLookup" and it does a case insensitive search "the hard way". The comments indicate that this is due the the SQL standard requiring proper use of Unicode translation:
comment:
/*
         * Apply an ASCII-only downcasing.  We must not use tolower() since it may
         * produce the wrong translation in some locales (eg, Turkish).
         */

​Oh well, it's been interesting, but I don't think that we'll come to a resolution for the OP on this issue. I just blame both PostgreSQL and MySQL for this problem because the SQL standard says that the names are automatically UPPERCASEd unless enclosed in quotes. Not lower cased as PostgreSQL does it, nor unchanged as MySQL does it.​


--
The unfacts, did we have them, are too imprecisely few to warrant our certitude.

Maranatha! <><
John McKown

Re: Allow disabling folding of unquoted identifiers to lowercase

From
Tom Lane
Date:
John McKown <john.archie.mckown@gmail.com> writes:
> ​Oh well, it's been interesting, but I don't think that we'll come to a
> resolution for the OP on this issue. I just blame both PostgreSQL and MySQL
> for this problem because the SQL standard says that the names are
> automatically UPPERCASEd unless enclosed in quotes. Not lower cased as
> PostgreSQL does it, nor unchanged as MySQL does it.​

Yeah, this isn't going to happen.  Years ago we looked into what it would
take to be able to flip a switch and have the standard-compliant behavior
(fold to upper not lower).  It was impractical then and no doubt is far
more so now.  I do not remember all the details, but there were multiple
pain points even in terms of server-side implementation, never mind all
the applications we'd break.

What the OP is asking for doesn't even have the argument "but it's
standards compliant!" going for it.  So I doubt we'd accept such a patch
even if someone managed to create one.

            regards, tom lane


Re: Allow disabling folding of unquoted identifiers to lowercase

From
Evgeny Morozov
Date:
On 30 April 2016 at 01:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yeah, this isn't going to happen.  Years ago we looked into what it would
take to be able to flip a switch and have the standard-compliant behavior
(fold to upper not lower).  It was impractical then and no doubt is far
more so now.  I do not remember all the details, but there were multiple
pain points even in terms of server-side implementation, never mind all
the applications we'd break.

Alright, thanks to everyone for looking into this. Not knowing the code, I naively assumed it should be easy to add an option to not do the case folding.
 

What the OP is asking for doesn't even have the argument "but it's
standards compliant!" going for it. 

Indeed, my argument was it would allow people to choose their own naming convention to (easily) use with Postgres, which would in turn make it easier to migrate from other RDBMSes to Postgres. Although, if you want a standard compliance argument, that can easily be added. :) Just have 3 options: fold to lowercase (current behaviour, default), fold to uppercase (standards compliant), do not fold (most flexible, compatible with MSSQL and MySQL).

> So I doubt we'd accept such a patch even if someone managed to create one.

Well, I was going to ask if paying someone to fix this was an option, but this preempts that!

Re: Allow disabling folding of unquoted identifiers to lowercase

From
John McKown
Date:
On Mon, May 2, 2016 at 3:03 AM, Evgeny Morozov <evgeny.morozov+list+pgsql@shift-technology.com> wrote:
On 30 April 2016 at 01:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yeah, this isn't going to happen.  Years ago we looked into what it would
take to be able to flip a switch and have the standard-compliant behavior
(fold to upper not lower).  It was impractical then and no doubt is far
more so now.  I do not remember all the details, but there were multiple
pain points even in terms of server-side implementation, never mind all
the applications we'd break.

Alright, thanks to everyone for looking into this. Not knowing the code, I naively assumed it should be easy to add an option to not do the case folding.
 

What the OP is asking for doesn't even have the argument "but it's
standards compliant!" going for it. 

Indeed, my argument was it would allow people to choose their own naming convention to (easily) use with Postgres, which would in turn make it easier to migrate from other RDBMSes to Postgres. Although, if you want a standard compliance argument, that can easily be added. :) Just have 3 options: fold to lowercase (current behaviour, default), fold to uppercase (standards compliant), do not fold (most flexible, compatible with MSSQL and MySQL).

> So I doubt we'd accept such a patch even if someone managed to create one.

Well, I was going to ask if paying someone to fix this was an option, but this preempts that!

​I have a silly idea. IIRC, your original problem is that your users are in the habit of entering something like:

select ... from SomeTable ...

And MySQL would actually use the name "SomeTable" (case preserving) and not "sometable" (PostgreSQL) or "SOMETABLE" (SQL standard). What program are the users actually using to do the select? If it is something like "psql", perhaps it would actually be easier to create a modified version which automatically inserts the " marks for them. Of course, you are now doing double parsing of the SQL. First in the client, to modify it before sending to the server. Then again in the server. OK, maybe it is going too far. I guess this might be a "quote everything which is not a keyword" option for a psql replacement. Or whatever the front end is that the users use.


--
The unfacts, did we have them, are too imprecisely few to warrant our certitude.

Maranatha! <><
John McKown

Re: Allow disabling folding of unquoted identifiers to lowercase

From
Evgeny Morozov
Date:
That's an interesting idea! The client users would use is probably pgAdmin. I don't know whether pgAdmin parses the query, though. If it does then it should be relatively easy to add this. If not, I'd imagine it's not going to happen.

On 2 May 2016 at 13:59, John McKown <john.archie.mckown@gmail.com> wrote:
On Mon, May 2, 2016 at 3:03 AM, Evgeny Morozov <evgeny.morozov+list+pgsql@shift-technology.com> wrote:
On 30 April 2016 at 01:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yeah, this isn't going to happen.  Years ago we looked into what it would
take to be able to flip a switch and have the standard-compliant behavior
(fold to upper not lower).  It was impractical then and no doubt is far
more so now.  I do not remember all the details, but there were multiple
pain points even in terms of server-side implementation, never mind all
the applications we'd break.

Alright, thanks to everyone for looking into this. Not knowing the code, I naively assumed it should be easy to add an option to not do the case folding.
 

What the OP is asking for doesn't even have the argument "but it's
standards compliant!" going for it. 

Indeed, my argument was it would allow people to choose their own naming convention to (easily) use with Postgres, which would in turn make it easier to migrate from other RDBMSes to Postgres. Although, if you want a standard compliance argument, that can easily be added. :) Just have 3 options: fold to lowercase (current behaviour, default), fold to uppercase (standards compliant), do not fold (most flexible, compatible with MSSQL and MySQL).

> So I doubt we'd accept such a patch even if someone managed to create one.

Well, I was going to ask if paying someone to fix this was an option, but this preempts that!

​I have a silly idea. IIRC, your original problem is that your users are in the habit of entering something like:

select ... from SomeTable ...

And MySQL would actually use the name "SomeTable" (case preserving) and not "sometable" (PostgreSQL) or "SOMETABLE" (SQL standard). What program are the users actually using to do the select? If it is something like "psql", perhaps it would actually be easier to create a modified version which automatically inserts the " marks for them. Of course, you are now doing double parsing of the SQL. First in the client, to modify it before sending to the server. Then again in the server. OK, maybe it is going too far. I guess this might be a "quote everything which is not a keyword" option for a psql replacement. Or whatever the front end is that the users use.


--
The unfacts, did we have them, are too imprecisely few to warrant our certitude.

Maranatha! <><
John McKown

Re: Allow disabling folding of unquoted identifiers to lowercase

From
Guillaume Lelarge
Date:

Le 3 mai 2016 7:01 PM, "Evgeny Morozov" <evgeny.morozov@shift-technology.com> a écrit :
>
> That's an interesting idea! The client users would use is probably pgAdmin. I don't know whether pgAdmin parses the query, though. If it does then it should be relatively easy to add this. If not, I'd imagine it's not going to happen.
>

The pgAdmin query tool doesn't parse the query. It sends to the server without parsing itself.

> On 2 May 2016 at 13:59, John McKown <john.archie.mckown@gmail.com> wrote:
>>
>> On Mon, May 2, 2016 at 3:03 AM, Evgeny Morozov <evgeny.morozov+list+pgsql@shift-technology.com> wrote:
>>>
>>> On 30 April 2016 at 01:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>>
>>>> Yeah, this isn't going to happen.  Years ago we looked into what it would
>>>> take to be able to flip a switch and have the standard-compliant behavior
>>>> (fold to upper not lower).  It was impractical then and no doubt is far
>>>> more so now.  I do not remember all the details, but there were multiple
>>>> pain points even in terms of server-side implementation, never mind all
>>>> the applications we'd break.
>>>
>>>
>>> Alright, thanks to everyone for looking into this. Not knowing the code, I naively assumed it should be easy to add an option to not do the case folding.
>>>  
>>>>
>>>>
>>>> What the OP is asking for doesn't even have the argument "but it's
>>>> standards compliant!" going for it. 
>>>>
>>> Indeed, my argument was it would allow people to choose their own naming convention to (easily) use with Postgres, which would in turn make it easier to migrate from other RDBMSes to Postgres. Although, if you want a standard compliance argument, that can easily be added. :) Just have 3 options: fold to lowercase (current behaviour, default), fold to uppercase (standards compliant), do not fold (most flexible, compatible with MSSQL and MySQL).
>>>
>>> > So I doubt we'd accept such a patch even if someone managed to create one.
>>>
>>> Well, I was going to ask if paying someone to fix this was an option, but this preempts that!
>>
>>
>> ​I have a silly idea. IIRC, your original problem is that your users are in the habit of entering something like:
>>
>> select ... from SomeTable ...
>>
>> And MySQL would actually use the name "SomeTable" (case preserving) and not "sometable" (PostgreSQL) or "SOMETABLE" (SQL standard). What program are the users actually using to do the select? If it is something like "psql", perhaps it would actually be easier to create a modified version which automatically inserts the " marks for them. Of course, you are now doing double parsing of the SQL. First in the client, to modify it before sending to the server. Then again in the server. OK, maybe it is going too far. I guess this might be a "quote everything which is not a keyword" option for a psql replacement. Or whatever the front end is that the users use.
>>
>>
>> --
>> The unfacts, did we have them, are too imprecisely few to warrant our certitude.
>>
>> Maranatha! <><
>> John McKown
>
>

Re: Allow disabling folding of unquoted identifiers to lowercase

From
"Peter J. Holzer"
Date:
On 2016-04-29 19:21:30 +0200, Evgeny Morozov wrote:
> It would be great if Postgres had a server setting that allowed the automatic
> folding of identifiers to lowercase to be disabled, so that camel case
> identifiers could be used without having to quote every single identifier, i.e.
>
> SELECT MyColumn FROM MyTable ORDER BY MyColumn
>
> instead of
>
> SELECT "MyColumn" FROM "MyTable" ORDER BY "MyColumn"
[...]
> My company is looking into doing this. Currently our table and column names
> exactly match our class and property names, which are in camel case. MSSQL
> supports this just fine. To move to Postgres we would have to either quote
> *everything* or translate names back-and-forth between code and database. Both
> options are OK for auto-generated SQL, but we also have many users writing
> ad-hoc SQL queries. Having to quote everything would have those users screaming
> to switch back to MSSQL very quickly! That leaves us with the mapping approach,
> which is doable, but also a constant "mental speedbump" at best.

What exactly is the problem you are trying to solve?

If you and your users are consistent about never using quotes, your
users can write:

SELECT MyColumn FROM MyTable ORDER BY MyColumn;

It will select mycolumn from mytable, but that doesn't matter, since you
created the table with

    CREATE MyTable (MyColumn varchar);

so you really have a table mytable with a column mycolumn, not a table
MyTable with a column MyColumn.

There are three potential problems I can see:

1) Users might be confused that PgAdmin (or whatever tool they use to
   inspect the database) displays all the names in lowercase, and they
   might find a name like sometablewithareallylongname less readable
   than SomeTableWithAReallyLongName.

2) Since case doesn't matter, they might be inconsistent: One programmer
   might write MyTable, another mytable, the third MYTABLE, ...

3) You might want to use a tool to automatically generate SQL queries,
   but that tool quotes identifiers.

        hp

--
   _  | Peter J. Holzer    | I want to forget all about both belts and
|_|_) |                    | suspenders; instead, I want to buy pants
| |   | hjp@hjp.at         | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/

Attachment

Re: Allow disabling folding of unquoted identifiers to lowercase

From
"Klaus P. Pieper"
Date:
> -----Ursprüngliche Nachricht-----
>
> What exactly is the problem you are trying to solve?
>
> If you and your users are consistent about never using quotes, your users
can
> write:
>
> SELECT MyColumn FROM MyTable ORDER BY MyColumn;
>
> It will select mycolumn from mytable, but that doesn't matter, since you
> created the table with
>
>     CREATE MyTable (MyColumn varchar);
>
> so you really have a table mytable with a column mycolumn, not a table
> MyTable with a column MyColumn.

I use 50% of my time Sybase and 50% PostgreSQL.

For me is the way Sybase works is just more convenient:

CREATE MyTable (MyColumn varchar);

creates a camel cased table MyType and field MyColumn.

SELECT * FROM SYSCATALOG gives MyTable.

This is better readable when you use long table / fiel names.

The user can then use whatever he wants: mytable, MyTable, myTABLE, ...

Klaus




Re: Allow disabling folding of unquoted identifiers to lowercase

From
Raymond O'Donnell
Date:
On 08/05/2016 10:41, Klaus P. Pieper wrote:
> For me is the way Sybase works is just more convenient:
>
> CREATE MyTable (MyColumn varchar);
>
> creates a camel cased table MyType and field MyColumn.
>
> SELECT * FROM SYSCATALOG gives MyTable.
>
> This is better readable when you use long table / fiel names.

I reckon this is just a matter of style... I use underscores to make
long names readable...

   create table my_table_with_a_really_long_name;

....and it works fine.

Ray.

--
Raymond O'Donnell :: Galway :: Ireland
rod@iol.ie