Locale-dependent case conversion in {identifier} - Mailing list pgsql-hackers

From Nicolai Tufar
Subject Locale-dependent case conversion in {identifier}
Date
Msg-id 01df01c29811$7cea48b0$8016a8c0@apb.com.tr
Whole thread Raw
In response to 7.4 Wishlist  ("Christopher Kings-Lynne" <chriskl@familyhealth.com.au>)
Responses Re: Locale-dependent case conversion in {identifier}  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Locale-dependent case conversion in {identifier}  (Hannu Krosing <hannu@tm.ee>)
List pgsql-hackers
Comment in {identifier} section in src/backend/parser/scan.l states:                                    [...]
*Note: here we use a locale-dependent case conversion,          * which seems appropriate under SQL99 rules, whereas
     * the keyword comparison was NOT locale-dependent.          */
 

And in ScanKeywordLookup() in src/backend/parser/keywords.c:
       /*        * Apply an ASCII-only downcasing.      We must not use tolower()
since it        * may produce the wrong translation in some locales (eg, Turkish),        * and we don't trust
isupper()very much either.  In an ASCII-based        * encoding the tests against A and Z are sufficient, but we also
    * check isupper() so that we will work correctly under EBCDIC.  The        * actual case conversion step should
workfor either ASCII or
 
EBCDIC.        */

And I happen to have bad luck to use PostgreSQL with Turkish locale. And, as
you
may know our "I" is not your "I":
   pgsql=# create table a(x char(1));   CREATE TABLE   pgsql=# grant SELECT ON a to PUBLIC;   ERROR:  user "public"
doesnot exist   pgsql=#
 

Oracle, the second best database I have does seem to convert relation names
in
locale-dependent fassion:
  SQL> alter session set NLS_LANGUAGE='TURKISH';  Session altered.  SQL> create table a(x char(1));  Table created.
SQL>grant select on a to PUBLIC;  Grant succeeded.
 

Further, if I try to create a table in oracle using Turkish-specific
characters,
it is creating it alright, without trying to make them upper-case as it
usually does.

So I have changed lower-case conversion code in scan.l to make it purely
ASCII-based
as in keywords.c. Mini-patch is given below. Please bear in mind that it is
my first
attempt at hacking PostgreSQL code, so there can be some mistakes.

Regards,
Nick


diff -Nur src/backend/parser/scan.l.orig src/backend/parser/scan.l
--- src/backend/parser/scan.l.orig      Sat Nov 30 02:54:06 2002
+++ src/backend/parser/scan.l   Sat Nov 30 02:57:45 2002
@@ -551,9 +551,12 @@                                       ident = pstrdup(yytext);
 for (i = 0; ident[i]; i++)                                       {
 
-                                               if (isupper((unsigned char)
ident[i]))
-                                                       ident[i] =
tolower((unsigned char) ident[i]);
+                                               char            ch =
ident[i];
+                                               if (ch >= 'A' && ch <= 'Z'
&& isupper((unsigned char) ch))
+                                                       ch += 'a' - 'A';
+                                                       ident[i] = ch;                                       }
+                                       ident[i] = '\0';                                       if (i >= NAMEDATALEN)
               {                                               int len;
 




pgsql-hackers by date:

Previous
From: "Christopher Kings-Lynne"
Date:
Subject: Re: 7.4 Wishlist
Next
From: Philip Warner
Date:
Subject: Re: 7.4 Wishlist