Thread: Patch to make Turks happy.

Patch to make Turks happy.

From
Nicolai Tufar
Date:
Hi,

Yet another problem with Turkish encoding. clean_encoding_name()
in src/backend/utils/mb/encnames.c uses tolower() to convert locale
names to lower-case. This causes errors if locale name contains
capital "I" and current olcale is Turkish. Some examples:

aaa=# \l
      List of databases
   Name    | Owner | Encoding
-----------+-------+----------
 aaa       | pgsql | LATIN5
 bbb       | pgsql | LATIN5
 template0 | pgsql | LATIN5
 template1 | pgsql | LATIN5
(4 rows)
aaa=# CREATE DATABASE ccc ENCODING='LATIN5';
ERROR:  LATIN5 is not a valid encoding name
aaa=# \encoding
SQL_ASCII
aaa=# \encoding SQL_ASCII
SQL_ASCII: invalid encoding name or conversion procedure not found
aaa=# \encoding LATIN5
LATIN5: invalid encoding name or conversion procedure not found


Patch, is a simple change to use ASCII-only lower-case conversion
instead of locale-dependent tolower()

Best regards,
Nic.






*** ./src/backend/utils/mb/encnames.c.orig    Mon Dec  2 15:58:49 2002
--- ./src/backend/utils/mb/encnames.c    Mon Dec  2 18:13:23 2002
***************
*** 407,413 ****
      for (p = key, np = newkey; *p != '\0'; p++)
      {
          if (isalnum((unsigned char) *p))
!             *np++ = tolower((unsigned char) *p);
      }
      *np = '\0';
      return newkey;
--- 407,416 ----
      for (p = key, np = newkey; *p != '\0'; p++)
      {
          if (isalnum((unsigned char) *p))
!             if (*p >= 'A' && *p <= 'Z')
!                 *np++ = *p + 'a' - 'A';
!             else
!                 *np++ = *p;
      }
      *np = '\0';
      return newkey;


Re: [HACKERS] Patch to make Turks happy.

From
Bruce Momjian
Date:
I am not going to apply this patch because I think it will mess up the
handling of other locales.


---------------------------------------------------------------------------

Nicolai Tufar wrote:
> Hi,
>
> Yet another problem with Turkish encoding. clean_encoding_name()
> in src/backend/utils/mb/encnames.c uses tolower() to convert locale
> names to lower-case. This causes errors if locale name contains
> capital "I" and current olcale is Turkish. Some examples:
>
> aaa=# \l
>       List of databases
>    Name    | Owner | Encoding
> -----------+-------+----------
>  aaa       | pgsql | LATIN5
>  bbb       | pgsql | LATIN5
>  template0 | pgsql | LATIN5
>  template1 | pgsql | LATIN5
> (4 rows)
> aaa=# CREATE DATABASE ccc ENCODING='LATIN5';
> ERROR:  LATIN5 is not a valid encoding name
> aaa=# \encoding
> SQL_ASCII
> aaa=# \encoding SQL_ASCII
> SQL_ASCII: invalid encoding name or conversion procedure not found
> aaa=# \encoding LATIN5
> LATIN5: invalid encoding name or conversion procedure not found
>
>
> Patch, is a simple change to use ASCII-only lower-case conversion
> instead of locale-dependent tolower()
>
> Best regards,
> Nic.
>
>
>
>
>
>
> *** ./src/backend/utils/mb/encnames.c.orig    Mon Dec  2 15:58:49 2002
> --- ./src/backend/utils/mb/encnames.c    Mon Dec  2 18:13:23 2002
> ***************
> *** 407,413 ****
>       for (p = key, np = newkey; *p != '\0'; p++)
>       {
>           if (isalnum((unsigned char) *p))
> !             *np++ = tolower((unsigned char) *p);
>       }
>       *np = '\0';
>       return newkey;
> --- 407,416 ----
>       for (p = key, np = newkey; *p != '\0'; p++)
>       {
>           if (isalnum((unsigned char) *p))
> !             if (*p >= 'A' && *p <= 'Z')
> !                 *np++ = *p + 'a' - 'A';
> !             else
> !                 *np++ = *p;
>       }
>       *np = '\0';
>       return newkey;
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: [HACKERS] Patch to make Turks happy.

From
Nicolai Tufar
Date:
Bruce Momjian wrote:
> I am not going to apply this patch because I think it will mess up the
> handling of other locales.

As far as I figured from the source code this function only deals with
cleaning up
locale names and nothing else. Since all the locale names are in plain
ASCII I think
it will be safe to use ASCII-only lower-case conversion.

By the way, I noticed only after sending the patch that compiler
complains about
ambiguous `else' so it can be rewritten as:


    if (*p >= 'A' && *p <= 'Z'){

        *np++ = *p + 'a' - 'A';

    }else{

        *np++ = *p;
            }



Regards,
Nicolai


>
>
> ---------------------------------------------------------------------------
>
> Nicolai Tufar wrote:
>
>>Hi,
>>
>>Yet another problem with Turkish encoding. clean_encoding_name()
>>in src/backend/utils/mb/encnames.c uses tolower() to convert locale
>>names to lower-case. This causes errors if locale name contains
>>capital "I" and current olcale is Turkish. Some examples:
>>
>>aaa=# \l
>>      List of databases
>>   Name    | Owner | Encoding
>>-----------+-------+----------
>> aaa       | pgsql | LATIN5
>> bbb       | pgsql | LATIN5
>> template0 | pgsql | LATIN5
>> template1 | pgsql | LATIN5
>>(4 rows)
>>aaa=# CREATE DATABASE ccc ENCODING='LATIN5';
>>ERROR:  LATIN5 is not a valid encoding name
>>aaa=# \encoding
>>SQL_ASCII
>>aaa=# \encoding SQL_ASCII
>>SQL_ASCII: invalid encoding name or conversion procedure not found
>>aaa=# \encoding LATIN5
>>LATIN5: invalid encoding name or conversion procedure not found
>>
>>
>>Patch, is a simple change to use ASCII-only lower-case conversion
>>instead of locale-dependent tolower()
>>
>>Best regards,
>>Nic.
>>
>>
>>
>>
>>
>>
>>*** ./src/backend/utils/mb/encnames.c.orig    Mon Dec  2 15:58:49 2002
>>--- ./src/backend/utils/mb/encnames.c    Mon Dec  2 18:13:23 2002
>>***************
>>*** 407,413 ****
>>      for (p = key, np = newkey; *p != '\0'; p++)
>>      {
>>          if (isalnum((unsigned char) *p))
>>!             *np++ = tolower((unsigned char) *p);
>>      }
>>      *np = '\0';
>>      return newkey;
>>--- 407,416 ----
>>      for (p = key, np = newkey; *p != '\0'; p++)
>>      {
>>          if (isalnum((unsigned char) *p))
>>!             if (*p >= 'A' && *p <= 'Z')
>>!                 *np++ = *p + 'a' - 'A';
>>!             else
>>!                 *np++ = *p;
>>      }
>>      *np = '\0';
>>      return newkey;
>>
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 4: Don't 'kill -9' the postmaster
>>
>
>




Re: [HACKERS] Patch to make Turks happy.

From
Peter Eisentraut
Date:
Bruce Momjian writes:

> I am not going to apply this patch because I think it will mess up the
> handling of other locales.

This patch looks OK to me.  Normally, character set names should use
identifier case-folding rules anyway, so seems to be a step in the right
direction.  Much better than saying that users of certain locales can't
properly use PostgreSQL.

>
>
> ---------------------------------------------------------------------------
>
> Nicolai Tufar wrote:
> > Hi,
> >
> > Yet another problem with Turkish encoding. clean_encoding_name()
> > in src/backend/utils/mb/encnames.c uses tolower() to convert locale
> > names to lower-case. This causes errors if locale name contains
> > capital "I" and current olcale is Turkish. Some examples:
> >
> > aaa=# \l
> >       List of databases
> >    Name    | Owner | Encoding
> > -----------+-------+----------
> >  aaa       | pgsql | LATIN5
> >  bbb       | pgsql | LATIN5
> >  template0 | pgsql | LATIN5
> >  template1 | pgsql | LATIN5
> > (4 rows)
> > aaa=# CREATE DATABASE ccc ENCODING='LATIN5';
> > ERROR:  LATIN5 is not a valid encoding name
> > aaa=# \encoding
> > SQL_ASCII
> > aaa=# \encoding SQL_ASCII
> > SQL_ASCII: invalid encoding name or conversion procedure not found
> > aaa=# \encoding LATIN5
> > LATIN5: invalid encoding name or conversion procedure not found
> >
> >
> > Patch, is a simple change to use ASCII-only lower-case conversion
> > instead of locale-dependent tolower()
> >
> > Best regards,
> > Nic.
> >
> >
> >
> >
> >
> >
> > *** ./src/backend/utils/mb/encnames.c.orig    Mon Dec  2 15:58:49 2002
> > --- ./src/backend/utils/mb/encnames.c    Mon Dec  2 18:13:23 2002
> > ***************
> > *** 407,413 ****
> >       for (p = key, np = newkey; *p != '\0'; p++)
> >       {
> >           if (isalnum((unsigned char) *p))
> > !             *np++ = tolower((unsigned char) *p);
> >       }
> >       *np = '\0';
> >       return newkey;
> > --- 407,416 ----
> >       for (p = key, np = newkey; *p != '\0'; p++)
> >       {
> >           if (isalnum((unsigned char) *p))
> > !             if (*p >= 'A' && *p <= 'Z')
> > !                 *np++ = *p + 'a' - 'A';
> > !             else
> > !                 *np++ = *p;
> >       }
> >       *np = '\0';
> >       return newkey;
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 4: Don't 'kill -9' the postmaster
> >
>
>

--
Peter Eisentraut   peter_e@gmx.net




Re: [HACKERS] Patch to make Turks happy.

From
Bruce Momjian
Date:
OK, Peter, that helps.  Thanks.  I will apply it.

---------------------------------------------------------------------------

Peter Eisentraut wrote:
> Bruce Momjian writes:
>
> > I am not going to apply this patch because I think it will mess up the
> > handling of other locales.
>
> This patch looks OK to me.  Normally, character set names should use
> identifier case-folding rules anyway, so seems to be a step in the right
> direction.  Much better than saying that users of certain locales can't
> properly use PostgreSQL.
>
> >
> >
> > ---------------------------------------------------------------------------
> >
> > Nicolai Tufar wrote:
> > > Hi,
> > >
> > > Yet another problem with Turkish encoding. clean_encoding_name()
> > > in src/backend/utils/mb/encnames.c uses tolower() to convert locale
> > > names to lower-case. This causes errors if locale name contains
> > > capital "I" and current olcale is Turkish. Some examples:
> > >
> > > aaa=# \l
> > >       List of databases
> > >    Name    | Owner | Encoding
> > > -----------+-------+----------
> > >  aaa       | pgsql | LATIN5
> > >  bbb       | pgsql | LATIN5
> > >  template0 | pgsql | LATIN5
> > >  template1 | pgsql | LATIN5
> > > (4 rows)
> > > aaa=# CREATE DATABASE ccc ENCODING='LATIN5';
> > > ERROR:  LATIN5 is not a valid encoding name
> > > aaa=# \encoding
> > > SQL_ASCII
> > > aaa=# \encoding SQL_ASCII
> > > SQL_ASCII: invalid encoding name or conversion procedure not found
> > > aaa=# \encoding LATIN5
> > > LATIN5: invalid encoding name or conversion procedure not found
> > >
> > >
> > > Patch, is a simple change to use ASCII-only lower-case conversion
> > > instead of locale-dependent tolower()
> > >
> > > Best regards,
> > > Nic.
> > >
> > >
> > >
> > >
> > >
> > >
> > > *** ./src/backend/utils/mb/encnames.c.orig    Mon Dec  2 15:58:49 2002
> > > --- ./src/backend/utils/mb/encnames.c    Mon Dec  2 18:13:23 2002
> > > ***************
> > > *** 407,413 ****
> > >       for (p = key, np = newkey; *p != '\0'; p++)
> > >       {
> > >           if (isalnum((unsigned char) *p))
> > > !             *np++ = tolower((unsigned char) *p);
> > >       }
> > >       *np = '\0';
> > >       return newkey;
> > > --- 407,416 ----
> > >       for (p = key, np = newkey; *p != '\0'; p++)
> > >       {
> > >           if (isalnum((unsigned char) *p))
> > > !             if (*p >= 'A' && *p <= 'Z')
> > > !                 *np++ = *p + 'a' - 'A';
> > > !             else
> > > !                 *np++ = *p;
> > >       }
> > >       *np = '\0';
> > >       return newkey;
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 4: Don't 'kill -9' the postmaster
> > >
> >
> >
>
> --
> Peter Eisentraut   peter_e@gmx.net
>
>
>
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: [HACKERS] Patch to make Turks happy.

From
Bruce Momjian
Date:
OK, patch applied.  Peter, should this appear in 7.3.1 too?

---------------------------------------------------------------------------

Peter Eisentraut wrote:
> Bruce Momjian writes:
>
> > I am not going to apply this patch because I think it will mess up the
> > handling of other locales.
>
> This patch looks OK to me.  Normally, character set names should use
> identifier case-folding rules anyway, so seems to be a step in the right
> direction.  Much better than saying that users of certain locales can't
> properly use PostgreSQL.
>
> >
> >
> > ---------------------------------------------------------------------------
> >
> > Nicolai Tufar wrote:
> > > Hi,
> > >
> > > Yet another problem with Turkish encoding. clean_encoding_name()
> > > in src/backend/utils/mb/encnames.c uses tolower() to convert locale
> > > names to lower-case. This causes errors if locale name contains
> > > capital "I" and current olcale is Turkish. Some examples:
> > >
> > > aaa=# \l
> > >       List of databases
> > >    Name    | Owner | Encoding
> > > -----------+-------+----------
> > >  aaa       | pgsql | LATIN5
> > >  bbb       | pgsql | LATIN5
> > >  template0 | pgsql | LATIN5
> > >  template1 | pgsql | LATIN5
> > > (4 rows)
> > > aaa=# CREATE DATABASE ccc ENCODING='LATIN5';
> > > ERROR:  LATIN5 is not a valid encoding name
> > > aaa=# \encoding
> > > SQL_ASCII
> > > aaa=# \encoding SQL_ASCII
> > > SQL_ASCII: invalid encoding name or conversion procedure not found
> > > aaa=# \encoding LATIN5
> > > LATIN5: invalid encoding name or conversion procedure not found
> > >
> > >
> > > Patch, is a simple change to use ASCII-only lower-case conversion
> > > instead of locale-dependent tolower()
> > >
> > > Best regards,
> > > Nic.
> > >
> > >
> > >
> > >
> > >
> > >
> > > *** ./src/backend/utils/mb/encnames.c.orig    Mon Dec  2 15:58:49 2002
> > > --- ./src/backend/utils/mb/encnames.c    Mon Dec  2 18:13:23 2002
> > > ***************
> > > *** 407,413 ****
> > >       for (p = key, np = newkey; *p != '\0'; p++)
> > >       {
> > >           if (isalnum((unsigned char) *p))
> > > !             *np++ = tolower((unsigned char) *p);
> > >       }
> > >       *np = '\0';
> > >       return newkey;
> > > --- 407,416 ----
> > >       for (p = key, np = newkey; *p != '\0'; p++)
> > >       {
> > >           if (isalnum((unsigned char) *p))
> > > !             if (*p >= 'A' && *p <= 'Z')
> > > !                 *np++ = *p + 'a' - 'A';
> > > !             else
> > > !                 *np++ = *p;
> > >       }
> > >       *np = '\0';
> > >       return newkey;
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 4: Don't 'kill -9' the postmaster
> > >
> >
> >
>
> --
> Peter Eisentraut   peter_e@gmx.net
>
>
>
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: src/backend/utils/mb/encnames.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/utils/mb/encnames.c,v
retrieving revision 1.10
diff -c -c -r1.10 encnames.c
*** src/backend/utils/mb/encnames.c    4 Sep 2002 20:31:31 -0000    1.10
--- src/backend/utils/mb/encnames.c    5 Dec 2002 23:19:40 -0000
***************
*** 407,413 ****
      for (p = key, np = newkey; *p != '\0'; p++)
      {
          if (isalnum((unsigned char) *p))
!             *np++ = tolower((unsigned char) *p);
      }
      *np = '\0';
      return newkey;
--- 407,418 ----
      for (p = key, np = newkey; *p != '\0'; p++)
      {
          if (isalnum((unsigned char) *p))
!         {
!             if (*p >= 'A' && *p <= 'Z')
!                 *np++ = *p + 'a' - 'A';
!             else
!                 *np++ = *p;
!         }
      }
      *np = '\0';
      return newkey;

Re: [HACKERS] Patch to make Turks happy.

From
Bruce Momjian
Date:
Peter, is that patch OK for 7.3.1?  I am not sure.

---------------------------------------------------------------------------

Peter Eisentraut wrote:
> Bruce Momjian writes:
>
> > I am not going to apply this patch because I think it will mess up the
> > handling of other locales.
>
> This patch looks OK to me.  Normally, character set names should use
> identifier case-folding rules anyway, so seems to be a step in the right
> direction.  Much better than saying that users of certain locales can't
> properly use PostgreSQL.
>
> >
> >
> > ---------------------------------------------------------------------------
> >
> > Nicolai Tufar wrote:
> > > Hi,
> > >
> > > Yet another problem with Turkish encoding. clean_encoding_name()
> > > in src/backend/utils/mb/encnames.c uses tolower() to convert locale
> > > names to lower-case. This causes errors if locale name contains
> > > capital "I" and current olcale is Turkish. Some examples:
> > >
> > > aaa=# \l
> > >       List of databases
> > >    Name    | Owner | Encoding
> > > -----------+-------+----------
> > >  aaa       | pgsql | LATIN5
> > >  bbb       | pgsql | LATIN5
> > >  template0 | pgsql | LATIN5
> > >  template1 | pgsql | LATIN5
> > > (4 rows)
> > > aaa=# CREATE DATABASE ccc ENCODING='LATIN5';
> > > ERROR:  LATIN5 is not a valid encoding name
> > > aaa=# \encoding
> > > SQL_ASCII
> > > aaa=# \encoding SQL_ASCII
> > > SQL_ASCII: invalid encoding name or conversion procedure not found
> > > aaa=# \encoding LATIN5
> > > LATIN5: invalid encoding name or conversion procedure not found
> > >
> > >
> > > Patch, is a simple change to use ASCII-only lower-case conversion
> > > instead of locale-dependent tolower()
> > >
> > > Best regards,
> > > Nic.
> > >
> > >
> > >
> > >
> > >
> > >
> > > *** ./src/backend/utils/mb/encnames.c.orig    Mon Dec  2 15:58:49 2002
> > > --- ./src/backend/utils/mb/encnames.c    Mon Dec  2 18:13:23 2002
> > > ***************
> > > *** 407,413 ****
> > >       for (p = key, np = newkey; *p != '\0'; p++)
> > >       {
> > >           if (isalnum((unsigned char) *p))
> > > !             *np++ = tolower((unsigned char) *p);
> > >       }
> > >       *np = '\0';
> > >       return newkey;
> > > --- 407,416 ----
> > >       for (p = key, np = newkey; *p != '\0'; p++)
> > >       {
> > >           if (isalnum((unsigned char) *p))
> > > !             if (*p >= 'A' && *p <= 'Z')
> > > !                 *np++ = *p + 'a' - 'A';
> > > !             else
> > > !                 *np++ = *p;
> > >       }
> > >       *np = '\0';
> > >       return newkey;
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 4: Don't 'kill -9' the postmaster
> > >
> >
> >
>
> --
> Peter Eisentraut   peter_e@gmx.net
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: [HACKERS] Patch to make Turks happy.

From
Peter Eisentraut
Date:
Bruce Momjian writes:

> Peter, is that patch OK for 7.3.1?  I am not sure.

Definitely.  It's a bug fix.

--
Peter Eisentraut   peter_e@gmx.net


Re: [HACKERS] Patch to make Turks happy.

From
Bruce Momjian
Date:
Thanks.   Applied for 7.3.1.

---------------------------------------------------------------------------

Peter Eisentraut wrote:
> Bruce Momjian writes:
>
> > Peter, is that patch OK for 7.3.1?  I am not sure.
>
> Definitely.  It's a bug fix.
>
> --
> Peter Eisentraut   peter_e@gmx.net
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073