Thread: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

From
Alexander Farber
Date:
Hello fellow PostgreSQL users,

does anybody else observe the problem, that calling lower() method on UTF8 cyrillic strings works on Mac and Linux for version 9.5.3, but fails on Windows 7 / 64 bit (I am using the unzippable version w/o installer)?

I am probably not providing enough information here... not sure what else to write.

Here is my code (trying to lowercase a string before saving it in 2-dim. array):


        FOR _tile IN SELECT * FROM JSONB_ARRAY_ELEMENTS(in_tiles)
        LOOP
                _letter :=  _tile->>'letter';
                _value  := (_tile->>'value')::int;
                _col    := (_tile->>'col')::int + 1;
                _row    := (_tile->>'row')::int + 1;

                RAISE NOTICE 'tile = %', _tile;

                IF NOT words_valid_tile(_letter,
                                        _value,
                                        _col,
                                        _row) THEN
                        RAISE EXCEPTION 'Invalid tile = %', _tile;
                END IF;

                IF (_letters[_col][_row] IS NOT NULL) THEN
                        RAISE EXCEPTION 'Cell already occupied %', _tile;
                END IF;
                 
                _letters[_col][_row] := lower(_letter);
                RAISE NOTICE 'letter = %', _letters[_col][_row];  -- STILL UPPERCASE
                _values[_col][_row]  := _value;

        END LOOP;

I wonder if there is a workaround for this problem on Windows

Thank you
Alex


Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

From
Alexander Farber
Date:
Here the Windows-log excerpt (the 5 cyrillic letters stay uppercased):

LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started
LOG:  statement: SET client_encoding = 'UTF8';

LOG:  execute <unnamed>: SELECT out_gid AS gid FROM words_play_game($1, $2, $3::jsonb)
DETAIL:  parameters: $1 = '1', $2 = '3', $3 = '[{"col":7,"letter":"П","row":11,"value":2},{"col":7,"letter":"И","row":10,"value":1},{"col":7,"letter":"Л","row":9,"value":2},{"col":7,"letter":"О","row":8,"value":1},{"col":7,"letter":"П","row":7,"value":2}]'
NOTICE:  tile = {"col": 7, "row": 11, "value": 2, "letter": "П"}
NOTICE:  letter = П
NOTICE:  tile = {"col": 7, "row": 10, "value": 1, "letter": "И"}
NOTICE:  letter = И
NOTICE:  tile = {"col": 7, "row": 9, "value": 2, "letter": "Л"}
NOTICE:  letter = Л
NOTICE:  tile = {"col": 7, "row": 8, "value": 1, "letter": "О"}
NOTICE:  letter = О
NOTICE:  tile = {"col": 7, "row": 7, "value": 2, "letter": "П"}
NOTICE:  letter = П

Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

From
Alexander Farber
Date:
More info:

# \l+
                                                  
   Name    |  Owner  | Encoding | Collate | Ctype |
-----------+---------+----------+---------+-------+
 postgres  | user1   | UTF8     | C       | C     |
 template0 | user1   | UTF8     | C       | C     |
           |         |          |         |       |
 template1 | user1   | UTF8     | C       | C     |
           |         |          |         |       |
 words     | user1   | UTF8     | C       | C     |

Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

From
"Charles Clavadetscher"
Date:
Hello

> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Alexander Farber
> Sent: Montag, 8. August 2016 09:10
> To: pgsql-general <pgsql-general@postgresql.org>
> Subject: Re: [GENERAL] lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux
>
> More info:
>
> # \l+
>
>   Name    |  Owner  | Encoding | Collate | Ctype |
> -----------+---------+----------+---------+-------+
> postgres  | user1   | UTF8     | C       | C     |
> template0 | user1   | UTF8     | C       | C     |
>           |         |          |         |       |
> template1 | user1   | UTF8     | C       | C     |
>           |         |          |         |       |
> words     | user1   | UTF8     | C       | C     |


I cannot test on 9.5, which you are using, but I was able to reproduce the problem on 9.3:

kofadmin@kofdb.localhost=> CREATE DATABASE test TEMPLATE template0 ENCODING 'UTF8' LC_COLLATE 'C' LC_CTYPE 'C';
CREATE DATABASE
kofadmin@kofdb.localhost=> \l
                                                 List of databases
   Name    |  Owner   | Encoding |          Collate           |           Ctype            |   Access privileges
-----------+----------+----------+----------------------------+----------------------------+-----------------------
 kofdb     | kofadmin | UTF8     | English_United States.1252 | English_United States.1252 |
 test      | kofadmin | UTF8     | C                          | C                          |

kofadmin@kofdb.localhost=> \c test
psql (9.6devel, server 9.3.6)
You are now connected to database "test" as user "kofadmin".
kofadmin@test.localhost=> select lower('P');  <-- This works (latin letters)
 lower
-------
 p
(1 row)

kofadmin@test.localhost=> select lower('И'); <-- This does not work (cyrillic)
 lower
-------
 И
(1 row)

kofadmin@test.localhost=> \c kofdb
psql (9.6devel, server 9.3.6)
You are now connected to database "kofdb" as user "kofadmin".
kofadmin@kofdb.localhost=> select lower('И'); <-- This works on a DB with another collation and ctype
 lower
-------
 и
(1 row)

It seems to be a problem with collation and or ctype.

What are the settings of the database on your Linux system where all works correct?

Bye
Charles




Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

From
"Charles Clavadetscher"
Date:
Hello

> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Charles
> Clavadetscher
> Sent: Montag, 8. August 2016 09:30
> To: 'Alexander Farber' <alexander.farber@gmail.com>; 'pgsql-general' <pgsql-general@postgresql.org>
> Subject: Re: [GENERAL] lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux
>
> Hello
>
> > From: pgsql-general-owner@postgresql.org
> > [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Alexander
> > Farber
> > Sent: Montag, 8. August 2016 09:10
> > To: pgsql-general <pgsql-general@postgresql.org>
> > Subject: Re: [GENERAL] lower() silently fails for 9.5.3 on Windows,
> > but works on Mac, Linux
> >
> > More info:
> >
> > # \l+
> >
> >   Name    |  Owner  | Encoding | Collate | Ctype |
> > -----------+---------+----------+---------+-------+
> > postgres  | user1   | UTF8     | C       | C     |
> > template0 | user1   | UTF8     | C       | C     |
> >           |         |          |         |       |
> > template1 | user1   | UTF8     | C       | C     |
> >           |         |          |         |       |
> > words     | user1   | UTF8     | C       | C     |
>
>
> I cannot test on 9.5, which you are using, but I was able to reproduce the problem on 9.3:
>
> kofadmin@kofdb.localhost=> CREATE DATABASE test TEMPLATE template0 ENCODING 'UTF8' LC_COLLATE 'C' LC_CTYPE 'C';
> CREATE DATABASE kofadmin@kofdb.localhost=> \l
>                                                  List of databases
>    Name    |  Owner   | Encoding |          Collate           |           Ctype            |   Access privileges
> -----------+----------+----------+----------------------------+----------------------------+-----------------------
>  kofdb     | kofadmin | UTF8     | English_United States.1252 | English_United States.1252 |
>  test      | kofadmin | UTF8     | C                          | C                          |
>
> kofadmin@kofdb.localhost=> \c test
> psql (9.6devel, server 9.3.6)
> You are now connected to database "test" as user "kofadmin".
> kofadmin@test.localhost=> select lower('P');  <-- This works (latin letters)  lower
> -------
>  p
> (1 row)
>
> kofadmin@test.localhost=> select lower('И'); <-- This does not work (cyrillic)  lower
> -------
>  И
> (1 row)
>
> kofadmin@test.localhost=> \c kofdb
> psql (9.6devel, server 9.3.6)
> You are now connected to database "kofdb" as user "kofadmin".
> kofadmin@kofdb.localhost=> select lower('И'); <-- This works on a DB with another collation and ctype  lower
> -------
>  и
> (1 row)
>
> It seems to be a problem with collation and or ctype.
>
> What are the settings of the database on your Linux system where all works correct?

A possible workaround:

kofadmin@test.localhost=> CREATE COLLATION "en_US" (LOCALE = 'English_United States.1252');
CREATE COLLATION

kofadmin@test.localhost=> select * from pg_collation;
 collname | collnamespace | collowner | collencoding |        collcollate         |         collctype
----------+---------------+-----------+--------------+----------------------------+----------------------------
 default  |            11 |        10 |           -1 |                            |
 C        |            11 |        10 |           -1 | C                          | C
 POSIX    |            11 |        10 |           -1 | POSIX                      | POSIX
 en_US    |          2200 |     16394 |            6 | English_United States.1252 | English_United States.1252
(4 rows)

kofadmin@test.localhost=> with x as (select 'И'::text collate "en_US" as letter) select lower(letter) from x ;
 lower
-------
 и
(1 row)

Sure, bit uncomfortable.
Bye
Charles

>
> Bye
> Charles
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

From
Alexander Farber
Date:
Thank you for the replies.

On CentOS 7 Linux with pgdg 9.5.3 the lower() method works and I have:

 # \l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |
-----------+----------+----------+-------------+-------------+
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
           |          |          |             |             |
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
           |          |          |             |             |
 words     | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |


Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

From
"Charles Clavadetscher"
Date:
Hello

> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Alexander Farber
> Sent: Montag, 8. August 2016 10:12
> Cc: pgsql-general <pgsql-general@postgresql.org>
> Subject: Re: [GENERAL] lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux
>
> Thank you for the replies.
> On CentOS 7 Linux with pgdg 9.5.3 the lower() method works and I have:
>
> # \l
>                                  List of databases
>   Name    |  Owner   | Encoding |   Collate   |    Ctype    |
> -----------+----------+----------+-------------+-------------+
> postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
> template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
>           |          |          |             |             |
> template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
>           |          |          |             |             |
> words     | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |

Well yes, anogher collation and ctype.

I found an easier way to use the workaround after create collation:

kofadmin@test.localhost=> select lower(('И'::text collate "en_US")) ;
 lower
-------
 и
(1 row)

Maybe other more expert than me on this topic will suggest better solution.
Bye
Charles




Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

From
Alexander Farber
Date:
Hello Charles, unfortunately on Windows 7 this fails:

psql (9.5.3)
Type "help" for help.

# select lower(('И'::text collate "en_US")) ;
ERROR:  collation "en_US" for encoding "UTF8" does not exist
LINE 1: select lower(('?'::text collate "en_US")) ;
                                ^
By the way I the following code works well for me on all 3 platforms:

CREATE TABLE words_verbs (
        word varchar(255) PRIMARY KEY CHECK (
                word ~ '^[А-Я]{2,}$' AND
                word !~ '[ЖШ]Ы' AND
                word !~ '[ЧЩ]Я' AND
                word !~ 'Ц[ЮЯ]' AND
                (word ~ '[ТЧ]ЬСЯ$' OR
                 word ~ '[ТЧ]Ь$' OR 
                 word ~ 'ТИ$')),
        hashed varchar(32) NOT NULL
);

but I understand that it is probably different methods on the lower layer (pcre instead of some collating functions?)....

Regards
Alex

Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

From
"Charles Clavadetscher"
Date:
Hello Alexander

> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Alexander Farber
> Sent: Montag, 8. August 2016 10:21
> Cc: pgsql-general <pgsql-general@postgresql.org>
> Subject: Re: [GENERAL] lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux
>
> Hello Charles, unfortunately on Windows 7 this fails:
>
> psql (9.5.3)
> Type "help" for help.
>
> # select lower(('И'::text collate "en_US")) ;
> ERROR:  collation "en_US" for encoding "UTF8" does not exist LINE 1: select lower(('?'::text collate "en_US")) ;

I assume that you did not create the collation yet as I mentioned in a previous mail.

kofadmin@test.localhost=> CREATE COLLATION "en_US" (LOCALE = 'English_United States.1252');
CREATE COLLATION

Which locale can be created depends on those available on your OS.

Bye
Charles

> By the way I the following code works well for me on all 3 platforms:
>
> CREATE TABLE words_verbs (
>         word varchar(255) PRIMARY KEY CHECK (
>                 word ~ '^[А-Я]{2,}$' AND
>                 word !~ '[ЖШ]Ы' AND
>                 word !~ '[ЧЩ]Я' AND
>                 word !~ 'Ц[ЮЯ]' AND
>                 (word ~ '[ТЧ]ЬСЯ$' OR
>                  word ~ '[ТЧ]Ь$' OR
>                  word ~ 'ТИ$')),
>         hashed varchar(32) NOT NULL
> );
>
>
> but I understand that it is probably different methods on the lower layer (pcre instead of some collating
> functions?)....
>
>
> Regards
>
> Alex




Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

From
Tom Lane
Date:
Alexander Farber <alexander.farber@gmail.com> writes:
> More info:
> # \l+

>    Name    |  Owner  | Encoding | Collate | Ctype |
> -----------+---------+----------+---------+-------+
>  postgres  | user1   | UTF8     | C       | C     |
>  template0 | user1   | UTF8     | C       | C     |
>            |         |          |         |       |
>  template1 | user1   | UTF8     | C       | C     |
>            |         |          |         |       |
>  words     | user1   | UTF8     | C       | C     |

Well, there's your problem: in C locale, only the standard ASCII letters
will be transformed by upper/lower.

            regards, tom lane