Re: Possibly a bug - Mailing list pgsql-bugs

From surya poondla
Subject Re: Possibly a bug
Date
Msg-id CAOVWO5osy_PbZnbdGH0QZgL0LpkrhrQ475MJLX80ioASBBhOkw@mail.gmail.com
Whole thread Raw
In response to Possibly a bug  (Анатолий <anatoly@email.su>)
Responses Re: Possibly a bug
List pgsql-bugs
Hi Анатолий, Tom

Thank you for reporting the issue.
I did some testing and see a similar issue. I'm seeing a behavior that suggests this isn't simply a C locale configuration issue, but rather an inconsistency in how collations are resolved within SQL functions.

My environment is configured with UTF-8 locale:

  postgres=# show lc_collate;
   lc_collate
  -------------
   en_US.UTF-8
  (1 row)

  postgres=# show lc_ctype;
    lc_ctype
  -------------
   en_US.UTF-8
  (1 row)

Test 1: SQL function with user parameter. Cyrillic is not uppercased
  postgres=# create or replace function to_upper_first(param1 text, param2 text)
  returns text as $
      select upper(param1) || ' ' || upper(param2) || ' ' || param2;
  $ language sql;
CREATE FUNCTION
postgres=# select to_upper_first('тест', user::text);
  to_upper_first
------------------
 тест SURYA surya
(1 row)

Test 2: PL/pgSQL function. Cyrillic is not uppercased
postgres=# create or replace function to_upper_first(param1 text, param2 text)
postgres-# returns text language plpgsql as $$
postgres$# begin
postgres$#     return upper(param1) || ' ' || upper(param2) || ' ' || param2;
postgres$# end;
postgres$# $$;
CREATE FUNCTION
postgres=#
postgres=# select to_upper_first('тест', user::text);
  to_upper_first
------------------
 тест SURYA surya
(1 row)


Test 3: Explicit collation, uppercase works correctly
postgres=# create or replace function to_upper_first(param1 text, param2 text)
postgres-# returns text as $$
postgres$#     select upper(param1 collate "en_US.UTF-8")
postgres$#          || ' '
postgres$#          || upper(param2 collate "en_US.UTF-8")
postgres$#          || ' '
postgres$#          || param2;
postgres$# $$ language sql;
CREATE FUNCTION
postgres=#
postgres=# select to_upper_first('тест', user::text);
  to_upper_first
------------------
 ТЕСТ SURYA surya
(1 row)

Test 4: Direct upper() calls - Work correctly
postgres=# select upper('тест');
 upper
-------
 ТЕСТ
(1 row)

postgres=# select upper('тест'::text);
 upper
-------
 ТЕСТ
(1 row)

postgres=# select upper(('тест')::text);
 upper
-------
 ТЕСТ
(1 row)

postgres=# select upper(('тест') collate "en_US.UTF-8");
 upper
-------
 ТЕСТ
(1 row)

postgres=# select upper(('тест') collate "C");
 upper
-------
 тест
(1 row)


My observations:
  1. Direct calls to upper('тест') correctly uppercases the Cyrillic using the database's en_US.UTF-8 locale
  2. Inside functions, upper(param1) fails to uppercase Cyrillic when user::text is passed as param2
       i. The issue occurs in both SQL and PL/pgSQL functions
  3. Explicitly specifying collate "en_US.UTF-8" in the function resolves the issue.

If select upper('тест'::text) works correctly (i.e it converts to uppercase) outside the function, why does upper(param1) appear to use C collation inside the function when user::text is involved as another parameter?
It seems that the user session variable as a parameter is somehow affecting collation resolution for the independent param1 parameter, which seems unexpected.

I will investigate more into this issue.

Regards,
Surya Poondla

pgsql-bugs by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: basic_archive lost archive_directory
Next
From: Michael Paquier
Date:
Subject: Re: BUG #19396: Standby and DR site replication broken with PANIC: WAL contains references to invalid pages messge