Thread: foreign_data test fails with non-C locale

foreign_data test fails with non-C locale

From

Heikki Linnakangas

Date:

09 January 2009, 10:12:27

The foreign_data test case is failing when I run "make installcheck"
against a server that's been initialized with a locale other than C
(en_GB.UTF-8).

The reason is the different ordering of upper and lower case characters,
per attached diff file. We can simply add an alternative expected output
file, but I'd prefer not to if we can modify the test case instead. We
could rename some of the object so that they sort the same in all
locales, but that seems a bit awkward in this case.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
*** /home/hlinnaka/git-sandbox/pgsql/src/test/regress/expected/foreign_data.out    Fri Jan  9 13:11:06 2009
--- /home/hlinnaka/git-sandbox/pgsql/src/test/regress/results/foreign_data.out    Fri Jan  9 15:47:27 2009
***************
*** 658,667 ****
  SELECT * FROM information_schema.foreign_servers ORDER BY 1, 2;
   foreign_server_catalog | foreign_server_name | foreign_data_wrapper_catalog | foreign_data_wrapper_name |
foreign_server_type| foreign_server_version | authorization_identifier  

------------------------+---------------------+------------------------------+---------------------------+---------------------+------------------------+--------------------------
-  regression             | S6                  | regression                   | foo                       |
        |                        | foreign_data_user 
   regression             | s4                  | regression                   | foo                       | oracle
        |                        | foreign_data_user 
   regression             | s5                  | regression                   | foo                       |
        | 15.0                   | regress_test_role 
   regression             | s6                  | regression                   | foo                       |
        | 16.0                   | regress_test_indirect 
   regression             | s8                  | regression                   | postgresql                |
        |                        | foreign_data_user 
   regression             | st1                 | regression                   | foo                       |
        |                        | regress_test_indirect 
   regression             | st2                 | regression                   | foo                       |
        |                        | regress_test_role 
--- 658,667 ----
  SELECT * FROM information_schema.foreign_servers ORDER BY 1, 2;
   foreign_server_catalog | foreign_server_name | foreign_data_wrapper_catalog | foreign_data_wrapper_name |
foreign_server_type| foreign_server_version | authorization_identifier  

------------------------+---------------------+------------------------------+---------------------------+---------------------+------------------------+--------------------------
   regression             | s4                  | regression                   | foo                       | oracle
        |                        | foreign_data_user 
   regression             | s5                  | regression                   | foo                       |
        | 15.0                   | regress_test_role 
   regression             | s6                  | regression                   | foo                       |
        | 16.0                   | regress_test_indirect 
+  regression             | S6                  | regression                   | foo                       |
        |                        | foreign_data_user 
   regression             | s8                  | regression                   | postgresql                |
        |                        | foreign_data_user 
   regression             | st1                 | regression                   | foo                       |
        |                        | regress_test_indirect 
   regression             | st2                 | regression                   | foo                       |
        |                        | regress_test_role 
***************
*** 670,680 ****
  SELECT * FROM information_schema.foreign_server_options ORDER BY 1, 2, 3;
   foreign_server_catalog | foreign_server_name |   option_name    | option_value
  ------------------------+---------------------+------------------+--------------
-  regression             | S6                  | mixed_case_names | true
   regression             | s4                  | dbname           | b
   regression             | s4                  | host             | a
   regression             | s6                  | dbname           | b
   regression             | s6                  | host             | a
   regression             | s8                  | connect_timeout  | 30
   regression             | s8                  | dbname           | db1
  (7 rows)
--- 670,680 ----
  SELECT * FROM information_schema.foreign_server_options ORDER BY 1, 2, 3;
   foreign_server_catalog | foreign_server_name |   option_name    | option_value
  ------------------------+---------------------+------------------+--------------
   regression             | s4                  | dbname           | b
   regression             | s4                  | host             | a
   regression             | s6                  | dbname           | b
   regression             | s6                  | host             | a
+  regression             | S6                  | mixed_case_names | true
   regression             | s8                  | connect_timeout  | 30
   regression             | s8                  | dbname           | db1
  (7 rows)
***************
*** 682,693 ****
  SELECT * FROM information_schema.user_mappings ORDER BY 1, 2, 3;
   authorization_identifier | foreign_server_catalog | foreign_server_name
  --------------------------+------------------------+---------------------
   PUBLIC                   | regression             | s4
   PUBLIC                   | regression             | s8
   PUBLIC                   | regression             | st1
-  foreign_data_user        | regression             | S6
-  foreign_data_user        | regression             | s4
-  foreign_data_user        | regression             | s8
   regress_test_role        | regression             | s5
   regress_test_role        | regression             | s6
   regress_test_role        | regression             | st1
--- 682,693 ----
  SELECT * FROM information_schema.user_mappings ORDER BY 1, 2, 3;
   authorization_identifier | foreign_server_catalog | foreign_server_name
  --------------------------+------------------------+---------------------
+  foreign_data_user        | regression             | s4
+  foreign_data_user        | regression             | S6
+  foreign_data_user        | regression             | s8
   PUBLIC                   | regression             | s4
   PUBLIC                   | regression             | s8
   PUBLIC                   | regression             | st1
   regress_test_role        | regression             | s5
   regress_test_role        | regression             | s6
   regress_test_role        | regression             | st1
***************
*** 696,705 ****
  SELECT * FROM information_schema.user_mapping_options ORDER BY 1, 2, 3, 4;
   authorization_identifier | foreign_server_catalog | foreign_server_name | option_name |  option_value
  --------------------------+------------------------+---------------------+-------------+-----------------
-  PUBLIC                   | regression             | s4                  | mapping     | is public
-  PUBLIC                   | regression             | st1                 | modified    | 1
   foreign_data_user        | regression             | S6                  | username    | test_mixed_case
   foreign_data_user        | regression             | s8                  | password    | public
   regress_test_role        | regression             | s5                  | modified    | 1
   regress_test_role        | regression             | s6                  | username    | test
   regress_test_role        | regression             | st1                 | password    | boo
--- 696,705 ----
  SELECT * FROM information_schema.user_mapping_options ORDER BY 1, 2, 3, 4;
   authorization_identifier | foreign_server_catalog | foreign_server_name | option_name |  option_value
  --------------------------+------------------------+---------------------+-------------+-----------------
   foreign_data_user        | regression             | S6                  | username    | test_mixed_case
   foreign_data_user        | regression             | s8                  | password    | public
+  PUBLIC                   | regression             | s4                  | mapping     | is public
+  PUBLIC                   | regression             | st1                 | modified    | 1
   regress_test_role        | regression             | s5                  | modified    | 1
   regress_test_role        | regression             | s6                  | username    | test
   regress_test_role        | regression             | st1                 | password    | boo

======================================================================

Re: foreign_data test fails with non-C locale

From

Andrew Dunstan

Date:

09 January 2009, 10:17:50


Heikki Linnakangas wrote:
> The foreign_data test case is failing when I run "make installcheck" 
> against a server that's been initialized with a locale other than C 
> (en_GB.UTF-8).
>
> The reason is the different ordering of upper and lower case 
> characters, per attached diff file. We can simply add an alternative 
> expected output file, but I'd prefer not to if we can modify the test 
> case instead. We could rename some of the object so that they sort the 
> same in all locales, but that seems a bit awkward in this case.

Regression tests have always failed on non-C locales AFAIK. The 
buildfarm goes out of its way to avoid that.

cheers

andrew

Re: foreign_data test fails with non-C locale

From

Heikki Linnakangas

Date:

09 January 2009, 10:23:41

Andrew Dunstan wrote:
> Heikki Linnakangas wrote:
>> The foreign_data test case is failing when I run "make installcheck" 
>> against a server that's been initialized with a locale other than C 
>> (en_GB.UTF-8).
>>
>> The reason is the different ordering of upper and lower case 
>> characters, per attached diff file. We can simply add an alternative
>> expected output file, but I'd prefer not to if we can modify the test 
>> case instead. We could rename some of the object so that they sort the 
>> same in all locales, but that seems a bit awkward in this case.
> 
> Regression tests have always failed on non-C locales AFAIK. The 
> buildfarm goes out of its way to avoid that.

No, that's the only test case that's failing.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: foreign_data test fails with non-C locale

From

Peter Eisentraut

Date:

09 January 2009, 10:25:40

Andrew Dunstan wrote:
> Regression tests have always failed on non-C locales AFAIK. The 
> buildfarm goes out of its way to avoid that.

The regression tests should work just fine in non-C locales.  If the 
buildfarm goes out of its way to avoid non-C locales, then it loses some 
significant code coverage, considering that there are several variant 
code paths for locales, and considering the amount of users that use them.

Re: foreign_data test fails with non-C locale

From

Peter Eisentraut

Date:

09 January 2009, 10:52:03

Heikki Linnakangas wrote:
> The foreign_data test case is failing when I run "make installcheck" 
> against a server that's been initialized with a locale other than C 
> (en_GB.UTF-8).

I have removed one of the differences but can't reproduce the other 
right now (although it looks consequential).  I'll check that on a 
different machine.

Re: foreign_data test fails with non-C locale

From

Andrew Dunstan

Date:

09 January 2009, 11:21:53

Peter Eisentraut wrote:
> Andrew Dunstan wrote:
>> Regression tests have always failed on non-C locales AFAIK. The 
>> buildfarm goes out of its way to avoid that.
>
> The regression tests should work just fine in non-C locales.  If the 
> buildfarm goes out of its way to avoid non-C locales, then it loses 
> some significant code coverage, considering that there are several 
> variant code paths for locales, and considering the amount of users 
> that use them.

It was discussed here at the time, IIRC, and we put in the check 
precisely because other locales broke the buildfarm. Originally 
buildfarm just inherited the locale from its environment.

If it is no longer true that other locales break the tests, then I'm 
happy to examine alternatives.

cheers

andrew

Re: foreign_data test fails with non-C locale

From

Tom Lane

Date:

09 January 2009, 12:25:41

Andrew Dunstan <andrew@dunslane.net> writes:
> Peter Eisentraut wrote:
>> The regression tests should work just fine in non-C locales.

> It was discussed here at the time, IIRC, and we put in the check 
> precisely because other locales broke the buildfarm. Originally 
> buildfarm just inherited the locale from its environment.

I don't think we are prepared to buy into a general policy that the
regression tests should pass in *any* locale; maintaining a large
number of variant expected-files isn't very practical.  However, the
de facto policy is that we try to keep them passing in locales that
are used by any of the regular developers.  I think it would be useful
to have buildfarm members testing in a few common locales.
        regards, tom lane

Re: foreign_data test fails with non-C locale

From

"Guillaume Smet"

Date:

09 January 2009, 12:45:15

On Fri, Jan 9, 2009 at 5:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> However, the
> de facto policy is that we try to keep them passing in locales that
> are used by any of the regular developers.  I think it would be useful
> to have buildfarm members testing in a few common locales.

If you define common locales, I can set up as many new animals as
needed to cover the locales needed for any branch we'd like to test.

Perhaps we should add a parameter to the buildfarm config file so that
the buildfarm script can check the locale is accepted and set it
directly. Considering that we won't have the locale information in the
animal description, it's a good way to have it in the report.

Just let me know.

-- 
Guillaume

Re: foreign_data test fails with non-C locale

From

Andrew Dunstan

Date:

09 January 2009, 13:16:30


Guillaume Smet wrote:
> On Fri, Jan 9, 2009 at 5:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>   
>> However, the
>> de facto policy is that we try to keep them passing in locales that
>> are used by any of the regular developers.  I think it would be useful
>> to have buildfarm members testing in a few common locales.
>>     
>
> If you define common locales, I can set up as many new animals as
> needed to cover the locales needed for any branch we'd like to test.
>
> Perhaps we should add a parameter to the buildfarm config file so that
> the buildfarm script can check the locale is accepted and set it
> directly. Considering that we won't have the locale information in the
> animal description, it's a good way to have it in the report.
>
>
>   

Sure, we can easily have buildfarm's initdb step set any locale (and 
encoding, for that matter) we like. That's a simple change.

cheers

andrew

Re: foreign_data test fails with non-C locale

From

Peter Eisentraut

Date:

11 January 2009, 05:42:09

On Friday 09 January 2009 16:51:44 Peter Eisentraut wrote:
> Heikki Linnakangas wrote:
> > The foreign_data test case is failing when I run "make installcheck"
> > against a server that's been initialized with a locale other than C
> > (en_GB.UTF-8).
>
> I have removed one of the differences but can't reproduce the other
> right now (although it looks consequential).  I'll check that on a
> different machine.

Also fixed now.

Re: foreign_data test fails with non-C locale

From

Peter Eisentraut

Date:

11 January 2009, 06:54:11

On Friday 09 January 2009 18:24:55 Tom Lane wrote:
> I don't think we are prepared to buy into a general policy that the
> regression tests should pass in *any* locale; maintaining a large
> number of variant expected-files isn't very practical.  However, the
> de facto policy is that we try to keep them passing in locales that
> are used by any of the regular developers.  I think it would be useful
> to have buildfarm members testing in a few common locales.

This called for an extensive test ... :-)

My glibc installation supplies 668 locales (locale -a), which appear to
represent about 225 distinct language/country combinations.  (The rest are
encoding variants.)

I ran the regression tests with all of them, and got 95 failures (out of 668).

15 out of the 95 failures are initdb not completing because the encoding
specified by the locale is not supported by PostgreSQL.  But it appears that
at least xx_XX.utf8 works for each of these cases, so the language is
supported in some way.

The remaining 80 failures are more-or-less linguistic issues that belong to
the following 26 language/country combinations:

az_AZ    sorts k < q < l; Turkish i
br_FR    sorts ch separately
crh_UA    Turkish i
cs_CZ    sorts ch separately; sorts st = s
cy_GB    sorts ch separately
da_DK    sorts aa = å > z
es_EC    sorts ch separately
es_US    sorts ch separately
et_EE    sorts v = w
fo_FO    sorts aa = å > z
ha_NG    sorts sh separately
hsb_DE    sorts ch separately
ig_NG    sorts ch separately; sorts sh separately
ik_CA    sorts ch separately
kl_GL    sorts aa = å > z
nb_NO    sorts aa = å > z
nn_NO    sorts aa = å > z
om_ET    sorts ch separately (> z); sorts sh separately
om_KE    sorts ch separately (> z); sorts sh separately
pl_PL    (some other inexplicable sorting regression)
sk_SK    sorts ch separately; sorts st = s
sv_SE    sorts v = w
tk_TM    sorts v = w
tr_CY    Turkish i
tr_TR    Turkish i
tt_RU    sorts k < q < l

The "Turkish i" failures are in the tsearch tests.  I'm not completely
comfortable that it's doing the right thing there.

We could easily get rid of the aa, ch, and v/w failures by adjusting the test
data, since the data is completely coincidental anyway.  I propose to do
that, and document these issues so that they can be avoided in future tests.

I'm not so worried about the other cases.

Also, considering that some of these alternative sorting rules appear to be
controversial even among users of the language (e.g., we have had actual bug
reports that the es_EC rule is wrong, and the sv_SE rule is also obsolete
according to the language regulators), it might be interesting to write a
small test program that can tell users how their current locale behaves in
known corner cases.

Re: foreign_data test fails with non-C locale

From

Tom Lane

Date:

11 January 2009, 12:47:14

Peter Eisentraut <peter_e@gmx.net> writes:
> This called for an extensive test ... :-)

> My glibc installation supplies 668 locales (locale -a), which appear to 
> represent about 225 distinct language/country combinations.  (The rest are 
> encoding variants.)

> I ran the regression tests with all of them, and got 95 failures (out of 668).

Fascinating data.  I assume you did not remove the existing
locale-variant expected files?  IOW this isn't "all the locale
dependencies", but "all the ones we didn't fix previously"?

> We could easily get rid of the aa, ch, and v/w failures by adjusting the test
> data, since the data is completely coincidental anyway.  I propose to do 
> that, and document these issues so that they can be avoided in future tests.

I have no confidence in the ability of some documentation to keep the
tests clean.  However, if we had buildfarm members testing in locales
that exercise each of those cases, it'd be all right.

If we try to fix those cases I think we should try to fix Turkish i
as well ... but I concur that first requires determining if it's
behaving wrong or not. Devrim, or someone?

> Also, considering that some of these alternative sorting rules appear to be 
> controversial even among users of the language (e.g., we have had actual bug 
> reports that the es_EC rule is wrong, and the sv_SE rule is also obsolete 
> according to the language regulators), it might be interesting to write a 
> small test program that can tell users how their current locale behaves in 
> known corner cases.

Considering the number of people who complain about en_US (expecting C
sort order instead), I'm not sure you should consider this a corner
case.
        regards, tom lane

Re: foreign_data test fails with non-C locale

From

Devrim GÜNDÜZ

Date:

11 January 2009, 15:45:03

On Sun, 2009-01-11 at 12:54 +0200, Peter Eisentraut wrote:
> The "Turkish i" failures are in the tsearch tests.  I'm not completely
> comfortable that it's doing the right thing there.

AFAIK, ISO-8859-9 is broken in a way, and the Turkish maintainers are
not interested in fixing them -- they ask us to move to tr_TR.UTF-8.
--
Devrim GÜNDÜZ, RHCE
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr                  http://www.gunduz.org

Re: foreign_data test fails with non-C locale

From

Devrim GÜNDÜZ

Date:

11 January 2009, 16:01:59

On Sun, 2009-01-11 at 11:46 -0500, Tom Lane wrote:

> If we try to fix those cases I think we should try to fix Turkish i
> as well ... but I concur that first requires determining if it's
> behaving wrong or not. Devrim, or someone?


What exactly do you want to see?

Regards,
--
Devrim GÜNDÜZ, RHCE
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr                  http://www.gunduz.org

Re: foreign_data test fails with non-C locale

From

Peter Eisentraut

Date:

12 January 2009, 06:06:42

Devrim GÜNDÜZ wrote:
> On Sun, 2009-01-11 at 11:46 -0500, Tom Lane wrote:
> 
>> If we try to fix those cases I think we should try to fix Turkish i
>> as well ... but I concur that first requires determining if it's
>> behaving wrong or not. Devrim, or someone?
> 
> What exactly do you want to see?

Using a glibc system, initdb with --locale=tr_TR (or tr_TR.utf8 or 
whatever) and run make installcheck.  You should see test failures in 
the tsearch and tsdicts tests that appear to relate to issues with 
lowercasing the "I" letter correctly.  And then use your language skills 
to determine what the correct behavior is. ;-)

Note that on Mac OS X with tr_TR locales, the tests do not fail.

I actually suspect that both current answers are wrong.

Re: foreign_data test fails with non-C locale

From

Devrim GÜNDÜZ

Date:

12 January 2009, 07:26:32

Hi,

On Mon, 2009-01-12 at 12:06 +0200, Peter Eisentraut wrote:
> Using a glibc system, initdb with --locale=tr_TR (or tr_TR.utf8 or
> whatever) and run make installcheck.  You should see test failures in
> the tsearch and tsdicts tests that appear to relate to issues with
> lowercasing the "I" letter correctly.

Yep, I ran them already, and as you wrote, I'm getting 3 errors (tsearch
tests + foreign_data test).

>  And then use your language skills to determine what the correct
> behavior is. ;-)

SKIES would be skıes (dotless i).

Here is the conversion table:

I (capital) <-> ı
İ (capital <-> i

We also have a few more chars, but I did not test them yet:

ş <-> Ş (capital) (S with a tail)
ü <-> Ü (capital) (U with dots)
ç <-> Ç (capital) (C with a tail)
ğ <-> Ğ (capital) (G with a hat)
ö <-> Ö (capital) (O with dots)

Regards,
--
Devrim GÜNDÜZ, RHCE
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr                  http://www.gunduz.org

tsearch with Turkish locale ( was Re: foreign_data test fails with non-C locale)

From

Peter Eisentraut

Date:

19 January 2009, 11:03:54

Devrim GÜNDÜZ wrote:
> Yep, I ran them already, and as you wrote, I'm getting 3 errors (tsearch
> tests + foreign_data test).
> 
>>  And then use your language skills to determine what the correct
>> behavior is. ;-)
> 
> SKIES would be skıes (dotless i). 
> 
> Here is the conversion table:
> 
> I (capital) <-> ı 
> İ (capital <-> i

I think the test show that there is a bug in the tsearch support for 
Turkish.  Here is the test diff:


--- expected/tsearch.out        2008-10-18 12:56:29.000000000 +0300
+++ results/tsearch.out 2009-01-19 16:26:51.000000000 +0200
@@ -962,38 +962,38 @@ SELECT to_tsvector('SKIES My booKs');         to_tsvector ----------------------------
- 'books':3 'my':2 'skies':1
+ 'books':3 'my':2 'skIes':1 (1 row)
[and more of the same]

This is not correct under either Turkish or non-Turkish language rules.

Note that

postgres=# select lower('SKIES'); lower
------- skıes
(1 row)

Re: tsearch with Turkish locale ( was Re: foreign_data test fails with non-C locale)

From

Teodor Sigaev

Date:

19 January 2009, 13:45:28

> I think the test show that there is a bug in the tsearch support for 
> Turkish.  Here is the test diff:
How to reproduce that?

% psql -l                               List of databases    Name    | Owner  | Encoding |  Collation  |    Ctype    |
Accessprivileges
 
------------+--------+----------+-------------+-------------+------------------- postgres   | pgsql  | UTF8     |
tr_TR.UTF-8| tr_TR.UTF-8 | regression | teodor | UTF8     | tr_TR.UTF-8 | tr_TR.UTF-8 |
 

% ./pg_regress --inputdir=. --dlpath=. --multibyte=UTF8 --load-language=plpgsql   --top-builddir=../../..
--schedule=./parallel_schedule
...
======================= All 120 tests passed.
=======================


-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/

Re: tsearch with Turkish locale ( was Re: foreign_data test fails with non-C locale)

From

Devrim GÜNDÜZ

Date:

19 January 2009, 14:20:51

On Mon, 2009-01-19 at 20:45 +0300, Teodor Sigaev wrote:
> How to reproduce that?

-bash-3.2$ psql -l                                        List of databases   Name    |  Owner   | Encoding |
Collation |    Ctype    |          Access Privileges           
------------+----------+----------+-------------+-------------+-------------------------------------postgres   |
postgres| UTF8     | tr_TR.UTF-8 | tr_TR.UTF-8 | regression | postgres | UTF8     | tr_TR.UTF-8 | tr_TR.UTF-8 |
template0 | postgres | UTF8     | tr_TR.UTF-8 | tr_TR.UTF-8 | {=c/postgres,postgres=CTc/postgres}template1  | postgres
|UTF8     | tr_TR.UTF-8 | tr_TR.UTF-8 | {=c/postgres,postgres=CTc/postgres} 
(4 rows)

-bash-3.2$  ./pg_regress --inputdir=. --dlpath=. --multibyte=UTF8 --load-language=plpgsql --top-builddir=../../..
--schedule=./parallel_schedule
(using postmaster on Unix socket, default port)

<snip>    timestamp            ... FAILED    timestamptz          ... FAILED
<snip>    tsearch              ... FAILED    tsdicts              ... FAILED    foreign_data         ... FAILED

========================5 of 120 tests failed.
========================

This is on a Fedora-9 x86 box, and:

-bash-3.2$ rpm -qv glibc
glibc-2.8-8.i686

Regards,
--
Devrim GÜNDÜZ, RHCE
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr                  http://www.gunduz.org

Re: tsearch with Turkish locale ( was Re: foreign_data test fails with non-C locale)

From

Teodor Sigaev

Date:

19 January 2009, 14:45:41

> ========================
>  5 of 120 tests failed. 
> ========================
> 
> This is on a Fedora-9 x86 box, and:
> 
> -bash-3.2$ rpm -qv glibc
> glibc-2.8-8.i686

Interesting. On my notebook all is ok.
% uname -a
FreeBSD ... 7.1-RELEASE-p2 FreeBSD 7.1-RELEASE-p2

Is any possibility of broken locale?
-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/

Re: foreign_data test fails with non-C locale

From

Zdenek Kotala

Date:

19 January 2009, 16:13:23

Andrew Dunstan píše v pá 09. 01. 2009 v 12:16 -0500:
> 
> Guillaume Smet wrote:
> > On Fri, Jan 9, 2009 at 5:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >   
> >> However, the
> >> de facto policy is that we try to keep them passing in locales that
> >> are used by any of the regular developers.  I think it would be useful
> >> to have buildfarm members testing in a few common locales.
> >>     
> >
> > If you define common locales, I can set up as many new animals as
> > needed to cover the locales needed for any branch we'd like to test.
> >
> > Perhaps we should add a parameter to the buildfarm config file so that
> > the buildfarm script can check the locale is accepted and set it
> > directly. Considering that we won't have the locale information in the
> > animal description, it's a good way to have it in the report.
> >
> >
> >   
> 
> Sure, we can easily have buildfarm's initdb step set any locale (and 
> encoding, for that matter) we like. That's a simple change.

Will be possible to set more locales and run tests without recompilation
on all of them? For example I have installed all Solaris'es locales on
my animal, but currently it means that I need perform whole cycle for
each locale.
    Zdenek

Re: foreign_data test fails with non-C locale

From

Zdenek Kotala

Date:

19 January 2009, 16:39:11

Peter Eisentraut píše v ne 11. 01. 2009 v 12:54 +0200:

> The remaining 80 failures are more-or-less linguistic issues that belong to 
> the following 26 language/country combinations:
> 

> cs_CZ    sorts ch separately; sorts st = s

s < st
Zdenek

Re: foreign_data test fails with non-C locale

From

Peter Eisentraut

Date:

20 January 2009, 01:45:15

On Monday 19 January 2009 22:39:30 Zdenek Kotala wrote:
> Peter Eisentraut píše v ne 11. 01. 2009 v 12:54 +0200:
> > The remaining 80 failures are more-or-less linguistic issues that belong
> > to the following 26 language/country combinations:
> >
> >
> > cs_CZ    sorts ch separately; sorts st = s
>
> s < st

I had initially misinterpreted the failures.  The real difference is that
Czech sorts numbers after letters, most other locales do it the other way
around.

Re: tsearch with Turkish locale ( was Re: foreign_data test fails with non-C locale)

From

Peter Eisentraut

Date:

20 January 2009, 03:26:11

Teodor Sigaev wrote:
>> ========================
>>  5 of 120 tests failed. ========================
>>
>> This is on a Fedora-9 x86 box, and:
>>
>> -bash-3.2$ rpm -qv glibc
>> glibc-2.8-8.i686
> 
> Interesting. On my notebook all is ok.
> % uname -a
> FreeBSD ... 7.1-RELEASE-p2 FreeBSD 7.1-RELEASE-p2
> 
> Is any possibility of broken locale?

Assuming that the locales on FreeBSD are the same or closely related to 
the ones on Mac OS X, I would rather say that the BSD locales are 
broken, because they don't actually support the Turkish case conversion 
rules:

regression=# show lc_ctype;  lc_ctype
------------- tr_TR.utf-8
(1 row)

regression=# select lower('SKIES'); lower
------- skies
(1 row)

regression=# select upper('skies'); upper
------- SKIES
(1 row)


Thus, the problem that the glibc locales appear to expose is masked here.

Re: foreign_data test fails with non-C locale

From

Andrew Dunstan

Date:

24 January 2009, 00:57:15


Zdenek Kotala wrote:
> Andrew Dunstan píše v pá 09. 01. 2009 v 12:16 -0500:
>   
>   
>> Sure, we can easily have buildfarm's initdb step set any locale (and 
>> encoding, for that matter) we like. That's a simple change.
>>     
>
> Will be possible to set more locales and run tests without recompilation
> on all of them? For example I have installed all Solaris'es locales on
> my animal, but currently it means that I need perform whole cycle for
> each locale.
>   

I'm working on this. Yes, you will be able to specify a list of locales 
to check. For each locale the following tests will be run: 
installcheck,  pl-installcheck, and contrib-installcheck.

However, our tests are still a bit short of working across locales.

PL-check gives the diff below on PLTCL tests under en_US locale. I guess 
the simplest answer is to add an alternative result file.

cheers

andrew

 select * from T_pkey1 order by key1 using @<, key2;  key1 |         key2         |                   txt
    ------+----------------------+------------------------------------------
 
-     1 | KEY1-3               | should work                                 1 | key1-1               | test key
                           1 | key1-2               | test key                                    1 | key1-3
  | test key                                    2 | key2-3               | test key
2| key2-9               | test key                                (6 rows)
 
--- 166,175 ---- select * from T_pkey1 order by key1 using @<, key2;  key1 |         key2         |
txt                   ------+----------------------+------------------------------------------     1 | key1-1
   | test key                                    1 | key1-2               | test key
1| key1-3               | test key                               
 
+     1 | KEY1-3               | should work                                 2 | key2-3               | test key
                           2 | key2-9               | test key                                (6 rows)

Re: foreign_data test fails with non-C locale

From

Zdenek Kotala

Date:

24 January 2009, 07:18:43

Andrew Dunstan píše v pá 23. 01. 2009 v 23:57 -0500:
> 
> Zdenek Kotala wrote:
> > Andrew Dunstan píše v pá 09. 01. 2009 v 12:16 -0500:
> >   
> >   
> >> Sure, we can easily have buildfarm's initdb step set any locale (and 
> >> encoding, for that matter) we like. That's a simple change.
> >>     
> >
> > Will be possible to set more locales and run tests without recompilation
> > on all of them? For example I have installed all Solaris'es locales on
> > my animal, but currently it means that I need perform whole cycle for
> > each locale.
> >   
> 
> I'm working on this. Yes, you will be able to specify a list of locales 
> to check. For each locale the following tests will be run: 
> installcheck,  pl-installcheck, and contrib-installcheck.

thanks

> However, our tests are still a bit short of working across locales.

Yes, they are. Peter cleaned up some of them, but there are still open
issues. And MacOS has broken locale which is different problem.

> PL-check gives the diff below on PLTCL tests under en_US locale. I guess 
> the simplest answer is to add an alternative result file.

Yes, I thought about add locale suffix for alternative result file, but
it could be useless overhead.

But some tests can be modified. For example 
select * from T_pkey1 order by key1 using @<, key2;

can be rewritten as
 select * from T_pkey1 order by key1 using @<, key2::name;

    Zdenek

Re: foreign_data test fails with non-C locale

From

Andrew Dunstan

Date:

26 January 2009, 12:09:13


Zdenek Kotala wrote:
> Andrew Dunstan píše v pá 23. 01. 2009 v 23:57 -0500:
>   
>> Zdenek Kotala wrote:
>>     
>>> Andrew Dunstan píše v pá 09. 01. 2009 v 12:16 -0500:
>>>   
>>>   
>>>       
>>>> Sure, we can easily have buildfarm's initdb step set any locale (and 
>>>> encoding, for that matter) we like. That's a simple change.
>>>>     
>>>>         
>>> Will be possible to set more locales and run tests without recompilation
>>> on all of them? For example I have installed all Solaris'es locales on
>>> my animal, but currently it means that I need perform whole cycle for
>>> each locale.
>>>   
>>>       
>> I'm working on this. Yes, you will be able to specify a list of locales 
>> to check. For each locale the following tests will be run: 
>> installcheck,  pl-installcheck, and contrib-installcheck.
>>     
>
> thanks
>
>   
>   

Example run with locales C, en_US.utf8 and french: 
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=dungbeetle&dt=2009-01-26%2012:44:01

cheers

andrew

Re: foreign_data test fails with non-C locale

From

Andrew Dunstan

Date:

31 January 2009, 18:08:26


Zdenek Kotala wrote:
>> PL-check gives the diff below on PLTCL tests under en_US locale. I guess 
>> the simplest answer is to add an alternative result file.
>>     
>
> Yes, I thought about add locale suffix for alternative result file, but
> it could be useless overhead.
>
> But some tests can be modified. For example 
>
>  select * from T_pkey1 order by key1 using @<, key2;
>
> can be rewritten as
>
>   select * from T_pkey1 order by key1 using @<, key2::name;
>
>
>
>   

Is that the preferred solution? I want to fix this so I can re-enable 
building with TCL in dungbeetle.

cheers

andrew

Re: foreign_data test fails with non-C locale

From

Zdenek Kotala

Date:

02 February 2009, 03:51:24

Andrew Dunstan píše v so 31. 01. 2009 v 17:08 -0500:
> 
> Zdenek Kotala wrote:
> >> PL-check gives the diff below on PLTCL tests under en_US locale. I guess 
> >> the simplest answer is to add an alternative result file.
> >>     
> >
> > Yes, I thought about add locale suffix for alternative result file, but
> > it could be useless overhead.
> >
> > But some tests can be modified. For example 
> >
> >  select * from T_pkey1 order by key1 using @<, key2;
> >
> > can be rewritten as
> >
> >   select * from T_pkey1 order by key1 using @<, key2::name;
> >
> >
> >
> >   
> 
> Is that the preferred solution? I want to fix this so I can re-enable 
> building with TCL in dungbeetle.

Probably not in all cases. 
Zdenek