Re: foreign_data test fails with non-C locale - Mailing list pgsql-hackers
From | Peter Eisentraut |
---|---|
Subject | Re: foreign_data test fails with non-C locale |
Date | |
Msg-id | 200901111254.03722.peter_e@gmx.net Whole thread Raw |
In response to | Re: foreign_data test fails with non-C locale (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: foreign_data test fails with non-C locale
Re: foreign_data test fails with non-C locale Re: foreign_data test fails with non-C locale |
List | pgsql-hackers |
On Friday 09 January 2009 18:24:55 Tom Lane wrote: > I don't think we are prepared to buy into a general policy that the > regression tests should pass in *any* locale; maintaining a large > number of variant expected-files isn't very practical. However, the > de facto policy is that we try to keep them passing in locales that > are used by any of the regular developers. I think it would be useful > to have buildfarm members testing in a few common locales. This called for an extensive test ... :-) My glibc installation supplies 668 locales (locale -a), which appear to represent about 225 distinct language/country combinations. (The rest are encoding variants.) I ran the regression tests with all of them, and got 95 failures (out of 668). 15 out of the 95 failures are initdb not completing because the encoding specified by the locale is not supported by PostgreSQL. But it appears that at least xx_XX.utf8 works for each of these cases, so the language is supported in some way. The remaining 80 failures are more-or-less linguistic issues that belong to the following 26 language/country combinations: az_AZ sorts k < q < l; Turkish i br_FR sorts ch separately crh_UA Turkish i cs_CZ sorts ch separately; sorts st = s cy_GB sorts ch separately da_DK sorts aa = å > z es_EC sorts ch separately es_US sorts ch separately et_EE sorts v = w fo_FO sorts aa = å > z ha_NG sorts sh separately hsb_DE sorts ch separately ig_NG sorts ch separately; sorts sh separately ik_CA sorts ch separately kl_GL sorts aa = å > z nb_NO sorts aa = å > z nn_NO sorts aa = å > z om_ET sorts ch separately (> z); sorts sh separately om_KE sorts ch separately (> z); sorts sh separately pl_PL (some other inexplicable sorting regression) sk_SK sorts ch separately; sorts st = s sv_SE sorts v = w tk_TM sorts v = w tr_CY Turkish i tr_TR Turkish i tt_RU sorts k < q < l The "Turkish i" failures are in the tsearch tests. I'm not completely comfortable that it's doing the right thing there. We could easily get rid of the aa, ch, and v/w failures by adjusting the test data, since the data is completely coincidental anyway. I propose to do that, and document these issues so that they can be avoided in future tests. I'm not so worried about the other cases. Also, considering that some of these alternative sorting rules appear to be controversial even among users of the language (e.g., we have had actual bug reports that the es_EC rule is wrong, and the sv_SE rule is also obsolete according to the language regulators), it might be interesting to write a small test program that can tell users how their current locale behaves in known corner cases.
pgsql-hackers by date: