Home > mailing lists

plpython_unicode test (was Re: buildfarm / handling (undefined) locales) - Mailing list pgsql-hackers

From	Tom Lane
Subject	plpython_unicode test (was Re: buildfarm / handling (undefined) locales)
Date	June 1, 2014 20:45:28
Msg-id	6789.1401655517@sss.pgh.pa.us Whole thread Raw
In response to	Re: buildfarm / handling (undefined) locales (Tomas Vondra <tv@fuzzy.cz>)
Responses	Re: plpython_unicode test (was Re: buildfarm / handling (undefined) locales)
List	pgsql-hackers

Tree view

Tomas Vondra <tv@fuzzy.cz> writes:
> On 13.5.2014 20:58, Tom Lane wrote:
>> Tomas Vondra <tv@fuzzy.cz> writes:
>>> Yeah, not really what we were shooting for. I've fixed this by
>>> defining the missing locales, and indeed - magpie now fails in
>>> plpython tests.

>> I saw that earlier today (tho right now the buildfarm server seems
>> to not be responding :-().  Probably we should use some
>> more-widely-used character code in that specific test?

> Any idea what other character could be used in those tests? ISTM fixing
> this universally would mean using ASCII characters - the subset of UTF-8
> common to all the encodings. But I'm afraid that'd contradict the very
> purpose of those tests ...

We really ought to resolve this issue so that we can get rid of some of
the red in the buildfarm.  ISTM there are three possible approaches:

1. Decide that we're not going to support running the plpython regression
tests under "weird" server encodings, in which case Tomas should just
remove cs_CZ.WIN-1250 from the set of encodings his buildfarm animals
test.  Don't much care for this, but it has the attraction of being
minimal work.

2. Change the plpython_unicode test to use some ASCII character in place
of \u0080.  We could keep on using the \u syntax to create the character,
but as stated above, this still seems like it's losing a significant
amount of test coverage.

3. Try to select some "more portable" non-ASCII character, perhaps U+00A0
(non breaking space) or U+00E1 (a-acute).  I think this would probably
work for most encodings but it might still fail in the Far East.  Another
objection is that the expected/plpython_unicode.out file would contain
that character in UTF8 form.  In principle that would work, since the test
sets client_encoding = utf8 explicitly, but I'm worried about accidental
corruption of the expected file by text editors, file transfers, etc.
(The current usage of U+0080 doesn't suffer from this risk because psql
special-cases printing of multibyte UTF8 control characters, so that we
get exactly "\u0080".)

Thoughts?
        regards, tom lane

pgsql-hackers by date:

From: Maxence Ahlouche
Date: 01 June 2014, 20:07:02
Subject: Re: [GSoC] Clustering in MADlib - status update

From: Tom Lane
Date: 01 June 2014, 21:36:07
Subject: Re: plpython_unicode test (was Re: buildfarm / handling (undefined) locales)

plpython_unicode test (was Re: buildfarm / handling (undefined) locales) - Mailing list pgsql-hackers

Previous

Next