Home > mailing lists

Re: plpython_unicode test (was Re: buildfarm / handling (undefined) locales) - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: plpython_unicode test (was Re: buildfarm / handling (undefined) locales)
Date	June 2, 2014 00:36:07
Msg-id	7970.1401658553@sss.pgh.pa.us Whole thread Raw
In response to	plpython_unicode test (was Re: buildfarm / handling (undefined) locales) (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: plpython_unicode test (was Re: buildfarm / handling (undefined) locales)
List	pgsql-hackers

Tree view

I wrote:
> 3. Try to select some "more portable" non-ASCII character, perhaps U+00A0
> (non breaking space) or U+00E1 (a-acute).  I think this would probably
> work for most encodings but it might still fail in the Far East.  Another
> objection is that the expected/plpython_unicode.out file would contain
> that character in UTF8 form.  In principle that would work, since the test
> sets client_encoding = utf8 explicitly, but I'm worried about accidental
> corruption of the expected file by text editors, file transfers, etc.
> (The current usage of U+0080 doesn't suffer from this risk because psql
> special-cases printing of multibyte UTF8 control characters, so that we
> get exactly "\u0080".)

I did a little bit of experimentation and determined that none of the
LATIN1 characters are significantly more portable than what we've got:
for instance a-acute fails to convert into 16 of the 33 supported
server-side encodings (versus 17 failures for U+0080).  However,
non-breaking space is significantly better: it converts into all our
supported server encodings except EUC_CN, EUC_JP, EUC_KR, EUC_TW.
It seems likely that we won't do better than that except with a basic
ASCII character.

In principle we could make the test "pass" even in these encodings
by adding variant expected files, but I doubt it's worth it.  I'd
be inclined to just add a comment to the regression test file indicating
that that's a known failure case, and move on.
        regards, tom lane

pgsql-hackers by date:

From: Tom Lane
Date: 01 June 2014, 23:45:28
Subject: plpython_unicode test (was Re: buildfarm / handling (undefined) locales)

From: Andrew Dunstan
Date: 02 June 2014, 00:58:15
Subject: Re: plpython_unicode test (was Re: buildfarm / handling (undefined) locales)

Re: plpython_unicode test (was Re: buildfarm / handling (undefined) locales) - Mailing list pgsql-hackers

Previous

Next