Re: TM format can mix encodings in to_char() - Mailing list pgsql-hackers

From Tom Lane
Subject Re: TM format can mix encodings in to_char()
Date
Msg-id 22661.1555960563@sss.pgh.pa.us
Whole thread Raw
In response to Re: TM format can mix encodings in to_char()  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: TM format can mix encodings in to_char()
List pgsql-hackers
Peter Geoghegan <pg@bowt.ie> writes:
> On Sun, Apr 21, 2019 at 6:26 AM Andrew Dunstan
> <andrew.dunstan@2ndquadrant.com> wrote:
>> How does one do that? Just set a Turkish locale?

> tr_TR is, in a sense, special among locales:
> http://blog.thetaphi.de/2012/07/default-locales-default-charsets-and.html
> The Turkish dotless i has apparently been implicated in all kinds of
> bugs in quite a variety of contexts.

Yeah, we've had our share of those :-(.  But the dotless i is not the
problem here --- it happens to not trigger an encoding conversion
issue, it seems.  Amusingly, the existing test case for lc_time = tr_TR
in collate.linux.utf8.sql is specifically coded to check what happens
with dotted/dotless i, and yet it manages to not trip over this problem.
(I suspect the reason is that what comes out of strftime is "Nis" which
is ASCII, and the non-ASCII characters only arise from subsequent case
conversion within PG.)

            regards, tom lane



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: pg_dump is broken for partition tablespaces
Next
From: Tom Lane
Date:
Subject: Re: pg_dump is broken for partition tablespaces