Re: Encoding/collation question - Mailing list pgsql-general

From Karsten Hilbert
Subject Re: Encoding/collation question
Date
Msg-id 20191212093713.GA3164@hermes.hilbert.loc
Whole thread Raw
In response to Re: Encoding/collation question  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
Responses Re: Encoding/collation question
List pgsql-general
On Thu, Dec 12, 2019 at 05:03:59AM +0000, Andrew Gierth wrote:

>  Rich> I doubt that my use will notice meaningful differences. Since
>  Rich> there are only two or three databases in UTF8 and its collation
>  Rich> perhaps I'll convert those to LATIN1 and C.
>
> Note that it's perfectly fine to use UTF8 encoding and C collation (this
> has the effect of sorting strings in Unicode codepoint order); this is
> as fast for comparisons as LATIN1/C is.
>
> For those cases where you need data to be sorted in a
> culturally-meaningful order rather than in codepoint order, you can set
> collations on specific columns or in individual queries.

Nice, thanks for pointing that out. One addition: while this
may seem like "the" magic bullet it should be noted that one
will need additional indexes for culturally-meaningful ORDER
BY sorts to be fast (while having a default non-C collation
one will get a by-default culturally-meaningful index for
that one non-C locale).

Question: is C collation expected to be future-proof /
rock-solid /stable -- like UTF8 for encoding choice -- or
could it end up like the SQL-ASCII encoding did: Yeah, we
support it, it's been in use a long time, it should work,
but, nah, one doesn't really want to choose it over UTF8 if
at all possible, or at least know *exactly* what one is doing
and BTW YMMV ?

Karsten
--
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B



pgsql-general by date:

Previous
From: Deepti Sharma S
Date:
Subject: RE: PostgreSQL version compatibility with RHEL7.7
Next
From: Tom Lane
Date:
Subject: Re: Encoding/collation question