On Tue, 2026-03-10 at 11:12 -0400, Robert Haas wrote:
> I don't know if this is exactly the right proposal, but I think it's
> probably appropriate to start gently pushing people towards UTF-8
> rather than anything else. Unicode has largely won, AFAICT, and the
> use cases for anything else are increasingly narrow. I don't think we
> should try to be coercive, but there's a reasonable presumption that
> people who haven't said what they want probably want UTF8.
If their environment's LC_CTYPE is UTF8-based, they already get UTF-8.
If it isn't, we can either:
(a) Fall back to LC_CTYPE=C, which is the only UTF8-compatible locale
available everywhere. C is actually not a terrible fallback: it doesn't
actually affect many things, because I have moved almost everything to
use the database default locale.
(b) Warn or error unless they explicitly specify the encoding with -E.
But the former is likely to be ignored and the latter is not what I'd
call "gentle".
Which of these do you think is the right approach?
There's narrower question about what we do with LC_CTYPE=C. Currently
we use SQL_ASCII encoding, which doesn't seem like a great default, and
we could change that to default to UTF8. And another question about
whether we change the meaning of --no-locale.
>
> I'm much less convinced about this idea. I think the number of people
> who will be unhappy about the less-user-friendly sort order changes
> is
> probably quite high. It's reasonable to want something more stable
> and
> better version-controlled than libc, but switching to a simple
> code-point sort seems like a high price to pay for that.
Surely inconsistent indexes and poor performance are also a high price,
so how do you weigh the prices against each other?
We sweat over single-digit performance regressions in fairly specific
cases all the time, but here we're 3X slower for index builds:
https://www.depesz.com/2024/06/11/how-much-speed-youre-leaving-at-the-table-if-you-use-default-locale/
and 2-5X slower for Sort:
https://www.postgresql.org/message-id/64039a2dbcba6f42ed2f32bb5f0371870a70afda.camel@j-davis.com
and others don't seem very concerned, so I feel like I'm missing
something.
Regards,
Jeff Davis