On Fri, 2025-10-17 at 15:02 -0700, Jeff Davis wrote:
> On Fri, 2025-10-17 at 17:23 +0200, Peter Eisentraut wrote:
> > I remain violently opposed to this idea. I don't understand how it
> > could be acceptable to just not provide a good display order by
> > default
> > and have everyone rewrite their queries.
>
> I assume that you favor alternative 3 listed here[1], which is to use
> ICU "und" as the default. Is that correct? Or do you prefer to get
> the
> locale from the environment at initdb time?
Right now we're still stuck with the worst possible default: libc. Can
you make a more concrete counter-proposal here that sorts through some
of the details?
* Should we base the ICU locale on the environment, or just default
everyone to the "und" locale?
* If ICU support is disabled, how does that affect the defaults?
* If using the environment, what happens if the locale is not supported
by ICU (in particular "C" or "C.UTF-8")?
* What would be the default encoding, or should that come from the
environment?
* The ICU provider has some weaknesses around non-UTF8 encodings
because of casts from wchar_t and the use of tolower() in
downcase_identifier(). Are those potential blockers, and if so, are
they fixable?
* Can we try harder to find an acceptable way to use memcmp() for the
indexes by default, at least primary keys, even if the database
collation is ICU? I know that I've argued for this in the past and it's
been soundly rejected[1], but some variation on this idea could be
worthy of consideration.
Regards,
Jeff Davis
[1]
https://www.postgresql.org/message-id/b7a9f32eee8d24518f791168bc6fb653d1f95f4d.camel@j-davis.com