Re: Change initdb default to the builtin collation provider - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Change initdb default to the builtin collation provider
Date
Msg-id c216300bde738e98c231d5273b4be410912ccdb6.camel@j-davis.com
Whole thread Raw
In response to Re: Change initdb default to the builtin collation provider  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Thu, 2026-03-12 at 15:58 -0400, Robert Haas wrote:
>
>
> Back when I wrote web applications, before starting at EDB, this is
> the kind of thing that I did all the time, for like ten years
> straight. I had plenty of text fields that could have used collate
> "C", because they contained things like part numbers or account
> numbers or whatever. But anything that contained a person's name or a
> company name or any other kind of name that is assigned by humans
> rather than generated by a computer could contain any of the
> characters that humans use, and should be sorted the way humans like.
> And isn't this a totally normal kind of application for somebody to
> write? It sure was for me.

Yes, I agree that's perfectly normal application. I'm just not sure how
useful it is that the index order matches the expected display order by
default. While it's plausible that it could benefit from a few indexes
with a natural language collation, there are many practical reasons why
it might not.

And if it's a mix of fields, some of which are ASCII and some natural
language, then that's not a particularly strong argument that the
indexes should default to natural language. That leaves you unable to
use the indexes for prefix search on any field, which is a pretty
normal thing to want to do in that kind of application also.

I guess what I'm saying is that I agree that users want an appealing
final result order. But even assuming that's a requirement, pushing
that down into all text indexes by default is a bad trade-off: the cost
side is too high, and to see a net performance benefit, there are too
many "ifs".

Regards,
    Jeff Davis




pgsql-hackers by date:

Previous
From: jian he
Date:
Subject: Re: CAST(... ON DEFAULT) - WIP build on top of Error-Safe User Functions
Next
From: Peter Geoghegan
Date:
Subject: Re: Problems with get_actual_variable_range's VISITED_PAGES_LIMIT