On Fri, 2024-11-01 at 14:08 +0100, Andreas Karlsson wrote:
> > Agreed -- a lot of work has gone into optimizing the regex code,
> > and we
> > don't want a perf regression there. But I'm also not sure exactly
> > which
> > kinds of tests I should be running for that.
>
> I think we should at least try to find the worst case to see how big
> the
> performance hit for that is. And then after that try to figure out a
> more typical case benchmark.
What I had in mind was:
* a large table with a single ~100KiB text field
* a scan with a case insensitive regex that uses some character
classes
Does that sound like a worst case?
> The painful part was mostly just a reference to that without a
> catalog
> table where new providers can be added we would need to add
> collations
> for our new custom provider on some already existing provider and
> then
> do for example some pattern matching on the name of the new
> collation.
> Really ugly but works.
To add a catalog table for the locale providers, the main challenge is
around the database default collation and, relatedly, initdb. Do you
have some ideas around that?
Regards,
Jeff Davis