On Sun, 2025-03-02 at 23:33 +0300, Alexander Borisov wrote:
> Did you have a time for review this?
>
> I'd like to continue improving Unicode in Postgres, as I previously
> wrote, next in my plans are Normalization forms, and more.
> But now I am blocked by this patch.
Hi,
I have refactored unicode_case.c a bit (v3j-0001) and rebased your v3
work on top of that (v3j-0002).
The refactoring is so that the optimizations do not need to modify
convert_case, which is already complex and I'd like to avoid adding
more to that function. Instead, I created a casemap() function, which
maps a single chracter, and convert_case() calls that.
I didn't test the refactoring for performance, but it looks as
optimizable as what was there before.
A couple questions:
* Is there a reason the fast-path for codepoints < 0x80 is in
unicode_case.c rather than unicode_case_func.h?
* Is there a reason you defined case_index() as static rather than
static inline?
* Is there a reason to have a new file unicode_case_func.h rather than
just add it to unicode_case_table.h?
I'm looking at a few more details, but this is a low-risk change
because there are exhaustive tests, so I intend to commit something
like this soon.
Regards,
Jeff Davis