Re: Optimization for lower(), upper(), casefold() functions. - Mailing list pgsql-hackers

From Alexander Borisov
Subject Re: Optimization for lower(), upper(), casefold() functions.
Date
Msg-id e9057c71-93ec-4a21-b02e-979c7da2b665@gmail.com
Whole thread Raw
In response to Re: Optimization for lower(), upper(), casefold() functions.  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: Optimization for lower(), upper(), casefold() functions.
List pgsql-hackers
19.02.2025 01:02, Jeff Davis пишет:
> On Tue, 2025-02-11 at 23:08 +0300, Alexander Borisov wrote:
>> I tried the approach via a range table. The result was worse than
>> without the table. With branching in a function, the result is
>> better.
>>
>> Patch v3 — ranges binary search by branches.
>> Patch v4 — ranges binary search by table.
> 
> Thoughts on v3:
> 
> It looks like the top 5 bits of the offset are unused. What if we used
> those bits for flags to indicate:
> 
>     HAS_LOWER
>     HAS_UPPER
>     HAS_FOLD
>     HAS_SPECIAL
>     HAS_TITLE
> 
> That way, we only need to look in the corresponding table if it
> actually has an entry other than the codepoint itself.
> 
> It doesn't leave a lot of room if the tables get larger, but if we are
> worried about that, we could eliminate HAS_TITLE, because I don't think
> the performance for INITCAP() is as important as LOWER/UPPER/CASEFOLD.
> 
> Regards,
>     Jeff Davis
> 

It seems to be micro-optimizations. In the sense that the code is very
clear and understandable now.
These micro-optimizations make the code more complex, and there will be
no performance gain.

In proposing the patch for v3, I struck a balance between improving
performance and reducing binary size, without sacrificing code clarity.

We can go into micro-optimizations now and lose the essence of
the proposed improvements. And ahead of us we have
Unicode Normalization Forms.

I'm willing to try different approaches, but there's confidence that
we've found a great optimization right now.


--
Alexander Borisov




pgsql-hackers by date:

Previous
From: Jelte Fennema-Nio
Date:
Subject: Re: Commitfest app release on Feb 17 with many improvements
Next
From: Jeff Davis
Date:
Subject: Re: Optimization for lower(), upper(), casefold() functions.