Thread: Re: Optimization for lower(), upper(), casefold() functions.

Re: Optimization for lower(), upper(), casefold() functions.

From
Jeff Davis
Date:
On Tue, 2025-02-11 at 23:08 +0300, Alexander Borisov wrote:
> I tried the approach via a range table. The result was worse than
> without the table. With branching in a function, the result is
> better.
>
> Patch v3 — ranges binary search by branches.
> Patch v4 — ranges binary search by table.

Thoughts on v3:

It looks like the top 5 bits of the offset are unused. What if we used
those bits for flags to indicate:

   HAS_LOWER
   HAS_UPPER
   HAS_FOLD
   HAS_SPECIAL
   HAS_TITLE

That way, we only need to look in the corresponding table if it
actually has an entry other than the codepoint itself.

It doesn't leave a lot of room if the tables get larger, but if we are
worried about that, we could eliminate HAS_TITLE, because I don't think
the performance for INITCAP() is as important as LOWER/UPPER/CASEFOLD.

Regards,
    Jeff Davis




Re: Optimization for lower(), upper(), casefold() functions.

From
Alexander Borisov
Date:
19.02.2025 01:02, Jeff Davis пишет:
> On Tue, 2025-02-11 at 23:08 +0300, Alexander Borisov wrote:
>> I tried the approach via a range table. The result was worse than
>> without the table. With branching in a function, the result is
>> better.
>>
>> Patch v3 — ranges binary search by branches.
>> Patch v4 — ranges binary search by table.
> 
> Thoughts on v3:
> 
> It looks like the top 5 bits of the offset are unused. What if we used
> those bits for flags to indicate:
> 
>     HAS_LOWER
>     HAS_UPPER
>     HAS_FOLD
>     HAS_SPECIAL
>     HAS_TITLE
> 
> That way, we only need to look in the corresponding table if it
> actually has an entry other than the codepoint itself.
> 
> It doesn't leave a lot of room if the tables get larger, but if we are
> worried about that, we could eliminate HAS_TITLE, because I don't think
> the performance for INITCAP() is as important as LOWER/UPPER/CASEFOLD.
> 
> Regards,
>     Jeff Davis
> 

It seems to be micro-optimizations. In the sense that the code is very
clear and understandable now.
These micro-optimizations make the code more complex, and there will be
no performance gain.

In proposing the patch for v3, I struck a balance between improving
performance and reducing binary size, without sacrificing code clarity.

We can go into micro-optimizations now and lose the essence of
the proposed improvements. And ahead of us we have
Unicode Normalization Forms.

I'm willing to try different approaches, but there's confidence that
we've found a great optimization right now.


--
Alexander Borisov




Re: Optimization for lower(), upper(), casefold() functions.

From
Jeff Davis
Date:
On Wed, 2025-02-19 at 01:54 +0300, Alexander Borisov wrote:
> In proposing the patch for v3, I struck a balance between improving
> performance and reducing binary size, without sacrificing code
> clarity.

Fair enough. I will continue reviewing v3.

Regards,
    Jeff Davis