Re: speed up unicode decomposition and recomposition - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: speed up unicode decomposition and recomposition
Date
Msg-id 20201015002523.GA2305@paquier.xyz
Whole thread Raw
In response to Re: speed up unicode decomposition and recomposition  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: speed up unicode decomposition and recomposition  (John Naylor <john.naylor@enterprisedb.com>)
List pgsql-hackers
On Wed, Oct 14, 2020 at 01:06:40PM -0400, Tom Lane wrote:
> John Naylor <john.naylor@enterprisedb.com> writes:
>> Some other considerations:
>> - As I alluded above, this adds ~26kB to libpq because of SASLPrep. Since
>> the decomp array was reordered to optimize linear search, it can no longer
>> be used for binary search.  It's possible to build two arrays, one for
>> frontend and one for backend, but that's additional complexity. We could
>> also force frontend to do a linear search all the time, but that seems
>> foolish. I haven't checked if it's possible to exclude the hash from
>> backend's libpq.
>
> IIUC, the only place libpq uses this is to process a password-sized string
> or two during connection establishment.  It seems quite silly to add
> 26kB in order to make that faster.  Seems like a nice speedup on the
> backend side, but I'd vote for keeping the frontend as-is.

Agreed.  Let's only use the perfect hash in the backend.  It would be
nice to avoid an extra generation of the decomposition table for that,
and a table ordered by codepoints is easier to look at.  How much do
you think would be the performance impact if we don't use for the
linear search the most-optimized decomposition table?
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: kevent latch paths don't handle postmaster death well
Next
From: Alvaro Herrera
Date:
Subject: Re: pgsql: Restore replication protocol's duplicate command tags