Andrew Dunstan <andrew@dunslane.net> writes:
> ITAGAKI Takahiro wrote:
>> | SQL_ASCII | LATIN1 | UTF8 | EUC_JP
>> ---------+-----------+--------+-------+---------
>> HEAD | 8017 | 8029 | 16928 | 18213
>> Patched | 7899 | 7887 | 9985 | 10370 [ms]
>>
>> It improved the performance not only for UTF8, but also for other
>> multi-byte encodings and a bit for single-byte encodings.
> Interesting. I infer from these results that the biggest bang here comes
> from abandoning CHAREQ and doing all comparisons byte-wise.
It looks like CHAREQ and NextChar are both pretty expensive, no doubt
due to having to drill down through the MB encoding vectoring mechanism
to find out what to do.
A technique we might want to apply in future patches is to have an API
whereby we can get a direct function pointer to the appropriate mblen
or other encoding-dependent function, and then call directly to the
right place in the inner loops instead of having to go through the
intermediate vectoring function every time.
regards, tom lane