Home > mailing lists

Re: Speeding up text_position_next with multibyte encodings - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: Speeding up text_position_next with multibyte encodings
Date	January 25, 2019 17:33:54
Msg-id	b65df3d8-1f59-3bd7-ebbe-68b81d5a76a4@iki.fi Whole thread Raw
In response to	Re: Speeding up text_position_next with multibyte encodings (John Naylor <jcnaylor@gmail.com>)
Responses	Re: Speeding up text_position_next with multibyte encodings (Bruce Momjian <bruce@momjian.us>)
List	pgsql-hackers

Tree view

On 15/01/2019 02:52, John Naylor wrote:
> The majority of cases are measurably faster, and the best case is at
> least 20x faster. On the whole I'd say this patch is a performance win
> even without further optimization. I'm marking it ready for committer.

I read through the patch one more time, tweaked the comments a little 
bit, and committed. Thanks for the review!

I did a little profiling of the worst case, where this is slower than 
the old approach. There's a lot of function call overhead coming from 
walking the string with pg_mblen(). That could be improved. If we 
inlined pg_mblen() into loop, it becomes much faster, and I think this 
code would be faster even in the worst case. (Except for the very worst 
cases, where hash table with the new code happens to have a collision at 
a different point than before, but that doesn't seem like a fair 
comparison.)

I think this is good enough as it is, but if I have the time, I'm going 
to try optimizing the pg_mblen() loop, as well as similar loops e.g. in 
pg_mbstrlen(). Or if someone else wants to give that a go, feel free.

- Heikki

pgsql-hackers by date:

From: Petr Jelinek
Date: 25 January 2019, 17:26:55
Subject: Re: [HACKERS] logical decoding of two-phase transactions

From: Chapman Flack
Date: 25 January 2019, 17:37:35
Subject: Re: House style for DocBook documentation?

Re: Speeding up text_position_next with multibyte encodings - Mailing list pgsql-hackers

Previous

Next