Heikki Linnakangas писал 2017-05-15 12:06:
> On 05/14/2017 09:47 PM, Sokolov Yura wrote:
>> Good day, everyone.
>>
>> I've been playing a bit with unlogged tables - just random updates on
>> simple
>> key-value table. I've noticed amount of cpu spent in a
>> compactify_tuples
>> (called by PageRepareFragmentaion). Most of time were spent in qsort
>> of
>> itemidbase items.
>
> Ah, I played with this too a couple of years ago, see
> https://www.postgresql.org/message-id/546B89DE.7030906%40vmware.com,
> but got distracted by other things and never got around to commit
> that.
>
>> itemidbase array is bounded by number of tuples in a page, and
>> itemIdSortData
>> structure is simple, so specialized version could be a better choice.
>>
>> Attached patch adds combination of one pass of prefix sort with
>> insertion
>> sort for larger array and shell sort for smaller array.
>> Insertion sort and shell sort are implemented as macros and could be
>> reused.
>
> Cool! Could you compare that against the bucket sort I posted in the
> above thread, please?
>
> At a quick glance, your "prefix sort" seems to be the the same
> algorithm as the bucket sort that I implemented. You chose 256
> buckets, where I picked 32. And you're adding a shell sort
> implementation, for small arrays, while I used a straight insertion
> sort. Not sure what these differences mean in practice.
>
> - Heikki
Thank you for attention.
My first version of big page sort was almost exactly same to yours.
I had a bug in my insertion sort, so I had to refactor it.
(bug were fixed)
I found that items in itemidbase are almost sorted, so it is important
to try keep its order in prefix sort. So I've changed --count[i] to
count[i+1]++.
And it looks like it is better to have more buckets:
- with 256 buckets, size of single bucket is almost always less than 2,
so array is almost always sorted after prefix sort pass.
But it looks like it is better to sort each bucket separately, as you
did, and as it was in my early version.
Also I used your names for functions and some comments.
I attached new version of the patch.
I left memcpy intact cause it looks like it doesn't take noticable
cpu time.
--
Sokolov Yura aka funny.falcon
Postgres Professional: https://postgrespro.ru
The Russian PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers