Home > mailing lists

Re: [HACKERS] Small improvement to compactify_tuples - Mailing list pgsql-hackers

From	Юрий Соколов
Subject	Re: [HACKERS] Small improvement to compactify_tuples
Date	November 6, 2017 17:50:34
Msg-id	CAL-rCA1Rk=hU_vEiX5s3_-qCV+mx6JnpsvCr5AeBh57igMvBuw@mail.gmail.com Whole thread
In response to	Re: [HACKERS] Small improvement to compactify_tuples (Claudio Freire <klaussfreire@gmail.com>)
Responses	Re: [HACKERS] Small improvement to compactify_tuples
List	pgsql-hackers

Tree view

2017-11-05 20:44 GMT+03:00 Claudio Freire <klaussfreire@gmail.com>:
>
> On Sat, Nov 4, 2017 at 8:07 PM, Юрий Соколов <funny.falcon@gmail.com> wrote:
> > 2017-11-03 5:46 GMT+03:00 Tom Lane <tgl@sss.pgh.pa.us>:
> >>
> >> Sokolov Yura <funny.falcon@postgrespro.ru> writes:
> >> > [ 0001-Improve-compactify_tuples.patch, v5 or thereabouts ]
> >>
> >> I went to check the shellsort algorithm against Wikipedia's entry,
> >> and found that this appears to be an incorrect implementation of
> >> shellsort: where pg_shell_sort_pass has
> >>
> >> for (_i = off; _i < _n; _i += off) \
> >>
> >> it seems to me that we need to have
> >>
> >> for (_i = off; _i < _n; _i += 1) \
> >>
> >> or maybe just _i++.
> >
> >
> > Shame on me :-(
> > I've wrote shell sort several times, so I forgot to recheck myself once
> > again.
> > And looks like best gap sequence from wikipedia is really best
> > ( {301, 132, 57, 23, 10 , 4} in my notation),
> >
> >
> > 2017-11-03 17:37 GMT+03:00 Claudio Freire <klaussfreire@gmail.com>:
> >> On Thu, Nov 2, 2017 at 11:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>> BTW, the originally given test case shows no measurable improvement
> >>> on my box.
> >>
> >> I did manage to reproduce the original test and got a consistent
> >> improvement.
> >
> > I've rechecked my self using my benchmark.
> > Without memmove, compactify_tuples comsumes:
> > - with qsort 11.66% cpu (pg_qsort + med3 + swapfunc + itemoffcompare +
> > compactify_tuples = 5.97 + 0.51 + 2.87 + 1.88 + 0.44)
> > - with just insertion sort 6.65% cpu (sort is inlined, itemoffcompare also
> > inlined, so whole is compactify_tuples)
> > - with just shell sort 5,98% cpu (sort is inlined again)
> > - with bucket sort 1,76% cpu (sort_itemIds + compactify_tuples = 1.30 +
> > 0.46)
>
> Is that just insertion sort without bucket sort?

Yes. Just to show that inlined insertion sort is better than non-inlined qsort

in this particular use-case.

> Because I think shell sort has little impact in your original patch
> because it's rarely exercised. With bucket sort, most buckets are very
> small, too small for shell sort to do any useful work.

Yes. In the patch, buckets are sorted with insertion sort. Shell sort is used

only on full array if its size less than 48.

Bucket sort has constant overhead of traversing all buckets, even if they

are empty. That is why I think, shell sort for small arrays is better. Though,

I didn't measure that carefully. And probably insertion sort for small arrays

will be just enough.

> Maybe leave a fallback to qsort if some corner case produces big buckets?

For 8kb pages, each bucket is per 32 bytes. So, for heap pages it is at

most 1 heap-tuple per bucket, and for index pages it is at most 2 index

tuples per bucket. For 32kb pages it is 4 heap-tuples and 8 index-tuples

per bucket.

It will be unnecessary overhead to call non-inlineable qsort in this cases

So, I think, shell sort could be removed, but insertion sort have to remain.

I'd prefer shell sort to remain also. It could be useful in other places also,

because it is easily inlinable, and provides comparable to qsort performance

up to several hundreds of elements.

With regards,

Sokolov Yura aka funny_falcon.

pgsql-hackers by date:

From: Stephen Frost
Date: 06 November 2017, 17:30:16
Subject: Re: [HACKERS] Early locking option to parallel backup

From: Claudio Freire
Date: 06 November 2017, 17:55:14
Subject: Re: [HACKERS] Small improvement to compactify_tuples

Re: [HACKERS] Small improvement to compactify_tuples - Mailing list pgsql-hackers

Previous

Next