Re: [HACKERS] Small improvement to compactify_tuples - Mailing list pgsql-hackers

From Claudio Freire
Subject Re: [HACKERS] Small improvement to compactify_tuples
Date
Msg-id CAGTBQpZgs4hVmDtHg4F6MDJuAE9-U=z6d+yJzjPsZup7AZhV0Q@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Small improvement to compactify_tuples  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] Small improvement to compactify_tuples  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, Nov 8, 2017 at 12:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Tue, Nov 7, 2017 at 4:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> What I'm getting from the standard pgbench measurements, on both machines,
>>> is that this patch might be a couple percent slower than HEAD, but that is
>>> barely above the noise floor so I'm not too sure about it.
>
>> Hmm.  It seems like slowing down single client performance by a couple
>> of percent is something that we really don't want to do.
>
> I do not think there is any change here that can be proven to always be a
> win.  Certainly the original patch, which proposes to replace an O(n log n)
> sort algorithm with an O(n^2) one, should not be thought to be that.
> The question to focus on is what's the average case, and I'm not sure how
> to decide what the average case is.  But more than two test scenarios
> would be a good start.
>
>                         regards, tom lane

Doing no change to the overall algorithm and replacing qsort with an
inlineable type-specific one should be a net win in all cases.

Doing bucket sort with a qsort of large buckets (or small tuple
arrays) should also be a net win in all cases.

Using shell sort might not seem clear, but lets not forget the
original patch only uses it in very small arrays and very infrequently
at that.

What's perhaps not clear is whether there are better ideas. Like
rebuilding the page as Tom proposes, which doesn't seem like a bad
idea. Bucket sort already is O(bytes), just as memcopy, only it has a
lower constant factor (it's bytes/256 in the original patch), which
might make copying the whole page an extra time lose against bucket
sort in a few cases.

Deciding that last point does need more benchmarking. That doesn't
mean the other improvements can't be pursued in the meanwhile, right?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: [HACKERS] Small improvement to compactify_tuples
Next
From: Peter Eisentraut
Date:
Subject: Re: [HACKERS] taking stdbool.h into use