Re: Inlining comparators as a performance optimisation - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Inlining comparators as a performance optimisation
Date
Msg-id CA+U5nM+Et7jdrRo7Pq5Crcb5c+n3EiOMaY2Ji1uzDwtnO59P5g@mail.gmail.com
Whole thread Raw
In response to Re: Inlining comparators as a performance optimisation  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Inlining comparators as a performance optimisation
Re: Inlining comparators as a performance optimisation
List pgsql-hackers
On Fri, Nov 18, 2011 at 5:20 AM, Robert Haas <robertmhaas@gmail.com> wrote:

> I think that we should really consider doing with this patch what Tom
> suggested upthread; namely, looking for a mechanism to allow
> individual datatypes to offer up a comparator function that doesn't
> require bouncing through FunctionCall2Coll().  It seems to me that
> duplicating the entire qsort() algorithm is a little iffy.  Sure, in
> this case it works out to a win.  But it's only a small win -
> three-quarters of it is in the uncontroversial activity of reducing
> the impedance mismatch - and how do we know it will always be a win?
> Adding more copies of the same code can be an anti-optimization if it
> means that a smaller percentage of it fits in the instruction cache,
> and sometimes small changes in runtime are caused by random shifts in
> the layout of memory that align things more or less favorably across
> cache lines rather than by real effects.  Now it may well be that this
> is a real effect, but will it still look as good when we do this for
> 10 data types?  For 100 data types?
>
> In contrast, it seems to me that reducing the impedance mismatch is
> something that we could go and do across our entire code base, and
> every single data type would benefit from it.  It would also be
> potentially usable by other sorting algorithms, not just quick sort.

I don't think its credible to implement that kind of generic
improvement at this stage of the release cycle. That has a much bigger
impact since it potentially effects all internal datatypes and
external ones also. Definitely a longer term way forward.

If individual datatypes offer up a comparator function that is easily
going to result in more code than is being suggested here. So the
argument about flooding the CPU cache works against your alternate
proposal, not in favour of it.

We have no proof that we need to do this for 10 or 100 data types. We
only currently have proof that there is gain for the most common
types. Of course, it sounds like it might be useful to allow any data
type to gain an advantage, but we shouldn't be blind to the point that
almost nobody will use such a facility, and if they do the code won't
be written for a long time yet. If this came as a request from custom
datatype authors complaining of slow sorts it would be different, but
it didn't so we don't even know if anybody would ever write user
defined comparator routines. Rejecting a patch because of a guessed
user requirement is not good.

Peter's suggested change adds very few lines of code and those compile
to some very terse code, a few hundred instructions at very most.
Requesting an extra few cachelines to improve qsort by so much is
still an easy overall win.

The OP change improves qsort dramatically, and is a small, isolated
patch. There is no significant downside. We also have it now, so lets
commit this, chalk up another very good performance improvement and
use our time on something else this commitfest, such as the flexlocks
idea.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: range_adjacent and discrete ranges
Next
From: Greg Smith
Date:
Subject: Re: Core Extensions relocation