Re: Inlining comparators as a performance optimisation - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: Inlining comparators as a performance optimisation |
Date | |
Msg-id | CA+U5nM+Et7jdrRo7Pq5Crcb5c+n3EiOMaY2Ji1uzDwtnO59P5g@mail.gmail.com Whole thread Raw |
In response to | Re: Inlining comparators as a performance optimisation (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Inlining comparators as a performance optimisation
Re: Inlining comparators as a performance optimisation |
List | pgsql-hackers |
On Fri, Nov 18, 2011 at 5:20 AM, Robert Haas <robertmhaas@gmail.com> wrote: > I think that we should really consider doing with this patch what Tom > suggested upthread; namely, looking for a mechanism to allow > individual datatypes to offer up a comparator function that doesn't > require bouncing through FunctionCall2Coll(). It seems to me that > duplicating the entire qsort() algorithm is a little iffy. Sure, in > this case it works out to a win. But it's only a small win - > three-quarters of it is in the uncontroversial activity of reducing > the impedance mismatch - and how do we know it will always be a win? > Adding more copies of the same code can be an anti-optimization if it > means that a smaller percentage of it fits in the instruction cache, > and sometimes small changes in runtime are caused by random shifts in > the layout of memory that align things more or less favorably across > cache lines rather than by real effects. Now it may well be that this > is a real effect, but will it still look as good when we do this for > 10 data types? For 100 data types? > > In contrast, it seems to me that reducing the impedance mismatch is > something that we could go and do across our entire code base, and > every single data type would benefit from it. It would also be > potentially usable by other sorting algorithms, not just quick sort. I don't think its credible to implement that kind of generic improvement at this stage of the release cycle. That has a much bigger impact since it potentially effects all internal datatypes and external ones also. Definitely a longer term way forward. If individual datatypes offer up a comparator function that is easily going to result in more code than is being suggested here. So the argument about flooding the CPU cache works against your alternate proposal, not in favour of it. We have no proof that we need to do this for 10 or 100 data types. We only currently have proof that there is gain for the most common types. Of course, it sounds like it might be useful to allow any data type to gain an advantage, but we shouldn't be blind to the point that almost nobody will use such a facility, and if they do the code won't be written for a long time yet. If this came as a request from custom datatype authors complaining of slow sorts it would be different, but it didn't so we don't even know if anybody would ever write user defined comparator routines. Rejecting a patch because of a guessed user requirement is not good. Peter's suggested change adds very few lines of code and those compile to some very terse code, a few hundred instructions at very most. Requesting an extra few cachelines to improve qsort by so much is still an easy overall win. The OP change improves qsort dramatically, and is a small, isolated patch. There is no significant downside. We also have it now, so lets commit this, chalk up another very good performance improvement and use our time on something else this commitfest, such as the flexlocks idea. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: