Re: Remove hacks for old bad qsort() implementations? - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Remove hacks for old bad qsort() implementations?
Date
Msg-id 200805072350.m47NoDE20060@momjian.us
Whole thread Raw
In response to Remove hacks for old bad qsort() implementations?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Remove hacks for old bad qsort() implementations?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom, are you intending to remove this part of the sort code?

---------------------------------------------------------------------------

Tom Lane wrote:
> There are several places in tuplesort.c (and perhaps elsewhere) where
> we explicitly work around limitations of various platforms' qsort()
> functions.  Notably, there's this bit in comparetup_index_btree
> 
>     /*
>      * If key values are equal, we sort on ItemPointer.  This does not affect
>      * validity of the finished index, but it offers cheap insurance against
>      * performance problems with bad qsort implementations that have trouble
>      * with large numbers of equal keys.
>      */
> 
> which I unquestioningly copied into comparetup_index_hash yesterday.
> However, oprofile is telling me that doing this is costing
> *significantly* more than just returning zero would do:
> 
>   9081  0.3050 :    tuple1 = (IndexTuple) a->tuple;
>   3759  0.1263 :    tuple2 = (IndexTuple) b->tuple;
>                :
>                :    {
> 130409  4.3800 :        BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
>  34539  1.1601 :        BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
>                :
>   3281  0.1102 :        if (blk1 != blk2)
>    812  0.0273 :            return (blk1 < blk2) ? -1 : 1;
>                :    }
>                :    {
>     28 9.4e-04 :        OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
>      1 3.4e-05 :        OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
>                :
>      1 3.4e-05 :        if (pos1 != pos2)
>  48757  1.6376 :            return (pos1 < pos2) ? -1 : 1;
>                :    }
>                :
>                :    return 0;
>  56705  1.9045 :}
> 
> Looks to me like we're eating more than seven percent of the total
> runtime to do this :-(
> 
> Now as far as I can see, the original motivation for this (as stated in
> the comment) is entirely dead anymore, since we always use our own qsort
> implementation in preference to whatever bogus version a given libc
> might supply.  What do people think of removing this bit of code in
> favor of just returning 0?
> 
> I can see a couple of possible objections:
> 
> 1. Someday we might go back to using platform qsort.  (But surely we
> could insist on qsort behaving sanely for equal keys.)
> 
> 2. If you've got lots of equal keys, it's conceivable that having the
> index entries sorted by TID offers some advantage in indexscan speed.
> I'm dubious that that's useful, mainly because the planner should prefer
> a bitmap scan in such a case; and anyway the ordering is unlikely to
> be preserved for long.  But it's something to think about.
> 
> Comments?
> 
>             regards, tom lane
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


pgsql-hackers by date:

Previous
From: David Fetter
Date:
Subject: Auto-updated fields
Next
From: Bruce Momjian
Date:
Subject: Re: minimal update