Re: A worst case for qsort - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: A worst case for qsort
Date
Msg-id CAM3SWZTTSUUa0SWhDPqi9QR5TUqF5mhBcDsLAUdrXR6_4k9vzQ@mail.gmail.com
Whole thread Raw
In response to Re: A worst case for qsort  (Rod Taylor <rod.taylor@gmail.com>)
List pgsql-hackers
On Thu, Aug 7, 2014 at 2:23 PM, Rod Taylor <rod.taylor@gmail.com> wrote:
> While I'm sure it's not common, I've seen a couple of ten-million tuple
> tables having a URL column as primary key where 98% of the entries begin
> with 'http://www.'
>
> So, that exact scenario is out there.

Sure, that scenario is relatively common. My point was that that
scenario, alongside a perfect logical/physical heap correlation, and
alongside a frequent need to sort is uncommon. If you're able to get
away with a "bubble sort best case" there, then the sort is going to
be extremely fast relative to any other database system anyway, and
you don't have too much to complain about. It's exactly the same
wasted cost as in the general case (where the tuples aren't in order),
except that it looks more expensive next to your unrealistically fast
sort that you were very lucky to be able to get away with.

I think the special pre-check for already sorted tuples within qsort()
is at best only justified by scenarios like where a serial primary key
index is reindexed. That's about it.

You can totally alter the outcome of this case by inserting a single
tuple, so the top-level pre-check fails.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: David G Johnston
Date:
Subject: Re: Fixed redundant i18n strings in json
Next
From: Rod Taylor
Date:
Subject: Re: A worst case for qsort