Re: Memory usage during sorting - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Memory usage during sorting
Date
Msg-id CAM-w4HOU_shOEwi=sDQMxh96eKt_xZyPWvW5h8DVxJFh8PcGrQ@mail.gmail.com
Whole thread Raw
In response to Re: Memory usage during sorting  (Peter Geoghegan <peter@2ndquadrant.com>)
Responses Re: Memory usage during sorting  (Jim Nasby <jim@nasby.net>)
List pgsql-hackers
On Mon, Apr 16, 2012 at 10:42 PM, Peter Geoghegan <peter@2ndquadrant.com> wrote:
> All but 4 regression tests pass, but they don't really count
> as failures, since they're down to an assumption in the tests that the
> order certain tuples appear should be the same as our current
> quicksort implementation returns them, even though, in these
> problematic cases, that is partially dictated by implementation - our
> quicksort isn't stable, but timsort is.

This is an interesting point. If we use a stable sort we'll probably
be stuck with stable sorts indefinitely. People will start depending
on the stability and then we'll break their apps if we find a faster
sort that isn't stable.

Notably though tapesort is not stable (because heapsort is not stable
so neither the runs nor the merge steps are stable). So people's apps
would appear to work when they're in memory and fail only on large
data sets. It's easily possible for a user's query to never need to go
to tape though.

We don't have the luxury of having a separate sort and stable_sort
though due to the ORDER BY clause.

All in all I think it's handier to have a stable ORDER BY sort than an
unstable one though. So I'm not necessarily opposed to it even if it
means we're stuck using a stable sort indefinitely.
-- 
greg


pgsql-hackers by date:

Previous
From: Qi Huang
Date:
Subject: Re: Gsoc2012 idea, tablesample
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: [BUG] Checkpointer on hot standby runs without looking checkpoint_segments