Re: Using quicksort for every external sort run - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Using quicksort for every external sort run
Date
Msg-id 267c57d1-b0fc-aa9b-ecb6-cadfe1a36d8d@2ndquadrant.com
Whole thread Raw
In response to Re: Using quicksort for every external sort run  (Peter Geoghegan <pg@heroku.com>)
Responses Re: Using quicksort for every external sort run  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
On 04/03/2016 09:41 PM, Peter Geoghegan wrote:
> Hi Tomas,
...
>> 3) replacement_sort_mem GUC
>>
>> I'm not quite sure what's the plan with this GUC. It was useful for
>> development, but it seems to me it's pretty difficult to tune it in practice
>> (especially if you don't know the internals, which users generally don't).
>
> I agree.
>
>> So I think we should either remove the GUC entirely, or move it to the
>> developer section next to trace_sort (and removing it from the conf).
>
> I'll let Robert decide what's best here, but I see your point.
>
> Side note: trace_sort actually is documented. It's a bit weird that we
> have those TRACE_SORT macros at all IMV. I think we should rip those
> out, and assume every build enables TRACE_SORT, because that's
> probably true anyway.

What do you mean by documented? I thought this might be a good place is:

http://www.postgresql.org/docs/devel/static/runtime-config-developer.html

which is where trace_sort is documented.

>
> I do think that replacement selection could be put to good use for
> CREATE INDEX if the CREATE INDEX utility command had a "presorted"
> parameter. Specifically, an implementation of the "presorted" idea
> that I recently sketched [1] could do better than any presorted
> replacement selection case we've seen so far because it allows the
> implementation to optimistically create the index on-the-fly (if that
> isn't possible, throw an error), without a second pass over tuples
> sorted on tape. Nothing needs to be stored on a tape/temp file *at
> all*; the only thing that is stored externally is the index itself.
> But this patch doesn't add that feature, which can be worked on
> without the user needing to know about replacement_sort_mem in 9.6.
>
> So, I'm not in favor of ripping out the replacement selection code,
> but think it could make sense to effectively disable it entirely for
> the time being (with some developer feature to turn it back on for
> testing). In general, I share your misgivings about the new GUC,
> though.

OK.

>
>> I'm wondering whether 16MB default is not a bit too much, actually. As
>> explained before, that's not the amount of cache we should expect per
>> process, so maybe ~2-4MB would be a better default value?
>
> The obvious presorted case is where we have a SERIAL column, but as I
> mentioned even that isn't helped by RS. Moreover, it will be
> significantly hurt with a default maintenance_work_mem of 64MB. Your
> int4 CREATE INDEX cases clearly show this.
>
>> BTW couldn't we tune the value automatically for each sort, using the
>> pg_stats.correlation for the sort keys, when available (increasing the
>> replacement_sort_mem when correlation is close to 1.0)? Wouldn't that
>> improve at least some of the regressions?
>
> Maybe, but that seems hard. That information isn't conveniently
> available to the executor/tuplesort, and as we've seen with CREATE
> INDEX int4 cases, it's far from clear that we'll win even when there
> definitely is presorted input. Replacement selection needs more than a
> simple correlation to win, so you'll end up building a cost model with
> many new problems if this is to work.

Sure, that's non-trivial and definitely not a 9.6 material. I'm also 
wondering whether we need to do choose replacement_sort_mem at planning 
time, or whether it could be done in the executor based on actually 
observed data ...

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
Next
From: Tom Lane
Date:
Subject: Re: More stable query plans via more predictable column statistics