Re: maintenance_work_mem and CREATE INDEX time - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: maintenance_work_mem and CREATE INDEX time
Date
Msg-id CAMkU=1zLr03igfRbfr+GqousCR5CExzUcRZp7tCwDWfQcypzLw@mail.gmail.com
Whole thread Raw
In response to Re: maintenance_work_mem and CREATE INDEX time  (Amit Langote <amitlangote09@gmail.com>)
Responses Re: maintenance_work_mem and CREATE INDEX time
List pgsql-hackers
On Tue, Jul 23, 2013 at 10:56 PM, Amit Langote <amitlangote09@gmail.com> wrote:
> On Wed, Jul 24, 2013 at 3:20 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
>>
>> The heap structure used in external sorts is cache-unfriendly.  The
>> bigger the heap used, the more this unfriendliness becomes apparent.
>> And the bigger maintenance_work_mem, the bigger the heap used.
>>
>> The bigger heap also means you have fewer "runs" to merge in the
>> external sort.  However, as long as the number of runs still fits in
>> the same number of merge passes, this is generally not a meaningful
>> difference.
>
> Does fewer runs mean more time (in whichever phase of external sort)?

That's complicated.  In general fewer runs are faster, as the heap
used at that stage is smaller.  But this difference is small.  If you
can get the number of runs down to a level that needs fewer passes
over the data, that will make things faster.  But this is rare.  If
the sort isn't already being done in a single pass, then your sort
must be huge or your working memory setting is pathologically tiny.

There is a rough conservation of total heap layers between the two
phases: the initial tuple heap, and the merge stage heap-of-tapes.
Say for example that by increasing work_mem, you can increase the
initial heap from 25 layers to 27 layers, while decreasing the merge
phase heap from 5 layers to 3 layers.  The total number of comparisons
for the entire sort will be about the same, but the comparisons across
the 27 layer heap are much more likely to need to go to main RAM,
rather than come from L3 cache (or whatever the cache level is).

Cheers,

Jeff


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Bison 3.0 updates
Next
From: Marti Raudsepp
Date:
Subject: Re: Bison 3.0 updates