Re: Tuplesort merge pre-reading - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Tuplesort merge pre-reading
Date
Msg-id CAM3SWZTv6TOqtAE9n-o7+b=xF-TWfvo0nznSjRX1xX3SBgvETw@mail.gmail.com
Whole thread Raw
In response to Re: Tuplesort merge pre-reading  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: Tuplesort merge pre-reading  (Peter Geoghegan <pg@heroku.com>)
Re: Tuplesort merge pre-reading  (Heikki Linnakangas <hlinnaka@iki.fi>)
List pgsql-hackers
On Wed, Sep 28, 2016 at 5:04 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> Not sure that I understand. I agree that each merge pass tends to use
>> roughly the same number of tapes, but the distribution of real runs on
>> tapes is quite unbalanced in earlier merge passes (due to dummy runs).
>> It looks like you're always using batch memory, even for non-final
>> merges. Won't that fail to be in balance much of the time because of
>> the lopsided distribution of runs? Tapes have an uneven amount of real
>> data in earlier merge passes.
>
>
> How does the distribution of the runs on the tapes matter?

The exact details are not really relevant to this discussion (I think
it's confusing that we simply say "Target Fibonacci run counts",
FWIW), but the simple fact that it can be quite uneven is.

This is why I never pursued batch memory for non-final merges. Isn't
that what you're doing here? You're pretty much always setting
"state->batchUsed = true".

>> I'm basically repeating myself here, but: I think it's incorrect that
>> LogicalTapeAssignReadBufferSize() is called so indiscriminately (more
>> generally, it is questionable that it is called in such a high level
>> routine, rather than the start of a specific merge pass -- I said so a
>> couple of times already).
>
>
> You can't release the tape buffer at the end of a pass, because the buffer
> of a tape will already be filled with data from the next run on the same
> tape.

Okay, but can't you just not use batch memory for non-final merges,
per my initial approach? That seems far cleaner.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Tuplesort merge pre-reading
Next
From: Peter Geoghegan
Date:
Subject: Re: Tuplesort merge pre-reading