Home > mailing lists

Re: Logical tape pause/resume - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: Logical tape pause/resume
Date	October 4, 2016 19:00:14
Msg-id	682429bb-608b-337b-ee8e-c915669aeb70@iki.fi Whole thread Raw
In response to	Re: Logical tape pause/resume (Simon Riggs <simon@2ndquadrant.com>)
List	pgsql-hackers

Tree view

On 10/04/2016 05:58 PM, Simon Riggs wrote:
> On 4 October 2016 at 12:47, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
>>> Why not just make each new run start at a block boundary?
>>> That way we waste on average BLCKSZ/2 disk space per run, which is
>>> negligible but we avoid any need to have code to read back in the last
>>> block.
>>
>> Hmm. You'd still have to read back the last block, so that you can update
>> its next-pointer.
>
> If each run is in its own file, then you can skip that bit.

Then you need to remember the names of the files, in memory. That's 
closer to my idea of having only one run on each tape, but you're taking 
further by also saying that each tape is a separate file.

That might be a good idea for building the initial runs. Each file would 
then be written and read sequentially, so we could perhaps get rid of 
the whole pre-reading code, and rely completely on OS read-ahead.

However, can't really do that for multi-pass merges, because you want to 
reuse the space of the input tapes for the output tape, as you go.

> And we do want the sort to disk to use multiple files so we can
> parallelize I/O as well as CPU.

Huh? Why can't you parallelize I/O on a single file?

- Heikki

pgsql-hackers by date:

From: Robert Haas
Date: 04 October 2016, 18:59:13
Subject: Re: Tracking wait event for latches

From: Robert Haas
Date: 04 October 2016, 19:01:28
Subject: Re: [RFC] Should we fix postmaster to avoid slow shutdown?

Re: Logical tape pause/resume - Mailing list pgsql-hackers

Previous

Next