Re: About tapes - Mailing list pgsql-hackers

From Robert Haas
Subject Re: About tapes
Date
Msg-id AANLkTimEuAyvn34NykLpbMISZGLIZc8sEffsZy5SeMKP@mail.gmail.com
Whole thread Raw
In response to Re: About tapes  ("mac_man2005@hotmail.it" <mac_man2005@hotmail.it>)
Responses Re: About tapes  ("mac_man2005@hotmail.it" <mac_man2005@hotmail.it>)
List pgsql-hackers
On Fri, Jun 18, 2010 at 3:46 PM, mac_man2005@hotmail.it
<mac_man2005@hotmail.it> wrote:
> Which is the difference between having more than one tape into a file and
> having one tape per file?

It makes it easier to recycle space a little at a time.  Suppose
you're merging two runs of 100 blocks each.  You read in a block from
each run and write out two output blocks.  Now that you've done that,
the first block of each of the input runs is garbage and can be
recycled - but if the input runs and the output run are in three
separate files, there's no easy way to do that.  You can truncate a
file (and throw away the end) but there's no easy way to throw away
the BEGINNING of a file.  So you'll probably have to hold on to the
entirety of both inputs until you've written the entirety of the
output.

On the other hand, suppose you have all the blocks in one big file.
The first input run is in blocks 1-100; the second is in blocks
101-200.  You can read blocks 1 and 101, say, and write the results to
blocks 201 and 202, using extra storage, but only a little bit.  When
you then read blocks 2 and 102, you write the results to blocks 1 and
100, which are no longer needed, because you've already merged them.
When you get done with that, blocks 2 and 102 are now no longer needed
and can be used to write the next part of the output.  Of course, you
have to keep track of which order to reread the blocks in when the
sort is done: 201, 202, 1, 101, ... but that's a manageable problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


pgsql-hackers by date:

Previous
From: "mac_man2005@hotmail.it"
Date:
Subject: Re: About tapes
Next
From: Greg Stark
Date:
Subject: Re: hstore ==> and deprecate =>