Re: [HACKERS] sorting big tables :( - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: [HACKERS] sorting big tables :(
Date
Msg-id 199805170422.AAA03532@candle.pha.pa.us
Whole thread Raw
In response to Re: [HACKERS] sorting big tables :(  (Michael Richards <miker@scifair.acadiau.ca>)
Responses Re: [HACKERS] sorting big tables :(
Re: [HACKERS] sorting big tables :(
List pgsql-hackers
>
> On Fri, 15 May 1998, Bruce Momjian wrote:
>
> > > I have a big table. 40M rows.
> > > On the disk, it's size is:
> > >  2,090,369,024 bytes. So 2 gigs. On a 9 gig drive I can't sort this table.
> > > How should one decide based on table size how much room is needed?
>
> > It is taking so much disk space because it is using a TAPE sorting
> > method, by breaking the file into tape chunks and sorting in pieces, the
> The files grow until I have 6 files of almost a gig each. At that point, I
> start running out of space...
> This TAPE sotring method. It is a simple merge sort? Do you know of a way
> this could be done while using constant space and no more complexity in
> the algorithim. Even if it is a little slower, the DBMS could decide based
> on the table size whether it should use the tape sort or another one...
> Bubble sort would not be my first choice tho :)

Tape sort is a standard Knuth sorting.  It basically sorts in pieces,
and merges.  If you don't do this, the accessing around gets very poor
as you page fault all over the file, and the cache becomes useless.

There is something optimal about having seven sort files.  Not sure what
to suggest.  No one has complained about this before.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

pgsql-hackers by date:

Previous
From: Ryan Kirkpatrick
Date:
Subject: Regression Test Analysis for Linux/Alpha...
Next
From: dg@illustra.com (David Gould)
Date:
Subject: Re: [HACKERS] sorting big tables :(