Re: [HACKERS] A Better External Sort? - Mailing list pgsql-performance

From Josh Berkus
Subject Re: [HACKERS] A Better External Sort?
Date
Msg-id 200509291103.27344.josh@agliodbs.com
Whole thread Raw
In response to Re: [HACKERS] A Better External Sort?  ("Jeffrey W. Baker" <jwbaker@acm.org>)
List pgsql-performance
Jeff,

> Josh, do you happen to know how many passes are needed in the multiphase
> merge on your 60GB table?

No, any idea how to test that?

> I think the largest speedup will be to dump the multiphase merge and
> merge all tapes in one pass, no matter how large M.  Currently M is
> capped at 6, so a sort of 60GB with 1GB sort memory needs 13 passes over
> the tape.  It could be done in a single pass heap merge with N*log(M)
> comparisons, and, more importantly, far less input and output.

Yes, but the evidence suggests that we're actually not using the whole 1GB
of RAM ... maybe using only 32MB of it which would mean over 200 passes
(I'm not sure of the exact match).  Just fixing our algorithm so that it
used all of the work_mem permitted might improve things tremendously.

> I would also recommend using an external processes to asynchronously
> feed the tuples into the heap during the merge.
>
> What's the timeframe for 8.2?

Too far out to tell yet.  Probably 9mo to 1 year, that's been our history.

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

pgsql-performance by date:

Previous
From: "Jeffrey W. Baker"
Date:
Subject: Re: [HACKERS] A Better External Sort?
Next
From:
Date:
Subject: SQL Function performance