Home > mailing lists

Re: Sorting Improvements for 8.4 - Mailing list pgsql-hackers

From	Jeff Davis
Subject	Re: Sorting Improvements for 8.4
Date	December 3, 2007 21:35:32
Msg-id	1196721318.22428.477.camel@dogma.ljc.laika.com Whole thread Raw
In response to	Re: Sorting Improvements for 8.4 (Gregory Stark <stark@enterprisedb.com>)
List	pgsql-hackers

Tree view

On Mon, 2007-12-03 at 20:40 +0000, Gregory Stark wrote:
> So the question is just how many seeks are we doing during sorting. If we're
> doing 0.1% seeks and 99.9% sequential i/o then eliminating the 1% entirely
> (which we can't do) isn't going to speed up seeking all that much. If we're
> doing 20% seeks and can get that down to 10% it might be worthwhile.

It's not just about eliminating seeks, it's about being able to merge
more runs at one time.

If you are merging 10 runs at once, and only two of those runs overlap
and the rest are much greater values, you might be spending 99% of the
time in sequential I/O. 

But the point is, we're wasting the memory holding those other 8 runs in
memory (wasting 80% of the memory you're using), so we really could be
merging a lot more than 10 runs at once. This might eliminate stages
from the merge process.

My point is just that "how many seeks are we doing" is not the only
question. We could be doing 99% sequential I/O and still make huge wins.

In reality, of course, the runs aren't going to be disjoint completely,
but they may be partially disjoint. That's where forecasting comes in:
you preread from the tapes you will actually need tuples from soonest.

Regards,Jeff Davis

pgsql-hackers by date:

From: Gregory Stark
Date: 03 December 2007, 21:11:07
Subject: Re: Sorting Improvements for 8.4

From: Devrim GÜNDÜZ
Date: 03 December 2007, 21:40:47
Subject: Re: Is postgres.gif missing in cvs?

Re: Sorting Improvements for 8.4 - Mailing list pgsql-hackers

Previous

Next