On Fri, 2005-09-23 at 11:31 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > Since we know the predicted size of the sort set prior to starting the
> > sort node, could we not use that information to allocate memory
> > appropriately? i.e. if sort size is predicted to be more than twice the
> > size of work_mem, then just move straight to the external sort algorithm
> > and set the work_mem down at the lower limit?
>
> Have you actually read the sort code?
Yes and Knuth too. Your research and code are incredible, almost
untouchable. Yet sort performance is important and empirical evidence
suggests that this can be improved upon significantly, so I am and will
be spending time trying to improve upon that. Another time...
This thread was aiming to plug a problem I saw with 8.1's ability to use
very large work_mem settings. I felt that either my performance numbers
were wrong or we needed to do something; I've not had anybody show me
performance numbers that prove mine doubtful, yet.
> During the run-forming phase it's definitely useful to eat all the
> memory you can: that translates directly to longer initial runs and
> hence fewer merge passes.
Sounds good, but maybe that is not the dominant effect. I'll retest, on
the assumption that there is a benefit, but there's something wrong with
my earlier tests.
> During the run-merging phase it's possible
> that using less memory would not hurt performance any, but as already
> stated, I don't think it will actually end up cutting the backend's
> memory footprint --- the sbrk point will be established during the run
> forming phase and it's unlikely to move back much until transaction end.
> Also, if I recall the development of that code correctly, the reason for
> using more than minimum memory during the merge phase is that writing or
> reading lots of tuples at once improves sequentiality of access to the
> temp files. So I'm not sure that cutting down the memory wouldn't hurt
> performance.
Cutting memory below about 16 MB does definitely hurt external sort
performance; I explain that as being the effect of sequential access. I
haven't looked to nail down the breakpoint exactly since it seemed more
important simply to say that there looked like there was one.. Its just
that raising it above that mark doesn't help much, according to my
current results.
I'll get some more test results and repost them, next week. I will be
very happy if the results show that more memory helps.
Best Regards, Simon Riggs