Re: Merge algorithms for large numbers of "tapes" - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Merge algorithms for large numbers of "tapes"
Date
Msg-id 1141835704.27729.749.camel@localhost.localdomain
Whole thread Raw
In response to Re: Merge algorithms for large numbers of "tapes"  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, 2006-03-08 at 10:21 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > 1. Earlier we had some results that showed that the heapsorts got slower
> > when work_mem was higher and that concerns me most of all right now.
> 
> Fair enough, but that's completely independent of the merge algorithm.
> (I don't think the Nyberg results necessarily apply to our situation
> anyway, as we are not sorting arrays of integers, and hence the cache
> effects are far weaker for us.  I don't mind trying alternate sort
> algorithms, but I'm not going to believe an improvement in advance of
> direct evidence in our own environment.)

Of course, this would be prototyped first...and I agree about possible
variability of those results for us.

> > 2. Improvement in the way we do overall memory allocation, so we would
> > not have the problem of undersetting work_mem that we currently
> > experience. If we solved this problem we would have faster sorts in
> > *all* cases, not just extremely large ones. Dynamically setting work_mem
> > higher when possible would be very useful.
> 
> I think this would be extremely dangerous, as it would encourage
> processes to take more than their fair share of available resources.

Fair share is the objective. I was trying to describe the general case
so we could discuss a solution that would allow a dynamic approach
rather than the static one we have now.

Want to handle these cases: "How much to allocate, when..."
A. we have predicted number of users 
B. we have a busy system - more than predicted number of users
C. we have a quiet system - less than predicted number of users

In B/C we have to be careful that we don't under/overallocate resources
only to find the situation changes immediately afterwards.

In many cases the static allocation is actually essential since you may
be more interested in guaranteeing a conservative run time rather than
seeking to produce occasional/unpredictable bursts of speed. But in many
cases people want to have certain tasks go faster when its quiet and go
slower when its not.

> Also, to the extent that you believe the problem is insufficient L2
> cache, it seems increasing work_mem to many times the size of L2 will
> always be counterproductive.  

Sorry to confuse: (1) and (2) were completely separate, so no intended
interaction between L2 cache and memory.

> (Certainly there is no value in increasing
> work_mem until we are in a regime where it consistently improves
> performance significantly, which it seems we aren't yet.)

Very much agreed.

Best Regards, Simon Riggs



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: problem with large maintenance_work_mem settings and
Next
From: Joachim Wieland
Date:
Subject: Re: Status of TODO: postgresql.conf: reset to default when