Re: Synchronized Scan update - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Synchronized Scan update
Date
Msg-id 1173119974.13722.277.camel@dogma.v10.wvs
Whole thread Raw
In response to Re: Synchronized Scan update  ("Simon Riggs" <simon@2ndquadrant.com>)
List pgsql-hackers
On Sun, 2007-03-04 at 11:54 +0000, Simon Riggs wrote:
> > (2) sync_scan_offset: Start a new scan this many pages before a
> > currently running scan to take advantage of the pages
> >  that are likely already in cache.
> 
> I'm somewhat dubious about this parameter, I have to say, even though I
> am eager for this feature. It seems like a "magic" parameter that works
> only when we have the right knowledge to set it correctly.
> 

That was my concern about this parameter also.

> How will we know what to default it to and how will we know whether to
> set it higher or lower for better performance? Does that value vary
> according to the workload on the system? How?
> 

Perhaps people would only set this parameter when they know it will
help, and for more complex (or varied) usage patterns they'd set
sync_scan_offset to 0 to be safe.

My thinking on the subject (and this is only backed up by very basic
tests) is that there are basically two situations where setting this
parameter too high can hurt:
(1) It's too close to the limits of your physical memory, and you end up
diverging the scans when they could be kept together.
(2) You're using a lot of CPU and the backends aren't processing the
buffers as fast as your I/O system is delivering them. This will prevent
the scans from converging.

If your CPUs are well below capacity and you choose a size significantly
less than your effective cache size, I don't think it will hurt.

> I'm worried that we get a feature that works well on simple tests and
> not at all in real world circumstances. I don't want to cast doubt on
> what could be a great patch or be negative: I just see that the feature
> relies on the dynamic behaviour of the system. I'd like to see some
> further studies on how this works to make sure that we can realistically
> set know how to set this knob, that its the correct knob and it is the
> only one we need.

I will do some better tests on some better hardware this week and next
week. I hope that sheds some light.

> Further thoughts: It sounds like sync_scan_offset is related to
> effective_cache_size. Can you comment on whether that might be a
> something we can use as well/instead? (i.e. set the scan offset to say K
> * effective_cache_size, 0.1 <= K <= 0.5)???
> 
> Might we do roughly the same thing with sync_scan_threshold as well, and
> just have enable_sync_scan instead? i.e. sync_scan_threshold =
> effective_cache_size? When would those two parameters not be connected
> directly to each other?
> 

Originally, these parameters were in terms of the effective_cache_size.
Somebody else convinced me that it was too confusing to have the
variables dependent on each other, so I made them independent. I don't
have a strong opinion either way.

Regards,Jeff Davis






pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Bug: Buffer cache is not scan resistant
Next
From: "Luke Lonergan"
Date:
Subject: Re: Bug: Buffer cache is not scan resistant