Re: Synchronized Scan update - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Synchronized Scan update
Date
Msg-id 1173009271.3760.1772.camel@silverbirch.site
Whole thread Raw
In response to Re: Synchronized Scan update  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: Synchronized Scan update  ("Joshua D. Drake" <jd@commandprompt.com>)
Re: Synchronized Scan update  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
On Fri, 2007-03-02 at 15:03 -0800, Jeff Davis wrote:
> Is there any consensus about whether to include these two parameters as
> GUCs or constants if my patch is to be accepted?
> 
> (1) sync_scan_threshold: Use synchronized scanning for tables greater
> than this many pages; smaller tables will not be affected.

That sounds OK.

> (2) sync_scan_offset: Start a new scan this many pages before a
> currently running scan to take advantage of the pages
>  that are likely already in cache.

I'm somewhat dubious about this parameter, I have to say, even though I
am eager for this feature. It seems like a "magic" parameter that works
only when we have the right knowledge to set it correctly.

How will we know what to default it to and how will we know whether to
set it higher or lower for better performance? Does that value vary
according to the workload on the system? How?

I'm worried that we get a feature that works well on simple tests and
not at all in real world circumstances. I don't want to cast doubt on
what could be a great patch or be negative: I just see that the feature
relies on the dynamic behaviour of the system. I'd like to see some
further studies on how this works to make sure that we can realistically
set know how to set this knob, that its the correct knob and it is the
only one we need.

Further thoughts: It sounds like sync_scan_offset is related to
effective_cache_size. Can you comment on whether that might be a
something we can use as well/instead? (i.e. set the scan offset to say K
* effective_cache_size, 0.1 <= K <= 0.5)???

Might we do roughly the same thing with sync_scan_threshold as well, and
just have enable_sync_scan instead? i.e. sync_scan_threshold =
effective_cache_size? When would those two parameters not be connected
directly to each other?

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com




pgsql-hackers by date:

Previous
From: Andrew - Supernews
Date:
Subject: Re: ERROR: operator does not exist: integer !=- integer
Next
From: Hannu Krosing
Date:
Subject: Re: UPSERT