Re: Synchronized Scan update - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Synchronized Scan update
Date
Msg-id 1173746777.23455.90.camel@dogma.v10.wvs
Whole thread Raw
In response to Re: Synchronized Scan update  ("Simon Riggs" <simon@2ndquadrant.com>)
Responses Re: Synchronized Scan update
Re: Synchronized Scan update
List pgsql-hackers
On Mon, 2007-03-12 at 13:21 +0000, Simon Riggs wrote:
> So based on those thoughts, sync_scan_offset should be fixed at 16,
> rather than being variable. In addition, ss_report_loc() should only
> report its position every 16 blocks, rather than do this every time,
> which will reduce overhead of this call.

If we fix sync_scan_offset at 16, we might as well just get rid of it.
Sync scans are only useful on large tables, and getting a free 16 pages
over a scan isn't worth the trouble. However, even without
sync_scan_offset, sync scans are still a valuable feature.

I agree that ss_report_loc() doesn't need to report on every call. If
there's any significant overhead I agree that it should report less
often. Do you think that the overhead is significant on such a simple
function?

> 
> To match that, scan_recycle_buffers should be fixed at 32. So GUCs for
> sync_scan_offset and scan_recycle_buffers would not be required at all.
> 
> IMHO we can also remove sync_scan_threshold and just use NBuffers
> instead. That way we get the benefit of both patches or neither, making
> it easier to understand what's going on.

I like the idea of reducing tuning parameters, but we should, at a
minimum, still allow an on/off button for sync scans. My tests revealed
that the wrong combination of OS/FS/IO-Scheduler/Controller could result
in bad I/O behavior.

> If need be, the value of scan_recycle_buffers can be varied upwards
> should the scans drift apart, as a way of bringing them back together.

If the scans aren't being brought together, that means that one of the
scans is CPU bound or outside the combined cache trail (shared_buffers
+ OS buffer cache). 

> We aren't tracking whether they are together or apart, so I would like
> to see some debug output from synch scans to allow us to assess how far
> behind the second scan is as it progresses. e.g.
> LOG:  synch scan currently on block N, trailing pathfinder by M blocks
> issued every 128 blocks as we go through the scans. 
> 
> Thoughts?
> 

It's hard to track where all the scans are currently. One of the
advantages of my patch is its simplicity: the scans don't need to know
about other specific scans, and there is no concept in the code of a
"head" scan or a "pack".

There is no easy way to tell which scan is ahead and which is behind.
There was a discussion when I submitted this proposal at the beginning
of 8.3, but I didn't see enough benefit to justify all of the costs and
risks associated with scans communicating between eachother. I
certainly can't implement that kind of thing before feature freeze, and
I think there's a risk of lock contention for the communication
required. I'm also concerned that -- if the scans are too
interdependent -- it would make postgres less robust against the
disappearance of a single backend (i.e. what if the backend that is
leading a scan dies?).

Regards,Jeff Davis





pgsql-hackers by date:

Previous
From: "Michael Ledford"
Date:
Subject: Daylight Saving Time question PostgreSQL 8.1.4
Next
From: Josh Berkus
Date:
Subject: Re: Daylight Saving Time question PostgreSQL 8.1.4