Re: Synchronized Scan update - Mailing list pgsql-hackers
From | Jeff Davis |
---|---|
Subject | Re: Synchronized Scan update |
Date | |
Msg-id | 1173746777.23455.90.camel@dogma.v10.wvs Whole thread Raw |
In response to | Re: Synchronized Scan update ("Simon Riggs" <simon@2ndquadrant.com>) |
Responses |
Re: Synchronized Scan update
Re: Synchronized Scan update |
List | pgsql-hackers |
On Mon, 2007-03-12 at 13:21 +0000, Simon Riggs wrote: > So based on those thoughts, sync_scan_offset should be fixed at 16, > rather than being variable. In addition, ss_report_loc() should only > report its position every 16 blocks, rather than do this every time, > which will reduce overhead of this call. If we fix sync_scan_offset at 16, we might as well just get rid of it. Sync scans are only useful on large tables, and getting a free 16 pages over a scan isn't worth the trouble. However, even without sync_scan_offset, sync scans are still a valuable feature. I agree that ss_report_loc() doesn't need to report on every call. If there's any significant overhead I agree that it should report less often. Do you think that the overhead is significant on such a simple function? > > To match that, scan_recycle_buffers should be fixed at 32. So GUCs for > sync_scan_offset and scan_recycle_buffers would not be required at all. > > IMHO we can also remove sync_scan_threshold and just use NBuffers > instead. That way we get the benefit of both patches or neither, making > it easier to understand what's going on. I like the idea of reducing tuning parameters, but we should, at a minimum, still allow an on/off button for sync scans. My tests revealed that the wrong combination of OS/FS/IO-Scheduler/Controller could result in bad I/O behavior. > If need be, the value of scan_recycle_buffers can be varied upwards > should the scans drift apart, as a way of bringing them back together. If the scans aren't being brought together, that means that one of the scans is CPU bound or outside the combined cache trail (shared_buffers + OS buffer cache). > We aren't tracking whether they are together or apart, so I would like > to see some debug output from synch scans to allow us to assess how far > behind the second scan is as it progresses. e.g. > LOG: synch scan currently on block N, trailing pathfinder by M blocks > issued every 128 blocks as we go through the scans. > > Thoughts? > It's hard to track where all the scans are currently. One of the advantages of my patch is its simplicity: the scans don't need to know about other specific scans, and there is no concept in the code of a "head" scan or a "pack". There is no easy way to tell which scan is ahead and which is behind. There was a discussion when I submitted this proposal at the beginning of 8.3, but I didn't see enough benefit to justify all of the costs and risks associated with scans communicating between eachother. I certainly can't implement that kind of thing before feature freeze, and I think there's a risk of lock contention for the communication required. I'm also concerned that -- if the scans are too interdependent -- it would make postgres less robust against the disappearance of a single backend (i.e. what if the backend that is leading a scan dies?). Regards,Jeff Davis
pgsql-hackers by date: