Re: old synchronized scan patch - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: old synchronized scan patch
Date
Msg-id 1165444056.2048.45.camel@dogma.v10.wvs
Whole thread Raw
In response to Re: old synchronized scan patch  ("Jim C. Nasby" <jim@nasby.net>)
List pgsql-hackers
On Wed, 2006-12-06 at 12:48 -0600, Jim C. Nasby wrote:
> On Tue, Dec 05, 2006 at 09:09:39AM -0800, Jeff Davis wrote:
> > That being said, I can just lock the hint table (the shared memory hint
> > table, not the relation) and only update the hint every K pages, as Niel
> > Conway suggested when I first proposed it. If we find a K small enough
> > so the feature is useful, but large enough that we're sure there won't
> > be contention, this might be a good option. However, I don't know that
> > we would eliminate the contention, because if K is a constant (rather
> > than random), the backends would still all want to update that shared
> > memory table at the same time.
> 
> What about some algorithm where only one backend will update the hint
> entry (perhaps the first one, or the slowest one (ie: lowest page
> number))? ISTM that would eliminate a lot of contention, and if you get
> clever with the locking scheme you could probably allow other backends
> to do non-blocking reads except when the page number passes a 4-byte
> value (assuming 4-byte int updates are atomic).
> 

If we have one backend in charge, how does it pass the torch when it
finishes the scan? I think you're headed back in the direction of an
independent "scanning" process. That's not unreasonable, but there are a
lot of other issues to deal with.

One thought of mine goes something like this: A scanner process starts
up and scans with a predefined length of a cache trail in the
shared_buffers, perhaps a chunk of buffers used like a circular list (so
it doesn't interfere with caching). When a new scan starts, it could
request a block from this scanner process and begin the scan there. If
the new scan keeps up with the scanner process, it will always be
getting cached data. If it falls behind, the request turns into a new
block request. In theory, the scan could actually catch back up to the
scanner process after falling behind.

We could use a meaningful event (like activity on a particular relation)
to start/stop the scanner process.

It's just another idea, but I'm still not all that sure that
synchronization is necessary.

Does anyone happen to have an answer on whether OS-level readahead is
system-wide, or per-process? I expect that it's system wide, but Tom
raised the issue and it may be a drawback if some OSs do per-process
readahead.

Regards,Jeff Davis




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [BUGS] 8.2 bug with outer join reordering
Next
From: Jeff Davis
Date:
Subject: Re: old synchronized scan patch