Re: cheaper snapshots redux - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: cheaper snapshots redux
Date
Msg-id F995A034-1AFC-482E-A7AC-AEDF0880CC2A@nasby.net
Whole thread Raw
In response to cheaper snapshots redux  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: cheaper snapshots redux
List pgsql-hackers
On Aug 22, 2011, at 4:25 PM, Robert Haas wrote:
> What I'm thinking about
> instead is using a ring buffer with three pointers: a start pointer, a
> stop pointer, and a write pointer.  When a transaction ends, we
> advance the write pointer, write the XIDs or a whole new snapshot into
> the buffer, and then advance the stop pointer.  If we wrote a whole
> new snapshot, we advance the start pointer to the beginning of the
> data we just wrote.
>
> Someone who wants to take a snapshot must read the data between the
> start and stop pointers, and must then check that the write pointer
> hasn't advanced so far in the meantime that the data they read might
> have been overwritten before they finished reading it.  Obviously,
> that's a little risky, since we'll have to do the whole thing over if
> a wraparound occurs, but if the ring buffer is large enough it
> shouldn't happen very often.  And a typical snapshot is pretty small
> unless massive numbers of subxids are in use, so it seems like it
> might not be too bad.  Of course, it's pretty hard to know for sure
> without coding it up and testing it.

Something that would be really nice to fix is our reliance on a fixed size of shared memory, and I'm wondering if this
couldbe an opportunity to start in a new direction. My thought is that we could maintain two distinct shared memory
snapshotsand alternate between them. That would allow us to actually resize them as needed. We would still need
somethinglike what you suggest to allow for adding to the list without locking, but with this scheme we wouldn't need
toworry about extra locking when taking a snapshot since we'd be doing that in a new segment that no one is using yet. 

The downside is such a scheme does add non-trivial complexity on top of what you proposed. I suspect it would be much
betterif we had a separate mechanism for dealing with shared memory requirements (shalloc?). But if it's just not
practicalto make a generic shared memory manager it would be good to start thinking about ways we can work around fixed
sharedmemory size issues. 
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net




pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: 9.1rc1: TRAP: FailedAssertion("!(item_width > 0)", File: "costsize.c", Line: 3274)
Next
From: Robert Haas
Date:
Subject: Re: cheaper snapshots redux