Home > mailing lists

Re: cheaper snapshots redux - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: cheaper snapshots redux
Date	August 22, 2011 20:22:58
Msg-id	CA+TgmoZHt4iGyVp3vxOTOi8ev4J_faWZNQ-3OXgcabed7uFaDA@mail.gmail.com Whole thread Raw
In response to	Re: cheaper snapshots redux (Jim Nasby <jim@nasby.net>)
Responses	Re: cheaper snapshots redux Re: cheaper snapshots redux
List	pgsql-hackers

Tree view

On Mon, Aug 22, 2011 at 6:45 PM, Jim Nasby <jim@nasby.net> wrote:
> Something that would be really nice to fix is our reliance on a fixed size of shared memory, and I'm wondering if
thiscould be an opportunity to start in a new direction. My thought is that we could maintain two distinct shared
memorysnapshots and alternate between them. That would allow us to actually resize them as needed. We would still need
somethinglike what you suggest to allow for adding to the list without locking, but with this scheme we wouldn't need
toworry about extra locking when taking a snapshot since we'd be doing that in a new segment that no one is using yet.
>
> The downside is such a scheme does add non-trivial complexity on top of what you proposed. I suspect it would be much
betterif we had a separate mechanism for dealing with shared memory requirements (shalloc?). But if it's just not
practicalto make a generic shared memory manager it would be good to start thinking about ways we can work around fixed
sharedmemory size issues.

Well, the system I'm proposing is actually BETTER than having two
distinct shared memory snapshots. For example, right now we cache up
to 64 subxids per backend. I'm imagining that going away and using
that memory for the ring buffer. Out of the box, that would imply a
ring buffer of 64 * 103 = 6592 slots. If the average snapshot lists
100 XIDs, you could rewrite the snapshot dozens of times times before
the buffer wraps around, which is obviously a lot more than two. Even
if subtransactions are being heavily used and each snapshot lists 1000
XIDs, you still have enough space to rewrite the snapshot several
times over before wraparound occurs. Of course, at some point the
snapshot gets too big and you have to switch to retaining only the
toplevel XIDs, which is more or less the equivalent of what happens
under the current implementation when any single transaction's subxid
cache overflows.

With respect to a general-purpose shared memory allocator, I think
that there are cases where that would be useful to have, but I don't
think there are as many of them as many people seem to think. I
wouldn't choose to implement this using a general-purpose allocator
even if we had it, both because it's undesirable to allow this or any
subsystem to consume an arbitrary amount of memory (nor can it fail...
especially in the abort path) and because a ring buffer is almost
certainly faster than a general-purpose allocator. We have enough
trouble with palloc overhead already. That having been said, I do
think there are cases where it would be nice to have... and it
wouldn't surprise me if I end up working on something along those
lines in the next year or so. It turns out that memory management is
a major issue in lock-free programming; you can't assume that it's
safe to recycle an object once the last pointer to it has been removed
from shared memory - because someone may have fetched the pointer just
before you removed it and still be using it to examine the object. An
allocator with some built-in capabilities for handling such problems
seems like it might be very useful....

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Jim Nasby
Date: 22 August 2011, 19:45:18
Subject: Re: cheaper snapshots redux

From: Daniel Farina
Date: 22 August 2011, 22:04:46
Subject: Re: SSL-mode error reporting in libpq

Re: cheaper snapshots redux - Mailing list pgsql-hackers

Previous

Next