Re: Proposal for CSN based snapshots - Mailing list pgsql-hackers

From Markus Wanner
Subject Re: Proposal for CSN based snapshots
Date
Msg-id 51B6C97B.1070000@bluegap.ch
Whole thread Raw
In response to Re: Proposal for CSN based snapshots  (Ants Aasma <ants@cybertec.at>)
Responses Re: Proposal for CSN based snapshots
List pgsql-hackers
Ants,

the more I think about this, the more I start to like it.

On 06/07/2013 02:50 PM, Ants Aasma wrote:
> On Fri, Jun 7, 2013 at 2:59 PM, Markus Wanner <markus@bluegap.ch> wrote:
>> Agreed. Postgres-R uses a CommitOrderId, which is very similar in
>> concept, for example.
> 
> Do you think having this snapshot scheme would be helpful for Postgres-R?

Yeah, it could help to reduce patch size, after a rewrite to use such a CSN.

>> Or why do you need to tell apart aborted from in-progress transactions
>> by CSN?
> 
> I need to detect aborted transactions so they can be discared during
> the eviction process, otherwise the sparse array will fill up. They
> could also be filtered out by cross-referencing uncommitted slots with
> the procarray. Having the abort case do some additional work to make
> xid assigment cheaper looks like a good tradeoff.

I see.

>>> Sparse buffer needs to be at least big enough to fit CSN slots for the
>>> xids of all active transactions and non-overflowed subtransactions. At
>>> the current level PGPROC_MAX_CACHED_SUBXIDS=64, the minimum comes out
>>> at 16 bytes * (64 + 1) slots * 100 =  backends = 101.6KB per buffer,
>>> or 203KB total in the default configuration.
>>
>> A CSN is 8 bytes, the XID 4, resulting in 12 bytes per slot. So I guess
>> the given 16 bytes includes alignment to 8 byte boundaries. Sounds good.
> 
> 8 byte alignment for CSNs is needed for atomic if not something else.

Oh, right, atomic writes.

> I think the size could be cut in half by using a base value for CSNs
> if we assume that no xid is active for longer than 2B transactions as
> is currently the case. I didn't want to include the complication in
> the first iteration, so I didn't verify if that would have any
> gotchas.

In Postgres-R, I effectively used a 32-bit order id which wraps around.

In this case, I guess adjusting the base value will get tricky. Wrapping
could probably be used as well, instead.

> The number of times each cache line can be invalidated is
> bounded by 8.

Hm.. good point.

Regards

Markus Wanner



pgsql-hackers by date:

Previous
From: Hannu Krosing
Date:
Subject: Re: DO ... RETURNING
Next
From: Dean Rasheed
Date:
Subject: Re: how to find out whether a view is updatable