On Tue, Dec 17, 2013 at 4:31 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-12-16 23:01:16 -0500, Robert Haas wrote:
>> On Sat, Dec 14, 2013 at 12:37 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>> > On 2013-12-14 11:50:00 -0500, Robert Haas wrote:
>> >> Well, it still seems to me that the right way to think about this is
>> >> that the change stream begins at a certain point, and then once you
>> >> cross a certain threshold (all transactions in progress at that time
>> >> have ended) any subsequent snapshot is a possible point from which to
>> >> roll forward.
>> >
>> > Unfortunately it's not possible to build exportable snapshots at any
>> > time - it requires keeping far more state around since we need to care
>> > about all transactions, not just transactions touching the
>> > catalog. Currently you can only export the snapshot in the one point we
>> > become consistent, after that we stop maintaining that state.
>>
>> I don't get it. Once all the old transactions are gone, I don't see
>> why you need any state at all to build an exportable snapshot. Just
>> take a snapshot.
>
> The state we're currently decoding, somewhere in already fsynced WAL,
> won't correspond to the state in the procarray. There might be
> situations where it will, but we can't guarantee that we ever reach that
> point without taking locks that will be problematic.
You don't need to guarantee that. Just take a current snapshot and
then throw away (or don't decode in the first place) any transactions
that would be visible to that snapshot. This is simpler and more
flexible, and possibly more performant, too, because with your design
you'll have to hold back xmin to the historical snapshot you build
while copying the table rather than to a current snapshot.
I really think we should consider whether we can't get by with ripping
out the build-an-exportable-snapshot code altogether. I don't see
that it's really buying us much. We need a way for the client to know
when decoding has reached the point where it is guaranteed complete -
i.e. all transactions in progress at the time decoding was initiated
have ended. We also need a way for a backend performing decoding to
take a current MVCC snapshot, export it, and send the identifier to
the client. And we need a way for the client to know whether any
given one of those snapshots includes a particular XID we may have
decoded. But I think all of that might still be simpler than what you
have now, and it's definitely more flexible.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company