Re: [HACKERS] logical decoding of two-phase transactions - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: [HACKERS] logical decoding of two-phase transactions
Date
Msg-id CAMsr+YE5UC7MeZFt7+rhNrcacAne8bogftJH3GtqGQwMN4a1Gg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] logical decoding of two-phase transactions  (Stas Kelvich <s.kelvich@postgrespro.ru>)
Responses Re: [HACKERS] logical decoding of two-phase transactions  (Robert Haas <robertmhaas@gmail.com>)
Re: [HACKERS] logical decoding of two-phase transactions  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
List pgsql-hackers
On 17 March 2017 at 08:10, Stas Kelvich <s.kelvich@postgrespro.ru> wrote:

> While working on this i’ve spotted quite a nasty corner case with aborted prepared
> transaction. I have some not that great ideas how to fix it, but maybe i blurred my
> view and missed something. So want to ask here at first.
>
> Suppose we created a table, then in 2pc tx we are altering it and after that aborting tx.
> So pg_class will have something like this:
>
> xmin | xmax | relname
> 100  | 200    | mytable
> 200  | 0        | mytable
>
> After previous abort, tuple (100,200,mytable) becomes visible and if we will alter table
> again then xmax of first tuple will be set current xid, resulting in following table:
>
> xmin | xmax | relname
> 100  | 300    | mytable
> 200  | 0        | mytable
> 300  | 0        | mytable
>
> In that moment we’ve lost information that first tuple was deleted by our prepared tx.

Right. And while the prepared xact has aborted, we don't control when
it aborts and when those overwrites can start happening. We can and
should check if a 2pc xact is aborted before we start decoding it so
we can skip decoding it if it's already aborted, but it could be
aborted *while* we're decoding it, then have data needed for its
snapshot clobbered.

This hasn't mattered in the past because prepared xacts (and
especially aborted 2pc xacts) have never needed snapshots, we've never
needed to do something from the perspective of a prepared xact.

I think we'll probably need to lock the 2PC xact so it cannot be
aborted or committed while we're decoding it, until we finish decoding
it. So we lock it, then check if it's already aborted/already
committed/in progress. If it's aborted, treat it like any normal
aborted xact. If it's committed, treat it like any normal committed
xact. If it's in progress, keep the lock and decode it.

People using logical decoding for 2PC will presumably want to control
2PC via logical decoding, so they're not so likely to mind such a
lock.

> * Try at first to scan catalog filtering out tuples with xmax bigger than snapshot->xmax
> as it was possibly deleted by our tx. Than if nothing found scan in a usual way.

I don't think that'll be at all viable with the syscache/relcache
machinery. Way too intrusive.

> * Do not decode such transaction at all.

Yes, that's what I'd like to do, per above.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: [HACKERS] Potential data loss of 2PC files
Next
From: Craig Ringer
Date:
Subject: Re: [HACKERS] logical decoding of two-phase transactions