Home > mailing lists

Re: [HACKERS] logical decoding of two-phase transactions - Mailing list pgsql-hackers

From	Craig Ringer
Subject	Re: [HACKERS] logical decoding of two-phase transactions
Date	January 27, 2017 05:52:45
Msg-id	CAMsr+YHos95GFyfd2Gk61-66wpxFTgXJh=sBV3Rtfhe3xzG1HQ@mail.gmail.com Whole thread Raw
In response to	Re: [HACKERS] logical decoding of two-phase transactions (Stas Kelvich <s.kelvich@postgrespro.ru>)
Responses	Re: [HACKERS] logical decoding of two-phase transactions (Michael Paquier <michael.paquier@gmail.com>)
List	pgsql-hackers

Tree view

On 26 January 2017 at 19:34, Stas Kelvich <s.kelvich@postgrespro.ru> wrote:

> Imagine following scenario:
>
> 1. PREPARE happend
> 2. PREPARE decoded and sent where it should be sent
> 3. We got all responses from participating nodes and issuing COMMIT/ABORT
> 4. COMMIT/ABORT decoded and sent
>
> After step 3 there is no more memory state associated with that prepared tx, so if will fail
> between 3 and 4 then we can’t know GID unless we wrote it commit record (or table).

If the decoding session crashes/disconnects and restarts between 3 and
4, we know the xact is now committed or rolled backand we don't care
about its gid anymore, we can decode it as a normal committed xact or
skip over it if aborted. If Pg crashes between 3 and 4 the same
applies, since all decoding sessions must restart.

No decoding session can ever start up between 3 and 4 without passing
through 1 and 2, since we always restart decoding at restart_lsn and
restart_lsn cannot be advanced past the assignment (BEGIN) of a given
xid until we pass its commit record and the downstream confirms it has
flushed the results.

The reorder buffer doesn't even really need to keep track of the gid
between 3 and 4, though it should do to save the output plugin and
downstream the hassle of keeping an xid to gid mapping. All it needs
is to know if we sent a given xact's data to the output plugin at
PREPARE time, so we can suppress sending them again at COMMIT time,
and we can store that info on the ReorderBufferTxn. We can store the
gid there too.

We'll need two new output plugin callbacks
  prepare_cb  rollback_cb

since an xact can roll back after we decode PREPARE TRANSACTION (or
during it, even) and we have to be able to tell the downstream to
throw the data away.

I don't think the rollback callback should be called
abort_prepared_cb, because we'll later want to add the ability to
decode interleaved xacts' changes as they are made, before commit, and
in that case will also need to know if they abort. We won't care if
they were prepared xacts or not, but we'll know based on the
ReorderBufferTXN anyway.

We don't need a separate commit_prepared_cb, the existing commit_cb is
sufficient. The gid will be accessible on the ReorderBufferTXN.

Now, if it's simpler to just xlog the gid at COMMIT PREPARED time when
wal_level >= logical I don't think that's the end of the world. But
since we already have almost everything we need in memory, why not
just stash the gid on ReorderBufferTXN?

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

From: Tom Lane
Date: 27 January 2017, 05:39:09
Subject: Re: [HACKERS] Performance improvement for joins where outer side is unique

From: Michael Paquier
Date: 27 January 2017, 05:53:52
Subject: Re: [HACKERS] Speedup twophase transactions

Re: [HACKERS] logical decoding of two-phase transactions - Mailing list pgsql-hackers

Previous

Next