Re: ERROR: subtransaction logged without previous top-level txn record - Mailing list pgsql-bugs

From Amit Kapila
Subject Re: ERROR: subtransaction logged without previous top-level txn record
Date
Msg-id CAA4eK1L=MDbmGu5-+BmY7Svc07jr+ZabiH8C_qo3RSc8pgUpDQ@mail.gmail.com
Whole thread Raw
In response to Re: ERROR: subtransaction logged without previous top-level txn record  (Arseny Sher <a.sher@postgrespro.ru>)
Responses Re: ERROR: subtransaction logged without previous top-level txn record
List pgsql-bugs
On Fri, Oct 25, 2019 at 12:26 PM Arseny Sher <a.sher@postgrespro.ru> wrote:
>
>
> Andres Freund <andres@anarazel.de> writes:
>
> > Hi,
> >
> > On 2019-10-24 12:59:30 +0300, Arseny Sher wrote:
> >> Our customer also encountered this issue and I've looked into it. The problem is
> >> reproduced well enough using the instructions in the previous message.
> >
> > Is this with
> >
> > commit bac2fae05c7737530a6fe8276cd27d210d25c6ac
> > Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
> > Date:   2019-09-13 16:36:28 -0300
> >
> >     logical decoding: process ASSIGNMENT during snapshot build
> >
> >     Most WAL records are ignored in early SnapBuild snapshot build phases.
> >     But it's critical to process some of them, so that later messages have
> >     the correct transaction state after the snapshot is completely built; in
> >     particular, XLOG_XACT_ASSIGNMENT messages are critical in order for
> >     sub-transactions to be correctly assigned to their parent transactions,
> >     or at least one assert misbehaves, as reported by Ildar Musin.
> >
> >     Diagnosed-by: Masahiko Sawada
> >     Author: Masahiko Sawada
> >     Discussion: https://postgr.es/m/CAONYFtOv+Er1p3WAuwUsy1zsCFrSYvpHLhapC_fMD-zNaRWxYg@mail.gmail.com
> >
> > applied?
>
> Yeah, I tried fresh master. See below: skipped xl_xact_assignment is
> beyond restart_lsn at all (and thus not read), so this doesn't matter.
>
>
> >> The check leading to this ERROR is too strict, it forbids legit behaviours. Say
> >> we have in WAL
> >>
> >> [ <xl_xact_assignment_1> <restart_lsn> <subxact_change> <xl_xact_assignment_1> <commit> confirmed_flush_lsn> ]
> >>
> >> - First xl_xact_assignment record is beyond reading, i.e. earlier
> >>   restart_lsn, where ready snapshot will be taken from disk.
> >> - After restart_lsn there is some change of a subxact.
> >> - After that, there is second xl_xact_assignment (for another subxact)
> >>   revealing relationship between top and first subxact, where this ERROR fires.
> >>
> >> Such transaction won't be streamed because we hadn't seen it in full. It must be
> >> finished before streaming will start, i.e. before confirmed_flush_lsn.
> >>
> >> Of course, the easiest fix is to just throw the check out.
> >
> > I don't think that'd actually be a fix, and just hiding a bug.
>
> I don't see a bug here. At least in reproduced scenario I see false
> alert, as explained above: transaction with skipped xl_xact_assignment
> won't be streamed as it finishes before confirmed_flush_lsn.
>

Does this guarantee come from the fact that we need to wait for such a
transaction before reaching a consistent snapshot state?  If not, can
you explain a bit more what makes you say so?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-bugs by date:

Previous
From: Andres Freund
Date:
Subject: Re: postgres crash on concurrent update of inheritance partitionedtable
Next
From: Andres Freund
Date:
Subject: Re: BUG #16223: Performance regression between 11.6 and 12.1 in anSQL query with a recursive CTE based on function