Re: ERROR: subtransaction logged without previous top-level txn record - Mailing list pgsql-bugs

From Amit Kapila
Subject Re: ERROR: subtransaction logged without previous top-level txn record
Date
Msg-id CAA4eK1LYzrZ_+8VhD_N_dsQwjxA9t+AyGKT-Wjnc8S7jCwAcBw@mail.gmail.com
Whole thread Raw
In response to Re: ERROR: subtransaction logged without previous top-level txn record  (Arseny Sher <a.sher@postgrespro.ru>)
Responses Re: ERROR: subtransaction logged without previous top-level txn record  (Arseny Sher <a.sher@postgrespro.ru>)
List pgsql-bugs
On Mon, Feb 3, 2020 at 7:16 PM Arseny Sher <a.sher@postgrespro.ru> wrote:
> Amit Kapila <amit.kapila16@gmail.com> writes:
>
> > So, doesn't this mean that it started occurring after the fix done in
> > commit 96b5033e11 [1]?  Because before that fix we wouldn't have
> > allowed processing XLOG_XACT_ASSIGNMENT records unless we are in
> > SNAPBUILD_FULL_SNAPSHOT state.  I am not telling the fix in that
> > commit is wrong, but just trying to understand the situation here.
>
> Nope. Consider again example of WAL above triggering the error:
>
> [ <xl_xact_assignment_1> <restart_lsn> <subxact_change> <xl_xact_assignment_2> <commit> <confirmed_flush_lsn> ]
>
> Decoder starting reading WAL at <restart_lsn> where he immediately reads
> from disk snapshot serialized earlier, which makes it jump to
> SNAPBUILD_CONSISTENT right away.
>

Sure, I understand that if we get serialized snapshot from disk, this
problem can occur and probably we can fix by ignoring serialized
snapshots during slot creation or something on those lines.  However,
what I am trying to understand is whether this can occur from another
path as well.  I think it might occur without using serialized
snapshots as well because we allow decoding xl_xact_assignment record
even when the snapshot state is not full.  Say in your above example,
even if the snapshot state is not SNAPBUILD_CONSISTENT as we haven't
used the serialized snapshot, then also, it can lead to the above
problem due to decoding of xl_xact_assignment.  I have not tried to
generate a test case for this, so I could easily be wrong here.

What I am trying to get at is if the problem can only occur by using
serialized snapshots and we fix it by somehow not using them initial
slot creation, then ideally we don't need to remove the error in
ReorderBufferAssignChild, but if that is not the case, then we need to
discuss other cases as well.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #16243: non super user take pg_restore found some errors.
Next
From: Thomas Butz
Date:
Subject: Re: BUG #16241: Degraded hash join performance