Re: Parallel scan with SubTransGetTopmostTransaction assert coredump - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Parallel scan with SubTransGetTopmostTransaction assert coredump
Date
Msg-id CA+TgmobW=Zcr8HvP1vjyTE5vg5Qy+FUiM2GhfLT3mghGE6ZyuQ@mail.gmail.com
Whole thread Raw
In response to Re: Parallel scan with SubTransGetTopmostTransaction assert coredump  (Greg Nancarrow <gregn4422@gmail.com>)
Responses Re: Parallel scan with SubTransGetTopmostTransaction assert coredump
List pgsql-hackers
On Wed, Jul 14, 2021 at 10:34 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> Unfortunately there is currently no test, code-comment, README or
> developer-discussion that definitively determines which approach (v2
> vs  v3/v4) is a valid fix for this issue.
> We don't know if having both the transaction and active snapshots in a
> parallel worker is intentional or not, and if so, why so?
> (certainly in the non-parallel case of the same statement execution,
> there is only one snapshot in question here - the obtained transaction
> snapshot is pushed as the active snapshot, as it is done in 95% of
> cases in the code)
> It seems that only the original code authors know how the snapshot
> handling in parallel-workers is MEANT to work, and they have yet to
> speak up about it here.
> At this point, we can only all agree that there is a problem to be fixed here.

Hi.

Thanks to Thomas Munro for drawing my attention to this thread. I
wasn't intentionally ignoring it, but there's a lot of email in the
world and only so much time.

When I wrote this code originally, the idea that I had in mind was
simply this: whatever state we have in the leader ought to be
reproduced in each worker. So if there's an active snapshot in the
leader, we ought to make that active in all the workers, and if
there's a transaction snapshot in the leader, we ought to make that
the transaction snapshot in all of the workers.

But I see now that my thinking was fuzzy, and I'm going to blame that
on the name GetTransactionSnapshot() being slightly misleading. If
IsolationUsesXactSnapshot() is true, then there's really such a thing
as a transaction snapshot and reproducing that in the worker is a
sensible thing to do. But when !IsolationUsesXactSnapshot(),
GetTransactionSnapshot() doesn't just "get the transaction snapshot",
because there isn't any such thing. It takes a whole new snapshot, on
the theory that you wouldn't be calling this function unless you had
finished up with the snapshot you got the last time you called this
function. And in the case of initiating parallel query, that is the
wrong thing.

I think that, at least in the case where IsolationUsesXactSnapshot()
is true, we need to make sure that calling GetTransactionSnapshot() in
a worker produces the same result that it would have produced in the
leader. Say one of the workers calls an sql or plpgsql function and
that function runs a bunch of SQL statements. It seems to me that
there's probably a way for this to result in calls inside the worker
to GetTransactionSnapshot(), and if that doesn't return the same
snapshot as in the leader, then we've broken MVCC.

What about when IsolationUsesXactSnapshot() is false? Perhaps it's OK
to just skip this altogether in that case. Certainly what we're doing
can't be right, because copying a snapshot that wouldn't have been
taken without parallel query can't ever be the right thing to do.
Perhaps we need to copy something else instead. I'm not really sure.

So I think v2 is probably on the right track, but wrong when the
transaction isolation level is REPEATABLE READ or SERIALIZABLE, and v3
and v4 just seem like unprincipled hacks that try to avoid the
assertion failure by lying about whether there's a problem.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Vladimir Sitnikov
Date:
Subject: Re: speed up verifying UTF-8
Next
From: Robert Haas
Date:
Subject: Re: refactoring basebackup.c