Re: Parallel scan with SubTransGetTopmostTransaction assert coredump - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Parallel scan with SubTransGetTopmostTransaction assert coredump
Date
Msg-id CA+TgmoaZLxM+=L_MZ+Mm-iLBUDHWcqEuLv7Ami_oqxU3W2kZug@mail.gmail.com
Whole thread Raw
In response to Re: Parallel scan with SubTransGetTopmostTransaction assert coredump  (Greg Nancarrow <gregn4422@gmail.com>)
Responses Re: Parallel scan with SubTransGetTopmostTransaction assert coredump  (Greg Nancarrow <gregn4422@gmail.com>)
List pgsql-hackers
On Wed, Aug 4, 2021 at 10:03 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> In setting up the snapshot for the execution state used in command
> execution, GetTransactionSnapshot() is called and (possibly a copy of)
> the returned snapshot is pushed as the ActiveSnapshot.

I mean, there are LOTS of PushActiveSnapshot() calls in the code. A
lot of those specifically say
PushActiveSnapshot(GetTransactionSnapshot()) but others are pushing
snapshots obtained from various other places. I really don't think it
can possibly be correct in general to assume that the snapshot on top
of the active snapshot stack is the same as the transaction snapshot.

> So why (current Postgres code, no patches applied) in setting up for
> parallel-worker execution (in InitializeParallelDSM) does the Postgres
> code then acquire ANOTHER TransactionSnapshot (by calling
> GetTransactionSnashot(), which could return CurrentSnapshot or a new
> snapshot) and serialize that, as well as serializing what the
> ActiveSnapshot points to, and then restore those in the workers as two
> separate snapshots? Is it a mistake? Or if intentional and correct,
> how so?

Well, I already agreed that in cases where GetTransactionSnapshot()
will acquire an altogether new snapshot, we shouldn't call it, but
beyond that I don't see why you think this is wrong. I mean, suppose
we only call GetTransactionSnapshot() at parallel worker when
IsolationUsesXactSnapshot(). In that case, CurrentSnapshot is a
durable, transaction-lifetime piece of backend-local state that can
affect the results of future calls to GetTransactionSnapshot(), and
therefore seems to need to be replicated to workers. Separately,
regardless of IsolationUsesXactSnapshot(), the active snapshot is
accessible via calls to GetActiveSnapshot() and therefore should also
be replicated to workers. Now, I don't know of any necessary reason
why those two things need to be the same, because again, there are
PushActiveSnapshot() calls all over the place, and they're not all
pushing the transaction snapshot. So therefore it makes sense to me
that we need to replicate those two things separately - the active
snapshot in the leader becomes the active snapshot in the workers and
the transaction snapshot in the leader becomes the transaction
snapshot in the worker.

Now there is evidently something wrong with this line of reasoning
because the code is buggy and my proposed fix doesn't work, but I
don't know what is wrong with it. You seem to think that it's crazy
that we try to replicate the active snapshot to the active snapshot
and the transaction snapshot to the transaction snapshot, but that
did, and still does, seem really sane to me. The only part that now
seems clearly wrong to me is that !IsolationUsesXactSnapshot() case
takes an *extra* snapshot, but since fixing that didn't fix the bug,
there's evidently more to the problem than that.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Extension updates and pg_upgrade
Next
From: Robert Haas
Date:
Subject: Re: EXEC_BACKEND vs bgworkers without BGWORKER_SHMEM_ACCESS