Re: Parallel scan with SubTransGetTopmostTransaction assert coredump - Mailing list pgsql-hackers
From | Greg Nancarrow |
---|---|
Subject | Re: Parallel scan with SubTransGetTopmostTransaction assert coredump |
Date | |
Msg-id | CAJcOf-dbeg0CfJgo03SXD1=qzM650DrTRJDKhtp_=zvDPoxxfw@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel scan with SubTransGetTopmostTransaction assert coredump (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Parallel scan with SubTransGetTopmostTransaction assert coredump
|
List | pgsql-hackers |
On Tue, Aug 10, 2021 at 12:35 AM Robert Haas <robertmhaas@gmail.com> wrote: > > Now there is evidently something wrong with this line of reasoning > because the code is buggy and my proposed fix doesn't work, but I > don't know what is wrong with it. You seem to think that it's crazy > that we try to replicate the active snapshot to the active snapshot > and the transaction snapshot to the transaction snapshot, but that > did, and still does, seem really sane to me. The only part that now > seems clearly wrong to me is that !IsolationUsesXactSnapshot() case > takes an *extra* snapshot, but since fixing that didn't fix the bug, > there's evidently more to the problem than that. > I traced through snapshot processing during a parallel SELECT, up to the point of the existing GetTransactionSnapshot() and GetCurrentSnapshot() calls in InitializeParallelDSM(). I'm seeing the following sequence of calls, numbered below: PortalStart(): case PORTAL_ONE_SELECT: (1) PushActiveSnapshot(GetTransactionSnapshot()); ... queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts), portal->sourceText, (2) GetActiveSnapshot(), InvalidSnapshot, None_Receiver, params, portal->queryEnv, 0); ... (3) PopActiveSnapshot(); PortalRunSelect(): (4) PushActiveSnapshot(queryDesc->snapshot); ExecutorRun(queryDesc, direction, (uint64) count, portal->run_once); InitializeParallelDSM(): (5) Snapshot transaction_snapshot = GetTransactionSnapshot(); (6) Snapshot active_snapshot = GetActiveSnapshot(); nprocessed = queryDesc->estate->es_processed; (7) PopActiveSnapshot(); The snapshot used in execution of the query is clearly the ActiveSnapshot at the time of creating the QueryDesc [at (2)] which is a copy of the TransactionSnapshot originally acquired [at (1)]. In InitializeParallelDSM() it acquires both the TransactionSnapshot [at (5)] and the ActiveSnapshot [at (6)], to be serialized in the DSM for the workers (each of which will deserialize and restore these). But the problem I see is that the GetTransactionSnapshot() call [at (5)] may acquire a new snapshot (i.e. a later snapshot than the ActiveSnapshot used in the execution of the query), for example, if a concurrent transaction has completed since GetTransactionSnapshot() was last called [in (1)]. In this case, GetTransactionSnapshot() calls GetSnapshotDataReuse() and it returns false, causing a new snapshot to be built by GetTransactionSnapshot(). curXactCompletionCount = ShmemVariableCache->xactCompletionCount; if (curXactCompletionCount != snapshot->snapXactCompletionCount) return false; When this TransactionSnapshot is restored in a worker process, it accordingly sets TransactionXmin, and if we look back at the coredump stacktrace and the Assert condition that failed in the worker, we see that the xid was expected to be >= TransactionXmin, but the Assert fired because the xid was < TransactionXmin. Assert(TransactionIdFollowsOrEquals(xid, TransactionXmin)); This is explained by the TransactionSnapshot being a later snapshot in this case. So this is why it seems to be wrong to call GetTransactionSnapshot() in InitializeParallelDSM() and use a separate, potentially later, snapshot than that used in the execution state for the query. Regards, Greg Nancarrow Fujitsu Australia
pgsql-hackers by date: