Re: [HACKERS] snapbuild woes - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [HACKERS] snapbuild woes
Date
Msg-id 20170511221217.dqok4ixviv6hlp3n@alap3.anarazel.de
Whole thread Raw
In response to Re: [HACKERS] snapbuild woes  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
List pgsql-hackers
On 2017-04-15 05:18:49 +0200, Petr Jelinek wrote:
> >>> From 3318a929e691870f3c1ca665bec3bfa8ea2af2a8 Mon Sep 17 00:00:00 2001
> >>> From: Petr Jelinek <pjmodos@pjmodos.net>
> >>> Date: Sun, 26 Feb 2017 01:07:33 +0100
> >>> Subject: [PATCH 3/5] Prevent snapshot builder xmin from going backwards
> >>
> >> A bit more commentary would be good. What does that protect us against?
> >>
> > 
> > I think I explained that in the email. We might export snapshot with
> > xmin smaller than global xmin otherwise.
> > 
> 
> Updated commit message with explanation as well.


> From ae60b52ae0ca96bc14169cd507f101fbb5dfdf52 Mon Sep 17 00:00:00 2001
> From: Petr Jelinek <pjmodos@pjmodos.net>
> Date: Sun, 26 Feb 2017 01:07:33 +0100
> Subject: [PATCH 3/5] Prevent snapshot builder xmin from going backwards
> 
> Logical decoding snapshot builder may encounter xl_running_xacts with
> older xmin than the xmin of the builder. This can happen because
> LogStandbySnapshot() sometimes sees already committed transactions as
> running (there is difference between "running" in terms for WAL and in
> terms of ProcArray). When this happens we must make sure that the xmin
> of snapshot builder won't go back otherwise the resulting snapshot would
> show some transaction as running even though they have already
> committed.
> ---
>  src/backend/replication/logical/snapbuild.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
> index ada618d..3e34f75 100644
> --- a/src/backend/replication/logical/snapbuild.c
> +++ b/src/backend/replication/logical/snapbuild.c
> @@ -1165,7 +1165,8 @@ SnapBuildProcessRunningXacts(SnapBuild *builder, XLogRecPtr lsn, xl_running_xact
>       * looking, it's correct and actually more efficient this way since we hit
>       * fast paths in tqual.c.
>       */
> -    builder->xmin = running->oldestRunningXid;
> +    if (TransactionIdFollowsOrEquals(running->oldestRunningXid, builder->xmin))
> +        builder->xmin = running->oldestRunningXid;
>  
>      /* Remove transactions we don't need to keep track off anymore */
>      SnapBuildPurgeCommittedTxn(builder);
> -- 
> 2.7.4

I still don't understand.  The snapshot's xmin is solely managed via
xl_running_xacts, so I don't see how the WAL/procarray difference can
play a role here.  ->committed isn't pruned before xl_running_xacts
indicates it can be done, so I don't understand your explanation above.

I'd be ok with adding this as paranoia check, but I still don't
understand when it could practically be hit.

- Andres



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [HACKERS] snapbuild woes
Next
From: Alvaro Herrera
Date:
Subject: Re: [HACKERS] WITH clause in CREATE STATISTICS