Re: [COMMITTERS] pgsql: Avoid SnapshotResetXmin() duringAtEOXact_Snapshot() - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [COMMITTERS] pgsql: Avoid SnapshotResetXmin() duringAtEOXact_Snapshot()
Date
Msg-id 20170324175602.yaxb633jpgcyc5vx@alap3.anarazel.de
Whole thread Raw
In response to Re: [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 2017-03-24 13:50:54 -0400, Robert Haas wrote:
> On Fri, Mar 24, 2017 at 12:27 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> > On Fri, Mar 24, 2017 at 12:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> >> On Fri, Mar 24, 2017 at 10:23 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> >>> Avoid SnapshotResetXmin() during AtEOXact_Snapshot()
> >>>
> >>> For normal commits and aborts we already reset PgXact->xmin
> >>> Avoiding touching highly contented shmem improves concurrent
> >>> performance.
> >>>
> >>> Simon Riggs
> >>
> >> I'm getting occasional crashes with backtraces that look like this:
> >>
> >> #4  0x0000000107e4be2b in AtEOXact_Snapshot (isCommit=<value
> >> temporarily unavailable, due to optimizations>, isPrepare=0 '\0') at
> >> snapmgr.c:1154
> >> #5  0x0000000107a76c06 in CleanupTransaction () at xact.c:2643
> >>
> >> I suspect that is the fault of this patch.  Please fix or revert.
> >
> > Also, the entire buildfarm is turning red.
> >
> > longfin, spurfowl, and magpie all show this assertion failure in the
> > log.  I haven't checked the others.
> >
> > TRAP: FailedAssertion("!(MyPgXact->xmin == 0)", File: "snapmgr.c", Line: 1154)
> 
> Another thing that is interesting is that when I run make -j8
> check-world, the overall tests appear to succeed even though there are
> failures mid-way through:
> 
> test tablefunc                ... FAILED (test process exited with exit code 2)
> 
> ...but then later we end with:
> 
> ok
> All tests successful.
> Files=11, Tests=80, 251 wallclock secs ( 0.07 usr  0.02 sys + 19.77
> cusr 14.45 csys = 34.31 CPU)
> Result: PASS

> real    4m27.421s
> user    3m50.047s
> sys    1m31.937s

> That's unrelated to the current problem of course, but it seems to
> suggest that make's -j option doesn't entirely do what you'd expect
> when used with make check-world.
> 

That's likely the output of a different test from the one that failed.
It's a lot easier to see the result if you're doing
&& echo success || echo failure

- Andres



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()
Next
From: Pavan Deolasee
Date:
Subject: Re: Patch: Write Amplification Reduction Method (WARM)