Re: cannot abort transaction 2737414167, it was already committed - Mailing list pgsql-hackers

From Noah Misch
Subject Re: cannot abort transaction 2737414167, it was already committed
Date
Msg-id 20240703171749.7d.nmisch@google.com
Whole thread Raw
In response to Re: cannot abort transaction 2737414167, it was already committed  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
On Thu, May 09, 2024 at 05:19:47PM +1200, Thomas Munro wrote:
> On Thu, Dec 28, 2023 at 11:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Thomas Munro <thomas.munro@gmail.com> writes:
> > > In CommitTransaction() there is a stretch of code beginning s->state =
> > > TRANS_COMMIT and ending s->state = TRANS_DEFAULT, from which we call
> > > out to various subsystems' AtEOXact_XXX() functions.  There is no way
> > > to roll back in that state, so anything that throws ERROR from those
> > > routines is going to get something much like $SUBJECT.  Hmm, we'd know
> > > which exact code path got that EIO from your smoldering core if we'd
> > > put an explicit critical section there (if we're going to PANIC
> > > anyway, it might as well not be from a different stack after
> > > longjmp()...).
> >
> > +1, there's basically no hope of debugging this sort of problem
> > as things stand.
> 
> I was reminded of this thread by Justin's other file system snafu thread.
> 
> Naively defining a critical section to match the extent of the
> TRANS_COMMIT state doesn't work, as a bunch of code under there uses
> palloc().  That reminds me of the nearby RelationTruncate() thread,
> and there is possibly even some overlap, plus more in this case...
> ugh.
> 
> Hmm, AtEOXact_RelationMap() is one of those steps, but lives just
> outside the crypto-critical-section created by TRANS_COMMIT, though
> has its own normal CS for logging.  I wonder, given that "updating the
> map file is effectively commit of the relocation", why wouldn't it
> have a variant of the problem solved by DELAY_CHKPT_START for normal
> commit records, under diabolical scheduling?  It's a stretch, but: You
> log XLOG_RELMAP_UPDATE, a concurrent checkpoint runs with REDO after
> that record, you crash before/during durable_rename(), and then you
> perform crash recovery.

See the CheckPointRelationMap() header comment for how relmapper behaves like
DELAY_CHKPT_START without using that flag.  I think its mechanism suffices.



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Assertion failure with summarize_wal enabled during pg_createsubscriber
Next
From: Tomas Vondra
Date:
Subject: Re: Commitfest manager for July 2024