Re: md.c vs elog.c vs smgrreleaseall() in barrier - Mailing list pgsql-hackers

From Andres Freund
Subject Re: md.c vs elog.c vs smgrreleaseall() in barrier
Date
Msg-id x6nmuzi5m6yzukp7lifuhfsnhyao3rse6aptatgg7llmz2gom6@h65qu63hqnli
Whole thread Raw
In response to Re: md.c vs elog.c vs smgrreleaseall() in barrier  (Noah Misch <noah@leadboat.com>)
Responses Re: md.c vs elog.c vs smgrreleaseall() in barrier
List pgsql-hackers
Hi,

I updated the patch with the following changes:

- Remove the assertion from smgrtruncate() - it would need to assert that it's
  called in a critical section.

  Not sure why it's not already asserting that?

  The function header says:
   * ... This function should normally
   * be called in a critical section, but the current size must be checked
   * outside the critical section, and no interrupts or smgr functions relating
   * to this relation should be called in between.

  The "should normally" is bit weird imo, when would it be safe to *not* use
  it in a critical section?


- added comments about the reason for HOLD_INTERRUPTS to smgrdounlinkall(),
  smgrdestroyall() and smgrreleaseall()


- moved the HOLD_INTERRUPTS after FlushRelationsAllBuffers() in smgrdosyncall


- updated the comment & commit message talking about the problem being due to
  debug elogs/ereports - it's also LOG/WARNING.


I was writing a remark in the commit message, explaining that the only < ERROR
elog/ereport that can be reached is very unlikely to be reachable, and
therefore it's not worth the risk of backpatching.  But it turns out there
also are WARNINGs in mdunlinkfork(), which seem a lot easier to reach than the
DEBUG1 in register_dirty_segment().

I still am leaning against backpatching, but I'm not sure that's not just
laziness.


On 2025-03-19 17:45:14 -0700, Noah Misch wrote:
> On Wed, Mar 19, 2025 at 06:45:20PM -0400, Andres Freund wrote:
> > On 2025-03-19 12:55:53 -0700, Noah Misch wrote:
> > > On Mon, Mar 17, 2025 at 07:52:02PM -0400, Andres Freund wrote:
> > > > @@ -434,6 +481,8 @@ smgrdosyncall(SMgrRelation *rels, int nrels)
> > > >      if (nrels == 0)
> > > >          return;
> > > >
> > > > +    HOLD_INTERRUPTS();
> > > > +
> > > >      FlushRelationsAllBuffers(rels, nrels);
> > >
> > > FlushRelationsAllBuffers() isn't part of smgr or md.c, so it's unlikely to
> > > become sensitive to smgrrelease().  It may do a ton of I/O.  Hence, I'd
> > > HOLD_INTERRUPTS() after FlushRelationsAllBuffers(), not before.
> >
> > Hm - we never would want to process interrupts while in
> > FlushRelationsAllBuffers() or such, would we?  I'm ok with changing it, I
> > guess I just didn't see a reason not to use a wider scope.
>
> If we get a query cancel or fast shutdown, it's better for the user to abort
> the transaction rather than keep flushing.  smgrDoPendingSyncs() calls here
> rather late in the pre-commit actions, so failing is still supposed to be
> fine.  I think the code succeeds at making it fine to fail here.

But we don't actually intentionally accept interrupts in
FlushRelationsAllBuffers()? It would only happen as a side-effect of a
non-error elog/ereport() processing interrupts, right?

It also looks like we couldn't accept interrupts when called by
AbortTransaction(), because there we already are in a HOLD_INTERRUPTS()
region. I'm pretty sure an error would trigger at least an assertion. But
that's really an independent issue.

Moved.



> > > > @@ -471,6 +522,8 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo)
> > > >      if (nrels == 0)
> > > >          return;
> > > >
> > > > +    HOLD_INTERRUPTS();
> > >
> > > I would move this below DropRelationsAllBuffers(), for reasons like
> > > FlushRelationsAllBuffers() above.
> >
> > I think that'd be unsafe. Once we dropped buffers from the buffer pool we
> > can't just continue without also unlinking the underlying relation, otherwise
> > an older view of the data later can be "revived from the dead" after an error,
> > causing all manner of corruption.
> >
> > I suspect it's always called with interrupts held already though.
>
> Ah, confirmed.  If I put this assert at the top of smgrdounlinkall(),
> check-world passes:
>
>     Assert(IsBinaryUpgrade || InRecovery || !INTERRUPTS_CAN_BE_PROCESSED());

I just made it hold interrupts for now, hope that makes sense?


> > > I struggle to speculate about the merits of this, because mdopen() can't fail.
> > > If mdopen() started to do things that could fail, mdnblocks() would be
> > > reasonable in assuming those things are done.  Hence, the long-term direction
> > > should be more like destroying the new smgr entry in the event of error.
> > >
> > > I would not make this change.  I'd maybe add a comment that smgr_open
> > > callbacks currently aren't allowed to fail, since smgropen() isn't ready to
> > > clean up smgr-level state from a failed open.  How do you see it?
> >
> > I see no disadvantage in the change - it seems strictly better to initialize
> > pincount earlier. I agree that it might be a good idea to explicitly handle
> > errors eventually, but that'd not be made harder by this change...
>
> Okay.  I suppose if mdopen() gained the ability to fail and mdnblocks() also
> gained the ability to cure said failure, this change would make that okay.

FWIW, if mdopen() could fail, it should probably only do fallible operations
after doing the non-fallible initialization. Then mdnblocks() wouldn't need to
cure anything.

Greetings,

Andres Freund

Attachment

pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: AIO v2.5
Next
From: Matheus Alcantara
Date:
Subject: Re: dblink: Add SCRAM pass-through authentication