Re: md.c vs elog.c vs smgrreleaseall() in barrier - Mailing list pgsql-hackers
From | Noah Misch |
---|---|
Subject | Re: md.c vs elog.c vs smgrreleaseall() in barrier |
Date | |
Msg-id | 20250320004514.95.nmisch@google.com Whole thread Raw |
In response to | Re: md.c vs elog.c vs smgrreleaseall() in barrier (Andres Freund <andres@anarazel.de>) |
Responses |
Re: md.c vs elog.c vs smgrreleaseall() in barrier
|
List | pgsql-hackers |
On Wed, Mar 19, 2025 at 06:45:20PM -0400, Andres Freund wrote: > On 2025-03-19 12:55:53 -0700, Noah Misch wrote: > > On Mon, Mar 17, 2025 at 07:52:02PM -0400, Andres Freund wrote: > > > @@ -362,12 +397,16 @@ smgrreleaseall(void) > > > if (SMgrRelationHash == NULL) > > > return; > > > > > > + HOLD_INTERRUPTS(); > > > > Likely not important, but it's not clear to me why smgrdestroyall() and > > smgrreleaseall() get HOLD_INTERRUPTS(), as opposed to relying on the holds in > > smgrdestroy() and smgrrelease(). In contrast, smgrreleaserellocator() does > > rely on smgrrelease() for the hold. > > It didn't seem particularly safe to allow interrupts, which in turn could > change the list of open relations, while iterating over a linked list / a > hashtable. Fair. > > > @@ -434,6 +481,8 @@ smgrdosyncall(SMgrRelation *rels, int nrels) > > > if (nrels == 0) > > > return; > > > > > > + HOLD_INTERRUPTS(); > > > + > > > FlushRelationsAllBuffers(rels, nrels); > > > > FlushRelationsAllBuffers() isn't part of smgr or md.c, so it's unlikely to > > become sensitive to smgrrelease(). It may do a ton of I/O. Hence, I'd > > HOLD_INTERRUPTS() after FlushRelationsAllBuffers(), not before. > > Hm - we never would want to process interrupts while in > FlushRelationsAllBuffers() or such, would we? I'm ok with changing it, I > guess I just didn't see a reason not to use a wider scope. If we get a query cancel or fast shutdown, it's better for the user to abort the transaction rather than keep flushing. smgrDoPendingSyncs() calls here rather late in the pre-commit actions, so failing is still supposed to be fine. I think the code succeeds at making it fine to fail here. > I guess I am a bit paranoid because gave me flashbacks to issues around > smgrtruncate() failing after doing DropRelationBuffers(), that Thomas recently > fixed (and I had worked on a few years before). But of course > DropRelationBuffers() is way more dangerous than FlushRelationsAllBuffers(). Fair. > > > } > > > } > > > + > > > + RESUME_INTERRUPTS(); > > > } > > > > > > /* > > > @@ -471,6 +522,8 @@ smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo) > > > if (nrels == 0) > > > return; > > > > > > + HOLD_INTERRUPTS(); > > > > I would move this below DropRelationsAllBuffers(), for reasons like > > FlushRelationsAllBuffers() above. > > I think that'd be unsafe. Once we dropped buffers from the buffer pool we > can't just continue without also unlinking the underlying relation, otherwise > an older view of the data later can be "revived from the dead" after an error, > causing all manner of corruption. > > I suspect it's always called with interrupts held already though. Ah, confirmed. If I put this assert at the top of smgrdounlinkall(), check-world passes: Assert(IsBinaryUpgrade || InRecovery || !INTERRUPTS_CAN_BE_PROCESSED()); > > > Subject: [PATCH v2 2/2] smgr: Make SMgrRelation initialization safer against > > > errors > > > > > > In case the smgr_open callback failed, the ->pincount field would not be > > > initialized and the relation would not be put onto the unpinned_relns list. > > > > > > This buglet was introduced in 21d9c3ee4ef7. As that commit is only in HEAD, no > > > need to backpatch. > > > > > > Discussion: https://postgr.es/m/3vae7l5ozvqtxmd7rr7zaeq3qkuipz365u3rtim5t5wdkr6f4g@vkgf2fogjirl > > > --- > > > src/backend/storage/smgr/smgr.c | 6 +++--- > > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > > > diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c > > > index 53a09fe4aaa..24971304b85 100644 > > > --- a/src/backend/storage/smgr/smgr.c > > > +++ b/src/backend/storage/smgr/smgr.c > > > @@ -255,12 +255,12 @@ smgropen(RelFileLocator rlocator, ProcNumber backend) > > > reln->smgr_cached_nblocks[i] = InvalidBlockNumber; > > > reln->smgr_which = 0; /* we only have md.c at present */ > > > > > > - /* implementation-specific initialization */ > > > - smgrsw[reln->smgr_which].smgr_open(reln); > > > - > > > /* it is not pinned yet */ > > > reln->pincount = 0; > > > dlist_push_tail(&unpinned_relns, &reln->node); > > > + > > > + /* implementation-specific initialization */ > > > + smgrsw[reln->smgr_which].smgr_open(reln); > > > } > > > > I struggle to speculate about the merits of this, because mdopen() can't fail. > > If mdopen() started to do things that could fail, mdnblocks() would be > > reasonable in assuming those things are done. Hence, the long-term direction > > should be more like destroying the new smgr entry in the event of error. > > > > I would not make this change. I'd maybe add a comment that smgr_open > > callbacks currently aren't allowed to fail, since smgropen() isn't ready to > > clean up smgr-level state from a failed open. How do you see it? > > I see no disadvantage in the change - it seems strictly better to initialize > pincount earlier. I agree that it might be a good idea to explicitly handle > errors eventually, but that'd not be made harder by this change... Okay. I suppose if mdopen() gained the ability to fail and mdnblocks() also gained the ability to cure said failure, this change would make that okay.
pgsql-hackers by date: