Re: Extending SMgrRelation lifetimes - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Extending SMgrRelation lifetimes |
Date | |
Msg-id | 621a52fd-3cd8-4f5d-a561-d510b853bbaf@iki.fi Whole thread Raw |
In response to | Re: Extending SMgrRelation lifetimes (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Extending SMgrRelation lifetimes
Re: Extending SMgrRelation lifetimes |
List | pgsql-hackers |
I spent some more time digging into this, experimenting with different approaches. Came up with pretty significant changes; see below: On 18/09/2023 18:19, Robert Haas wrote: > I think that if you believe 0001 to be correct you should go ahead and > commit it sooner rather than later. If you're wrong and something > weird starts happening we'll then have a chance to notice that before > too much other stuff gets changed on top of this and confuses the > matter. +1 > On Wed, Aug 23, 2023 at 12:55 AM Thomas Munro <thomas.munro@gmail.com> wrote: >> Right, let's one find one good place. I think smgropen() would be best. > > I think it would be a good idea to give this comment a bit more oomph. > In lieu of this: > > + * This does not attempt to actually open the underlying files. The returned > + * object remains valid at least until AtEOXact_SMgr() is called, or until > + * smgrdestroy() is called in non-transactional backends. > > I would leave the existing "This does not attempt to actually open the > underlying files." comment as a separate comment, and add something > like this as a new paragraph: > > In versions of PostgreSQL prior to 17, this function returned an > object with no defined lifetime. Now, however, the object remains > valid for the lifetime of the transaction, up to the point where > AtEOXact_SMgr() is called, making it much easier for callers to know > for how long they can hold on to a pointer to the returned object. If > this function is called outside of a transaction, the object remains > valid until smgrdestroy() or smgrdestroyall() is called. Background > processes that use smgr but not transactions typically do this once > per checkpoint cycle. +1 > Apart from that, the main thing that is bothering me is that the > justification for redefining smgrclose() to do what smgrrelease() now > does isn't really spelled out anywhere. You mentioned some reasons and > Heikki mentioned the benefit to extensions, but I feel like somebody > should be able to understand the reasoning clearly from reading the > commit message or the comments in the patch, rather than having to > visit the mailing list discussion, and I'm not sure we're there yet. I > feel like I understood why we were doing this and was convinced it was > a good idea at some point, but now the reasoning has gone out of my > head and I can't recover it. If somebody does smgropen() .. stuff ... > smgrclose() as in heapam_relation_copy_data() or index_copy_data(), > this change has the effect of making the SmgrRelation remain valid > until eoxact whereas before it would have been destroyed instantly. Is > that what we want? Presumably yes, or this patch wouldn't be shaped > like this, but I don't know why we want that... Fair. I tried to address that by adding an overview comment at top of smgr.c, explaining how this stuff work. I hope that helps. > Another thing that seems a bit puzzling is how this is intended to, or > does, interact with the ownership mechanism. Intuitively, I feel like > a Relation owns an SMgrRelation *because* the SMgrRelation has no > defined lifetime. But I think that's not quite right. I guess the > ownership mechanism doesn't guarantee anything about the lifetime of > the object, just that the pointer in the Relation won't hang around > any longer than the object to which it's pointing. So does that mean > that we're free to redefine the object lifetime to be pretty much > anything we want and that mechanism doesn't really care? Huh, > interesting. Yeah that owner mechanism is weird. It guarantees that the pointer to the SMgrRelation is cleared when the SMgrRelation is destroyed. But it also prevents the SMgrRelation from being destroyed at end of transaction. That's how it is in 'master' too. But with this patch, we don't normally call smgrdestroy() on an SMgrRelation that came from the relation cache. We do call smgrdestroyall() in the aux processes, but they don't have a relcache. So the real effect of setting the owner now is to prevent the SMgrRelation from being destroyed at end of transaction; the mechanism of clearing the pointer is unused. I found two exceptions to that, though, by adding some extra assertions and running the regression tests: 1. The smgrdestroyall() in a single-user backend in RequestCheckpoint(). It destroys SMgrRelations belonging to relcache entries, and the owner mechanism clears the pointers from the relcache. I think smgrcloseall(), or doing nothing, would actually be more appropriate there. 2. A funny case with foreign tables: ANALYZE on a foreign table calls visibilitymap_count(). A foreign table has no visibility map so it returns 0, but before doing so it calls RelationGetSmgr on the foreign table, which has 0/0/0 rellocator. That creates an SMgrRelation for 0/0/0, and sets the foreign table's relcache entry as its owner. If you then call ANALYZE on another foreign table, it also calls RelationGetSmgr with 0/0/0 rellocator, returning the same SMgrRelation entry, and changes its owner to the new relcache entry. That doesn't make much sense and it's pretty accidental that it works at all, so attached is a patch to avoid calling visibilitymap_count() on foreign tables. I propose that we replace the single owner with a "pin counter". One SMgrRelation can have more than one pin on it, and the guarantee is that as long as the pin counter is non-zero, the SMgrRelation cannot be destroyed and the pointer remains valid. We don't really need the capability for more than one pin at the moment (the regression tests pass with an extra assertion that pincount <= 1 after fixing the foreign table issue), but it seems more straightforward than tracking an owner. Here's another reason to do that: I noticed this at the end of swap_relation_files(): > /* > * Close both relcache entries' smgr links. We need this kluge because > * both links will be invalidated during upcoming CommandCounterIncrement. > * Whichever of the rels is the second to be cleared will have a dangling > * reference to the other's smgr entry. Rather than trying to avoid this > * by ordering operations just so, it's easiest to close the links first. > * (Fortunately, since one of the entries is local in our transaction, > * it's sufficient to clear out our own relcache this way; the problem > * cannot arise for other backends when they see our update on the > * non-transient relation.) > * > * Caution: the placement of this step interacts with the decision to > * handle toast rels by recursion. When we are trying to rebuild pg_class > * itself, the smgr close on pg_class must happen after all accesses in > * this function. > */ > RelationCloseSmgrByOid(r1); > RelationCloseSmgrByOid(r2); If RelationCloseSmgrByOid() merely closes the underlying file descriptor but doesn't destroy the SMgrRelation object - as it does with these patches - I think we reintroduce the dangling reference problem that the comment mentions. But if we allow the same SMgrRelation to be pinned by two different relcache entries, the problem goes away and we can remove that kluge. I think we're missing test coverage for that though. I commented out those calls in 'master' and ran the regression tests, but got no failures. I don't fully understand the problem anyway. Or does it not exist anymore? Is there a moment where we have two relcache entries pointing to the same SMgrRelation? I don't see it. In any case, with a pin counter mechanism, I believe it would be fine. Summary of the changes to the attached main patch: * Added an overview comment at top of smgr.c * Added the comment Robert suggested to smgropen() * Replaced the single owner with a pin count and smgrpin() / smgrunpin() functions. smgrdestroyall() now only destroys unpinned entries * Removed that kluge from swap_relation_files(). It should not be needed anymore with the pin counter. * Changed a few places in bufmgr.c where we called RelationGetSmgr() on every smgr call to keep the SMgrRelation in a local variable. That was not safe before, but is now. I don't think we should go on a spree to change all callers - RelationGetSmgr() is still cheap - but in these few places it seems worthwhile. * I kept the separate smgrclose() and smgrrelease() functions. I know I suggested to just change smgrclose() to do what smgrrelease() did, but on second thoughts keeping them separate seems nicer. However, sgmgrclose() just calls smgrrelease() now, so the distinction is just pro forma. The idea is that if you call smgrclose(), you hint that you don't need that SMgrRelation pointer anymore, although there might be other pointers to the same object and they stay valid. -- Heikki Linnakangas Neon (https://neon.tech)
Attachment
pgsql-hackers by date: