Re: Why mark empty pages all visible? - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Why mark empty pages all visible? |
Date | |
Msg-id | CAH2-Wz=OjN_jQuJaF7f_iF8YVHbAFP9cKx1eJRf4XAdTsj0raw@mail.gmail.com Whole thread Raw |
In response to | Re: Why mark empty pages all visible? (Andres Freund <andres@anarazel.de>) |
Responses |
Re: Why mark empty pages all visible?
|
List | pgsql-hackers |
On Mon, Mar 27, 2023 at 9:32 PM Andres Freund <andres@anarazel.de> wrote: > On 2023-03-27 20:12:11 -0700, Peter Geoghegan wrote: > > On Mon, Mar 27, 2023 at 6:48 PM Andres Freund <andres@anarazel.de> wrote: > > > It seems odd that we enter the page into the VM at this point. That means that > > > use of that page will now require a bit more work (including > > > RelationGetBufferForTuple() pinning it). > > > > I think that it's fairly obvious that it's *not* odd at all. If it > > didn't do this then the pages would have to be scanned by VACUUM. > > Yes - just like in the case of new pages. I'm not saying that the status quo is free of contradictions. Only that there seem to be contradictions in what you're saying now. > > You haven't said anything about the leading cause of marking empty > > pages all-frozen, by far: lazy_vacuum_heap_page(). Should it also stop > > marking empty pages all-frozen? > > It's not obvious that it should - but it's not as clear a case as the > ConditionalLockBufferForCleanup() -> lazy_scan_new_or_empty() one. In the > latter, we know > a) that we don't have to do any work to be able to advance the horizon > b) we know that somebody else has the page pinned > > What's the point in marking it all-visible at that point? In quite likely be > from RelationGetBufferForTuple() having extended the relation and then briefly > needed to release the lock (to acquire the lock on otherBuffer or in > GetVisibilityMapPins()). I think that there is significant value in avoiding special cases, on general principle. If we stopped doing this in lazy_scan_new_or_empty() we'd be inventing a new thing that cleanup locks are supposed to protect against. Maybe something like that would make sense, but if so then make that argument, and make it explicitly represented in the code. > I don't follow. In the ConditionalLockBufferForCleanup() -> > lazy_scan_new_or_empty() case we are dealing with an new or empty > page. Whereas lazy_vacuum_heap_page() deals with a page that definitely has > dead tuples on it. How are those two cases comparable? It doesn't have dead tuples anymore, though. ISTM that there is an issue here with the definition of an empty page. You're concerned about PageIsEmpty() pages. Which actually aren't quite the same thing as an empty page left behind by lazy_vacuum_heap_page(). It's just that this distinction isn't quite acknowledged anywhere, and probably didn't exist at all at some point. Maybe that should be addressed. > > > It seems pretty clear that we shouldn't enter a currently-in-use page into the > > > VM or freespacemap. All that's going to do is to "disturb" the backend trying > > > to use that page (by directing other backends to it) and to make its job more > > > expensive. > > > > I don't think that it's clear. What about the case where there is only > > one tuple, on a page that we cannot cleanup lock? Where do you draw > > the line? > > I don't see how that's comparable? For one, we might need to clean up that > tuple for vacuum to be able to advance the horizon. And as far as the > non-cleanup lock path goes, it actually can perform work there. And it doesn't > even need to acquire an exclusive lock. So we should put space in the FSM if it has one tuple, but not if it has zero tuples? Though not if it has zero tuples following processing by lazy_vacuum_heap_page()? -- Peter Geoghegan
pgsql-hackers by date: