Re: Why mark empty pages all visible? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Why mark empty pages all visible?
Date
Msg-id 20230328043216.qg7zrotxbapzy7oa@awork3.anarazel.de
Whole thread Raw
In response to Re: Why mark empty pages all visible?  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Why mark empty pages all visible?  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
Hi,

On 2023-03-27 20:12:11 -0700, Peter Geoghegan wrote:
> On Mon, Mar 27, 2023 at 6:48 PM Andres Freund <andres@anarazel.de> wrote:
> > It seems odd that we enter the page into the VM at this point. That means that
> > use of that page will now require a bit more work (including
> > RelationGetBufferForTuple() pinning it).
> 
> I think that it's fairly obvious that it's *not* odd at all. If it
> didn't do this then the pages would have to be scanned by VACUUM.

Yes - just like in the case of new pages.


> You haven't said anything about the leading cause of marking empty
> pages all-frozen, by far: lazy_vacuum_heap_page(). Should it also stop
> marking empty pages all-frozen?

It's not obvious that it should - but it's not as clear a case as the
ConditionalLockBufferForCleanup() -> lazy_scan_new_or_empty() one. In the
latter, we know
a) that we don't have to do any work to be able to advance the horizon
b) we know that somebody else has the page pinned

What's the point in marking it all-visible at that point? In quite likely be
from RelationGetBufferForTuple() having extended the relation and then briefly
needed to release the lock (to acquire the lock on otherBuffer or in
GetVisibilityMapPins()).


> > This got a bit stranger with 44fa84881fff, because now we add the page into
> > the VM even if it currently is pinned:
> 
> > It seems quite odd to set a page to all visible that we could not currently
> > get a cleanup lock on - obviously evidence of another backend trying to to do
> > something with the page.
> 
> You can say the same thing about lazy_vacuum_heap_page(), too.
> Including the part about cleanup locking.

I don't follow. In the ConditionalLockBufferForCleanup() ->
lazy_scan_new_or_empty() case we are dealing with an new or empty
page. Whereas lazy_vacuum_heap_page() deals with a page that definitely has
dead tuples on it.  How are those two cases comparable?


> > It seems pretty clear that we shouldn't enter a currently-in-use page into the
> > VM or freespacemap. All that's going to do is to "disturb" the backend trying
> > to use that page (by directing other backends to it) and to make its job more
> > expensive.
> 
> I don't think that it's clear. What about the case where there is only
> one tuple, on a page that we cannot cleanup lock? Where do you draw
> the line?

I don't see how that's comparable? For one, we might need to clean up that
tuple for vacuum to be able to advance the horizon. And as far as the
non-cleanup lock path goes, it actually can perform work there. And it doesn't
even need to acquire an exclusive lock.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: PGdoc: add missing ID attribute to create_subscription.sgml
Next
From: Justin Pryzby
Date:
Subject: Re: ALTER TABLE SET ACCESS METHOD on partitioned tables