On Tue, Feb 4, 2025 at 12:44 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> On Mon, Feb 3, 2025 at 9:09 PM Andres Freund <andres@anarazel.de> wrote:
> >
> > On 2025-01-29 14:12:52 -0500, Melanie Plageman wrote:
> > > From 71f32189aad510b73d221fb0478ffd916e5e5dde Mon Sep 17 00:00:00 2001
> > > From: Melanie Plageman <melanieplageman@gmail.com>
>
> > > @@ -1064,7 +1327,45 @@ lazy_scan_heap(LVRelState *vacrel)
> > > if (got_cleanup_lock)
> > > lazy_scan_prune(vacrel, buf, blkno, page,
> > > vmbuffer, all_visible_according_to_vm,
> > > - &has_lpdead_items);
> > > + &has_lpdead_items, &vm_page_frozen);
> > > +
> > > + /*
> > > + * Count an eagerly scanned page as a failure or a success.
> > > + */
> > > + if (was_eager_scanned)
> >
> > Hm - how should pages be counted that we couldn't get a lock on? I think
> > right now they'll be counted as a failure, but that doesn't seem quite right.
>
> Yea, I thought that counting them as failures made sense because we
> did fail to freeze them. However, now that you mention it, we didn't
> fail to freeze them because of age, so maybe we don't want to count
> them as failures. I don't expect us to have a bunch of contended
> all-visible pages, so I think the question is about what makes it more
> clear in the code. What do you think? Should I reset was_eager_scanned
> to false if we don't get the cleanup lock?
>
I feel like if we are making the trade-off in resources to attempt
eager scanning, and we weren't making progress for whatever reason
(and in the lock failure cases, wouldn't some of those be things that
would prevent us from freezing?) then it would generally be ok to bias
towards bailing sooner rather than later.
Robert Treat
https://xzilla.net