At Mon, 01 Nov 2021 18:01:18 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in
> "Bossart, Nathan" <bossartn@amazon.com> writes:
> > On 10/31/21, 1:55 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
> >> 2. I think we almost certainly have a problem in SyncPostCheckpoint.
>
> > This one doesn't look as straightforward. It looks like we might need
> > a list_delete_first_n() to delete the first N entries all at once to
> > improve this one.
>
> Yeah. We don't absolutely need a new list primitive: we could use
> list_copy_tail() and then free the old list. But the extra palloc
> traffic involved makes this sound like a bad idea. It does have
> the advantage that we could shorten the List's storage even when
> it doesn't go to empty, but I'm not sure that's worth anything.
> If the List isn't going to empty, that implies that we're getting
> a steady stream of unlink requests, meaning we'd probably just
> fill it up again.
I agree to that. In general I think it's better not to resize storage
on truncation (or shortning) of a list except for a few special cases.
(for example, for a list that temporarily grows prominently but won't
go empty)
> The minimum-change patch would have us truncating the list before
> each AbsorbSyncRequests call, so that the list state meets that
> function's expectations. However, as long as UNLINKS_PER_ABSORB
> is only 10, I don't think that gets us out of the O(N^2) woods.
Agreed.
> So what I did in the attached is add a "canceled" flag to
> PendingUnlinkEntry, which lets us deal with canceled or finished
> entries without having to delete them from the list right away.
> Then we only need to physically clean up the list once per
> SyncPostCheckpoint call.
We don't loop over so many canceled elements usually so I think it
works well. However, shouldn't we consider canceled before checking
cycle_ctr? A canceled elements should be invisible from later
accesses at all. I vaguely feel the name "cancel" might be better be
"consumed" or such but I don't object to "cancel".
I feel that we might need to wipe_mem for the memmove case as well
(together with list_delete_nth_cell) but that is another thing even if
that's correct.
Otherwise it looks good to me.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center