Hi,
On 2022-08-17 10:18:14 +0530, Amit Kapila wrote:
> > Looking at the non-recovery code makes me even more suspicious:
> >
> > /*
> > * Physically allocate the new bucket's primary page. We want to do this
> > * before changing the metapage's mapping info, in case we can't get the
> > * disk space. Ideally, we don't need to check for cleanup lock on new
> > * bucket as no other backend could find this bucket unless meta page is
> > * updated. However, it is good to be consistent with old bucket locking.
> > */
> > buf_nblkno = _hash_getnewbuf(rel, start_nblkno, MAIN_FORKNUM);
> > if (!IsBufferCleanupOK(buf_nblkno))
> > {
> > _hash_relbuf(rel, buf_oblkno);
> > _hash_relbuf(rel, buf_nblkno);
> > goto fail;
> > }
> >
> >
> > _hash_getnewbuf() calls _hash_pageinit() which calls PageInit(), which
> > memset(0)s the whole page. What does it even mean to check whether you
> > effectively have a cleanup lock after you zeroed out the page?
> >
> > Reading the README and the comment above makes me wonder if this whole cleanup
> > lock business here is just cargo culting and could be dropped?
> >
>
> I think it is okay to not acquire a clean-up lock on the new bucket
> page both in recovery and non-recovery paths. It is primarily required
> on the old bucket page to avoid concurrent scans/inserts. As mentioned
> in the comments and as per my memory serves, it is mainly for keeping
> it consistent with old bucket locking.
It's not keeping it consistent with bucket locking to zero out a page before
getting a cleanup lock, hopefully at least. This code is just broken on
multiple fronts, and consistency isn't a defense.
Greetings,
Andres Freund