Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock - Mailing list pgsql-hackers

From Andres Freund
Subject Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock
Date
Msg-id 20221013205535.cknud3qrq6woe2o7@awork3.anarazel.de
Whole thread Raw
In response to Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
Hi,

On 2022-10-06 12:44:24 +0530, Amit Kapila wrote:
> On Sat, Oct 1, 2022 at 12:35 AM Andres Freund <andres@anarazel.de> wrote:
> > My problem with this approach is that the whole cleanup lock is hugely
> > misleading as-is. As I noted in
> > https://www.postgresql.org/message-id/20220817193032.z35vdjhpzkgldrd3%40awork3.anarazel.de
> > we take the cleanup lock *after* re-initializing the page. Thereby
> > completely breaking the properties that a cleanup lock normally tries to
> > guarantee.
> >
> > Even if that were to achieve something useful (doubtful in this case),
> > it'd need a huge comment explaining what's going on.
> >
>
> Attached are two patches. The first patch is what Robert has proposed
> with some changes in comments to emphasize the fact that cleanup lock
> on the new bucket is just to be consistent with the old bucket page
> locking as we are initializing it just before checking for cleanup
> lock. In the second patch, I removed the acquisition of cleanup lock
> on the new bucket page and changed the comments/README accordingly.
>
> I think we can backpatch the first patch and the second patch can be
> just a HEAD-only patch. Does that sound reasonable to you?

Not particularly, no. I don't understand how "overwrite a page and then get a
cleanup lock" can sensibly be described by this comment:

> +++ b/src/backend/access/hash/hashpage.c
> @@ -807,7 +807,8 @@ restart_expand:
>       * before changing the metapage's mapping info, in case we can't get the
>       * disk space.  Ideally, we don't need to check for cleanup lock on new
>       * bucket as no other backend could find this bucket unless meta page is
> -     * updated.  However, it is good to be consistent with old bucket locking.
> +     * updated and we initialize the page just before it.  However, it is just
> +     * to be consistent with old bucket locking.
>       */
>      buf_nblkno = _hash_getnewbuf(rel, start_nblkno, MAIN_FORKNUM);
>      if (!IsBufferCleanupOK(buf_nblkno))

This is basically saying "I am breaking basic rules of locking just to be
consistent", no?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: remove redundant memset() call
Next
From: Nathan Bossart
Date:
Subject: Re: libpq support for NegotiateProtocolVersion