Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock |
Date | |
Msg-id | 20221014182140.l5woiudnp7i3zyzb@awork3.anarazel.de Whole thread Raw |
In response to | Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock
(Amit Kapila <amit.kapila16@gmail.com>)
Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
Hi, On 2022-10-14 10:40:11 +0530, Amit Kapila wrote: > On Fri, Oct 14, 2022 at 2:25 AM Andres Freund <andres@anarazel.de> wrote: > > > > > > > > Attached are two patches. The first patch is what Robert has proposed > > > with some changes in comments to emphasize the fact that cleanup lock > > > on the new bucket is just to be consistent with the old bucket page > > > locking as we are initializing it just before checking for cleanup > > > lock. In the second patch, I removed the acquisition of cleanup lock > > > on the new bucket page and changed the comments/README accordingly. > > > > > > I think we can backpatch the first patch and the second patch can be > > > just a HEAD-only patch. Does that sound reasonable to you? > > > > Not particularly, no. I don't understand how "overwrite a page and then get a > > cleanup lock" can sensibly be described by this comment: > > > > > +++ b/src/backend/access/hash/hashpage.c > > > @@ -807,7 +807,8 @@ restart_expand: > > > * before changing the metapage's mapping info, in case we can't get the > > > * disk space. Ideally, we don't need to check for cleanup lock on new > > > * bucket as no other backend could find this bucket unless meta page is > > > - * updated. However, it is good to be consistent with old bucket locking. > > > + * updated and we initialize the page just before it. However, it is just > > > + * to be consistent with old bucket locking. > > > */ > > > buf_nblkno = _hash_getnewbuf(rel, start_nblkno, MAIN_FORKNUM); > > > if (!IsBufferCleanupOK(buf_nblkno)) > > > > This is basically saying "I am breaking basic rules of locking just to be > > consistent", no? > > > > Fair point. How about something like: "XXX Do we really need to check > for cleanup lock on the new bucket? Here, we initialize the page, so > ideally we don't need to perform any operation that requires such a > check."?. This still seems to omit that the code is quite broken. > Feel free to suggest something better. How about something like: XXX: This code is wrong, we're overwriting the buffer before "acquiring" the cleanup lock. Currently this is not known to have bad consequences because XYZ and the fix seems a bit too risky for the backbranches. Greetings, Andres Freund
pgsql-hackers by date: