Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock |
Date | |
Msg-id | CAA4eK1JMGp+s22utSkyWp15ybidV65f_A28H=-yUWg4d4xJjZw@mail.gmail.com Whole thread Raw |
In response to | Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock (Andres Freund <andres@anarazel.de>) |
Responses |
Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock
|
List | pgsql-hackers |
On Wed, Aug 17, 2022 at 6:27 AM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2022-08-16 19:55:18 -0400, Tom Lane wrote: > > Robert Haas <robertmhaas@gmail.com> writes: > > > What sort of random things would someone do with pageinspect functions > > > that would hold buffer pins on one buffer while locking another one? > > > The functions in hashfuncs.c don't even seem like they would access > > > multiple buffers in total, let alone at overlapping times. And I don't > > > think that a query pageinspect could realistically be suspended while > > > holding a buffer pin either. If you wrapped it in a cursor it'd be > > > suspended before or after accessing any given buffer, not right in the > > > middle of that operation. > > > > pin != access. Unless things have changed really drastically since > > I last looked, a seqscan will sit on a buffer pin throughout the > > series of fetches from a single page. > > That's still the case. But for heap that shouldn't be a problem, because we'll > never try to take a cleanup lock while holding another page locked (nor even > another heap page pinned, I think). > > > I find it *highly* suspect that hash needs to acquire a cleanup lock while > holding another buffer locked. The recovery aspect alone makes that seem quite > unwise. Even if there's possibly no deadlock here for some reason or another. > > > Looking at the non-recovery code makes me even more suspicious: > > /* > * Physically allocate the new bucket's primary page. We want to do this > * before changing the metapage's mapping info, in case we can't get the > * disk space. Ideally, we don't need to check for cleanup lock on new > * bucket as no other backend could find this bucket unless meta page is > * updated. However, it is good to be consistent with old bucket locking. > */ > buf_nblkno = _hash_getnewbuf(rel, start_nblkno, MAIN_FORKNUM); > if (!IsBufferCleanupOK(buf_nblkno)) > { > _hash_relbuf(rel, buf_oblkno); > _hash_relbuf(rel, buf_nblkno); > goto fail; > } > > > _hash_getnewbuf() calls _hash_pageinit() which calls PageInit(), which > memset(0)s the whole page. What does it even mean to check whether you > effectively have a cleanup lock after you zeroed out the page? > > Reading the README and the comment above makes me wonder if this whole cleanup > lock business here is just cargo culting and could be dropped? > I think it is okay to not acquire a clean-up lock on the new bucket page both in recovery and non-recovery paths. It is primarily required on the old bucket page to avoid concurrent scans/inserts. As mentioned in the comments and as per my memory serves, it is mainly for keeping it consistent with old bucket locking. -- With Regards, Amit Kapila.
pgsql-hackers by date: