Re: ReadRecentBuffer() doesn't scale well - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: ReadRecentBuffer() doesn't scale well
Date
Msg-id CAH2-WzkdXYPQ7ZeYD00x64TmEQjMXGOsa5wzLEFTeV7BibWh7w@mail.gmail.com
Whole thread Raw
In response to Re: ReadRecentBuffer() doesn't scale well  (Andres Freund <andres@anarazel.de>)
Responses Re: ReadRecentBuffer() doesn't scale well
List pgsql-hackers
On Mon, Jun 26, 2023 at 11:27 PM Andres Freund <andres@anarazel.de> wrote:
> On 2023-06-26 21:53:12 -0700, Peter Geoghegan wrote:
> > It should be safe to allow searchers to see a version of the root page
> > that is out of date. The Lehman & Yao design is very permissive about
> > these things. There aren't any special cases where the general rules
> > are weakened in some way that might complicate this approach.
> > Searchers need to check the high key to determine if they need to move
> > right -- same as always.
>
> Wouldn't we at least need a pin on the root page, or hold a snapshot, to
> defend against page deletions?

You need to hold a snapshot to prevent concurrent page recycling --
though not page deletion itself (I did say "anything that you'd
usually think of as an interlock"). I'm pretty sure that a concurrent
page deletion is possible, even when you hold a pin on the page.
(Perhaps not, but if not then it's just an accident -- a side-effect
of the interlock that protects against concurrent heap TID recycling.)

You can't delete a rightmost page (on any level). Every root page is a
rightmost page. So the root would have to be split, and then once
again emptied before it could be deleted -- only then would there be a
danger of some backend with a locally cached root page having an
irredeemably bad picture of what's going on with the index. That's
another angle that you could approach the problem from, I suppose.

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Yugo NAGATA
Date:
Subject: Re: pgbnech: allow to cancel queries during benchmark
Next
From: Masahiko Sawada
Date:
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum