Re: Avoiding superfluous buffer locking during nbtree backwards scans - Mailing list pgsql-hackers
From | Matthias van de Meent |
---|---|
Subject | Re: Avoiding superfluous buffer locking during nbtree backwards scans |
Date | |
Msg-id | CAEze2Wh45vj6gwLQi5NjL9BdgGNDd6UgNkxuedZgHbRpq4W7UQ@mail.gmail.com Whole thread Raw |
In response to | Avoiding superfluous buffer locking during nbtree backwards scans (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: Avoiding superfluous buffer locking during nbtree backwards scans
|
List | pgsql-hackers |
On Wed, 16 Oct 2024 at 20:03, Peter Geoghegan <pg@bowt.ie> wrote: > > On Fri, Oct 11, 2024 at 7:29 PM Peter Geoghegan <pg@bowt.ie> wrote: > > Attached is v5 > > Now I'm attaching a v6, which further polishes things. Current plan is to > commit something very close to this in the next day or two. > > v6 is mostly just further comment polishing. But it also merges together > the two existing _bt_readnextpage loops (for forward and backward scan > directions) into one single loop that handles everything. This is > definitely a win for brevity, and might also be a win for clarity -- > but I'm not 100% sure which way I prefer it just yet. (patch v6) > - BlockNumber btps_scanPage; /* latest or next page to be scanned */ > + BlockNumber btps_nextScanPage; /* next page to be scanned */ > + BlockNumber btps_lastCurrPage; /* page whose sibling link was copied into > + * btps_scanPage */ This reference to btps_scanPage in the comment needs to be updated to its new name. (from your v5 mail) > * Now nbtree has only one PredicateLockPage call, inside _bt_readpage. > This saves us an extra BufferGetBlockNumber call. (v4 pushed > PredicateLockPage calls down one layer, v5 pushes them down another > layer still.) (and seen in patch v6) > + /* allow next page to be processed by parallel worker */ > + if (scan->parallel_scan) > + { > + if (ScanDirectionIsForward(dir)) > + _bt_parallel_release(scan, so->currPos.nextPage, > + so->currPos.currPage); > + else > + _bt_parallel_release(scan, so->currPos.prevPage, > + so->currPos.currPage); > + } > + > + PredicateLockPage(rel, so->currPos.currPage, scan->xs_snapshot); I'm a little bit concerned about this change: I'm not familiar with predicate locks, PLP, or anything related to SSI, but previously we called PLP in parallel scans while we still had hold of the scan, while we now call PLP only after letting other backends do things, allowing PLPs for this scan to happen concurrently with other backends of this scan. In principle, that looks like an improvement in parallelism by reducing the work done in the critical path. However, before, we would always get predicate locks in ascending (or descending for backward scans) value order, but now that strict keyspace order access has been released an unfortunately timed descheduling of the process between _bt_p_release and PLP's guts means mean lock acquisition would be only approximately in index leaf page order. If the lock acquisition order matters in the predicate lock system, then this will likely be a new cause of lock conflicts/deadlocks. If that's a real issue (by which I mean predicate locking is indeed sensitive to acquisition order) then this call to PredicateLockPage must happen before _bt_p_release, so that users don't get conflicts caused only by bad timing issues in single-directional index scans. Apart from these two comments, LGTM. Kind regards, Matthias van de Meent Neon (https://neon.tech)
pgsql-hackers by date: