Locking B-tree leafs immediately in exclusive mode - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Locking B-tree leafs immediately in exclusive mode
Date
Msg-id CAPpHfduAMDFMNYTCN7VMBsFg_hsf0GqiqXnt+bSeaJworwFoig@mail.gmail.com
Whole thread Raw
Responses Re: Locking B-tree leafs immediately in exclusive mode
Re: Locking B-tree leafs immediately in exclusive mode
List pgsql-hackers
Hi!

Currently _bt_search() always locks leaf buffer in shared mode (aka BT_READ),
while caller can relock it later.  However, I don't see what prevents
_bt_search()
from locking leaf immediately in exclusive mode (aka BT_WRITE) when required.
When traversing downlink from non-leaf page of level 1 (lowest level of non-leaf
pages just prior to leaf pages), we know that next page is going to be leaf.  In
this case, we can immediately lock next page in exclusive mode if required.
For sure, it might happen that we didn't manage to exclusively lock leaf in this
way when _bt_getroot() points us to leaf page.  But this case could be handled
in _bt_search() by relock.  Please, find implementation of this approach in the
attached patch.

I've run following simple test of this patch on 72-cores machine.

# postgresql.conf
max_connections = 300
shared_buffers = 32GB
fsync = off
synchronous_commit = off

#  DDL:
CREATE UNLOGGED TABLE ordered (id serial primary key, value text not null);
CREATE UNLOGGED TABLE unordered (i integer not null, value text not null);

# script_ordered.sql
INSERT INTO ordered (value) VALUES ('abcdefghijklmnoprsqtuvwxyz');

# script_unordered.sql
\set i random(1, 1000000)
INSERT INTO unordered VALUES (:i, 'abcdefghijklmnoprsqtuvwxyz');

# commands
pgbench -T 60 -P 1 -M prepared -f script_ordered.sql -c 150 -j 150 postgres
pgbench -T 60 -P 1 -M prepared -f script_unordered.sql -c 150 -j 150 postgres

# results
ordered, master: 157473 TPS
ordered, patched 231374 TPS
unordered, master: 232372 TPS
unordered, patched: 232535 TPS

As you can see, difference in unordered case is negligible  Due to random
inserts, concurrency for particular leafs is low. But ordered
insertion is almost
50% faster on patched version.

I wonder how could we miss such a simple optimization till now, but I also don't
see this patch to brake anything.

In patched version, it might appear that we have to traverse
rightlinks in exclusive
mode due to splits concurrent to downlink traversal.  However, the same might
happen in current master due to splits concurrent to relocks.  So, I
don't expect
performance regression to be caused by this patch.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

pgsql-hackers by date:

Previous
From: serge@rielau.com
Date:
Subject: RE: Re: Spilling hashed SetOps and aggregates to disk
Next
From: Tomas Vondra
Date:
Subject: Re: Spilling hashed SetOps and aggregates to disk