Re: B-tree descend for insertion locking - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: B-tree descend for insertion locking
Date
Msg-id CA+U5nMLJ4pvAH8bhAn5HTmWu8OqxPVx83-tRVu4q=DcO86jHjg@mail.gmail.com
Whole thread Raw
In response to B-tree descend for insertion locking  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
On 18 March 2014 11:12, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
> When inserting into a B-tree index, all the pages are read-locked when
> descending the tree. When we reach the leaf page, the read-lock is exchanged
> for a write-lock.
>
> There's nothing wrong with that, but why don't we just directly grab a
> write-lock on the leaf page? When descending, we know the level we're on,
> and what level the child page is. The only downside I can see is that we
> would unnecessarily hold a write-lock when a read-lock would suffice, if the
> page was just split and we have to move right. But that seems like a really
> bad bet - hitting the page when it was just split is highly unlikely.

Sounds good.

Grabbing write lock directly will reduce contention on the buffer, not
just reduce the code path.

If we have a considerable number of duplicates we would normally step
thru until we found a place to insert. Presumably that will happen
with write lock enabled, rather than read lock. Would this slow down
the insertion of highly duplicate keys under concurrent load? i.e. is
this a benefit for nearly-unique but not for other cases?

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Archive recovery won't be completed on some situation.
Next
From: Fujii Masao
Date:
Subject: Re: Archive recovery won't be completed on some situation.