Home > mailing lists

Re: pgsql: Optimize btree insertions for common case of increasing values - Mailing list pgsql-committers

From	Pavan Deolasee
Subject	Re: pgsql: Optimize btree insertions for common case of increasing values
Date	April 5, 2018 19:27:40
Msg-id	CABOikdNoffgt3XkSKPipawi1QYXxEgs67ZV9OsTVEb6jpQN+iw@mail.gmail.com Whole thread Raw
In response to	Re: pgsql: Optimize btree insertions for common case of increasing values (Peter Geoghegan <pg@bowt.ie>)
Responses	Re: pgsql: Optimize btree insertions for common case of increasing values (Peter Geoghegan <pg@bowt.ie>)
List	pgsql-committers

Tree view

On Wed, Apr 4, 2018 at 10:46 PM, Peter Geoghegan <pg@bowt.ie> wrote:

> You mean passing "fastpath" to _bt_insertonpg and then checking it's false
> if page split is needed? But isn't page split is only needed if the page
> doesn't have enough free space? If so, we have checked for that before
> setting "fastpath".

That's not exactly what I meant. I meant that if:

1. This is an insertion to the leaf level in _bt_insertonpg().

and

2. We don't have space on the page, and so must do a split (or at
least free LP_DEAD items).

and

3. RelationGetTargetBlock(rel) != InvalidBlockNumber

There should be an assertion failure. This new assertion within
_bt_insertonpg() makes it much less likely that the assumption breaks
later.

I came up with something like this:

+ /*

+ * If we're here then a pagesplit is needed. We should never reach here

+ * if we're using the fastpath since we should have checked for all the

+ * required conditions, including the fact that this page has enough

+ * freespace. Note that this routine can in-theory deal with the

+ * situation where a NULL stack pointer is passed (that's what would

+ * happen if the fastpath is taken), like it does during crash

+ * recovery. But that path is much slower, defeating the very purpose

+ * of the optimisation. The following assertion should protect us from

+ * any future code changes that invalidate those assumptions.

+ *

+ * Note the fact that whenever we fail to take the fastpath, we clear

+ * the cached block. So checking for a valid cached block at this point

+ * is enough to decide whether we're in a fastpath or not.

+ */

+ Assert(!(P_ISLEAF(lpageop) &&

+ BlockNumberIsValid(RelationGetTargetBlock(rel))));

But then I started thinking can _bt_insertonpg() be called from a code path which doesn't start at _bt_doinsert()? I traced one such path _bt_getstackbuf() -> _bt_finish_split() -> _bt_insert_parent(), but that can only happen in case of a non-leaf page. The assertion which checks for a leaf page, should be fine, right?

Finally, I'd like to see a small paragraph in the nbtree README, about
the high level theory behind this optimization and page recycling. The
assumption is that there can only be one non-ignorable leaf rightmost
page, and so even a RecentGlobalXmin style interlock isn't required.
We cannot fail to detect that our hint was invalidated, because there
can only be one such page in the B-Tree at any time. It's possible
that the page will be deleted and recycled without a backend's cached
page also being detected as invalidated, but only when we happen to
recycle a page that once again becomes the rightmost leaf page.

Can I borrow that almost verbatim, after adding details about the optimisation itself? Looks like I'm too tired to think sensibly.

Once those changes are made, this should be fine to commit.

Ok, thanks.

Thanks,

Pavan

Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

pgsql-committers by date:

From: Alexander Korotkov
Date: 05 April 2018, 18:29:41
Subject: Re: pgsql: Skip full index scan during cleanup of B-tree indexes when possi

From: Teodor Sigaev
Date: 05 April 2018, 20:56:18
Subject: pgsql: Fix handling of non-upgraded B-tree metapages

Re: pgsql: Optimize btree insertions for common case of increasing values - Mailing list pgsql-committers

Previous

Next