Home > mailing lists

Re: [HACKERS] GSoC 2017: weekly progress reports (week 6) - Mailing list pgsql-hackers

From	Alexander Korotkov
Subject	Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)
Date	April 9, 2018 17:57:19
Msg-id	CAPpHfdtPMFVWGXpQzKi4+y9ST1bKdsFW_0JRVv_yB_0s=_ttKA@mail.gmail.com Whole thread
In response to	Re: [HACKERS] GSoC 2017: weekly progress reports (week 6) (Heikki Linnakangas <hlinnaka@iki.fi>)
List	pgsql-hackers

Tree view

Hi!

Thank you for taking a look at this patch. I really appreciate your

attention over complex subjects like this.

On Mon, Apr 9, 2018 at 1:33 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 28/03/18 19:53, Teodor Sigaev wrote:
As I understand, scan should lock any visited page, but now it's true only for
posting tree. Seems, it also should lock pages in entry tree because concurrent
procesess could add new entries which could be matched by partial search, for
example. BTW, partial search (see collectMatchBitmap()) locks correctly entry
tree, but regular startScanEntry() doesn't lock entry page in case of posting
tree, only in case of posting list.
I think this needs some high-level comments or README to explain how the locking works. It seems pretty ad hoc at the moment. And incorrect.

I agree that explanation in README in insufficient.

1. Why do we lock all posting tree pages, even though they all represent the same value? Isn't it enough to lock the root of the posting tree?

2. Why do we lock any posting tree pages at all, if we lock the entry tree page anyway? Isn't the lock on the entry tree page sufficient to cover the key value?

I already have similar concerns in [1]. The idea of locking posting tree leafs was to

get more granular locks. I think you've correctly describe it in the commit message

here:

With a very large posting tree, it would

possibly be better to lock the posting tree leaf pages instead, so that a

"skip scan" with a query like "A & B", you could avoid unnecessary conflict

if a new tuple is inserted with A but !B. But let's keep this simple.

However, it's very complex and error prone. So, +1 for simplify it for v11.

3. Why do we *not* lock the entry leaf page, if there is no match? We still need a lock to remember that we probed for that value and there was no match, so that we conflict with a tuple that might be inserted later.

At least #3 is a bug. The attached patch adds an isolation test that demonstrates it. #1 and #2 are weird, and cause unnecessary locking, so I think we should fix those too, even if they don't lead to incorrect results.

Remember, the purpose of predicate locks is to lock key ranges, not physical pages or tuples in the index. We use leaf pages as handy shortcut for "any key value that would belong on this page", but it is just an implementation detail.

I took a stab at fixing those issues, as well as the bug when fastupdate is turned on concurrently. Does the attached patch look sane to you?

Teodor has already answered you, and I'd like to mention that your patch

looks good for me too.

1. https://www.postgresql.org/message-id/CAPpHfdvED_-7KbJp-ei4zRZu1brLgkJt4CA-uxF0iRO9WX2Sqw%40mail.gmail.com

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

pgsql-hackers by date:

From: Chapman Flack
Date: 09 April 2018, 17:57:01
Subject: Re: lazy detoasting

From: ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Date: 09 April 2018, 18:02:23
Subject: Re: Transform for pl/perl

Re: [HACKERS] GSoC 2017: weekly progress reports (week 6) - Mailing list pgsql-hackers

Previous

Next