Thread: [HACKERS] GSoC 2017 : Proposal for predicate locking in gist index

[HACKERS] GSoC 2017 : Proposal for predicate locking in gist index

From
Shubham Barai
Date:

Hello everyone,

 

I have been accepted as GSoC student for the project "Explicitly support predicate locks in index access methods besides b-tree". I want to share my approach for implementation of page level predicate locking in gist index. Any suggestions will be appreciated.

 

Proposal

 

The main difference between b-tree and gist index while searching for a target tuple is that in gist index, we can determine if there is a match or not at any level of the index. In gist, index entry of internal nodes contains a predicate which is used as a search key to search all reachable tuples from that node. To insert a tuple in the index, we first check the key representing a target subtree. If it doesn't already cover the key we are inserting, we have to replace it with the union of old key and the key we are inserting. After considering all these points, it seems logical to acquire a predicate lock at each level of the index.

The simplest way to do that will be by inserting a call for prdicatelockpage() in gistscanpage().

Insertion algorithm also needs to check for conflicting predicate locks at each level of the index.

We can insert a call for CheckForSerializableConflictIn() at two places in gistdoinsert().

1. after acquiring an exclusive lock on internal page (in case we are trying to replace an old search key)

2. after acquiring an exclusive lock on leaf page

If there is not enough space for insertion, we have to copy predicate lock from an old page to all new pages generated after a successful split operation. For that, we can insert a call for PredicateLockPageSplit() in gistplacetopage().

 

Regards,

Shubham


Re: [HACKERS] GSoC 2017 : Proposal for predicate locking in gist index

From
Kevin Grittner
Date:
On Wed, May 31, 2017 at 3:02 PM, Shubham Barai <shubhambaraiss@gmail.com> wrote:

> I have been accepted as GSoC student for the project "Explicitly support
> predicate locks in index access methods besides b-tree". I want to share my
> approach for implementation of page level predicate locking in gist index.

For the benefit of all following the discussion, implementing
support in an index AM conceptually consists of two things:
(1) Any scan with user-visible results must create SIRead predicate
locks on "gaps" scanned.  (For example, a scan just to find an
insertion spot for an index entry does not generally count, whereas
a scan to satisfy a user "EXISTS" test does.)
(2) Any insert into the index must CheckForSerializableConflictIn()
at enough points that at least one of them will detect an overlap
with a predicate lock from a preceding scan if the inserted index
entry might have changed the results of that preceding scan.

Detecting such a conflict does not automatically result in a
serialization failure, but is only part of tracking the information
necessary to make such a determination.  All that is handled by the
existing machinery if the AM does the above.

With a btree index, locking gaps at the leaf level is normally
sufficient, because both scan and insertion of an index entry must
descend that far.

> The main difference between b-tree and gist index while searching for a
> target tuple is that in gist index, we can determine if there is a match or
> not at any level of the index.

Agreed.  A gist scan can fail at any level, but for that scan to
have produced a different result due to insertion of an index entry,
*some* page that the scan looked at must be modified.

> The simplest way to do that will be by inserting a call for
> prdicatelockpage() in gistscanpage().

Yup.

> Insertion algorithm also needs to check for conflicting predicate locks at
> each level of the index.

Yup.

> We can insert a call for CheckForSerializableConflictIn() at two places in
> gistdoinsert().
>
> 1. after acquiring an exclusive lock on internal page (in case we are trying
> to replace an old search key)
>
> 2. after acquiring an exclusive lock on leaf page
>
> If there is not enough space for insertion, we have to copy predicate lock
> from an old page to all new pages generated after a successful split
> operation. For that, we can insert a call for PredicateLockPageSplit() in
> gistplacetopage().

That all sounds good.  When you code a patch, don't forget to add
tests to src/test/isolation.

Do you need any help getting a development/build/test environment
set up?

--
Kevin Grittner
VMware vCenter Server
https://www.vmware.com/



Re: [HACKERS] GSoC 2017 : Proposal for predicate locking in gist index

From
Andrew Borodin
Date:
Hi, hackers!

2017-06-01 1:50 GMT+05:00 Kevin Grittner <kgrittn@gmail.com>:
> > The main difference between b-tree and gist index while searching for a
> > target tuple is that in gist index, we can determine if there is a match or
> > not at any level of the index.
>
> Agreed.  A gist scan can fail at any level, but for that scan to
> have produced a different result due to insertion of an index entry,
> *some* page that the scan looked at must be modified.

Two things seem non-obvious to me.
First, I just do not know, can VACUUM erase page with predicate lock?
Right now, GiST never deletes pages, as far as I know, so this
question is only matter of future compatibility.

Second, when we are doing GiST inserts, we can encounter unfinished
split. That's not a frequent situation, but still. Should we skip
finishing split or should we add it to collision check too?

Best regards, Andrey Borodin, Octonica.



Re: [HACKERS] GSoC 2017 : Proposal for predicate locking in gist index

From
Shubham Barai
Date:
Hi, Kevin sir!

On 1 June 2017 at 02:20, Kevin Grittner <kgrittn@gmail.com> wrote:
On Wed, May 31, 2017 at 3:02 PM, Shubham Barai <shubhambaraiss@gmail.com> wrote:

> I have been accepted as GSoC student for the project "Explicitly support
> predicate locks in index access methods besides b-tree". I want to share my
> approach for implementation of page level predicate locking in gist index.

For the benefit of all following the discussion, implementing
support in an index AM conceptually consists of two things:
(1) Any scan with user-visible results must create SIRead predicate
locks on "gaps" scanned.  (For example, a scan just to find an
insertion spot for an index entry does not generally count, whereas
a scan to satisfy a user "EXISTS" test does.)
(2) Any insert into the index must CheckForSerializableConflictIn()
at enough points that at least one of them will detect an overlap
with a predicate lock from a preceding scan if the inserted index
entry might have changed the results of that preceding scan.

Detecting such a conflict does not automatically result in a
serialization failure, but is only part of tracking the information
necessary to make such a determination.  All that is handled by the
existing machinery if the AM does the above.

With a btree index, locking gaps at the leaf level is normally
sufficient, because both scan and insertion of an index entry must
descend that far.

> The main difference between b-tree and gist index while searching for a
> target tuple is that in gist index, we can determine if there is a match or
> not at any level of the index.

Agreed.  A gist scan can fail at any level, but for that scan to
have produced a different result due to insertion of an index entry,
*some* page that the scan looked at must be modified.

> The simplest way to do that will be by inserting a call for
> prdicatelockpage() in gistscanpage().

Yup.

> Insertion algorithm also needs to check for conflicting predicate locks at
> each level of the index.

Yup.

> We can insert a call for CheckForSerializableConflictIn() at two places in
> gistdoinsert().
>
> 1. after acquiring an exclusive lock on internal page (in case we are trying
> to replace an old search key)
>
> 2. after acquiring an exclusive lock on leaf page
>
> If there is not enough space for insertion, we have to copy predicate lock
> from an old page to all new pages generated after a successful split
> operation. For that, we can insert a call for PredicateLockPageSplit() in
> gistplacetopage().

That all sounds good.  When you code a patch, don't forget to add
tests to src/test/isolation. 

Do you need any help getting a development/build/test environment
set up?
 
 Yes, any link to the documentation will be helpful.

--
Kevin Grittner
VMware vCenter Server
https://www.vmware.com/

Re: [HACKERS] GSoC 2017 : Proposal for predicate locking in gist index

From
Kevin Grittner
Date:
On Thu, Jun 1, 2017 at 12:49 AM, Andrew Borodin <borodin@octonica.com> wrote:

> First, I just do not know, can VACUUM erase page with predicate lock?

For handling in btree, see:


https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/nbtree/nbtpage.c;h=f815fd40b20e98a88cb2fb5c71005ea125a459c9;hb=refs/heads/master#l1406

Note also this discussion:

https://www.postgresql.org/message-id/4D669122.80904@enterprisedb.com

It doesn't look like we ever got to the optimization Heikki
suggested in that post, so on rare occasions we could see a false
positive from this.  Perhaps we should give this another look while
we're in the AMs.

> Right now, GiST never deletes pages, as far as I know, so this
> question is only matter of future compatibility.

ok

> Second, when we are doing GiST inserts, we can encounter unfinished
> split. That's not a frequent situation, but still. Should we skip
> finishing split or should we add it to collision check too?

When a page is split, I think you need to call
PredicateLockPageSplit() before it gets to the point that an insert
to get to it.

--
Kevin Grittner
VMware vCenter Server
https://www.vmware.com/