Re: GSoC 2017 Proposal for "Explicitly support predicate locks inindex access methods besides btree" - Mailing list pgsql-hackers

From anant khandelwal
Subject Re: GSoC 2017 Proposal for "Explicitly support predicate locks inindex access methods besides btree"
Date
Msg-id CAD=a8SDTRodxh0yD_NVzUjbULYY2ETADQNEoViEbgigavO+WJw@mail.gmail.com
Whole thread Raw
In response to GSoC 2017 Proposal for "Explicitly support predicate locks in indexaccess methods besides btree"  (anant khandelwal <anantbietec@gmail.com>)
List pgsql-hackers
Also can i use the testing tool reside at src/test/isolation.

or i should use the testing tool dtester from markus wanner

On Sat, Apr 1, 2017 at 7:33 PM, anant khandelwal <anantbietec@gmail.com> wrote:
Hi guys,

My name is Anant Khandelwal currently i am pursuing masters from IIT - Delhi and previously i am a software engineer.

I am particularly interested in working on the project "Explicitly support predicate locks in index access methods besides b tree".I have gone through

http://hdl.handle.net/2123/5353
http://drkp.net/papers/ssi-vldb12.pdf

to understand how serializable implemented through the use of Snapshot isolation in PostgreSQL also gone through the codebase get some idea of different access methods for gin, gist, and hash indexes.
I want to discuss my idea.Then i move to make the proposal

Sorry for late i am busy with the academics.

From my findings the dependencies which create the anomalies in Snapshot isolation are:

wr-dependencies: if T1 writes a version of an object, and T2
reads that version, then T1 appears to have executed before T2
 ww-dependencies: if T1 writes a version of some object, and
T2 replaces that version with the next version, then T1 appears
to have executed before T2
 ww-dependencies: if T1 writes a version of some object, and
T2 replaces that version with the next version, then T1 appears
to have executed before T2

where T1 and T2 are the transaction

but due to the rw dependency caused by any insert into the index indexes other than the btree acquires relation level lock which causes serialization failure.So the main task is to implement page-level predicate locking in the remaining core index AMs

* Gist Index

B+tree index acquires predicate locks only on leaf pages as index tuples
with record pointers are stored on leaf pages. But for Gist index, we have
to consider locking at each index level as index tuples can be stored in
buffers associated with internal pages or in leaf pages.
So, the functions where we need to insert a call for

1. PredicateLockPage()

->gistScanPage()
after acquiring a shared lock on a buffer

2.CheckForSerializableConflictIn()

-> gistdoinsert()
after acquiring an exclusive lock on a target buffer and before inserting a
tuple


3. PredicateLockPageSplit()

->gistplacetopage()

If there is not enough space for insertion, we need to copy predicate lock
from an old page to all new pages generated after a successful split
operation.

PredicateLockPageSplit(Relation relation, BlockNumber oldblkno, BlockNumber
a lot) is used by b+-tree where two pages are involved in a split
operation. For  Gist index, we can define a similar function where more
than one page can be generated after split operation.

* Gin Index

Gin index consists of a B-tree index over key values, where each key is an
element of some indexed items and each tuple in a leaf page contains either
a posting list if the list is small enough or a pointer to posting tree.

1. PredicateLockPage()

->startscanentry()
   before calling collectMatchBitmap()

->scanPostingTree()
   after acquiring a shared lock on a leaf page

2.CheckForSerializableConflictIn()

-> ginentryinsert()

->gininsertitempointers()
   in case of insertion in a posting tree


3. PredicateLockPageSplit()

-> dataBeginPlacetoPageLeaf()

 after calling dataPlacetoPageLeafSplit()

* Hash Index

1. PredicateLockPage()

->hashgettuple()
->_hash_first()
->_hash_readnext()
->_hash_readprev()

2.CheckForSerializableConflictIn()

-> _hash_doinsert()

3. PredicateLockPageSplit()
 

This is just an idea also i understand that  Page level predicate locking exists in the btree AM, where it required the addition of 17 function calls to implement, but remains unimplemented in the gin, gist, spgist, brin, and hash index AMs. So we nned to insert function calls at other places also.

 Also tell me can we evaluate the performance on PostgreSQL on the following workloads 

  • SIBENCH microbenchMark
  • TPC-c++ 






pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: brin autosummarization -- autovacuum "work items"
Next
From: Alvaro Herrera
Date:
Subject: Re: BRIN de-summarize ranges