Thread: Deadlock possibility in _bt_check_unique?

Deadlock possibility in _bt_check_unique?

From
Gokulakannan Somasundaram
Date:
Hi,<br />   With the implementation of deferred unique constraints, we need to go back to the index second time to
checkwhether the unique check is valid. Say a situation occurs like this<br />a) the first session doing the unique
checkfinds out that there is a unique check required second time and just makes its entry and comes back<br /> b) the
secondsession doing the unique check finds out that there is a unique check required second time and just makes its
entryand comes back<br /><br />While they do the second check, first session will wait for the session to complete and
viceversa. Won't this result in a deadlock? Isn't this a realistic scenario?<br /><br />Thanks,<br />Gokul.<br /> 

Re: Deadlock possibility in _bt_check_unique?

From
Heikki Linnakangas
Date:
Gokulakannan Somasundaram wrote:
> Hi,
>    With the implementation of deferred unique constraints, we need to go
> back to the index second time to check whether the unique check is valid.
> Say a situation occurs like this
> a) the first session doing the unique check finds out that there is a unique
> check required second time and just makes its entry and comes back
> b) the second session doing the unique check finds out that there is a
> unique check required second time and just makes its entry and comes back
> 
> While they do the second check, first session will wait for the session to
> complete and vice versa. Won't this result in a deadlock? Isn't this a
> realistic scenario?

Yes, that can happen. The deadlock detector will kick in and abort one
of the sessions.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Deadlock possibility in _bt_check_unique?

From
Gokulakannan Somasundaram
Date:
Can you also explain how are we avoiding duplicates in this scenario?<br /><br />a) Say there are three pages(P,Q, R)
fullof duplicate tuples, that are deleted but not dead of id x(due to some long running transaction).<br />b) Now
SessionA gets in and checks the duplicate tuples for their liveliness with the HeapTuple for id x with shared lock on
allthe pages P, Q and R. Since all are deleted, it will get the message, that it need not come back to check again for
uniquenessFinally it again starts from P to check for freespace to insert its tuple. Say it inserts the tuple at page
Q.<br/> c) Now Session B(with same id x) starts after Session A, but it passes Q before the insertion of the tuple by
SessionA. It will also get the response from _bt_check_unique, that it need not comeback for second time unique check.
Nowit checks for freespace from P and it finds freespace at P. Then it will insert the new record at P itself.<br /><br
/>Sowe have two duplicate records, eventhough there is a unique constraint. Is this a possible scenario?<br /><br
/>Thanks,<br/>Gokul.<br /><br /><br /> 

Re: Deadlock possibility in _bt_check_unique?

From
Tom Lane
Date:
Gokulakannan Somasundaram <gokul007@gmail.com> writes:
> Can you also explain how are we avoiding duplicates in this scenario?
> a) Say there are three pages(P,Q, R) full of duplicate tuples, that are
> deleted but not dead of id x(due to some long running transaction).
> b) Now Session A gets in and checks the duplicate tuples for their
> liveliness with the HeapTuple for id x with shared lock on all the pages P,
> Q and R. Since all are deleted, it will get the message, that it need not
> come back to check again for uniqueness Finally it again starts from P to
> check for freespace to insert its tuple. Say it inserts the tuple at page Q.
> c) Now Session B(with same id x) starts after Session A, but it passes Q
> before the insertion of the tuple by Session A. It will also get the
> response from _bt_check_unique, that it need not comeback for second time
> unique check. Now it checks for freespace from P and it finds freespace at
> P. Then it will insert the new record at P itself.

> So we have two duplicate records, eventhough there is a unique constraint.
> Is this a possible scenario?

Are you talking about exclusion constraints or btree uniqueness
constraints?  This doesn't seem to be a particularly accurate
description of the implementation of either one.  The way btree
deals with this is explained in _bt_doinsert:
    * NOTE: obviously, _bt_check_unique can only detect keys that are already    * in the index; so it cannot defend
againstconcurrent insertions of the    * same key.  We protect against that by means of holding a write lock on    *
thetarget page.  Any other would-be inserter of the same key must    * acquire a write lock on the same target page, so
onlyone would-be    * inserter can be making the check at one time.  Furthermore, once we are    * past the check we
holdwrite locks continuously until we have performed    * our insertion, so no later inserter can fail to see our
insertion.   * (This requires some care in _bt_insertonpg.)
 
        regards, tom lane


Re: Deadlock possibility in _bt_check_unique?

From
Gokulakannan Somasundaram
Date:


Are you talking about exclusion constraints or btree uniqueness
constraints?  This doesn't seem to be a particularly accurate
description of the implementation of either one.  The way btree
deals with this is explained in _bt_doinsert:

Unique constraints
 

    * NOTE: obviously, _bt_check_unique can only detect keys that are already
    * in the index; so it cannot defend against concurrent insertions of the
    * same key.  We protect against that by means of holding a write lock on
    * the target page.  Any other would-be inserter of the same key must
    * acquire a write lock on the same target page, so only one would-be
    * inserter can be making the check at one time.  Furthermore, once we are
    * past the check we hold write locks continuously until we have performed
    * our insertion, so no later inserter can fail to see our insertion.
    * (This requires some care in _bt_insertonpg.)

 
This is fine, if the second session has to pass through the page, where the first session inserted the record. But as i said if the second session finds a free slot before hitting on the page where the first session inserted, then it will never hit the page with write lock. The comment says that,

"    Furthermore, once we are
    * past the check we hold write locks continuously until we have performed
    * our insertion, so no later inserter can fail to see our insertion.
    * (This requires some care in _bt_insertonpg.) "

But in the case, i suggested(page1, page2, page3 with non-dead duplicate tuples, which are deleted), the first session checks page1, finds no freespace, moves to page2, finds freespace and inserts. But when the second session checks page1, say some of the tuples become dead. Then it gets freespace there and inserts, but never sees the insertion of the first session. Maybe, i am missing something. Just thought of raising the flag..

Gokul.


 

Re: Deadlock possibility in _bt_check_unique?

From
Tom Lane
Date:
Gokulakannan Somasundaram <gokul007@gmail.com> writes:
> This is fine, if the second session has to pass through the page, where the
> first session inserted the record. But as i said if the second session finds
> a free slot before hitting on the page where the first session inserted,
> then it will never hit the page with write lock. The comment says that,

No, you don't understand how it works.  All would-be inserters will hit
the same target page to begin with, ie, the first one that the new key
could legally be inserted on.  The lock that protects against this
problem is the lock on that page, regardless of which page the key
actually ends up on.
        regards, tom lane


Re: Deadlock possibility in _bt_check_unique?

From
Gokulakannan Somasundaram
Date:

No, you don't understand how it works.  All would-be inserters will hit
the same target page to begin with, ie, the first one that the new key
could legally be inserted on.  The lock that protects against this
problem is the lock on that page, regardless of which page the key
actually ends up on.


Consider Time instances T1, T2, T3, T4

T1 : session 1 holds the write lock on page p1 and completes the unique check on p1, p2 and p3.

T2 : session 1 releases the lock on p1 (its waiting to acquire a ex lock on p2)

T3 : session 2 acquires write lock on p1 and completes the unique check on p1, p2 and p3. Here, i believe the Session 2
has a chance of getting the read lock before session 1 gets the write lock. Is it not possible?

T4 : session 1 gets the write lock on p2 and inserts and session 2 gets the write lock on p1 and inserts.

OK. I have stated my assumptions. Is my assumption at time instant T3 wrong/ never happen?

Thanks,
Gokul.

Re: Deadlock possibility in _bt_check_unique?

From
Tom Lane
Date:
Gokulakannan Somasundaram <gokul007@gmail.com> writes:
> Consider Time instances T1, T2, T3, T4

> T1 : session 1 holds the write lock on page p1 and completes the unique
> check on p1, p2 and p3.

> T2 : session 1 releases the lock on p1 (its waiting to acquire a ex lock on
> p2)

That's not what we do.  See _bt_findinsertloc.
        regards, tom lane


Re: Deadlock possibility in _bt_check_unique?

From
Gokulakannan Somasundaram
Date:
<br /><div class="gmail_quote"><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid
rgb(204,204, 204); padding-left: 1ex;"><div class="im"> > T2 : session 1 releases the lock on p1 (its waiting to
acquirea ex lock on<br /> > p2)<br /><br /></div>That's not what we do.  See _bt_findinsertloc.<br /><br />        
              regards, tom lane<br /></blockquote></div><br />      I am really confused. Please keep the cool and
explainme, if i am wrong. I could see this code in _bt_findinsertloc. There is a _bt_relandgetbuf, which releases lock
onp1 and tries to acquire a lock on p2. I wrote ex lock in the place of BT_WRITE.<br /><i><font size="2"><br
style="font-family:arial,helvetica,sans-serif;" /></font></i><pre class="fragment"><i><font size="2"><span
style="font-family:arial,helvetica,sans-serif;">00589         </span><span class="keywordflow" style="font-family:
arial,helvetica,sans-serif;">for</span><spanstyle="font-family: arial,helvetica,sans-serif;"> (;;)</span><br
style="font-family:arial,helvetica,sans-serif;" /> 
<a name="l00590" style="font-family: arial,helvetica,sans-serif;"></a><span style="font-family:
arial,helvetica,sans-serif;">00590        {</span><br style="font-family: arial,helvetica,sans-serif;" /><a
name="l00591"style="font-family: arial,helvetica,sans-serif;"></a><span style="font-family:
arial,helvetica,sans-serif;">00591            </span><a class="code"
href="http://doxygen.postgresql.org/block_8h.html#0be1c1ab88d7f8120e2cd2e8ac2697a1"style="font-family:
arial,helvetica,sans-serif;">BlockNumber</a><spanstyle="font-family: arial,helvetica,sans-serif;"> rblkno =
lpageop-></span><aclass="code"
href="http://doxygen.postgresql.org/structBTPageOpaqueData.html#0e96302f6e2aa4203cef0e243362b692"style="font-family:
arial,helvetica,sans-serif;">btpo_next</a><spanstyle="font-family: arial,helvetica,sans-serif;">;</span><br
style="font-family:arial,helvetica,sans-serif;" /> 
<a name="l00592" style="font-family: arial,helvetica,sans-serif;"></a><span style="font-family:
arial,helvetica,sans-serif;">00592</span><br style="font-family: arial,helvetica,sans-serif;" /><a name="l00593"
style="font-family:arial,helvetica,sans-serif;"></a><span style="font-family: arial,helvetica,sans-serif;">00593
    rbuf = </span><a class="code" href="http://doxygen.postgresql.org/nbtpage_8c.html#023261cd645fc5e8b4e8517fe9027bd6"
style="font-family:arial,helvetica,sans-serif;">_bt_relandgetbuf</a><span style="font-family:
arial,helvetica,sans-serif;">(rel,rbuf, rblkno, </span><a class="code"
href="http://doxygen.postgresql.org/nbtree_8h.html#e494b1ec6ecbe7251dfc412a1ec53c1b"style="font-family:
arial,helvetica,sans-serif;">BT_WRITE</a><spanstyle="font-family: arial,helvetica,sans-serif;">);</span><br
style="font-family:arial,helvetica,sans-serif;" /> 
<a name="l00594" style="font-family: arial,helvetica,sans-serif;"></a><span style="font-family:
arial,helvetica,sans-serif;">00594            page = </span><a class="code"
href="http://doxygen.postgresql.org/bufmgr_8h.html#fb570c83a17839dabeb75dba7ea8e1a5"style="font-family:
arial,helvetica,sans-serif;">BufferGetPage</a><spanstyle="font-family: arial,helvetica,sans-serif;">(rbuf);</span><br
style="font-family:arial,helvetica,sans-serif;" /> 
<a name="l00595" style="font-family: arial,helvetica,sans-serif;"></a><span style="font-family:
arial,helvetica,sans-serif;">00595            lpageop = (</span><a class="code"
href="http://doxygen.postgresql.org/structBTPageOpaqueData.html"style="font-family:
arial,helvetica,sans-serif;">BTPageOpaque</a><spanstyle="font-family: arial,helvetica,sans-serif;">) </span><a
class="code"href="http://doxygen.postgresql.org/bufpage_8h.html#3be45495654ca1ff61f6ae45805e25f6" style="font-family:
arial,helvetica,sans-serif;">PageGetSpecialPointer</a><spanstyle="font-family:
arial,helvetica,sans-serif;">(page);</span><brstyle="font-family: arial,helvetica,sans-serif;" /> 
<a name="l00596" style="font-family: arial,helvetica,sans-serif;"></a><span style="font-family:
arial,helvetica,sans-serif;">00596            </span><span class="keywordflow" style="font-family:
arial,helvetica,sans-serif;">if</span><spanstyle="font-family: arial,helvetica,sans-serif;"> (!</span><a class="code"
href="http://doxygen.postgresql.org/nbtree_8h.html#a8df35238449d00d7e14c3f1ccd3121c"style="font-family:
arial,helvetica,sans-serif;">P_IGNORE</a><spanstyle="font-family: arial,helvetica,sans-serif;">(lpageop))</span><br
style="font-family:arial,helvetica,sans-serif;" /> 
<a name="l00597" style="font-family: arial,helvetica,sans-serif;"></a><span style="font-family:
arial,helvetica,sans-serif;">00597                </span><span class="keywordflow" style="font-family:
arial,helvetica,sans-serif;">break</span><spanstyle="font-family: arial,helvetica,sans-serif;">;</span><br
style="font-family:arial,helvetica,sans-serif;" /> 
<a name="l00598" style="font-family: arial,helvetica,sans-serif;"></a><span style="font-family:
arial,helvetica,sans-serif;">00598            </span><span class="keywordflow" style="font-family:
arial,helvetica,sans-serif;">if</span><spanstyle="font-family: arial,helvetica,sans-serif;"> (</span><a class="code"
href="http://doxygen.postgresql.org/nbtree_8h.html#8b5e4857926b514c0f97592bf3344293"style="font-family:
arial,helvetica,sans-serif;">P_RIGHTMOST</a><spanstyle="font-family: arial,helvetica,sans-serif;">(lpageop))</span><br
style="font-family:arial,helvetica,sans-serif;" /> 
<a name="l00599" style="font-family: arial,helvetica,sans-serif;"></a><span style="font-family:
arial,helvetica,sans-serif;">00599                </span><a class="code"
href="http://doxygen.postgresql.org/elog_8h.html#850290250a1449fc513da20ae2ce5ec5"style="font-family:
arial,helvetica,sans-serif;">elog</a><spanstyle="font-family: arial,helvetica,sans-serif;">(</span><a class="code"
href="http://doxygen.postgresql.org/elog_8h.html#8fe83ac76edc595f6b98cd4a4127aed5"style="font-family:
arial,helvetica,sans-serif;">ERROR</a><spanstyle="font-family: arial,helvetica,sans-serif;">, </span><span
class="stringliteral"style="font-family: arial,helvetica,sans-serif;">"fell off the end of index \"%s\""</span><span
style="font-family:arial,helvetica,sans-serif;">,</span><br style="font-family: arial,helvetica,sans-serif;" /> 
<a name="l00600" style="font-family: arial,helvetica,sans-serif;"></a><span style="font-family:
arial,helvetica,sans-serif;">00600                     </span><a class="code"
href="http://doxygen.postgresql.org/rel_8h.html#5e9d450c92f70171110e22ffc678ed04"style="font-family:
arial,helvetica,sans-serif;">RelationGetRelationName</a><spanstyle="font-family:
arial,helvetica,sans-serif;">(rel));</span><brstyle="font-family: arial,helvetica,sans-serif;" /> 
<a name="l00601" style="font-family: arial,helvetica,sans-serif;"></a><span style="font-family:
arial,helvetica,sans-serif;">00601        }</span></font></i><br /><br /><br /></pre>What is that, i am missing
here?<br/><br />Gokul.<br /> 

Re: Deadlock possibility in _bt_check_unique?

From
Gokulakannan Somasundaram
Date:
Oh! yeah, i got it.  Thanks....

On Wed, Mar 24, 2010 at 12:26 AM, Gokulakannan Somasundaram <gokul007@gmail.com> wrote:

> T2 : session 1 releases the lock on p1 (its waiting to acquire a ex lock on
> p2)

That's not what we do.  See _bt_findinsertloc.

                       regards, tom lane

      I am really confused. Please keep the cool and explain me, if i am wrong. I could see this code in _bt_findinsertloc. There is a _bt_relandgetbuf, which releases lock on p1 and tries to acquire a lock on p2. I wrote ex lock in the place of BT_WRITE.

00589         for (;;)
00590 {
00591 BlockNumber rblkno = lpageop->btpo_next;
00592
00593 rbuf = _bt_relandgetbuf(rel, rbuf, rblkno, BT_WRITE);
00594 page = BufferGetPage(rbuf);
00595 lpageop = (BTPageOpaque) PageGetSpecialPointer(page);
00596 if (!P_IGNORE(lpageop))
00597 break;
00598 if (P_RIGHTMOST(lpageop))
00599 elog(ERROR, "fell off the end of index \"%s\"",
00600 RelationGetRelationName(rel));
00601 }



What is that, i am missing here?

Gokul.

Re: Deadlock possibility in _bt_check_unique?

From
Tom Lane
Date:
Gokulakannan Somasundaram <gokul007@gmail.com> writes:
>       I am really confused. Please keep the cool and explain me, if i am
> wrong. I could see this code in _bt_findinsertloc. There is a
> _bt_relandgetbuf, which releases lock on p1 and tries to acquire a lock on
> p2.

No, read it again.  The only locks that get released inside that loop
are ones on intermediate dead pages (rbuf is not buf).  The lock on the
original page is only released after the loop.
        regards, tom lane