Re: Hash Indexes - Mailing list pgsql-hackers

From Mithun Cy
Subject Re: Hash Indexes
Date
Msg-id CAD__OuhLq=Meu3LsST2r4Lxdk5ersOJFGp7HQby1rsG12-rUtA@mail.gmail.com
Whole thread Raw
In response to Re: Hash Indexes  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Hash Indexes  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
I did some basic testing of same. In that I found one issue with cursor.

+BEGIN;

+SET enable_seqscan = OFF;

+SET enable_bitmapscan = OFF;

+CREATE FUNCTION declares_cursor(int)

+ RETURNS void

+ AS 'DECLARE c CURSOR FOR SELECT * from con_hash_index_table WHERE keycol = $1;'

+LANGUAGE SQL;

+

+SELECT declares_cursor(1);

+MOVE FORWARD ALL FROM c;

+MOVE BACKWARD 10000 FROM c;

+ CLOSE c;

+ WARNING: buffer refcount leak: [5835] (rel=base/16384/30537, blockNum=327, flags=0x93800000, refcount=1 1)

ROLLBACK;

closing the cursor comes with the warning which say we forgot to unpin the buffer.

I have also added tests [1] for coverage improvements.

[1] Some tests to cover hash_index.


On Thu, Jul 14, 2016 at 4:33 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jun 22, 2016 at 8:48 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jun 22, 2016 at 5:14 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> We can do it in the way as you are suggesting, but there is another thing
>> which we need to consider here.  As of now, the patch tries to finish the
>> split if it finds split-in-progress flag in either old or new bucket.  We
>> need to lock both old and new buckets to finish the split, so it is quite
>> possible that two different backends try to lock them in opposite order
>> leading to a deadlock.  I think the correct way to handle is to always try
>> to lock the old bucket first and then new bucket.  To achieve that, if the
>> insertion on new bucket finds that split-in-progress flag is set on a
>> bucket, it needs to release the lock and then acquire the lock first on old
>> bucket, ensure pincount is 1 and then lock new bucket again and ensure that
>> pincount is 1. I have already maintained the order of locks in scan (old
>> bucket first and then new bucket; refer changes in _hash_first()).
>> Alternatively, we can try to  finish the splits only when someone tries to
>> insert in old bucket.
>
> Yes, I think locking buckets in increasing order is a good solution.
> I also think it's fine to only try to finish the split when the insert
> targets the old bucket.  Finishing the split enables us to remove
> tuples from the old bucket, which lets us reuse space instead of
> accelerating more.  So there is at least some potential benefit to the
> backend inserting into the old bucket.  On the other hand, a process
> inserting into the new bucket derives no direct benefit from finishing
> the split.
>

Okay, following this suggestion, I have updated the patch so that only
insertion into old-bucket can try to finish the splits.  Apart from
that, I have fixed the issue reported by Mithun upthread.  I have
updated the README to explain the locking used in patch.   Also, I
have changed the locking around vacuum, so that it can work with
concurrent scans when ever possible.  In previous patch version,
vacuum used to take cleanup lock on a bucket to remove the dead
tuples, moved-due-to-split tuples and squeeze operation, also it holds
the lock on bucket till end of cleanup.  Now, also it takes cleanup
lock on a bucket to out-wait scans, but it releases the lock as it
proceeds to clean the overflow pages.  The idea is first we need to
lock the next bucket page and  then release the lock on current bucket
page.  This ensures that any concurrent scan started after we start
cleaning the bucket will always be behind the cleanup.  Allowing scans
to cross vacuum will allow it to remove tuples required for sanctity
of scan.  Also for squeeze-phase we are just checking if the pincount
of buffer is one (we already have Exclusive lock on buffer of bucket
by that time), then only proceed, else will try to squeeze next time
the cleanup is required for that bucket.

Thoughts/Suggestions?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers




--
Thanks and Regards
Mithun C Y

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Surprising behaviour of \set AUTOCOMMIT ON
Next
From: Vladimir Sitnikov
Date:
Subject: Re: New version numbering practices