Re: [HACKERS] Page Scan Mode in Hash Index - Mailing list pgsql-hackers

From Ashutosh Sharma
Subject Re: [HACKERS] Page Scan Mode in Hash Index
Date
Msg-id CAE9k0P=JzsP07sz+-1Ujnzfxn96JhEW+6dq4-HXsKa906CE9qw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Page Scan Mode in Hash Index  (Ashutosh Sharma <ashu.coek88@gmail.com>)
Responses Re: [HACKERS] Page Scan Mode in Hash Index  (Ashutosh Sharma <ashu.coek88@gmail.com>)
List pgsql-hackers
While doing the code coverage testing of v7 patch shared with - [1], I
found that there are few lines of code in _hash_next() that are
redundant and needs to be removed. I basically came to know this while
testing the scenario where a hash index scan starts when a split is in
progress. I have removed those lines and attached is the newer version
of patch.

[1] - https://www.postgresql.org/message-id/CAE9k0Pmn92Le0iOTU5b6087eRXzUnkq1PKcihxtokHJ9boXCBg%40mail.gmail.com

On Mon, May 8, 2017 at 6:55 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> Hi,
>
> On Tue, Apr 4, 2017 at 3:56 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Sun, Apr 2, 2017 at 4:14 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>>>
>>> Please note that these patches needs to be applied on top of  [1].
>>>
>>
>> Few more review comments:
>>
>> 1.
>> - page = BufferGetPage(so->hashso_curbuf);
>> + blkno = so->currPos.currPage;
>> + if (so->hashso_bucket_buf == so->currPos.buf)
>> + {
>> + buf = so->currPos.buf;
>> + LockBuffer(buf, BUFFER_LOCK_SHARE);
>> + page = BufferGetPage(buf);
>> + }
>>
>> Here, you are assuming that only bucket page can be pinned, but I
>> think that assumption is not right.  When _hash_kill_items() is called
>> before moving to next page, there could be a pin on the overflow page.
>> You need some logic to check if the buffer is pinned, then just lock
>> it.  I think once you do that, it might not be convinient to release
>> the pin at the end of this function.
>
> Yes, there are few cases where we might have pin on overflow pages too
> and we need to handle such cases in _hash_kill_items. I have taken
> care of this in the attached v7 patch. Thanks.
>
>>
>> Add some comments on top of _hash_kill_items to explain the new
>> changes or say some thing like "See _bt_killitems for details"
>
> Added some more comments on the new changes.
>
>>
>> 2.
>> + /*
>> + * We save the LSN of the page as we read it, so that we know whether it
>> + * safe to apply LP_DEAD hints to the page later.  This allows us to drop
>> + * the pin for MVCC scans, which allows vacuum to avoid blocking.
>> + */
>> + so->currPos.lsn = PageGetLSN(page);
>> +
>>
>> The second part of above comment doesn't make sense because vacuum's
>> will still be blocked because we hold the pin on primary bucket page.
>
> That's right. It doesn't make sense because we won't allow vacuum to
> start. I have removed it.
>
>>
>> 3.
>> + {
>> + /*
>> + * No more matching tuples were found. return FALSE
>> + * indicating the same. Also, remember the prev and
>> + * next block number so that if fetching tuples using
>> + * cursor we remember the page from where to start the
>> + * scan.
>> + */
>> + so->currPos.prevPage = (opaque)->hasho_prevblkno;
>> + so->currPos.nextPage = (opaque)->hasho_nextblkno;
>>
>> You can't read opaque without having lock and by this time it has
>> already been released.
>
> I have corrected it. Please refer to the attached v7 patch.
>
> Also, I think if you want to save position for
>> cursor movement, then you need to save the position of last bucket
>> when scan completes the overflow chain, however as you have written it
>> will be always invalid block number. I think there is similar problem
>> with prevblock number.
>
> Did you mean last bucket or last page. If it is last page, then I am
> already storing it.
>
>>
>> 4.
>> +_hash_load_qualified_items(IndexScanDesc scan, Page page, OffsetNumber offnum,
>> +   OffsetNumber maxoff, ScanDirection dir)
>> +{
>> + HashScanOpaque so = (HashScanOpaque) scan->opaque;
>> + IndexTuple      itup;
>> + int itemIndex;
>> +
>> + if (ScanDirectionIsForward(dir))
>> + {
>> + /* load items[] in ascending order */
>> + itemIndex = 0;
>> +
>> + /* new page, relocate starting position by binary search */
>> + offnum = _hash_binsearch(page, so->hashso_sk_hash);
>>
>> What is the need to find offset number when this function already has
>> that as an input parameter?
>
> It's not required. I have removed it.
>
>>
>> 5.
>> +_hash_load_qualified_items(IndexScanDesc scan, Page page, OffsetNumber offnum,
>> +   OffsetNumber maxoff, ScanDirection dir)
>>
>> I think maxoff is not required to be passed a parameter to this
>> function as you need it only for forward scan. You can compute it
>> locally.
>
> It is required for both forward and backward scan. However, I am not
> passing it to _hash_load_qualified_items() now.
>
>>
>> 6.
>> +_hash_load_qualified_items(IndexScanDesc scan, Page page, OffsetNumber offnum,
>> +   OffsetNumber maxoff, ScanDirection dir)
>> {
>> ..
>> + if (ScanDirectionIsForward(dir))
>> + {
>> ..
>> + while (offnum <= maxoff)
>> {
>> ..
>> + if (so->hashso_sk_hash == _hash_get_indextuple_hashkey(itup) &&
>> + _hash_checkqual(scan, itup))
>> + {
>> + /* tuple is qualified, so remember it */
>> + _hash_saveitem(so, itemIndex, offnum, itup);
>> + itemIndex++;
>> + }
>> +
>> + offnum = OffsetNumberNext(offnum);
>> ..
>>
>> Why are you traversing the whole page even when there is no match?
>> There is a similar problem in backward scan. I think this is blunder.
>
> Fixed. Please check the attached
> '0001-Rewrite-hash-index-scans-to-work-a-page-at-a-timev7.patch'
>
>>
>> 7.
>> + if (so->currPos.buf == so->hashso_bucket_buf ||
>> + so->currPos.buf == so->hashso_split_bucket_buf)
>> + {
>> + so->currPos.prevPage = InvalidBlockNumber;
>> + LockBuffer(so->currPos.buf, BUFFER_LOCK_UNLOCK);
>> + }
>> + else
>> + {
>> + so->currPos.prevPage = (opaque)->hasho_prevblkno;
>> + _hash_relbuf(rel, so->currPos.buf);
>> + }
>> +
>> + so->currPos.nextPage = (opaque)->hasho_nextblkno;
>>
>> What makes you think it is safe read opaque after releasing the lock?
>
> Nothing makes me think to read opaque after releasing lock. It's a
> mistake. I have corrected it. Please check attached v7 patch.
>
> --
> With Regards,
> Ashutosh Sharma
> EnterpriseDB:http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Petr Jelinek
Date:
Subject: Re: [HACKERS] snapbuild woes
Next
From: Rahila Syed
Date:
Subject: Re: [HACKERS] Adding support for Default partition in partitioning