Re: Page Scan Mode in Hash Index - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Page Scan Mode in Hash Index
Date
Msg-id CAA4eK1JrePM0_ai64Dt5WAiaERET=Kuow-FyOGLDaNLSzBF6rw@mail.gmail.com
Whole thread Raw
In response to [HACKERS] Page Scan Mode in Hash Index  (Ashutosh Sharma <ashu.coek88@gmail.com>)
List pgsql-hackers
On Sun, Apr 2, 2017 at 4:14 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> Please note that these patches needs to be applied on top of  [1].
>

Few more review comments:

1.
- page = BufferGetPage(so->hashso_curbuf);
+ blkno = so->currPos.currPage;
+ if (so->hashso_bucket_buf == so->currPos.buf)
+ {
+ buf = so->currPos.buf;
+ LockBuffer(buf, BUFFER_LOCK_SHARE);
+ page = BufferGetPage(buf);
+ }

Here, you are assuming that only bucket page can be pinned, but I
think that assumption is not right.  When _hash_kill_items() is called
before moving to next page, there could be a pin on the overflow page.
You need some logic to check if the buffer is pinned, then just lock
it.  I think once you do that, it might not be convinient to release
the pin at the end of this function.

Add some comments on top of _hash_kill_items to explain the new
changes or say some thing like "See _bt_killitems for details"

2.
+ /*
+ * We save the LSN of the page as we read it, so that we know whether it
+ * safe to apply LP_DEAD hints to the page later.  This allows us to drop
+ * the pin for MVCC scans, which allows vacuum to avoid blocking.
+ */
+ so->currPos.lsn = PageGetLSN(page);
+

The second part of above comment doesn't make sense because vacuum's
will still be blocked because we hold the pin on primary bucket page.

3.
+ {
+ /*
+ * No more matching tuples were found. return FALSE
+ * indicating the same. Also, remember the prev and
+ * next block number so that if fetching tuples using
+ * cursor we remember the page from where to start the
+ * scan.
+ */
+ so->currPos.prevPage = (opaque)->hasho_prevblkno;
+ so->currPos.nextPage = (opaque)->hasho_nextblkno;

You can't read opaque without having lock and by this time it has
already been released.  Also, I think if you want to save position for
cursor movement, then you need to save the position of last bucket
when scan completes the overflow chain, however as you have written it
will be always invalid block number. I think there is similar problem
with prevblock number.

4.
+_hash_load_qualified_items(IndexScanDesc scan, Page page, OffsetNumber offnum,
+   OffsetNumber maxoff, ScanDirection dir)
+{
+ HashScanOpaque so = (HashScanOpaque) scan->opaque;
+ IndexTuple      itup;
+ int itemIndex;
+
+ if (ScanDirectionIsForward(dir))
+ {
+ /* load items[] in ascending order */
+ itemIndex = 0;
+
+ /* new page, relocate starting position by binary search */
+ offnum = _hash_binsearch(page, so->hashso_sk_hash);

What is the need to find offset number when this function already has
that as an input parameter?

5.
+_hash_load_qualified_items(IndexScanDesc scan, Page page, OffsetNumber offnum,
+   OffsetNumber maxoff, ScanDirection dir)

I think maxoff is not required to be passed a parameter to this
function as you need it only for forward scan. You can compute it
locally.

6.
+_hash_load_qualified_items(IndexScanDesc scan, Page page, OffsetNumber offnum,
+   OffsetNumber maxoff, ScanDirection dir)
{
..
+ if (ScanDirectionIsForward(dir))
+ {
..
+ while (offnum <= maxoff)
{
..
+ if (so->hashso_sk_hash == _hash_get_indextuple_hashkey(itup) &&
+ _hash_checkqual(scan, itup))
+ {
+ /* tuple is qualified, so remember it */
+ _hash_saveitem(so, itemIndex, offnum, itup);
+ itemIndex++;
+ }
+
+ offnum = OffsetNumberNext(offnum);
..

Why are you traversing the whole page even when there is no match?
There is a similar problem in backward scan. I think this is blunder.

7.
+ if (so->currPos.buf == so->hashso_bucket_buf ||
+ so->currPos.buf == so->hashso_split_bucket_buf)
+ {
+ so->currPos.prevPage = InvalidBlockNumber;
+ LockBuffer(so->currPos.buf, BUFFER_LOCK_UNLOCK);
+ }
+ else
+ {
+ so->currPos.prevPage = (opaque)->hasho_prevblkno;
+ _hash_relbuf(rel, so->currPos.buf);
+ }
+
+ so->currPos.nextPage = (opaque)->hasho_nextblkno;

What makes you think it is safe read opaque after releasing the lock?


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Re: asynchronous execution
Next
From: Amit Kapila
Date:
Subject: Re: Page Scan Mode in Hash Index