Re: [HACKERS] Page Scan Mode in Hash Index - Mailing list pgsql-hackers

From Jesper Pedersen
Subject Re: [HACKERS] Page Scan Mode in Hash Index
Date
Msg-id 0aaa9067-1e3c-868e-46fe-611a797ffd88@redhat.com
Whole thread Raw
In response to [HACKERS] Page Scan Mode in Hash Index  (Ashutosh Sharma <ashu.coek88@gmail.com>)
Responses Re: [HACKERS] Page Scan Mode in Hash Index  (Ashutosh Sharma <ashu.coek88@gmail.com>)
List pgsql-hackers
Hi,

On 02/14/2017 12:27 AM, Ashutosh Sharma wrote:
> Currently, Hash Index scan works tuple-at-a-time, i.e. for every
> qualifying tuple in a page, it acquires and releases the lock which
> eventually increases the lock/unlock traffic. For example, if an index
> page contains 100 qualified tuples, the current hash index scan has to
> acquire and release the lock 100 times to read those qualified tuples
> which is not good from performance perspective and it also impacts
> concurency with VACUUM.
>
> Considering above points, I would like to propose a patch that allows
> hash index scan to work in page-at-a-time mode. In page-at-a-time
> mode, once lock is acquired on a target bucket page, the entire page
> is scanned and all the qualified tuples are saved into backend's local
> memory. This reduces the lock/unlock calls for retrieving tuples from
> a page. Moreover, it also eliminates the problem of re-finding the
> position of the last returned index tuple and more importanly it
> allows VACUUM to release lock on current page before moving to the
> next page which eventually improves it's concurrency with scan.
>
> Attached patch modifies hash index scan code for page-at-a-time mode.
> For better readability, I have splitted it into 3 parts,
>

Due to the commits on master these patches applies with hunks.

The README should be updated to mention the use of page scan.

hash.h needs pg_indent.

> 1) 0001-Rewrite-hash-index-scans-to-work-a-page-at-a-time.patch: this
> patch rewrites the hash index scan module to work in page-at-a-time
> mode. It basically introduces two new functions-- _hash_readpage() and
> _hash_saveitem(). The former is used to load all the qualifying tuples
> from a target bucket or overflow page into an items array. The latter
> one is used by _hash_readpage to save all the qualifying tuples found
> in a page into an items array. Apart from that, this patch bascially
> cleans _hash_first(), _hash_next and hashgettuple().
>

For _hash_next I don't see this - can you explain ?

+ *
+ *             On failure exit (no more tuples), we release pin and set
+ *             so->currPos.buf to InvalidBuffer.


+ *     Returns true if any matching items are found else returns false.

s/Returns/Return/g

> 2) 0002-Remove-redundant-function-_hash_step-and-some-of-the.patch:
> this patch basically removes the redundant function _hash_step() and
> some of the unused members of HashScanOpaqueData structure.
>

Looks good.

> 3) 0003-Improve-locking-startegy-during-VACUUM-in-Hash-Index.patch:
> this patch basically improves the locking strategy for VACUUM in hash
> index. As the new hash index scan works in page-at-a-time, vacuum can
> release the lock on previous page before acquiring a lock on the next
> page, hence, improving hash index concurrency.
>

+                * As the new hash index scan work in page at a time mode,

Remove 'new'.

> I have also done the benchmarking of this patch and would like to
> share the results for the same,
>
> Firstly, I have done the benchmarking with non-unique values and i
> could see a performance improvement of 4-7%. For the detailed results
> please find the attached file 'results-non-unique values-70ff', and
> ddl.sql, test.sql are test scripts used in this experimentation. The
> detail of non-default GUC params and pgbench command are mentioned in
> the result sheet. I also did the benchmarking with unique values at
> 300 and 1000 scale factor and its results are provided in
> 'results-unique-values-default-ff'.
>

I'm seeing similar results, and especially with write heavy scenarios.

Best regards, Jesper




pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)
Next
From: Ashutosh Sharma
Date:
Subject: Re: [HACKERS] segfault in hot standby for hash indexes