Re: [HACKERS] Page Scan Mode in Hash Index - Mailing list pgsql-hackers

From Ashutosh Sharma
Subject Re: [HACKERS] Page Scan Mode in Hash Index
Date
Msg-id CAE9k0P=QfrT+ZvLrVXDPiVL61FKjc35H2eQHGHaz687n2vCGVQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Page Scan Mode in Hash Index  (Jesper Pedersen <jesper.pedersen@redhat.com>)
Responses Re: [HACKERS] Page Scan Mode in Hash Index  (Jesper Pedersen <jesper.pedersen@redhat.com>)
List pgsql-hackers
Hi,

>> Attached patch modifies hash index scan code for page-at-a-time mode.
>> For better readability, I have splitted it into 3 parts,
>>
>
> Due to the commits on master these patches applies with hunks.
>
> The README should be updated to mention the use of page scan.

Done. Please refer to the attached v2 version of patch.

>
> hash.h needs pg_indent.

Fixed.

>
>> 1) 0001-Rewrite-hash-index-scans-to-work-a-page-at-a-time.patch: this
>> patch rewrites the hash index scan module to work in page-at-a-time
>> mode. It basically introduces two new functions-- _hash_readpage() and
>> _hash_saveitem(). The former is used to load all the qualifying tuples
>> from a target bucket or overflow page into an items array. The latter
>> one is used by _hash_readpage to save all the qualifying tuples found
>> in a page into an items array. Apart from that, this patch bascially
>> cleans _hash_first(), _hash_next and hashgettuple().
>>
>
> For _hash_next I don't see this - can you explain ?

Sorry, It was wrongly copied from btree code. I have corrected it now. Please
check the attached v2 verison of patch.

>
> + *
> + *             On failure exit (no more tuples), we release pin and set
> + *             so->currPos.buf to InvalidBuffer.
>
>
> + *     Returns true if any matching items are found else returns false.
>
> s/Returns/Return/g

Done.

>
>> 2) 0002-Remove-redundant-function-_hash_step-and-some-of-the.patch:
>> this patch basically removes the redundant function _hash_step() and
>> some of the unused members of HashScanOpaqueData structure.
>>
>
> Looks good.
>
>> 3) 0003-Improve-locking-startegy-during-VACUUM-in-Hash-Index.patch:
>> this patch basically improves the locking strategy for VACUUM in hash
>> index. As the new hash index scan works in page-at-a-time, vacuum can
>> release the lock on previous page before acquiring a lock on the next
>> page, hence, improving hash index concurrency.
>>
>
> +                * As the new hash index scan work in page at a time mode,
>
> Remove 'new'.

Done.

>
>> I have also done the benchmarking of this patch and would like to
>> share the results for the same,
>>
>> Firstly, I have done the benchmarking with non-unique values and i
>> could see a performance improvement of 4-7%. For the detailed results
>> please find the attached file 'results-non-unique values-70ff', and
>> ddl.sql, test.sql are test scripts used in this experimentation. The
>> detail of non-default GUC params and pgbench command are mentioned in
>> the result sheet. I also did the benchmarking with unique values at
>> 300 and 1000 scale factor and its results are provided in
>> 'results-unique-values-default-ff'.
>>
>
> I'm seeing similar results, and especially with write heavy scenarios.

Great..!!

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: [HACKERS] Patch to improve performance of replay of AccessExclusiveLock
Next
From: Surafel Temesgen
Date:
Subject: Re: [HACKERS] New CORRESPONDING clause design