Home > mailing lists

Re: [POC] A better way to expand hash indexes. - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: [POC] A better way to expand hash indexes.
Date	March 27, 2017 08:51:20
Msg-id	CAA4eK1L=gE+YW1OcZiUbmnboapVvZJu0jJp7Su7oqZE6pjVKvA@mail.gmail.com Whole thread
In response to	[HACKERS] [POC] A better way to expand hash indexes. (Mithun Cy <mithun.cy@enterprisedb.com>)
Responses	Re: [POC] A better way to expand hash indexes.
List	pgsql-hackers

Tree view

On Sun, Mar 26, 2017 at 11:26 AM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:
> Thanks, Amit for the review.
> On Sat, Mar 25, 2017 at 7:03 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> I think one-dimensional patch has fewer places to touch, so that looks
>> better to me.  However, I think there is still hard coding and
>> assumptions in code which we should try to improve.
>
> Great!, I will continue with spares 1-dimensional improvement.
>

@@ -563,18 +563,20 @@ _hash_init_metabuffer(Buffer buf, double
num_tuples, RegProcedure procid,\
{
.. else
- num_buckets = ((uint32) 1) << _hash_log2((uint32) dnumbuckets);
+ num_buckets = _hash_get_totalbuckets(_hash_spareindex(dnumbuckets));
..
..
- metap->hashm_maxbucket = metap->hashm_lowmask = num_buckets - 1;
- metap->hashm_highmask = (num_buckets << 1) - 1;
+ metap->hashm_maxbucket = num_buckets - 1;
+
+ /* set hishmask, which should be sufficient to cover num_buckets. */
+ metap->hashm_highmask = (1 << (_hash_log2(num_buckets))) - 1;
+ metap->hashm_lowmask = (metap->hashm_highmask >> 1);
}

I think we can't change the number of buckets to be created or lowmask
and highmask calculation here without modifying _h_spoolinit() because
it sorts the data to be inserted based on hashkey which in turn
depends on the number of buckets that we are going to create during
create index operation.  We either need to allow create index
operation to still always create buckets in power-of-two fashion or we
need to update _h_spoolinit according to new computation.  One minor
drawback of using power-of-two scheme for creation of buckets during
create index is that it can lead to wastage of space and will be
inconsistent with what the patch does during split operation.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Rafia Sabih
Date: 27 March 2017, 08:48:00
Subject: Re: [COMMITTERS] pgsql: Improve access to parallel queryfrom procedural languages.

From: Thomas Munro
Date: 27 March 2017, 08:53:32
Subject: Re: [sqlsmith] Unpinning error in parallel worker

Re: [POC] A better way to expand hash indexes. - Mailing list pgsql-hackers

Previous

Next