Re: [HACKERS] [POC] A better way to expand hash indexes. - Mailing list pgsql-hackers

From Mithun Cy
Subject Re: [HACKERS] [POC] A better way to expand hash indexes.
Date
Msg-id CAD__OugBu=gX5z2hZa+2kZ6VmP==kOrKvfxqCRK0MGuyhzhiHg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] [POC] A better way to expand hash indexes.  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: [HACKERS] [POC] A better way to expand hash indexes.  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
Hi Amit, Thanks for the review,

On Mon, Mar 20, 2017 at 5:17 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> idea could be to make hashm_spares a two-dimensional array
> hashm_spares[32][4] where the first dimension will indicate the split
> point and second will indicate the sub-split number.  I am not sure
> whether it will be simpler or complex than the method used in the
> proposed patch, but I think we should think a bit more to see if we
> can come up with some simple technique to solve this problem.

I think making it a 2-dimensional array will not be any useful in fact
we really treat the given array 2-dimensional elements now.
The main concern of yours I think is the calculation steps to find the
phase of the splitpoint group the bucket belongs to.
+ tbuckets = (1 << (splitpoint_group + 2));
+ phases_beyond_bucket =
+ (tbuckets - num_bucket) / (1 << (splitpoint_group - 1));
+ return (((splitpoint_group + 1) << 2) - phases_beyond_bucket) - 1;

Quickly thinking further we allocate 2^x of buckets on each phase of
any splitpoint group and we have 4 such phases in each group; At each
splitpoint group again we allocate 2^y buckets, So buckets number have
a pattern in their bitmap representation to find out which phase of
allocation they belong to.


As below
===========
Group 0 -- bit 0, 1 define which phase of group each bucket belong to.
0 -- 00000000
1 -- 00000001
2 -- 00000010
3 -- 00000011
===========
Group 1 -- bit 0, 1 define which phase of group each bucket belong to.
4 -- 00000100
5 -- 00000101
6 -- 00000110
7 -- 00000111
===========
Group 2 -- bit 1, 2 define which phase of group each bucket belong to.
8 -- 00001000
9 -- 00001001
10 -- 00001010
11 -- 00001011
12 -- 00001100
13 -- 00001101
14 -- 00001110
15 -- 00001111
===========
Group 3 -- bit 2, 3 define which phase of group each bucket belong to.
16 -- 00010000
17 -- 00010001
18 -- 00010010
19 -- 00010011
20 -- 00010100
21 -- 00010101
22 -- 00010110
23 -- 00010111
24 -- 00011000
25 -- 00011001
26 -- 00011010
27 -- 00011011
28 -- 00011100
29 -- 00011101
30 -- 00011110
31 -- 00011111
============

So we can say given a bucket x of group n > 0 bits (n-1, n) defines
which phase of group they belong to. I see an opportunity here to
completely simplify the above calculation into a simple bitwise anding
the bucket number with (n-1,n) bitmask to get the phase of allocation
of bucket x.

Formula can be (x >> (splitpoint_group - 1)) & 0x3 = phase of bucket

Does this satisfy your concern?

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] WIP: Faster Expression Processing v4
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] Our feature change policy