Re: Hash Indexes - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Hash Indexes
Date
Msg-id CAA4eK1Kd7JVr4F54u4SCAhnq3fE7wtBX=Jscwn_sp3UFH4SYnA@mail.gmail.com
Whole thread Raw
In response to Re: Hash Indexes  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Fri, Nov 4, 2016 at 6:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Nov 3, 2016 at 6:25 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> +        nblkno = _hash_get_newblk(rel, pageopaque);
>>>
>>> I think this is not a great name for this function.  It's not clear
>>> what "new blocks" refers to, exactly.  I suggest
>>> FIND_SPLIT_BUCKET(metap, bucket) or OLD_BUCKET_TO_NEW_BUCKET(metap,
>>> bucket) returning a new bucket number.  I think that macro can be
>>> defined as something like this: bucket + (1 <<
>>> (fls(metap->hashm_maxbucket) - 1)).
>>>
>>
>> I think such a macro would not work for the usage of incomplete
>> splits.  The reason is that by the time we try to complete the split
>> of the current old bucket, the table half (lowmask, highmask,
>> maxbucket) would have changed and it could give you the bucket in new
>> table half.
>
> Can you provide an example of the scenario you are talking about here?
>

Consider a case as below:

First half of table
0 1 2 3
Second half of table
4 5 6 7

Now when split of bucket 2 (corresponding new bucket will be 6) is in
progress, system crashes and after restart it splits bucket number 3
(corresponding bucket will be 7).  Now after that, it will try to form
a new table half with buckets ranging from 8,9,..15.  Assume it
creates bucket 8 by splitting from bucket 0 and next if it tries to
split bucket 2, it will find an incomplete split and will attempt to
finish it.  At that time if it tries to calculate new bucket from old
bucket (2), it will calculate it as 10 (value of
metap->hashm_maxbucket will be 8 for third table half and if try it
with the above macro, it will calculate it as 10) whereas we need 6.
That is why you will see a check (if (new_bucket >
metap->hashm_maxbucket)) in _hash_get_newblk() which will ensure that
it returns the bucket number from previous half.  The basic idea is
that if there is an incomplete split from current bucket, it can't do
a new split from that bucket, so the check in _hash_get_newblk() will
give us correct value.

I can try to explain again if above is not clear enough.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Applying XLR_INFO_MASK correctly when looking at WAL record information
Next
From: Tom Lane
Date:
Subject: Re: Bug in to_timestamp().