Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patchfor hash index - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patchfor hash index
Date
Msg-id CAA4eK1+fesG1K_4dynNCaF5d-X28ytepyRHKODzyqT4jzuK_JA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patchfor hash index  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patchfor hash index  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Mon, Mar 5, 2018 at 8:58 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Sun, Mar 4, 2018 at 12:53 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Yes, but I think it would be better if we call this once we are sure
>> that at least one tuple from the old bucket has been transferred
>> (consider if all tuples in the old bucket are dead).  Apart from this,
>> I think this patch has missed handling the cases where we scan the
>> buckets when the split is in progress.  In such cases, we scan both
>> old and new bucket, so I think we need to ensure that we take
>> PredicateLock on both the buckets during such scans.
>
> Hmm.  Yeah.
>
> So, in _hash_first(), do you think we might just need this?
>
>       if (H_BUCKET_BEING_POPULATED(opaque))
>       {
>           ...
>           old_blkno = _hash_get_oldblock_from_newbucket(rel, bucket);
>           ...
>           old_buf = _hash_getbuf(rel, old_blkno, HASH_READ, LH_BUCKET_PAGE);
> +         PredicateLockPage(rel, BufferGetBlockNumber(old_buf),
> scan->xs_snapshot);
>           TestForOldSnapshot(scan->xs_snapshot, rel, BufferGetPage(old_buf));
>
> That is, if you begin scanning a 'new' bucket, we remember the old
> bucket and go and scan that too, so we'd better predicate-lock both up
> front (or I suppose we could do it later when we visit that page, but
> here it can be done in a single place).
>

Yeah, that can work, but I am slightly worried that we might actually
never scan the old bucket (say for queries with Limit clause) in which
case it might give false positives for insertions in old buckets.

> What if we begin scanning an 'old' bucket that is being split?  I
> think we'll only do that for tuples that actually belong in the old
> bucket after the split, so no need to double-lock?  And I don't think
> a split could begin while we are scanning.  Do I have that right?
>

Right.

> As for inserting, I'm not sure if any special treatment is needed, as
> long as the scan code path (above) and the split code path are
> correct.  I'm not sure though.
>

I also don't think we need any special handling for insertions.

> I'm wondering how to test all this.  I'm thinking about a program that
> repeatedly creates a hash index and then slowly adds more things to it
> so that buckets split (maybe using distinct keys carefully crafted to
> hit the same bucket?), while concurrently hammering it with a ton of
> scans and then ... somehow checking correctness...
>

Yeah, that will generate the required errors, but not sure how to
verify correctness.  One idea could be that when the split is in
progress, we somehow stop it in-between (say by cancel request) and
then run targeted selects and inserts on the bucket being scanned and
bucket being populated.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Yura Sokolov
Date:
Subject: Re: [HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfian abit)
Next
From: Pavel Stehule
Date:
Subject: Re: pg_get_functiondef forgets about most GUC_LIST_INPUT GUCs