Re: Next Steps with Hash Indexes - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Next Steps with Hash Indexes
Date
Msg-id CANbhV-HbtL7VBxLFeZ-hDin__v4Ey5bMES2+UV0mf2GYuFF5Pw@mail.gmail.com
Whole thread Raw
In response to Re: Next Steps with Hash Indexes  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Next Steps with Hash Indexes  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Tue, Jul 20, 2021 at 1:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jul 15, 2021 at 10:11 PM Simon Riggs
> <simon.riggs@enterprisedb.com> wrote:
> >
> > 2. Unique Hash Indexes have been summarized here:
> > https://www.postgresql.org/message-id/CAA4eK1KATC1TA5bR5eobYQVO3RWsnH6djNpk3P376em4V8MuUA%40mail.gmail.com
> > which also seems to have two parts to it.
> >
> > 2.1 Uniqueness Check
> > Amit: "to ensure that there is no duplicate entry we need to traverse
> > the whole bucket chain"
> > Agreed. That seems straightforward and can also be improved later.
> >
> > 2.2 Locking
> > Amit's idea of holding ExclusiveLock on the bucket page works for me,
> > but there was some doubt about splitting.
> >
>
> I think the main thing to think about for uniqueness check during
> split (where we scan both the old and new buckets) was whether we need
> to lock both the old (bucket_being_split) and new
> (bucket_being_populated) buckets or just holding locks on one of them
> (the current bucket in which we are inserting) is sufficient? During a
> scan of the new bucket, we just retain pins on both the buckets (see
> comments in _hash_first()) but if we need to retain locks on both
> buckets then we need to do something different then we do it for
> scans. But, I think it is sufficient to just hold an exclusive lock on
> the primary bucket page in the bucket we are trying to insert and pin
> on the other bucket (old bucket as we do for scans). Because no
> concurrent inserter should try to insert into the old bucket and new
> bucket the same tuple as before starting the split we always update
> the metapage for hashm_lowmask and hashm_highmask which decides the
> routing of the tuples.

During an incomplete split, we need to scan both old and new. So
during insert, we need to scan both old and new, while holding
exclusive locks on both bucket pages. I've spent a few days looking at
the split behavior and this seems a complete approach. I'm working on
a patch now, still at hacking stage.

(BTW, my opinion of the split mechanism has now changed from bad to
very good. It works really well for unique data, but can be completely
ineffective for badly skewed data).

> Now, I think here the other problem we need to think about is that for
> the hash index after finding the tuple in the index, we need to always
> recheck in the heap as we don't store the actual value in the hash
> index. For that in the scan, we get the tuple(s) from the index
> (release locks) and then match qual after fetching tuple from the
> heap. But we can't do that for uniqueness check because if we release
> the locks on the index bucket page then another inserter could come
> before we match it in heap. I think we need some mechanism that after
> fetching TID from the index, we recheck the actual value in heap
> before releasing the lock on the index bucket page.

I don't think btree does that, so I'm not sure we do need that for
hash. Yes, there is a race condition, but someone will win. Do we care
who? Do we care enough to take the concurrency hit? Duplicate inserts
would be very rare in a declared unique index, so it would be a poor
trade-off.

> The other thing could be that if we have unique support for hash index
> then probably we can allow Insert ... ON Conflict if the user
> specifies unique index column as conflict_target.

Yes, that looks doable.

> I am not sure if multicol index support is mandatory to allow
> uniqueness for hash indexes, sure it would be good but I feel that can
> be done as a separate patch as well.

I have a patch for multicol support, attached.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Next Steps with Hash Indexes
Next
From: Simon Riggs
Date:
Subject: Re: Next Steps with Hash Indexes