Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patchfor hash index - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patchfor hash index
Date
Msg-id CAEepm=1wa_wwGKAjGXego1KVooumF=m_2DZo=0W-+0JW_uCaGg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patchfor hash index  (Andres Freund <andres@anarazel.de>)
Responses Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patchfor hash index  (Amit Kapila <amit.kapila16@gmail.com>)
Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patchfor hash index  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
List pgsql-hackers
On Fri, Mar 2, 2018 at 3:57 PM, Andres Freund <andres@anarazel.de> wrote:
> Based on this sub-thread this patch's status of 'needs review' doesn't
> quite seem accurate and 'waiting on author' and then 'returned with
> feedback' would be more fitting?

I personally think this patch is really close to RFC.  Shubham has
fulfilled the project requirement, it's a tidy and short patch, it has
tests.  I think we really just need to verify that the split case
works correctly.

Hmm.  I notice that this calls PredicateLockPageSplit() after both
calls to _hash_splitbucket() (the one in _hash_finish_split() and the
one in _hash_expandtable()) instead of doing it inside that function,
and that _hash_splitbucket() unlocks bucket_nbuf before returning.
What if someone else accesses bucket_nbuf between
LockBuffer(bucket_nbuf, BUFFER_LOCK_UNLOCK) and
PredicateLockPageSplit()?  Doesn't that mean that another session can
read a newly created page and miss a predicate lock that is about to
be transferred to it?  If that is indeed a race, could it be fixed by
calling PredicateLockPageSplit() at the start of _hash_splitbucket()
instead?

Could we get a few days to mull over this and Shubham's other patches?
 It'd be really great to get some of these into 11.

My thought experiments about pseudo-pages and avoiding the split stuff
were not intended to get the patch kicked out.  I thought for a while
that hash indexes were a special case and could benefit from
dispensing with those trickier problems.  Upon further reflection, for
interesting size hash indexes pure hash value predicate tags wouldn't
be much better.  Furthermore, if we do decide we want to use using x %
max_predicate_locks_per_relation to avoid having to escalate to
relation predicate locks at the cost of slightly higher collision rate
then we should consider that for the whole system (including heap page
predicate locking), not just hash indexes.  Please consider those
ideas parked for now.

-- 
Thomas Munro
http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: WIP: BRIN multi-range indexes
Next
From: Mark Kirkwood
Date:
Subject: Re: zheap: a new storage format for PostgreSQL