Re: [HACKERS] Page Scan Mode in Hash Index - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] Page Scan Mode in Hash Index
Date
Msg-id CAA4eK1J6xiJUOidBaOt0iPsAdS0+p5PoKFf1R2yVjTwrY_4snA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Page Scan Mode in Hash Index  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Wed, Aug 9, 2017 at 2:58 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Mon, Aug 7, 2017 at 5:50 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> 7.
> _hash_kill_items(IndexScanDesc scan)
> {
> ..
> + /*
> + * If page LSN differs it means that the page was modified since the
> + * last read. killedItems could be not valid so LP_DEAD hints apply-
> + * ing is not safe.
> + */
> + page = BufferGetPage(buf);
> + if (PageGetLSN(page) != so->currPos.lsn)
> + {
> + _hash_relbuf(rel, buf);
> + return;
> + }
> ..
> }
>
> How does this check cover the case of unlogged tables?
>

I have thought about it and I think we can't use the technique btree
is using (not to release the pin on the page) to save unlogged or
temporary relations. It works for btree because it takes a cleanup
lock on each page before removing items from each page whereas in hash
index we take cleanup lock only on primary bucket page.  Now, one
thing we could do is to start taking a cleanup lock on each bucket
page (which includes primary bucket page and overflow pages), but I
think it can turn out to be worse than the current locking strategy.
Another idea could be to preserve the current locking strategy (take
the lock on next bucket page and then release the lock on current
bucket page) during vacuum of the unlogged hash index.  That will
ensure vacuum won't be able to remove the TIDs which we are going to
mark as dead.

Thoughts?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: [HACKERS] coverage analysis improvements
Next
From: Maksim Milyutin
Date:
Subject: Re: [HACKERS] Proposal: Local indexes for partitioned table