Re: Connections hang indefinitely while taking a gin index's LWLockbuffer_content lock - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Connections hang indefinitely while taking a gin index's LWLockbuffer_content lock
Date
Msg-id CAPpHfds_nVAbF-2uwhLiJ4PWXt_1b4xTosFOjQPi5TscbUaRzw@mail.gmail.com
Whole thread Raw
In response to Re: Connections hang indefinitely while taking a gin index's LWLockbuffer_content lock  (Andrey Borodin <x4mmm@yandex-team.ru>)
Responses Re: Connections hang indefinitely while taking a gin index's LWLockbuffer_content lock  (Andrey Borodin <x4mmm@yandex-team.ru>)
Re: Connections hang indefinitely while taking a gin index's LWLockbuffer_content lock  (Alexander Korotkov <aekorotkov@gmail.com>)
List pgsql-hackers
On Sat, Dec 8, 2018 at 12:48 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> > 8 дек. 2018 г., в 6:54, Alexander Korotkov <aekorotkov@gmail.com> написал(а):
> >
> > Yep, please find attached draft patch.
>
> Patch seems good to me, I'll check it in more detail.
> The patch gets posting item at FirstOffsetNumber instead of btree->getLeftMostChild(). This seem OK, since
dataGetLeftMostPage()is doing just the same, but with few Assert()s. 

I'd like to evade creating GinBtree for just calling
getLeftMostChild().  Also, few more places in ginvacuum.c do the same.
We have the same amount of Assert()s in ginVacuumPostingTreeLeaves().
So, let's keep it uniform.

I would also like Peter Geoghegan to take a look at this patch before
committing it.

> > BTW, it seems that I've another bug in GIN.  README says that
> >
> > "However, posting trees are only
> > fully searched from left to right, starting from the leftmost leaf. (The
> > tree-structure is only needed by insertions, to quickly find the correct
> > insert location). So as long as we don't delete the leftmost page on each
> > level, a search can never follow a downlink to page that's about to be
> > deleted."
> >
> > But that's not really true once we teach GIN to skip parts of posting
> > trees in PostgreSQL 9.4.  So, there might be a risk to visit page to
> > be deleted using downlink.  But in order to get real problem, vacuum
> > should past cleanup stage and someone else should reuse this page,
> > before we traverse downlink.  Thus, the risk of real problem is very
> > low.  But it still should be addressed.
>
> There's a patch above in this thread 0001-Use-correct-locking-protocol-during-GIN-posting-tree.patch where I propose
stampingevery deleted page with GinPageSetDeleteXid(page, ReadNewTransactionId()); and avoid reusing the page before
TransactionIdPrecedes(GinPageGetDeleteXid(page),RecentGlobalDataXmin). 
> Should we leave alone this bug for future fixes to keep current fix noninvasive?

I think since is separate bug introduced in PostgreSQL 9.4, which
should be backpatched with separate commit.  Could you please extract
patch dealing with GinPageSetDeleteXid() and GinPageGetDeleteXid().
The rest of work made in your patch should be considered for master.

BTW, what do you think about locking order in ginRedoDeletePage()?  Do
you have any explanation of current behavior?  I'll try to reach
Teodor tomorrow with this question.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


pgsql-hackers by date:

Previous
From: Marti Raudsepp
Date:
Subject: [PATCH] Minor cleanups of BRIN code
Next
From: Tom Lane
Date:
Subject: Why does TupleDescInitBuiltinEntry lack a "default: error" case?