Home > mailing lists

Re: Parallel Index Scan vs BTP_DELETED and BTP_HALF_DEAD - Mailing list pgsql-hackers

From	Kuntal Ghosh
Subject	Re: Parallel Index Scan vs BTP_DELETED and BTP_HALF_DEAD
Date	December 13, 2017 11:35:46
Msg-id	CAGz5QCKkiMr5PiEomH2k+Q0FK14ro=XN1XWuyCa97Ed382Fpnw@mail.gmail.com Whole thread Raw
In response to	Re: Parallel Index Scan vs BTP_DELETED and BTP_HALF_DEAD (Kuntal Ghosh <kuntalghosh.2007@gmail.com>)
Responses	Re: Parallel Index Scan vs BTP_DELETED and BTP_HALF_DEAD
List	pgsql-hackers

Tree view

On Wed, Dec 13, 2017 at 10:33 AM, Kuntal Ghosh
<kuntalghosh.2007@gmail.com> wrote:
> On Tue, Dec 12, 2017 at 5:20 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Tue, Dec 12, 2017 at 4:00 PM, Kuntal Ghosh
>> <kuntalghosh.2007@gmail.com> wrote:
>>> On Mon, Dec 11, 2017 at 2:26 PM, Thomas Munro
>>> <thomas.munro@enterprisedb.com> wrote:
>>>> On Mon, Dec 11, 2017 at 8:14 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>>>
>>>>> Thanks for looking into it.  I will see if we can write some test.  In
>>>>> the meantime if possible, can you please request Patrick Hemmer to
>>>>> verify the attached patch?
>>>>
>>>> Our discussion was on the #postgresql Freenode channel.  I pointed him
>>>> at this thread, but I'm not sure if he'll see my message or be able to
>>>> test.
>>> After discussing with Amit, I'm able to reproduce the scenario in a
>>> master-standby setup. The issue occurs when we perform parallel
>>> index(-only) scan on a BTP_HALF_DEAD -marked page. (If a page is
>>> marked as BTP_DELETED, it's already unlinked from the index).
>>>
>>> When a btree page is deleted during vacuum, it's first marked as
>>> BTP_HALF_DEAD in _bt_mark_page_halfdead and then marked as BTP_DELETED
>>> in _bt_unlink_halfdead_page without releasing cleanup lock on the
>>> buffer. Hence, any scan node cannot lock the same buffer. So, the
>>> issue can't be reproduced on master.
>>>
>>> However, after replaying XLOG_BTREE_MARK_PAGE_HALFDEAD record, standby
>>> releases the lock on the same buffer. If we force parallelism, an
>>> index scan on the same page will cause hang the standby server.
>>> Following is a (unpleasant)way to reproduce the issue:
>>>
>>> In master (with autovacuum = off):
>>> 1. create table t1(a int primary key);
>>> 2. insert into t1 select * from generate_series(1,1000); --generates 3
>>> leaf nodes (block no 1,2,4) and 1 root node (block no 3)
>>> 3. delete from t1 where a>=367 and a<=735; --delete all tuples pointed by leaf 2
>>> 4. analyze t1; --update the stats
>>> 5. explain analyze select * from t1 where a>=1 and a<=1000; --ensures
>>> that the next vacuum will consider leaf 2 for page deletion
>>
>> What do you mean by next vacuum, here autovacuum is off?  Are you
>> missing any step which manually performs the vacuum?
>>
> Yeah, you've to manually vacuum the table.
> 6. vacuum t1.
>
>>> Now, put a break point at _bt_unlink_halfdead_page, so that vacuum
>>> can't unlink the page.
>>>
>>> In standby,
>>> 1. force parallelism.
>>> 2. explain analyze select * from t1 where a>=1 and a<=1000; and the
>>> parallel workers hang at the above-discussed place!
>>>
I've also verified the backward scan case with the query provided by
Thomas. In standby,
2. explain analyze select * from t1 where a+1>a order by a desc; and
the parallel workers hang.
The patch fixes the issue.



-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Kuntal Ghosh
Date: 13 December 2017, 11:03:17
Subject: Re: Parallel Index Scan vs BTP_DELETED and BTP_HALF_DEAD

From: Amit Langote
Date: 13 December 2017, 12:32:42
Subject: Re: Boolean partitions syntax

Re: Parallel Index Scan vs BTP_DELETED and BTP_HALF_DEAD - Mailing list pgsql-hackers

Previous

Next