Re: Parallel Index Scan vs BTP_DELETED and BTP_HALF_DEAD - Mailing list pgsql-hackers
From | Kuntal Ghosh |
---|---|
Subject | Re: Parallel Index Scan vs BTP_DELETED and BTP_HALF_DEAD |
Date | |
Msg-id | CAGz5QCKkiMr5PiEomH2k+Q0FK14ro=XN1XWuyCa97Ed382Fpnw@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel Index Scan vs BTP_DELETED and BTP_HALF_DEAD (Kuntal Ghosh <kuntalghosh.2007@gmail.com>) |
Responses |
Re: Parallel Index Scan vs BTP_DELETED and BTP_HALF_DEAD
|
List | pgsql-hackers |
On Wed, Dec 13, 2017 at 10:33 AM, Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote: > On Tue, Dec 12, 2017 at 5:20 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: >> On Tue, Dec 12, 2017 at 4:00 PM, Kuntal Ghosh >> <kuntalghosh.2007@gmail.com> wrote: >>> On Mon, Dec 11, 2017 at 2:26 PM, Thomas Munro >>> <thomas.munro@enterprisedb.com> wrote: >>>> On Mon, Dec 11, 2017 at 8:14 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: >>>> >>>>> Thanks for looking into it. I will see if we can write some test. In >>>>> the meantime if possible, can you please request Patrick Hemmer to >>>>> verify the attached patch? >>>> >>>> Our discussion was on the #postgresql Freenode channel. I pointed him >>>> at this thread, but I'm not sure if he'll see my message or be able to >>>> test. >>> After discussing with Amit, I'm able to reproduce the scenario in a >>> master-standby setup. The issue occurs when we perform parallel >>> index(-only) scan on a BTP_HALF_DEAD -marked page. (If a page is >>> marked as BTP_DELETED, it's already unlinked from the index). >>> >>> When a btree page is deleted during vacuum, it's first marked as >>> BTP_HALF_DEAD in _bt_mark_page_halfdead and then marked as BTP_DELETED >>> in _bt_unlink_halfdead_page without releasing cleanup lock on the >>> buffer. Hence, any scan node cannot lock the same buffer. So, the >>> issue can't be reproduced on master. >>> >>> However, after replaying XLOG_BTREE_MARK_PAGE_HALFDEAD record, standby >>> releases the lock on the same buffer. If we force parallelism, an >>> index scan on the same page will cause hang the standby server. >>> Following is a (unpleasant)way to reproduce the issue: >>> >>> In master (with autovacuum = off): >>> 1. create table t1(a int primary key); >>> 2. insert into t1 select * from generate_series(1,1000); --generates 3 >>> leaf nodes (block no 1,2,4) and 1 root node (block no 3) >>> 3. delete from t1 where a>=367 and a<=735; --delete all tuples pointed by leaf 2 >>> 4. analyze t1; --update the stats >>> 5. explain analyze select * from t1 where a>=1 and a<=1000; --ensures >>> that the next vacuum will consider leaf 2 for page deletion >> >> What do you mean by next vacuum, here autovacuum is off? Are you >> missing any step which manually performs the vacuum? >> > Yeah, you've to manually vacuum the table. > 6. vacuum t1. > >>> Now, put a break point at _bt_unlink_halfdead_page, so that vacuum >>> can't unlink the page. >>> >>> In standby, >>> 1. force parallelism. >>> 2. explain analyze select * from t1 where a>=1 and a<=1000; and the >>> parallel workers hang at the above-discussed place! >>> I've also verified the backward scan case with the query provided by Thomas. In standby, 2. explain analyze select * from t1 where a+1>a order by a desc; and the parallel workers hang. The patch fixes the issue. -- Thanks & Regards, Kuntal Ghosh EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: