Thread: [HACKERS] Problem in Parallel Bitmap Heap Scan?

[HACKERS] Problem in Parallel Bitmap Heap Scan?

From
Thomas Munro
Date:
Hi,

I noticed a failure in the inet.sql test while running the regression
tests with parallelism cranked up, and can reproduce it interactively
as follows.  After an spgist index is created and the plan changes to
the one shown below, the query returns no rows.

regression=# set force_parallel_mode = regress;
SET
regression=# set max_parallel_workers_per_gather = 2;
SET
regression=# set parallel_tuple_cost = 0;
SET
regression=# set parallel_setup_cost = 0;
SET
regression=# set min_parallel_table_scan_size = 0;
SET
regression=# set min_parallel_index_scan_size = 0;
SET
regression=# set enable_seqscan = off;
SET
regression=# SELECT * FROM inet_tbl WHERE i <> '192.168.1.0/24'::cidr
ORDER BY i;

c          |        i
--------------------+------------------10.0.0.0/8         | 9.1.2.3/810.0.0.0/8         | 10.1.2.3/810.0.0.0/32
|10.1.2.3/810.0.0.0/8         | 10.1.2.3/810.1.0.0/16        | 10.1.2.3/1610.1.2.0/24        | 10.1.2.3/2410.1.2.3/32
    | 10.1.2.310.0.0.0/8         | 11.1.2.3/8192.168.1.0/24     | 192.168.1.226/24192.168.1.0/24     |
192.168.1.255/24192.168.1.0/24    | 192.168.1.0/25192.168.1.0/24     | 192.168.1.255/25192.168.1.0/26     |
192.168.1.22610.0.0.0/8        | 10::/8::ffff:1.2.3.4/128 | ::4.3.2.1/2410:23::f1/128      |
10:23::f1/6410:23::8000/113   | 10:23::ffff
 
(17 rows)

regression=# CREATE INDEX inet_idx3 ON inet_tbl using spgist (i);
CREATE INDEX
regression=# SELECT * FROM inet_tbl WHERE i <> '192.168.1.0/24'::cidr
ORDER BY i;c | i
---+---
(0 rows)

regression=# explain SELECT * FROM inet_tbl WHERE i <>
'192.168.1.0/24'::cidr ORDER BY i;                                      QUERY PLAN
-----------------------------------------------------------------------------------------Gather Merge
(cost=16.57..16.67rows=10 width=64)  Workers Planned: 1  ->  Sort  (cost=16.56..16.58 rows=10 width=64)        Sort
Key:i        ->  Parallel Bitmap Heap Scan on inet_tbl  (cost=12.26..16.39
 
rows=10 width=64)              Recheck Cond: (i <> '192.168.1.0/24'::inet)              ->  Bitmap Index Scan on
inet_idx3 (cost=0.00..12.26
 
rows=17 width=0)                    Index Cond: (i <> '192.168.1.0/24'::inet)
(8 rows)


-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] Problem in Parallel Bitmap Heap Scan?

From
Dilip Kumar
Date:
On Tue, Mar 21, 2017 at 4:47 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> I noticed a failure in the inet.sql test while running the regression
> tests with parallelism cranked up, and can reproduce it interactively
> as follows.  After an spgist index is created and the plan changes to
> the one shown below, the query returns no rows.

Thanks for reporting.  Seems like we are getting issues related to
TBM_ONE_PAGE and TBM_EMPTY.

I think in this area we need more testing, reason these are not tested
properly because these are not the natural case for parallel bitmap.
I think in next few days I will test more such cases by forcing the
parallel bitmap.

Here is the patch to fix the issue in hand.  I have also run the
regress suit with force_parallel_mode=regress and all the test are
passing.

Results after fix.

postgres=# explain analyze SELECT * FROM inet_tbl WHERE i <>
'192.168.1.0/24'::cidr
ORDER BY i;
                                                            QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------
 Gather Merge  (cost=16.53..16.62 rows=9 width=64) (actual
time=4.467..4.478 rows=16 loops=1)
   Workers Planned: 1
   Workers Launched: 1
   ->  Sort  (cost=16.52..16.54 rows=9 width=64) (actual
time=0.090..0.093 rows=8 loops=2)
         Sort Key: i
         Sort Method: quicksort  Memory: 25kB
         ->  Parallel Bitmap Heap Scan on inet_tbl  (cost=12.26..16.37
rows=9 width=64) (actual time=0.048..0.050 rows=8 loops=2)
               Recheck Cond: (i <> '192.168.1.0/24'::inet)
               Heap Blocks: exact=1
               ->  Bitmap Index Scan on inet_idx3  (cost=0.00..12.25
rows=16 width=0) (actual time=0.016..0.016 rows=16 loops=1)
                     Index Cond: (i <> '192.168.1.0/24'::inet)
 Planning time: 0.149 ms
 Execution time: 5.143 ms
(13 rows)

postgres=# SELECT * FROM inet_tbl WHERE i <> '192.168.1.0/24'::cidr
ORDER BY i;
         c          |        i
--------------------+------------------
 10.0.0.0/8         | 9.1.2.3/8
 10.0.0.0/8         | 10.1.2.3/8
 10.0.0.0/32        | 10.1.2.3/8
 10.0.0.0/8         | 10.1.2.3/8
 10.1.0.0/16        | 10.1.2.3/16
 10.1.2.0/24        | 10.1.2.3/24
 10.1.2.3/32        | 10.1.2.3
 10.0.0.0/8         | 11.1.2.3/8
 192.168.1.0/24     | 192.168.1.226/24
 192.168.1.0/24     | 192.168.1.255/24
 192.168.1.0/24     | 192.168.1.0/25
 192.168.1.0/24     | 192.168.1.255/25
 192.168.1.0/26     | 192.168.1.226
 ::ffff:1.2.3.4/128 | ::4.3.2.1/24
 10:23::f1/128      | 10:23::f1/64
 10:23::8000/113    | 10:23::ffff
(16 rows)


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] Problem in Parallel Bitmap Heap Scan?

From
Thomas Munro
Date:
On Wed, Mar 22, 2017 at 1:21 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> postgres=# SELECT * FROM inet_tbl WHERE i <> '192.168.1.0/24'::cidr
> ORDER BY i;
>          c          |        i
> --------------------+------------------
>  10.0.0.0/8         | 9.1.2.3/8
>  10.0.0.0/8         | 10.1.2.3/8
>  10.0.0.0/32        | 10.1.2.3/8
>  10.0.0.0/8         | 10.1.2.3/8
>  10.1.0.0/16        | 10.1.2.3/16
>  10.1.2.0/24        | 10.1.2.3/24
>  10.1.2.3/32        | 10.1.2.3
>  10.0.0.0/8         | 11.1.2.3/8
>  192.168.1.0/24     | 192.168.1.226/24
>  192.168.1.0/24     | 192.168.1.255/24
>  192.168.1.0/24     | 192.168.1.0/25
>  192.168.1.0/24     | 192.168.1.255/25
>  192.168.1.0/26     | 192.168.1.226
>  ::ffff:1.2.3.4/128 | ::4.3.2.1/24
>  10:23::f1/128      | 10:23::f1/64
>  10:23::8000/113    | 10:23::ffff
> (16 rows)

Isn't that one row short?  What happened to this one?
10.0.0.0/8         | 10::/8

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] Problem in Parallel Bitmap Heap Scan?

From
Dilip Kumar
Date:
On Wed, Mar 22, 2017 at 5:38 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> Isn't that one row short?  What happened to this one?
>
>  10.0.0.0/8         | 10::/8

Actually, In my last test I did not connect to regression database, I
have simply taken table and the few rows from inet.sql so it was only
16 rows even with seqscan.

Here are the updated results when I connect to regression database and re-test.

regression=# SELECT * FROM inet_tbl WHERE i <> '192.168.1.0/24'::cidr
ORDER BY i;        c          |        i
--------------------+------------------10.0.0.0/8         | 9.1.2.3/810.0.0.0/8         | 10.1.2.3/810.0.0.0/32
|10.1.2.3/810.0.0.0/8         | 10.1.2.3/810.1.0.0/16        | 10.1.2.3/1610.1.2.0/24        | 10.1.2.3/2410.1.2.3/32
    | 10.1.2.310.0.0.0/8         | 11.1.2.3/8192.168.1.0/24     | 192.168.1.226/24192.168.1.0/24     |
192.168.1.255/24192.168.1.0/24    | 192.168.1.0/25192.168.1.0/24     | 192.168.1.255/25192.168.1.0/26     |
192.168.1.22610.0.0.0/8        | 10::/8::ffff:1.2.3.4/128 | ::4.3.2.1/2410:23::f1/128      |
10:23::f1/6410:23::8000/113   | 10:23::ffff
 
(17 rows)

regression=# explain analyze SELECT * FROM inet_tbl WHERE i <>
'192.168.1.0/24'::cidr
ORDER BY i;                                                           QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------Gather
Merge (cost=16.57..16.67 rows=10 width=64) (actual
 
time=4.972..4.983 rows=17 loops=1)  Workers Planned: 1  Workers Launched: 1  ->  Sort  (cost=16.56..16.58 rows=10
width=64)(actual
 
time=0.107..0.110 rows=8 loops=2)        Sort Key: i        Sort Method: quicksort  Memory: 26kB        ->  Parallel
BitmapHeap Scan on inet_tbl  (cost=12.26..16.39
 
rows=10 width=64) (actual time=0.051..0.053 rows=8 loops=2)              Recheck Cond: (i <> '192.168.1.0/24'::inet)
         Heap Blocks: exact=1              ->  Bitmap Index Scan on inet_idx3  (cost=0.00..12.26
 
rows=17 width=0) (actual time=0.016..0.016 rows=17 loops=1)                    Index Cond: (i <>
'192.168.1.0/24'::inet)Planningtime: 0.113 msExecution time: 5.691 ms
 
(13 rows)


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Problem in Parallel Bitmap Heap Scan?

From
Amit Kapila
Date:
On Tue, Mar 21, 2017 at 5:51 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> On Tue, Mar 21, 2017 at 4:47 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> I noticed a failure in the inet.sql test while running the regression
>> tests with parallelism cranked up, and can reproduce it interactively
>> as follows.  After an spgist index is created and the plan changes to
>> the one shown below, the query returns no rows.
>
> Thanks for reporting.  Seems like we are getting issues related to
> TBM_ONE_PAGE and TBM_EMPTY.
>
> I think in this area we need more testing, reason these are not tested
> properly because these are not the natural case for parallel bitmap.
> I think in next few days I will test more such cases by forcing the
> parallel bitmap.
>

Okay, is your testing complete?

> Here is the patch to fix the issue in hand.  I have also run the
> regress suit with force_parallel_mode=regress and all the test are
> passing.
>

Thomas, did you get chance to verify Dilip's latest patch?

I have added this issue in PostgreSQL 10 Open Items list so that we
don't loose track of this issue.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Problem in Parallel Bitmap Heap Scan?

From
Thomas Munro
Date:
On Sat, Mar 25, 2017 at 6:04 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Tue, Mar 21, 2017 at 5:51 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>> On Tue, Mar 21, 2017 at 4:47 PM, Thomas Munro
>> <thomas.munro@enterprisedb.com> wrote:
>>> I noticed a failure in the inet.sql test while running the regression
>>> tests with parallelism cranked up, and can reproduce it interactively
>>> as follows.  After an spgist index is created and the plan changes to
>>> the one shown below, the query returns no rows.
>>
>> Thanks for reporting.  Seems like we are getting issues related to
>> TBM_ONE_PAGE and TBM_EMPTY.
>>
>> I think in this area we need more testing, reason these are not tested
>> properly because these are not the natural case for parallel bitmap.
>> I think in next few days I will test more such cases by forcing the
>> parallel bitmap.
>>
>
> Okay, is your testing complete?
>
>> Here is the patch to fix the issue in hand.  I have also run the
>> regress suit with force_parallel_mode=regress and all the test are
>> passing.
>>
>
> Thomas, did you get chance to verify Dilip's latest patch?
>
> I have added this issue in PostgreSQL 10 Open Items list so that we
> don't loose track of this issue.

The result is correct with this patch.  I ran make installcheck then
the same steps as above and the query result was correct after
creating the index.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Problem in Parallel Bitmap Heap Scan?

From
Dilip Kumar
Date:
On Sat, Mar 25, 2017 at 2:25 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
>>> I think in this area we need more testing, reason these are not tested
>>> properly because these are not the natural case for parallel bitmap.
>>> I think in next few days I will test more such cases by forcing the
>>> parallel bitmap.
>>>
>>
>> Okay, is your testing complete?

Yes, I have done more testing around this area with more cases, like
one page with BitmapOr etc.
Now it looks fine to me.
>>
>>> Here is the patch to fix the issue in hand.  I have also run the
>>> regress suit with force_parallel_mode=regress and all the test are
>>> passing.
>>>
>>
>> Thomas, did you get chance to verify Dilip's latest patch?
>>
>> I have added this issue in PostgreSQL 10 Open Items list so that we
>> don't loose track of this issue.
>
> The result is correct with this patch.  I ran make installcheck then
> the same steps as above and the query result was correct after
> creating the index.

Thanks for confirming.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] Problem in Parallel Bitmap Heap Scan?

From
Noah Misch
Date:
On Sat, Mar 25, 2017 at 09:55:21PM +1300, Thomas Munro wrote:
> On Sat, Mar 25, 2017 at 6:04 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Tue, Mar 21, 2017 at 5:51 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >> On Tue, Mar 21, 2017 at 4:47 PM, Thomas Munro
> >> <thomas.munro@enterprisedb.com> wrote:
> >>> I noticed a failure in the inet.sql test while running the regression
> >>> tests with parallelism cranked up, and can reproduce it interactively
> >>> as follows.  After an spgist index is created and the plan changes to
> >>> the one shown below, the query returns no rows.
> >>
> >> Thanks for reporting.  Seems like we are getting issues related to
> >> TBM_ONE_PAGE and TBM_EMPTY.
> >>
> >> I think in this area we need more testing, reason these are not tested
> >> properly because these are not the natural case for parallel bitmap.
> >> I think in next few days I will test more such cases by forcing the
> >> parallel bitmap.
> >>
> >
> > Okay, is your testing complete?
> >
> >> Here is the patch to fix the issue in hand.  I have also run the
> >> regress suit with force_parallel_mode=regress and all the test are
> >> passing.
> >>
> >
> > Thomas, did you get chance to verify Dilip's latest patch?
> >
> > I have added this issue in PostgreSQL 10 Open Items list so that we
> > don't loose track of this issue.
> 
> The result is correct with this patch.  I ran make installcheck then
> the same steps as above and the query result was correct after
> creating the index.

[Action required within three days.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item.  Robert,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com



Re: [HACKERS] Problem in Parallel Bitmap Heap Scan?

From
Robert Haas
Date:
On Sun, Apr 9, 2017 at 11:17 PM, Noah Misch <noah@leadboat.com> wrote:
> The above-described topic is currently a PostgreSQL 10 open item.

I have committed the patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company