Re: Parallel bitmap index scan - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Parallel bitmap index scan
Date
Msg-id CAFiTN-sH1D2CxCjVUPgQvprPHy0Y9jnFiO2Rg2--Z9fJobUGqw@mail.gmail.com
Whole thread Raw
In response to Parallel bitmap index scan  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: Parallel bitmap index scan
List pgsql-hackers
On Sun, Jul 26, 2020 at 6:42 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> I would like to propose a patch for enabling the parallelism for the
> bitmap index scan path.
>
> Background:
> Currently, we support only a parallel bitmap heap scan path.  Therein,
> the underlying bitmap index scan is done by a single worker called the
> leader.  The leader creates a bitmap in shared memory and once the
> bitmap is ready it creates a shared iterator and after that, all the
> workers process the shared iterator and scan the heap in parallel.
> While analyzing the TPCH plan we have observed that some of the
> queries are spending significant time in preparing the bitmap.  So the
> idea of this patch is to use the parallel index scan for preparing the
> underlying bitmap in parallel.
>
> Design:
> If underlying index AM supports the parallel path (currently only
> BTREE support it), then we will create a parallel bitmap heap scan
> path on top of the parallel bitmap index scan path.  So the idea of
> this patch is that each worker will do the parallel index scan and
> generate their part of the bitmap.  And, we will create a barrier so
> that we can not start preparing the shared iterator until all the
> worker is ready with their bitmap.  The first worker, which is ready
> with the bitmap will keep a copy of its TBM and the page table in the
> shared memory.  And, all the subsequent workers will merge their TBM
> with the shared TBM.  Once all the TBM are merged we will get one
> common shared TBM and after that stage, the worker can continue.  The
> remaining part is the same,  basically, again one worker will scan the
> shared TBM and prepare the shared iterator and once it is ready all
> the workers will jointly scan the heap in parallel using shared
> iterator.
>
> BitmapHeapNext
> {
> ...
> BarrierAttach();
> tbm = MultiExecProcNode();
> tbm_merge(tbm); --Merge with common tbm using tbm_union
> BarrierArriveAndWait();
>
> if (BitmapShouldInitializeSharedState(pstate)). --> only one worker
> come out of this
> {
>    tbm_prepare_shared_iterate();
>    BitmapDoneInitializingSharedState(). -->wakeup others
> }
> tbm_attach_shared_iterate(). --> all worker attach to shared iterator
> ...
> }
>
> Performance:  With scale factor 10, I could see that Q6 is spending
> significant time in a bitmap index scan so I have taken the
> performance with that query and I can see that the bitmap index scan
> node is 3x faster by using 3 workers whereas overall plan got ~40%
> faster.
>
> TPCH: S.F. 10, work_mem=512MB  shared_buffers: 1GB
>
> HEAD:
>   Limit  (cost=1559777.02..1559777.03 rows=1 width=32) (actual
> time=5260.121..5260.122 rows=1 loops=1)
>    ->  Finalize Aggregate  (cost=1559777.02..1559777.03 rows=1
> width=32) (actual time=5260.119..5260.119 rows=1 loops=1)
>          ->  Gather  (cost=1559776.69..1559777.00 rows=3 width=32)
> (actual time=5257.251..5289.595 rows=4 loops=1)
>                Workers Planned: 3
>                Workers Launched: 3
>                ->  Partial Aggregate  (cost=1558776.69..1558776.70
> rows=1 width=32) (actual time=5247.714..5247.714 rows=1 loops=4)
>                      ->  Parallel Bitmap Heap Scan on lineitem
> (cost=300603.01..1556898.89 rows=375560 width=12) (actual
> time=3475.944..50
> 37.484 rows=285808 loops=4)
>                            Recheck Cond: ((l_shipdate >=
> '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp
> without tim
> e zone) AND (l_discount >= 0.02) AND (l_discount <= 0.04) AND
> (l_quantity < '24'::numeric))
>                            Heap Blocks: exact=205250
>                            ->  Bitmap Index Scan on
> idx_lineitem_shipdate  (cost=0.00..300311.95 rows=1164235 width=0)
> (actual time=3169.85
> 5..3169.855 rows=1143234 loops=1)
>                                  Index Cond: ((l_shipdate >=
> '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp
> without
>  time zone) AND (l_discount >= 0.02) AND (l_discount <= 0.04) AND
> (l_quantity < '24'::numeric))
>  Planning Time: 0.659 ms
>  Execution Time: 5289.787 ms
> (13 rows)
>
>
> PATCH:
>
>   Limit  (cost=1559579.85..1559579.86 rows=1 width=32) (actual
> time=3333.572..3333.572 rows=1 loops=1)
>    ->  Finalize Aggregate  (cost=1559579.85..1559579.86 rows=1
> width=32) (actual time=3333.569..3333.569 rows=1 loops=1)
>          ->  Gather  (cost=1559579.52..1559579.83 rows=3 width=32)
> (actual time=3328.619..3365.227 rows=4 loops=1)
>                Workers Planned: 3
>                Workers Launched: 3
>                ->  Partial Aggregate  (cost=1558579.52..1558579.53
> rows=1 width=32) (actual time=3307.805..3307.805 rows=1 loops=4)
>                      ->  Parallel Bitmap Heap Scan on lineitem
> (cost=300405.84..1556701.72 rows=375560 width=12) (actual
> time=1585.726..30
> 97.628 rows=285808 loops=4)
>                            Recheck Cond: ((l_shipdate >=
> '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp
> without tim
> e zone) AND (l_discount >= 0.02) AND (l_discount <= 0.04) AND
> (l_quantity < '24'::numeric))
>                            Heap Blocks: exact=184293
>                            ->  Parallel Bitmap Index Scan on
> idx_lineitem_shipdate  (cost=0.00..300311.95 rows=1164235 width=0)
> (actual tim
> e=1008.361..1008.361 rows=285808 loops=4)
>                                  Index Cond: ((l_shipdate >=
> '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp
> without
>  time zone) AND (l_discount >= 0.02) AND (l_discount <= 0.04) AND
> (l_quantity < '24'::numeric))
>  Planning Time: 0.690 ms
>  Execution Time: 3365.420 ms
>
> Note:
> - Currently, I have only parallelized then bitmap index path when we
> have a bitmap index scan directly under bitmap heap.  But, if we have
> BitmapAnd or BitmapOr path then I did not parallelize the underlying
> bitmap index scan.  I think for BitmapAnd and BitmapOr we should use a
> completely different design, something similar to what we are doing in
> parallel append so I don't think BitmapAnd and BitmapOr we need to
> cover under this patch.
>
> - POC patch is attached to discuss the idea.  The patch still needs
> cleanup and testing.
>

There was one compilation warning so fixed in this version.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Next
From: Chris Travers
Date:
Subject: Re: Making CASE error handling less surprising