On Sun, Jul 26, 2020 at 6:43 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> I would like to propose a patch for enabling the parallelism for the
> bitmap index scan path.
>
> Background:
> Currently, we support only a parallel bitmap heap scan path. Therein,
> the underlying bitmap index scan is done by a single worker called the
> leader. The leader creates a bitmap in shared memory and once the
> bitmap is ready it creates a shared iterator and after that, all the
> workers process the shared iterator and scan the heap in parallel.
> While analyzing the TPCH plan we have observed that some of the
> queries are spending significant time in preparing the bitmap. So the
> idea of this patch is to use the parallel index scan for preparing the
> underlying bitmap in parallel.
>
> Design:
> If underlying index AM supports the parallel path (currently only
> BTREE support it), then we will create a parallel bitmap heap scan
> path on top of the parallel bitmap index scan path. So the idea of
> this patch is that each worker will do the parallel index scan and
> generate their part of the bitmap. And, we will create a barrier so
> that we can not start preparing the shared iterator until all the
> worker is ready with their bitmap. The first worker, which is ready
> with the bitmap will keep a copy of its TBM and the page table in the
> shared memory. And, all the subsequent workers will merge their TBM
> with the shared TBM. Once all the TBM are merged we will get one
> common shared TBM and after that stage, the worker can continue. The
> remaining part is the same, basically, again one worker will scan the
> shared TBM and prepare the shared iterator and once it is ready all
> the workers will jointly scan the heap in parallel using shared
> iterator.
>
Though I have not looked at the patch or code for the existing
parallel bitmap heap scan, one point keeps bugging in my mind. I may
be utterly wrong or my question may be so silly, anyways I would like
to ask here:
From the above design: each parallel worker creates partial bitmaps
for the index data that they looked at. Why should they merge these
bitmaps to a single bitmap in shared memory? Why can't each parallel
worker do a bitmap heap scan using the partial bitmaps they built
during it's bitmap index scan and emit qualified tuples/rows so that
the gather node can collect them? There may not be even lock
contention as bitmap heap scan takes read locks for the heap
pages/tuples.
With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com