Re: [HACKERS] Parallel bitmap heap scan - Mailing list pgsql-hackers
| From | Dilip Kumar |
|---|---|
| Subject | Re: [HACKERS] Parallel bitmap heap scan |
| Date | |
| Msg-id | CAFiTN-su5SW6ekgrOYV7SLMNDuZzMT4wqP1Sp1=NLXpFAGnUQA@mail.gmail.com Whole thread Raw |
| In response to | Re: [HACKERS] Parallel bitmap heap scan (Robert Haas <robertmhaas@gmail.com>) |
| Responses |
Re: [HACKERS] Parallel bitmap heap scan
Re: [HACKERS] Parallel bitmap heap scan |
| List | pgsql-hackers |
On Sun, Feb 26, 2017 at 9:26 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> I'm not entirely sure about the calling convention for
> tbm_free_shared_area() but the rest seems OK.
Changed
>
>> 2. Now tbm_free is not freeing any of the shared members which can be
>> accessed by other worker so tbm_free is safe to call from
>> ExecEndBitmapHeapScan without any safety check or ref count.
>
> That also seems fine. We ended up with something very similar in the
> Parallel Index Scan patch.
>
>> 0002:
>> 1. We don't need ExecShutdownBitmapHeapScan anymore because now we are
>> not freeing any shared member in ExecEndBitmapHeapScan.
>> 2. In ExecReScanBitmapHeapScan we will call tbm_free_shared_area to
>> free the shared members of the TBM.
>> 3. After that, we will free TBMSharedIteratorState what we allocated
>> using tbm_prepare_shared_iterate.
>
> Check. But I think tbm_free_shared_area() should also free the object
> itself, instead of making the caller do that separately.
Right, done that way.
>
> + if (DsaPointerIsValid(node->pstate->tbmiterator))
> + {
> + /* First we free the shared TBM members using the shared state */
> + tbm_free_shared_area(dsa, node->pstate->tbmiterator);
> + dsa_free(dsa, node->pstate->tbmiterator);
> + }
> +
> + if (DsaPointerIsValid(node->pstate->prefetch_iterator))
> + dsa_free(dsa, node->pstate->prefetch_iterator);
>
> The fact that these cases aren't symmetric suggests that your
> abstraction is leaky. I'm guessing that you can't call
> tbm_free_shared_area because the two iterators share one copy of the
> underlying iteration arrays, and the TBM code isn't smart enough to
> avoid freeing them twice. You're going to have to come up with a
> better solution to that problem; nodeBitmapHeapScan.c shouldn't know
> about the way the underlying storage details are managed. (Maybe you
> need to reference-count the iterator arrays?)
Converted iterator arrays to structure and maintained the refcount. I
had to do the same thing for pagetable also because that is also
shared across iterator.
>
> + if (node->inited)
> + goto start_iterate;
>
> My first programming teacher told me not to use goto. I've
> occasionally violated that rule, but I need a better reason than you
> have here. It looks very easy to avoid.
Changed
>
> + pbms_set_parallel(outerPlanState(node));
>
> I think this should be a flag in the plan, and the planner should set
> it correctly, instead of having it be a flag in the executor that the
> executor sets. Also, the flag itself should really be called
> something that involves the word "shared" rather than "parallel",
> because the bitmap will not be created in parallel, but it will be
> shared.
Done
>
> Have you checked whether this patch causes any regression in the
> non-parallel case? It adds a bunch of "if" statements, so it might.
> Hopefully not, but it needs to be carefully tested.
With new patch I tested in my local machine, perform multiple
executions and it doesn't show any regression. Attached file
perf_result contains the explain analyze output for one of the query
(head, patch with 0 workers and patch with 2 workers). I will perform
the test with all the TPC-H queries which using bitmap plan on the
bigger machine and will post the results next week.
>
> @@ -48,10 +48,11 @@
> #include "utils/snapmgr.h"
> #include "utils/tqual.h"
>
> -
> static TupleTableSlot *BitmapHeapNext(BitmapHeapScanState *node);
>
> Unnecessary.
Fixed
>
> +static bool pbms_is_leader(ParallelBitmapState *pbminfo);
> +static void pbms_set_parallel(PlanState *node);
>
> I don't think this "pbms" terminology is very good. It's dissimilar
> to the way other functions in this file are named. Maybe
> BitmapShouldInitializeSharedState().
Changed
>
> I think that some of the bits that this function makes conditional on
> pstate should be factored out into inline functions. Like:
>
> - if (node->prefetch_target >= node->prefetch_maximum)
> - /* don't increase any further */ ;
> - else if (node->prefetch_target >= node->prefetch_maximum / 2)
> - node->prefetch_target = node->prefetch_maximum;
> - else if (node->prefetch_target > 0)
> - node->prefetch_target *= 2;
> - else
> - node->prefetch_target++;
> + if (!pstate)
> + {
> + if (node->prefetch_target >= node->prefetch_maximum)
> + /* don't increase any further */ ;
> + else if (node->prefetch_target >= node->prefetch_maximum / 2)
> + node->prefetch_target = node->prefetch_maximum;
> + else if (node->prefetch_target > 0)
> + node->prefetch_target *= 2;
> + else
> + node->prefetch_target++;
> + }
> + else if (pstate->prefetch_target < node->prefetch_maximum)
> + {
> + SpinLockAcquire(&pstate->prefetch_mutex);
> + if (pstate->prefetch_target >= node->prefetch_maximum)
> + /* don't increase any further */ ;
> + else if (pstate->prefetch_target >=
> + node->prefetch_maximum / 2)
> + pstate->prefetch_target = node->prefetch_maximum;
> + else if (pstate->prefetch_target > 0)
> + pstate->prefetch_target *= 2;
> + else
> + pstate->prefetch_target++;
> + SpinLockRelease(&pstate->prefetch_mutex);
> + }
>
> I suggest creating an inline function like BitmapAdjustPrefetch() for
> this logic, and letting the code call that. The function can look
> something like this: if (pstate == NULL) { non-parallel stuff; return;
> } parallel stuff follows...
>
> And similarly for the other cases where you've made the logic
> conditional. This will make it more clear what's happening
> post-patch, I think, and will also help keep the level of indentation
> from getting out-of-control in certain places. In fact, maybe you
> should submit a preliminary refactoring patch that moves these chunks
> of logic into functions and then the main patch can apply over top of
> that.
Done.
>
> + bool inited;
>
> Suggest: initialized
Done
>
> - * ----------------
> + * pscan_len size of the shared memory for parallel bitmap
> + * inited is node is ready to iterate
> + * stbmiterator shared iterator
> + * sprefetch_iterator shared iterator for prefetching
> + * pstate shared state for parallel bitmap scan
> + *----------------
>
> No need to change number of dashes.
Fixed
>
> + * PBMState information : Current status of the TIDBitmap creation during
> + * parallel bitmap heap scan.
>
> If you look for existing places where comments are formatted like
> this, I bet you won't find many. Copy the surrounding style more.
Done as surrounding structure, also changed the name to
SharedBitmapState, I think that is better name for the purpose.
>
> + dsa_pointer tbmiterator;
> + dsa_pointer prefetch_iterator;
> + slock_t prefetch_mutex;
> + int prefetch_pages;
> + int prefetch_target;
> + bool prefetching;
> + slock_t state_mutex;
> + PBMState state;
> + ConditionVariable cv;
> + char phs_snapshot_data[FLEXIBLE_ARRAY_MEMBER];
>
> I think it is probably not a good idea to have two separate mutexes
> here. They'll be in the same cache line, so it won't be much faster
> than having one mutex, and the state mutex won't be acquired very
> often so you won't really gain anything anyway. I think you can just
> merge the mutexes into one called 'mutex'.
Done.
>
> + /* Allow only one process to prefetch */
>
> If this is a good idea, there should be a comment explaining why.
Done
>
> + TBMSharedIterator *stbmiterator;
> + TBMSharedIterator *sprefetch_iterator;
>
> Maybe shared_iterator and shared_prefetch_iterator.
Done
0001- same as previous with some changes for freeing the shared memory stuff.
0002- nodeBitmapHeapScan refactoring, this applies independently
0003- actual parallel bitmap stuff applies on top of 0001 and 0002
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: