Re: [HACKERS] Parallel bitmap heap scan - Mailing list pgsql-hackers
From | Dilip Kumar |
---|---|
Subject | Re: [HACKERS] Parallel bitmap heap scan |
Date | |
Msg-id | CAFiTN-su5SW6ekgrOYV7SLMNDuZzMT4wqP1Sp1=NLXpFAGnUQA@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Parallel bitmap heap scan (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: [HACKERS] Parallel bitmap heap scan
Re: [HACKERS] Parallel bitmap heap scan |
List | pgsql-hackers |
On Sun, Feb 26, 2017 at 9:26 PM, Robert Haas <robertmhaas@gmail.com> wrote: > I'm not entirely sure about the calling convention for > tbm_free_shared_area() but the rest seems OK. Changed > >> 2. Now tbm_free is not freeing any of the shared members which can be >> accessed by other worker so tbm_free is safe to call from >> ExecEndBitmapHeapScan without any safety check or ref count. > > That also seems fine. We ended up with something very similar in the > Parallel Index Scan patch. > >> 0002: >> 1. We don't need ExecShutdownBitmapHeapScan anymore because now we are >> not freeing any shared member in ExecEndBitmapHeapScan. >> 2. In ExecReScanBitmapHeapScan we will call tbm_free_shared_area to >> free the shared members of the TBM. >> 3. After that, we will free TBMSharedIteratorState what we allocated >> using tbm_prepare_shared_iterate. > > Check. But I think tbm_free_shared_area() should also free the object > itself, instead of making the caller do that separately. Right, done that way. > > + if (DsaPointerIsValid(node->pstate->tbmiterator)) > + { > + /* First we free the shared TBM members using the shared state */ > + tbm_free_shared_area(dsa, node->pstate->tbmiterator); > + dsa_free(dsa, node->pstate->tbmiterator); > + } > + > + if (DsaPointerIsValid(node->pstate->prefetch_iterator)) > + dsa_free(dsa, node->pstate->prefetch_iterator); > > The fact that these cases aren't symmetric suggests that your > abstraction is leaky. I'm guessing that you can't call > tbm_free_shared_area because the two iterators share one copy of the > underlying iteration arrays, and the TBM code isn't smart enough to > avoid freeing them twice. You're going to have to come up with a > better solution to that problem; nodeBitmapHeapScan.c shouldn't know > about the way the underlying storage details are managed. (Maybe you > need to reference-count the iterator arrays?) Converted iterator arrays to structure and maintained the refcount. I had to do the same thing for pagetable also because that is also shared across iterator. > > + if (node->inited) > + goto start_iterate; > > My first programming teacher told me not to use goto. I've > occasionally violated that rule, but I need a better reason than you > have here. It looks very easy to avoid. Changed > > + pbms_set_parallel(outerPlanState(node)); > > I think this should be a flag in the plan, and the planner should set > it correctly, instead of having it be a flag in the executor that the > executor sets. Also, the flag itself should really be called > something that involves the word "shared" rather than "parallel", > because the bitmap will not be created in parallel, but it will be > shared. Done > > Have you checked whether this patch causes any regression in the > non-parallel case? It adds a bunch of "if" statements, so it might. > Hopefully not, but it needs to be carefully tested. With new patch I tested in my local machine, perform multiple executions and it doesn't show any regression. Attached file perf_result contains the explain analyze output for one of the query (head, patch with 0 workers and patch with 2 workers). I will perform the test with all the TPC-H queries which using bitmap plan on the bigger machine and will post the results next week. > > @@ -48,10 +48,11 @@ > #include "utils/snapmgr.h" > #include "utils/tqual.h" > > - > static TupleTableSlot *BitmapHeapNext(BitmapHeapScanState *node); > > Unnecessary. Fixed > > +static bool pbms_is_leader(ParallelBitmapState *pbminfo); > +static void pbms_set_parallel(PlanState *node); > > I don't think this "pbms" terminology is very good. It's dissimilar > to the way other functions in this file are named. Maybe > BitmapShouldInitializeSharedState(). Changed > > I think that some of the bits that this function makes conditional on > pstate should be factored out into inline functions. Like: > > - if (node->prefetch_target >= node->prefetch_maximum) > - /* don't increase any further */ ; > - else if (node->prefetch_target >= node->prefetch_maximum / 2) > - node->prefetch_target = node->prefetch_maximum; > - else if (node->prefetch_target > 0) > - node->prefetch_target *= 2; > - else > - node->prefetch_target++; > + if (!pstate) > + { > + if (node->prefetch_target >= node->prefetch_maximum) > + /* don't increase any further */ ; > + else if (node->prefetch_target >= node->prefetch_maximum / 2) > + node->prefetch_target = node->prefetch_maximum; > + else if (node->prefetch_target > 0) > + node->prefetch_target *= 2; > + else > + node->prefetch_target++; > + } > + else if (pstate->prefetch_target < node->prefetch_maximum) > + { > + SpinLockAcquire(&pstate->prefetch_mutex); > + if (pstate->prefetch_target >= node->prefetch_maximum) > + /* don't increase any further */ ; > + else if (pstate->prefetch_target >= > + node->prefetch_maximum / 2) > + pstate->prefetch_target = node->prefetch_maximum; > + else if (pstate->prefetch_target > 0) > + pstate->prefetch_target *= 2; > + else > + pstate->prefetch_target++; > + SpinLockRelease(&pstate->prefetch_mutex); > + } > > I suggest creating an inline function like BitmapAdjustPrefetch() for > this logic, and letting the code call that. The function can look > something like this: if (pstate == NULL) { non-parallel stuff; return; > } parallel stuff follows... > > And similarly for the other cases where you've made the logic > conditional. This will make it more clear what's happening > post-patch, I think, and will also help keep the level of indentation > from getting out-of-control in certain places. In fact, maybe you > should submit a preliminary refactoring patch that moves these chunks > of logic into functions and then the main patch can apply over top of > that. Done. > > + bool inited; > > Suggest: initialized Done > > - * ---------------- > + * pscan_len size of the shared memory for parallel bitmap > + * inited is node is ready to iterate > + * stbmiterator shared iterator > + * sprefetch_iterator shared iterator for prefetching > + * pstate shared state for parallel bitmap scan > + *---------------- > > No need to change number of dashes. Fixed > > + * PBMState information : Current status of the TIDBitmap creation during > + * parallel bitmap heap scan. > > If you look for existing places where comments are formatted like > this, I bet you won't find many. Copy the surrounding style more. Done as surrounding structure, also changed the name to SharedBitmapState, I think that is better name for the purpose. > > + dsa_pointer tbmiterator; > + dsa_pointer prefetch_iterator; > + slock_t prefetch_mutex; > + int prefetch_pages; > + int prefetch_target; > + bool prefetching; > + slock_t state_mutex; > + PBMState state; > + ConditionVariable cv; > + char phs_snapshot_data[FLEXIBLE_ARRAY_MEMBER]; > > I think it is probably not a good idea to have two separate mutexes > here. They'll be in the same cache line, so it won't be much faster > than having one mutex, and the state mutex won't be acquired very > often so you won't really gain anything anyway. I think you can just > merge the mutexes into one called 'mutex'. Done. > > + /* Allow only one process to prefetch */ > > If this is a good idea, there should be a comment explaining why. Done > > + TBMSharedIterator *stbmiterator; > + TBMSharedIterator *sprefetch_iterator; > > Maybe shared_iterator and shared_prefetch_iterator. Done 0001- same as previous with some changes for freeing the shared memory stuff. 0002- nodeBitmapHeapScan refactoring, this applies independently 0003- actual parallel bitmap stuff applies on top of 0001 and 0002 -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: