Re: Query is over 2x slower with jit=on - Mailing list pgsql-hackers
From | Amit Khandekar |
---|---|
Subject | Re: Query is over 2x slower with jit=on |
Date | |
Msg-id | CAJ3gD9cWBKnzXkhu30njaY1au3NO0XD=r4N3GcC_nwFbw5GYYw@mail.gmail.com Whole thread Raw |
In response to | Re: Query is over 2x slower with jit=on (Amit Khandekar <amitdkhan.pg@gmail.com>) |
Responses |
Re: Query is over 2x slower with jit=on
Re: Query is over 2x slower with jit=on |
List | pgsql-hackers |
On 11 September 2018 at 14:50, Amit Khandekar <amitdkhan.pg@gmail.com> wrote: > On 10 September 2018 at 21:39, Andres Freund <andres@anarazel.de> wrote: >> Hi, >> >> On 2018-09-10 15:42:55 +0530, Amit Khandekar wrote: >>> Attached is a patch that accumulates the per-worker jit counters into >>> the leader process. >> >> Thanks! >> >> >>> I think we better show per-worker jit info also. The current patch >>> does not show that. I think it would be easy to continue on the patch >>> to show per-worker info also. Under the Gather node, we can show >>> per-worker jit counters. I think this would be useful too, besides the >>> cumulative figures in the leader process. Comments ? >> >> Yes, I think that'd be good. > Ok. Will continue on the patch. > >> I think we either should print the stats at >> the top level as $leader_value, $combined_worker_value, $total_value or >> just have the $combined_worker_value at the level where we print other >> stats from the worker, too. > > Yes, I think we can follow and be consistent with whatever way in > which the other worker stats are printed. Will check. > > Note: Since there can be a multiple separate Gather plans under a plan > tree, I think we can show this info for each Gather plan. The attached patch shows per-worker information, besides the cumulative figure in the end of plan. Attached is the complete output of an aggregate parallel query on tenk1 table (created using the regression tests). You can see that the per-worker figures are under each of the Gather plans. We can show combined values of all the workers under a Gather plan as one single value, but I guess if we just show per-worker values, we don't have to additionally show combined value. Comments ? Also, I have kept the the per-worker info only for verbose=true. The Gather plan's leader value is not shown. We also don't show leader-specific values when we show the usual per-worker instrumentation values. So I think we are consistent. Actually it is a bit tedious to collect only leader-specific values. And I don't think it is worth trying for it. For worker, I am not collecting per-plan-node info, because we want to show total worker figures, and not per-plan-node figures. For the normal instrumentations, we want to show per-node info. For leader-specific instrumentation, we would need to store per-(Gather-)plan info. Need to finalize the indentation and the other explain formats. One option is to keep the per-worker JIT info in a single line, like how we print the normal per-worker instrumentation : Worker 1: actual time=20.971..35.975 rows=39991 loops=1 Below is a snippet : Aggregate (cost=18656.00..18656.01 rows=1 width=8) (actual time=593.917..593.917 rows=1 loops=1) Output: count(*) -> Hash Join (cost=10718.00..18456.00 rows=80000 width=0) (actual time=496.650..593.883 rows=16 loops=1) Inner Unique: true Hash Cond: ((a.unique1 = b.unique1) AND (a.two = (row_number() OVER (?)))) -> Gather (cost=0.00..5918.00 rows=160000 width=8) (actual time=192.319..223.384 rows=160000 loops=1) Output: a.unique1, a.two Workers Planned: 4 Workers Launched: 4 Jit for Worker : 0 Functions: 2 Generation Time: 0.273 ms Inlining: true Inlining Time: 43.686 ms Optimization: true Optimization Time: 10.788 ms Emission Time: 8.438 ms Jit for Worker : 1 Functions: 2 Generation Time: 0.293 ms Inlining: true Inlining Time: 72.587 ms Optimization: true Optimization Time: 10.386 ms Emission Time: 8.115 ms .......... .......... .......... Planning Time: 0.548 ms Jit: Functions: 40 Generation Time: 3.892 ms Inlining: true Inlining Time: 409.397 ms Optimization: true Optimization Time: 174.708 ms Emission Time: 91.785 ms Execution Time: 610.262 ms (98 rows) > >> >> >>> /* >>> + * Add up the workers' JIT instrumentation from dynamic shared memory. >>> + */ >>> +static void >>> +ExecParallelRetrieveJitInstrumentation(PlanState *planstate, >>> + SharedJitInstrumentation *shared_jit) >>> +{ >>> + int n; >>> + JitContext *jit = planstate->state->es_jit; >>> + >>> + /* If the leader hasn't yet created a jit context, allocate one now. */ >>> + if (!jit) >>> + { >>> + planstate->state->es_jit = jit = >>> + jit_create_context(planstate->state->es_jit_flags); >>> + } >> >> Wouldn't it be better to move the jit instrumentation to outside of the >> context, to avoid having to do this? Or just cope with not having >> instrumentation for the leader in this case? We'd kinda need to deal >> with failure to create one anyway? > > Yeah, I think taking out the instrumentation out of the context looks > better. Will work on that. Not yet done this. Will do now. -- Thanks, -Amit Khandekar EnterpriseDB Corporation The Postgres Database Company
Attachment
pgsql-hackers by date: