> On Thu, Oct 10, 2024 at 05:36:10PM GMT, Ashutosh Bapat wrote:
>
> 3. With just patch 0001 applied, planning time usually shows
> degradation (column Q and X in planning time sheets) with or without
> PWJ enabled. I first thought that it might be because of the increased
> size of PlannerInfo. We had seen a similar phenomenon when adding a
> new member to WindowAggState [2]. Hence I introduced patch 0002 which
> moves two fields around to not increase the size of structure. But
> that doesn't fix the regression in the planning time (columns R and
> Y). Apart from increasing the PlannerInfo size and may be object file
> size, 0002 does not have any other impact. But the regression seen
> with just that patch is more than what we saw in [2]. More investigate
> is required to decide whether this regression is real or not and if
> real, the root cause. Looking at the numbers, it seems that this
> regression is causing the planning time regression in rest of the
> patches. If we fix regression by 0001, we should not see much
> regression in rest of the patches. I am looking for some guidance in
> investigating this regression.
Hi,
I've tried to reproduce some subset of those results, in case if I would
be able to notice anything useful. Strangely enough, I wasn't able to
get much boost in planning time e.g. with 4 first patches, 100
partitions and 5 joins -- the results you've posted are showing about
16% in that case, where I'm getting only a couple of percents. Probably
I'm doing something differently, but it's turned out to be hard to
reconstruct (based only on this thread) how did you exactly benchmark
the patch -- could you maybe summarize the benchmark in a reproducible
way?
From what I understand you were testing againt an empty partitioned table. Here
is what I was doing:
create table t1p (c1 int) partition by list(c1);
select format('create table %I partition of t1p for values in (%s)',
't1p' || i, i) from generate_series(1, 100) i; \gexec
do $x$
declare
i record;
plan float[];
plan_line text;
begin
for i in select * from generate_series(1, 1000) i loop
for plan_line in execute format($y$
explain analyze
select * from t1p t1, t1p t2, t1p t3, t1p t4, t1p t5
where t2.c1 =
t1.c1 and t3.c1 = t2.c1 and t4.c1 = t3.c1 and t5.c1 = t4.c1
$y$) loop
if plan_line like '%Planning Time%' then
plan := array_append(plan, substring(plan_line from '\d+.\d+')::float);
end if;
end loop;
end loop;
-- skip the first record as prewarming
raise warning 'avg: %',
(select avg(v) from unnest(plan[2:]) v);
raise warning 'median: %',
(select percentile_cont(0.5) within group(order by v)
from unnest(plan[2:]) v);
end;
$x$;
As a side note, may I ask to attach benchmark results as files, rather than a
link to a google spreadsheet? It feels nice to have something static to work
with.