On Tue, 3 Oct 2023 at 09:11, David Rowley <dgrowleyml@gmail.com> wrote:
> I'm concerned that this patch will be too much overhead when creating
> paths when a PathKey's EquivalenceClass has a large number of members
> from partitioned tables.
I just tried out the patch to see how much it affects the performance
of the planner. I think we need to find a better way to strip off the
pathkeys for the columns that have been aggregated.
Setup:
create table lp (a int, b int) partition by list (a);
select 'create table lp'||x||' partition of lp for values in('||x||')'
from generate_series(0,999)x;
\gexec
\pset pager off
set enable_partitionwise_aggregate=1;
Benchmark query:
explain (summary on) select a,count(*) from lp group by a;
Master:
Planning Time: 23.945 ms
Planning Time: 23.887 ms
Planning Time: 23.927 ms
perf top:
7.39% libc.so.6 [.] __memmove_avx_unaligned_erms
6.98% [kernel] [k] clear_page_rep
5.69% postgres [.] bms_is_subset
5.07% postgres [.] fetch_upper_rel
4.41% postgres [.] bms_equal
Patched:
Planning Time: 41.410 ms
Planning Time: 41.474 ms
Planning Time: 41.488 ms
perf top:
19.02% postgres [.] bms_is_subset
6.91% postgres [.] find_ec_member_matching_expr
5.93% libc.so.6 [.] __memmove_avx_unaligned_erms
5.55% [kernel] [k] clear_page_rep
4.07% postgres [.] fetch_upper_rel
3.46% postgres [.] bms_equal
> I wondered if we should instead just check
> if the subpath's pathkeys match root->group_pathkeys and if they do
> set the AggPath's pathkeys to list_copy_head(subpath->pathkeys,
> root->num_groupby_pathkeys), that'll be much cheaper, but it just
> feels a bit too much like a special case.
I tried this approach (patch attached) and it does perform better than
the other patch:
create_agg_path_fix2.patch:
Planning Time: 24.357 ms
Planning Time: 24.293 ms
Planning Time: 24.259 ms
7.45% libc.so.6 [.] __memmove_avx_unaligned_erms
6.90% [kernel] [k] clear_page_rep
5.56% postgres [.] bms_is_subset
5.38% postgres [.] bms_equal
I wonder if the attached patch is too much of a special case fix. I
guess from the lack of complaints previously that there are no other
cases where we could possibly have pathkeys that belong to columns
that are aggregated. I've not gone to much effort to see if I can
craft a case that hits this without the ORDER BY/DISTINCT aggregate
optimisation, however.
David