Re: [HACKERS] WIP: Aggregation push-down - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: [HACKERS] WIP: Aggregation push-down |
Date | |
Msg-id | 20200111011124.efacqhdsptc36pm2@development Whole thread Raw |
In response to | Re: [HACKERS] WIP: Aggregation push-down (Antonin Houska <ah@cybertec.at>) |
Responses |
Re: [HACKERS] WIP: Aggregation push-down
|
List | pgsql-hackers |
Hi, I've been looking at the last version (v14) of this patch series, submitted way back in July and unfortunately quiet since then. Antonin is probably right one of the reasons for the lack of reviews is that it requires interest/experience with planner. Initially it was also a bit hard to understand what are the benefits (and the patch shifted a bit), which is now mostly addressed by the README in the last patch. The trouble is that's hidden in the patch and so not immediately obvious to people considering reviewing it :-( Tom posted a nice summary in November 2018, but it was perhaps focused on the internals. So my first suggestion it to write a short summary with example how it affects practical queries (plan change, speedup) etc. My second suggestion is to have meaningful commit messages for each part of the patch series, explaining why we need that particular part. It might have been explained somewhere in the thread, but as a reviewer I really don't want to reverse-engineer the whole thread. Now, regarding individual parts of the patch: 1) v14-0001-Introduce-RelInfoList-structure.patch ------------------------------------------------- - I'm not entirely sure why we need this change. We had the list+hash before, so I assume we do this because we need the output functions? - The RelInfoList naming is pretty confusing, because it's actually not a list but a combination of list+hash. And the embedded list is called items, to make it yet a bit more confusing. I suggest we call this just RelInfo or RelInfoLookup or something else not including List. - RelInfoEntry seems severely under-documented, particularly the data part (which is void* making it even harder to understand what's it for without having to read the functions). AFAICS it's just simply a link to the embedded list, but maybe I'm wrong. - I suggest we move find_join_rel/add_rel_info/add_join_rel in relnode.c right before build_join_rel. This makes diff clearer/smaller and visual diffs nicer. 2) v14-0002-Introduce-make_join_rel_common-function.patch --------------------------------------------------------- I see no point in keeping this change in a separate patch, as it prety much just renames make_join_rel to make_join_rel_common and then adds make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2) { return make_join_rel_common(root, rel1, rel2); } which seems rather trivial and useless on it's own. I'd just merge it into 0003 where we use the _common function more extensively. 3) v14-0003-Aggregate-push-down-basic-functionality.patch --------------------------------------------------------- I haven't done much review on this yet, but I've done some testing and benchmarking so let me share at least that. The queries I used were of the type I mentioned earlier in this thread - essentially a star schema, i.e. fact table referencing dimensions, with aggregation per columns in the dimension. So something like SELECT d.c, sum(f) FROM fact JOIN dimension d ON (d.id = f.d_id) GROUP BY d.c; where "fact" table is much much larger than the dimension. Attached is a script I used for testing with a bunch of queries and a number of parameters (parallelism, size of dimension, size of fact, ...) and a spreadsheed summarizing the results. Overall, the results seem pretty good - in many cases the queries get about 2x faster and I haven't seen any regressions. That's pretty nice. One annoying thing is that this only works for non-parallel queries. That is, I saw no improvements with parallel queries. It was still faster than the non-parallel query with aggregate pushdown, but the parallel query did not improve. An example of timing looks like this: non-parallel (pushdown = off): 918 ms non-parallel (pushdown = on): 561 ms parallel (pushdown = off): 313 ms parallel (pushdown = on): 313 ms The non-parallel query gets faster because after enabling push-down the plan changes (and we get partial aggregate below the join). With parallel query that does not happen, the plans stay the same. I'm not claiming this is a bug - we end up picking the fastest execution plan (i.e. we don't make a mistake by e.g. switching to the non-parallel one with pushdown). My understanding is that the aggregate pushdown can't (currently?) be used with queries that use partial aggregate because of parallelism. That is we can either end up with a plan like this: Finalize Aggregate -> Join -> Partial Aggregate -> ... or a parallel plan like this: Finalize Aggregate -> Gather -> Partial Aggregate -> Join -> ... -> ... but we currently don't support a combination of both Finalize Aggregate -> Gather -> Join -> Partial Aggregate -> ... I wonder if that's actually even possible and what would it take to make it work? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
pgsql-hackers by date:
Next
From: Michael PaquierDate:
Subject: Re: [Logical Replication] TRAP:FailedAssertion("rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT ||rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX"