Re: Eager aggregation, take 3 - Mailing list pgsql-hackers
From | Richard Guo |
---|---|
Subject | Re: Eager aggregation, take 3 |
Date | |
Msg-id | CAMbWs4-Xru_eKBeRHFduigSGihdixFWVTR8A+dtMw7Mao+RkJA@mail.gmail.com Whole thread Raw |
In response to | Re: Eager aggregation, take 3 (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
On Wed, Oct 30, 2024 at 5:06 AM Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Sep 24, 2024 at 11:20 PM Richard Guo <guofenglinux@gmail.com> wrote: > > The reason is that it is very tricky to set the size estimates for a > > grouped join relation. For a non-grouped join relation, we know that > > all its paths have the same rowcount estimate (well, in theory). But > > this is not true for a grouped join relation. Suppose we have a > > grouped join relation for t1/t2 join. There might be two paths for > > it: > > What exactly do you mean by "well, in theory" here? My understanding > of how things work today is that every relation is supposed to produce > a specific set of rows and every unparameterized path must produce > that set of rows. The order of the rows may vary but the set of rows > may not. With your proposed design here, that's no longer true. > Instead, what's promised is that the row sets will become equivalent > after a later FinalizeAggregate step. In a sense, this is like > parameterization or partial paths. Yeah, you're correct that currently each relation is expected to produce the same specific set of rows. When I say "well, in theory" I mean that for a join relation, all its unparameterized paths should theoretically have the same row count estimate. However, in practice, because there are more than one way to make a joinrel for more than two base relations, and the selectivity estimation routines don’t handle all cases equally well, we might get different row count estimates depending on the pair provided. And yes, with the grouped relations proposed in this patch, the situation changes. For a grouped join relation, different paths can produce very different row sets, depending on where the partial aggregation is placed in the path tree. This is also why we need to recalculate the row count estimate for a grouped join path using its outer and inner paths provided, rather than using path->parent->rows directly. This is very like the parameterized path case. > I think what you're doing here is roughly equivalent to either of > these two cases. It's more like the parameterized path case. Instead > of having a path for a relation which is parameterized by some input > parameter, you have a path for a relation, say bar, which is partially > aggregated by some grouping column. But there's no guarantee of how > much partial aggregation has been done. In your example, PartialAgg(t1 > JOIN t2) is "more aggregated" than t1 JOIN PartialAgg(t2), so the row > counts are different. This makes me quite nervous. You can't compare a > parameterized path to an unparameterized path, but you can compare it > to another parameterized path if the parameterizations are the same. > You can't compare a partial path to a non-partial path, but you can > compare partial paths to each other. But with this work, > unparameterized, non-partial paths in the same RelOptInfo don't seem > like they are truly comparable. Maybe that's OK, but I'm not sure that > it isn't going to break other things. Perhaps we could introduce a GroupPathInfo into the Path structure to store common information for a grouped path, such as the location of the partial aggregation (which could be the relids of the relation on top of which we place the partial aggregation) and the estimated rowcount for this grouped path, similar to how ParamPathInfo functions for parameterized paths. Then we should be able to compare the grouped paths of the same location apples to apples. I haven’t thought this through in detail yet, though. > It's tempting to wonder if there's some way that we can avoid > generating paths for both PartialAgg(t1 JOIN t2) and t1 JOIN > PartialAgg(t2). Either the former has lower cardinality, or the latter > does. It seems likely that the lower-cardinality set is the winning > strategy. Even if the path has higher cost to generate, we save work > at every subsequent join level and at the final aggregation step. Are > there counterexamples where it's better to use a path from the > higher-cardinality set? This is very appealing if we can retain only the lower-cardinality path, as it would greatly simplify the overall design. I haven't looked for counterexamples yet, but I plan to do so later. > By the way, the work of figuring out what target list should be > produced by partial grouping is done by init_grouping_targets(), but > the comments seem to take it for granted that I know what result we're > trying to produce, and I don't. I think some more high-level > explanation of the goals of this code would be useful. It seems to me > that if I'm looking at a path for an ungrouped relation and it > produces a certain target list, then every column of that target list > is needed somewhere. If those columns are group keys, cool: we pass > those through. If they're inputs to the aggregates, cool: we feed them > to the aggregates. But if they are neither, then what? In the patch, > you either group on those columns or add them to the > possibly_dependent list depending on the result of > is_var_needed_by_join(). I can believe that there are some cases where > we can group on such columns and others where we can't, but find it > difficult to believe that this test reliably distinguishes between > those two cases. If it does, I don't understand why it does. Don't I > need to know something about how those columns are used in the upper > joins? Like, if those columns are connected by a chain of > binary-equality operators back to the user's choice of grouping > columns, that sounds good, but this test doesn't distinguish between > that case and an upper join on the < operator. > create_grouping_expr_infos() does reason based on whether there's an > equal-image operator available, but AIUI that's only reasoning about > the group columns the user mentioned, not the sort of implicit > grouping columns that init_grouping_targets() is creating. Yeah, this patch does not get it correct here. Basically the logic is that for the partial aggregation pushed down to a non-aggregated relation, we need to consider all columns of that relation involved in upper join clauses and include them in the grouping keys. Currently, the patch only checks whether a column is involved in upper join clauses but does not verify how the column is used. We need to ensure that the operator used in the join clause is at least compatible with the grouping operator; otherwise, the grouping operator might interpret the values as the same while the join operator sees them as different. Thanks Richard
pgsql-hackers by date: