Thread: Query results vary depending on the plan cache mode used
While working on the grouping sets patches for queries with GROUP BY items that are constants, I noticed $subject on master. As an example, consider prepare q1(int) as select $1 as c1, $1 as c2 from generate_series(1,2) t group by rollup(c1); set plan_cache_mode to force_custom_plan; execute q1(3); c1 | c2 ----+---- 3 | 3 | 3 (2 rows) set plan_cache_mode to force_generic_plan; execute q1(3); c1 | c2 ----+---- 3 | 3 | (2 rows) The reason can be seen in the plans under different modes. -- force_custom_plan explain (verbose, costs off) execute q1(3); QUERY PLAN ----------------------------------------------------- GroupAggregate Output: (3), 3 Group Key: 3 Group Key: () -> Function Scan on pg_catalog.generate_series t Output: 3 Function Call: generate_series(1, 2) (7 rows) -- force_generic_plan explain (verbose, costs off) execute q1(3); QUERY PLAN ----------------------------------------------------- GroupAggregate Output: ($1), ($1) Group Key: $1 Group Key: () -> Function Scan on pg_catalog.generate_series t Output: $1 Function Call: generate_series(1, 2) (7 rows) In custom mode, the target entry 'c2' is a Const expression, and setrefs.c does not replace it with an OUTER_VAR, despite there happens to be an identical Const below. As a result, when this OUTER_VAR goes to NULL due to the grouping sets, 'c2' remains as constant 3. Look at this code in search_indexed_tlist_for_non_var: /* * If it's a simple Const, replacing it with a Var is silly, even if there * happens to be an identical Const below; a Var is more expensive to * execute than a Const. What's more, replacing it could confuse some * places in the executor that expect to see simple Consts for, eg, * dropped columns. */ if (IsA(node, Const)) return NULL; In generic mode, the target entry 'c2' is a Param expression, and is replaced with the OUTER_VAR (indicated by the parentheses around the second '$1'). So it goes to NULL when we're grouping by the set that does not contain this Var. Is this inconsistent behavior in different plan cache modes expected, or does it indicate a bug that needs to be fixed? Thanks Richard
Richard Guo <guofenglinux@gmail.com> writes: > While working on the grouping sets patches for queries with GROUP BY > items that are constants, I noticed $subject on master. As an > example, consider This particular example seems like it's just an illustration of our known bugs with grouping sets using non-Var columns. I see no reason to blame the plan cache for it. Do you have any examples that don't depend on such a bug? (And yes, I apologize for not yet having reviewed your patch to fix the grouping-sets bug.) regards, tom lane
On Thu, Aug 1, 2024 at 10:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Richard Guo <guofenglinux@gmail.com> writes: > > While working on the grouping sets patches for queries with GROUP BY > > items that are constants, I noticed $subject on master. As an > > example, consider > > This particular example seems like it's just an illustration of > our known bugs with grouping sets using non-Var columns. I see > no reason to blame the plan cache for it. Do you have any > examples that don't depend on such a bug? Yeah, it's not the fault of the plan cache. I noticed this because in check_ungrouped_columns, both Const and Param are treated as always acceptable. However, in setrefs.c these two expression types are handled differently: Const is never matched to the lower tlist, while Param is always attempted to be matched to the lower tlist. This discrepancy causes the query in the example to behave differently depending on the plan cache mode. I'm unsure which behavior is the expected one. Another related question I have is about the behavior of the following query: select 3 as c1, 3 as c2 from generate_series(1,2) t group by rollup(c1) having 3 = 3; Should the target entry 'c2', as well as the Const expressions in havingQual, be replaced with references to the grouping key? Thanks Richard
Richard Guo <guofenglinux@gmail.com> writes: > Yeah, it's not the fault of the plan cache. I noticed this because in > check_ungrouped_columns, both Const and Param are treated as always > acceptable. However, in setrefs.c these two expression types are > handled differently: Const is never matched to the lower tlist, while > Param is always attempted to be matched to the lower tlist. This > discrepancy causes the query in the example to behave differently > depending on the plan cache mode. > I'm unsure which behavior is the expected one. I'm unsure why you think they need to have the same behavior. This code is correct given the assumption that a Const is as constant as it seems on the surface. The trouble we're hitting is that in the presence of grouping-set keys that reduce to Consts, a reference to such a key is *not* constant. The right answer to that IMO is to not represent those references as plain Const nodes. > Another related question I have is about the behavior of the following > query: > select 3 as c1, 3 as c2 from generate_series(1,2) t group by > rollup(c1) having 3 = 3; > Should the target entry 'c2', as well as the Const expressions in > havingQual, be replaced with references to the grouping key? This seems rather analogous to our occasional debates about whether textually distinct uses of "random()" should refer to the same value or different evaluations. I can't get very excited about it, because it seems to me this is an academic example not corresponding to any real use-case. The current code seems to effectively assume that the constant-instances are distinct even though textually identical, and that's fine with me. regards, tom lane