Thanks so much for working on this! While looking at the negative stadistinct conversion, I was wondering if we might run into a potential edge case with multi-level nested subqueries. What do you think?
/* Convert negative stadistinct to absolute count */
I complained in [1] that some TPC-DS queries suffer from very poor cardinality estimates on CTE scan filters, to the point that simply disabling nestloop makes some queries run hundreds of times faster. Here's a simple reproduction:
create table t (a int, b int, c int); insert into t select i%2, i, i from generate_series(1,1000) i; analyze t;
explain analyze with cte as (select a, b, avg(c) as avg from t group by a, b) select * from cte t1, cte t2 where t1.a = 1 and t2.a = 1 and t1.avg = t2.avg;
Column 'a' has only 2 distinct values, so the filter a=1 on the 1000-row CTE output should estimate ~500 rows (assuming these values are equally common). Instead, the CTE scan estimates 5 rows (1000 * 1/200) because examine_simple_variable returns early when the subquery has GROUP BY, and selectivity estimation falls back on 1/DEFAULT_NUM_DISTINCT.
As a result, this query ends up with a Nested Loop plan, and the Execution Time is 192.907 ms.
For DISTINCT or GROUP BY key columns that are simple Vars, I think we can propagate stadistinct from the base table, because the set of distinct values is preserved after grouping. MCV frequencies, histograms, and correlation data are not valid since GROUP BY and DISTINCT change the frequency distribution, but with stadistinct alone, callers like var_eq_const() can use a 1/ndistinct estimate rather than 1/DEFAULT_NUM_DISTINCT.
Attached is a patch to do this. With the patch, the example above estimates 500 rows ...