Home > mailing lists

Re: Propagate stadistinct through GROUP BY/DISTINCT in subqueries and CTEs - Mailing list pgsql-hackers

From	wenhui qiu
Subject	Re: Propagate stadistinct through GROUP BY/DISTINCT in subqueries and CTEs
Date	April 13 06:27:39
Msg-id	CAGjGUAK7cmV3u6YE3OBQCuhDh6hrhwY=meaEch0VDT3Kn32WDw@mail.gmail.com Whole thread
In response to	Propagate stadistinct through GROUP BY/DISTINCT in subqueries and CTEs (Richard Guo <guofenglinux@gmail.com>)
Responses	Re: Propagate stadistinct through GROUP BY/DISTINCT in subqueries and CTEs
List	pgsql-hackers

Tree view

HI Richard

> + /* Convert negative stadistinct to absolute count */
> + if (stats->stadistinct < 0)
> + {
> + RelOptInfo *baserel = find_base_rel(subroot, var->varno);
> +
> + if (baserel->tuples > 0)
> + {
> + stats->stadistinct = (float4)
> + clamp_row_est(-stats->stadistinct * baserel->tuples);
> + }
> + }

Thanks so much for working on this! While looking at the negative stadistinct conversion, I was wondering if we might run into a potential edge case with multi-level nested subqueries. What do you think?

/* Convert negative stadistinct to absolute count */

if (stats->stadistinct < 0)
{
- RelOptInfo *baserel = find_base_rel(subroot, var->varno);
+ RelOptInfo *baserel = vardata->rel;

- if (baserel->tuples > 0)
+ if (baserel && baserel->tuples > 0)
{
stats->stadistinct = (float4)
clamp_row_est(-stats->stadistinct * baserel->tuples);
}
}

Thanks

On Mon, Apr 13, 2026 at 9:19 AM Richard Guo <guofenglinux@gmail.com> wrote:

I complained in [1] that some TPC-DS queries suffer from very poor
cardinality estimates on CTE scan filters, to the point that simply
disabling nestloop makes some queries run hundreds of times faster.
Here's a simple reproduction:

create table t (a int, b int, c int);
insert into t select i%2, i, i from generate_series(1,1000) i;
analyze t;

explain analyze
with cte as (select a, b, avg(c) as avg from t group by a, b)
select * from cte t1, cte t2
where t1.a = 1 and t2.a = 1 and t1.avg = t2.avg;

Column 'a' has only 2 distinct values, so the filter a=1 on the
1000-row CTE output should estimate ~500 rows (assuming these values
are equally common). Instead, the CTE scan estimates 5 rows (1000 *
1/200) because examine_simple_variable returns early when the subquery
has GROUP BY, and selectivity estimation falls back on
1/DEFAULT_NUM_DISTINCT.

-> CTE Scan on cte t1 (cost=0.00..22.50 rows=5 width=40)
(actual time=4.874..5.053 rows=500.00 loops=1)

As a result, this query ends up with a Nested Loop plan, and the
Execution Time is 192.907 ms.

For DISTINCT or GROUP BY key columns that are simple Vars, I think we
can propagate stadistinct from the base table, because the set of
distinct values is preserved after grouping. MCV frequencies,
histograms, and correlation data are not valid since GROUP BY and
DISTINCT change the frequency distribution, but with stadistinct
alone, callers like var_eq_const() can use a 1/ndistinct estimate
rather than 1/DEFAULT_NUM_DISTINCT.

Attached is a patch to do this. With the patch, the example above
estimates 500 rows ...

-> CTE Scan on cte t1 (cost=0.00..22.50 rows=500 width=40)
(actual time=3.785..4.143 rows=500.00 loops=1)

... and chooses a Hash Join, with an Execution Time of 8.238 ms (~20x
faster).

I tested this patch on TPC-DS query 31:

-- on master:
Planning Time: 5.207 ms
Execution Time: 1536140.258 ms

-- on patched:
Planning Time: 5.140 ms
Execution Time: 1149.482 ms

Over 1300x faster.

Does this approach make sense? Any thoughts?

[1] https://postgr.es/m/CAMbWs4-QU_nkFqFZLdzWRsEsVE8aLWx4qBBVq7g4rXw+cvYDMg@mail.gmail.com

- Richard

pgsql-hackers by date:

From: vignesh C
Date: 13 April, 06:21:06
Subject: Re: EXCEPT TABLE - Case inconsistency for describe \d and \dRp+

From: Michael Paquier
Date: 13 April, 06:52:59
Subject: Re: test_compression, test module for low-level compression APIs (for 2b5ba2a0a141)

Re: Propagate stadistinct through GROUP BY/DISTINCT in subqueries and CTEs - Mailing list pgsql-hackers

Previous

Next