Home > mailing lists

Optimize cardinality estimation when unique keys are fully covered - Mailing list pgsql-hackers

From	feichanghong
Subject	Optimize cardinality estimation when unique keys are fully covered
Date	November 21, 2025 19:54:10
Msg-id	tencent_3018762E7D4C9BC470C821C829C1BF2F650A@qq.com Whole thread Raw
Responses	Re: Optimize cardinality estimation when unique keys are fully covered Re: Optimize cardinality estimation when unique keys are fully covered
List	pgsql-hackers

Tree view

Hi,

Recently, I ran into a cardinality estimation problem where the query

predicates fully cover a primary key or unique index. The issue can be

reproduced with the following case:

```sql

create table t1(a int, b int, primary key(a, b));

DO $$

DECLARE

i integer;

BEGIN

FOR i IN 1..100000 LOOP

INSERT INTO t1 VALUES (i, 0);

END LOOP;

END;

$$ LANGUAGE plpgsql;

DO $$

DECLARE

i integer;

BEGIN

FOR j IN 1..10 LOOP

FOR i IN 1..100000 LOOP

INSERT INTO t1 VALUES (i%10+10*(j-1), i);

END LOOP;

END;

$$ LANGUAGE plpgsql;

create table t2(a int);

insert into t2 select 1;

analyze t1, t2;

postgres=# explain analyze select * from t1 where a = 1 and b = 0;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------

Bitmap Heap Scan on t1 (cost=4.65..2174.00 rows=833 width=8) (actual time=0.060..0.061 rows=1.00 loops=1)

Recheck Cond: ((a = 1) AND (b = 0))

Heap Blocks: exact=1

Buffers: shared hit=4

-> Bitmap Index Scan on t1_pkey (cost=0.00..4.44 rows=833 width=0) (actual time=0.042..0.043 rows=1.00 loops=1)

Index Cond: ((a = 1) AND (b = 0))

Index Searches: 1

Buffers: shared hit=3

Planning Time: 0.146 ms

Execution Time: 0.105 ms

(10 rows)

postgres=# explain analyze select * from t1, t2 where t1.a = t2.a and t1.b = 0;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------

Nested Loop (cost=0.43..4.81 rows=32 width=12) (actual time=0.067..0.069 rows=1.00 loops=1)

Buffers: shared hit=5

-> Seq Scan on t2 (cost=0.00..1.01 rows=1 width=4) (actual time=0.024..0.025 rows=1.00 loops=1)

Buffers: shared hit=1

-> Index Only Scan using t1_pkey on t1 (cost=0.43..108.23 rows=32 width=8) (actual time=0.036..0.036 rows=1.00 loops=1)

Index Cond: ((a = t2.a) AND (b = 0))

Heap Fetches: 0

Index Searches: 1

Buffers: shared hit=4

Planning Time: 0.257 ms

Execution Time: 0.114 ms

(11 rows)

```

It can be observed that the Index Cond fully covers the primary key index.

Naturally, the correct rows estimate should be 1, but the optimizer

estimates the two conditions independently, resulting in an overestimated

row count. This overestimation has a greater impact in scenarios with

partitioned tables and multi-table joins, making the planner more inclined

to choose hash or merge joins.

We may consider checking in cardinality estimation whether the restrictlist

fully covers a unique index. If so, we can directly set the estimated rows

to 1. The attached patch provides a very early demo of this approach.

Best Regards,

Fei Changhong

Alibaba Cloud Computing Ltd.

Attachment

optimize-cardinality-fully-covered-uk.patch

pgsql-hackers by date:

From: Jacob Champion
Date: 21 November 2025, 19:38:37
Subject: Re: RFC 9266: Channel Bindings for TLS 1.3 support

From: feichanghong
Date: 21 November 2025, 20:11:56
Subject: Re: Optimize cardinality estimation when unique keys are fully covered

Optimize cardinality estimation when unique keys are fully covered - Mailing list pgsql-hackers

Attachment

Previous

Next