On Tue, 11 Oct 2022 at 16:13, gzh <gzhcoder@126.com> wrote:
> new=# explain analyze select 2 from analyze_word_reports where (cseid = 94) limit 1;
> Limit (cost=0.00..0.43 rows=1 width=4) (actual time=2156.964..2156.966 rows=1 loops=1)
> -> Seq Scan on analyze_word_reports (cost=0.00..528550.75 rows=1223533 width=4) (actual time=2156.962..2156.964
rows=1loops=
> 1)
> Filter: (cseid = 94)
> Rows Removed by Filter: 18320180 Planning Time: 0.086 ms Execution Time: 2156.985 ms
It's a bit unfortunate that the planner picked this plan. I can
recreate the problem on the master branch with:
create table t1 (a int, b int);
insert into t1 select x,x from generate_Series(1,10000000)x;
insert into t1 select 0,0 from generate_Series(1,10000000)x;
analyze t1;
create index on t1(a);
set synchronize_seqscans=off;
explain analyze select * from t1 where a=0 limit 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..0.03 rows=1 width=8) (actual
time=1865.838..1865.840 rows=1 loops=1)
-> Seq Scan on t1 (cost=0.00..338496.00 rows=10076667 width=8)
(actual time=1865.831..1865.831 rows=1 loops=1)
Filter: (a = 0)
Rows Removed by Filter: 10000000
Planning Time: 1.507 ms
Execution Time: 1866.326 ms
(6 rows)
What seems to be going on is that the index path is considered on the
base relation, but it's rejected by add_path() due to the costs being
higher than the seq scan costs.
I see even after dropping random_page_cost right down to 0.0 that we
do start to keep the Index path as a base relation path, but then the
LimitPath with the Seqscan subpath wins out over the LimitPath with
the index scan due to the Index scan having a higher startup cost.
It feels like something is a bit lacking in our cost model here. I'm
just not sure what that is.
David