Re: limit and query planner - Mailing list pgsql-general

From David Rowley
Subject Re: limit and query planner
Date
Msg-id CAKJS1f8PDJRJtJLtA63MaBhsoSg4cNpp=SOc8NAJJX+c-SnOAQ@mail.gmail.com
Whole thread Raw
In response to Re: limit and query planner  (armand pirvu <armand.pirvu@gmail.com>)
List pgsql-general
On 6 June 2018 at 07:17, armand pirvu <armand.pirvu@gmail.com> wrote:
> So since
> select count(*) from sp_i2birst_reg_staging_test;
> count
> -------
>  6860
> and
> select count(*) from sp_i2birst_reg_staging_test where evt_id = 'ACSF17'
> and status=0 ;
> count
> -------
>  4239
>
> That means to me I fetch almost the whole table and then I fall in the case you described seqscan-and-sort wins over
indexscan.
 

The planner simply assumes that 1 in (6860.0 / 4239.0) rows matches
your WHERE clause. Since you want 10 rows from the LIMIT, it thinks
it'll just need to read 17 rows from the heap to answer your query.

> explain analyze select * from sp_i2birst_reg_staging_test where evt_id = 'ACSF17'
> and status=1;
>                                                                  QUERY PLAN
>  Index Scan using fooidx on sp_i2birst_reg_staging_test  (cost=0.28..202.91 rows=500 width=519) (actual
time=0.097..0.527rows=500 loops=1)
 
>    Index Cond: (((evt_id)::text = 'ACSF17'::text) AND (status = 1))
>  Planning time: 1.024 ms
>  Execution time: 0.766 ms
> this gets 500 rows out of 6860
>
> explain analyze select * from sp_i2birst_reg_staging_test where evt_id = 'ACSF17'
> and status=1 limit 10;
>
>                                                                    QUERY PLAN
>  Limit  (cost=0.28..4.33 rows=10 width=519) (actual time=0.073..0.105 rows=10 loops=1)
>    ->  Index Scan using fooidx on sp_i2birst_reg_staging_test  (cost=0.28..202.91 rows=500 width=519) (actual
time=0.072..0.101rows=10 loops=1)
 
>          Index Cond: (((evt_id)::text = 'ACSF17'::text) AND (status = 1))
>  Planning time: 0.280 ms
>  Execution time: 0.173 ms
>
> Back to the original
>
> explain analyze select * from sp_i2birst_reg_staging_test where evt_id = 'ACSF17'
> and status=0 limit 1 ;
>
>                                                             QUERY PLAN
>  Limit  (cost=0.00..0.13 rows=1 width=519) (actual time=0.021..0.021 rows=1 loops=1)
>    ->  Seq Scan on sp_i2birst_reg_staging_test  (cost=0.00..548.40 rows=4239 width=519) (actual time=0.019..0.019
rows=1loops=1)
 
>          Filter: (((evt_id)::text = 'ACSF17'::text) AND (status = 0))
>          Rows Removed by Filter: 1
>  Planning time: 0.286 ms
>  Execution time: 0.110 ms

For the more restrictive status, the planner thinks more rows will
need to be looked at, which increases the cost of the seqscan, and the
planner favours the index scan.  You can see the planner estimates 500
rows will match the status=1 query. So thinks ceil(1 *  (6860 / 500.0)
* 10) = 138 rows will need looked at in the seqscan plan. That's
obviously more costly than 17 rows. So the index scan begins to look
more favourable.

The planner always assumes the rows are evenly distributed, which is
not always the case. If your ten rows were at the end of the heap,
then the seqscan would be a bad plan. In this case "Rows Removed by
Filter" would be high.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Code of Conduct plan
Next
From: Chris Travers
Date:
Subject: Re: Code of Conduct plan