Thread: index usage for min() vs. "order by asc limit 1"

index usage for min() vs. "order by asc limit 1"

From

Ben Chobot

Date:

17 November 2011, 21:13:02

I have two queries in PG 9.1. One uses an index like I would like, the other does not. Is this expected behavior? If
so,is there any way around it?  


postgres=# explain analyze select min(id) from delayed_jobs where strand='sis_batch:account:15' group by strand;
                                                        QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=0.00..8918.59 rows=66 width=29) (actual time=226.759..226.760 rows=1 loops=1)
   ->  Seq Scan on delayed_jobs  (cost=0.00..8553.30 rows=72927 width=29) (actual time=0.014..169.941 rows=72268
loops=1)
         Filter: ((strand)::text = 'sis_batch:account:15'::text)
 Total runtime: 226.817 ms
(4 rows)

postgres=# explain analyze select id from delayed_jobs where strand='sis_batch:account:15' order by id limit 1;
                                                                       QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..0.33 rows=1 width=8) (actual time=0.097..0.098 rows=1 loops=1)
   ->  Index Scan using index_delayed_jobs_on_strand on delayed_jobs  (cost=0.00..24181.74 rows=72927 width=8) (actual
time=0.095..0.095rows=1 loops=1) 
         Index Cond: ((strand)::text = 'sis_batch:account:15'::text)
 Total runtime: 0.129 ms
(4 rows)

Re: index usage for min() vs. "order by asc limit 1"

From

Steve Atkins

Date:

17 November 2011, 21:20:52

On Nov 17, 2011, at 5:12 PM, Ben Chobot wrote:

> I have two queries in PG 9.1. One uses an index like I would like, the other does not. Is this expected behavior? If
so,is there any way around it?  

I don't think you want the group by in that first query.

Cheers,
  Steve

>
>
> postgres=# explain analyze select min(id) from delayed_jobs where strand='sis_batch:account:15' group by strand;
>                                                        QUERY PLAN
>
--------------------------------------------------------------------------------------------------------------------------
> GroupAggregate  (cost=0.00..8918.59 rows=66 width=29) (actual time=226.759..226.760 rows=1 loops=1)
>   ->  Seq Scan on delayed_jobs  (cost=0.00..8553.30 rows=72927 width=29) (actual time=0.014..169.941 rows=72268
loops=1)
>         Filter: ((strand)::text = 'sis_batch:account:15'::text)
> Total runtime: 226.817 ms
> (4 rows)
>
> postgres=# explain analyze select id from delayed_jobs where strand='sis_batch:account:15' order by id limit 1;
>                                                                       QUERY PLAN
>
---------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit  (cost=0.00..0.33 rows=1 width=8) (actual time=0.097..0.098 rows=1 loops=1)
>   ->  Index Scan using index_delayed_jobs_on_strand on delayed_jobs  (cost=0.00..24181.74 rows=72927 width=8) (actual
time=0.095..0.095rows=1 loops=1) 
>         Index Cond: ((strand)::text = 'sis_batch:account:15'::text)
> Total runtime: 0.129 ms
> (4 rows)
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

Re: index usage for min() vs. "order by asc limit 1"

From

Ben Chobot

Date:

17 November 2011, 21:24:33

On Nov 17, 2011, at 5:20 PM, Steve Atkins wrote:

I don't think you want the group by in that first query.

Heh, I tried to simply the example, but in reality that = becomes an in clause of multiple values. So the group by is needed.

postgres=# explain analyze select min(id) from delayed_jobs where strand='sis_batch:account:15' group by strand;
                                                      QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=0.00..8918.59 rows=66 width=29) (actual time=226.759..226.760 rows=1 loops=1)
-> Seq Scan on delayed_jobs (cost=0.00..8553.30 rows=72927 width=29) (actual time=0.014..169.941 rows=72268 loops=1)
       Filter: ((strand)::text = 'sis_batch:account:15'::text)
Total runtime: 226.817 ms
(4 rows)

postgres=# explain analyze select id from delayed_jobs where strand='sis_batch:account:15' order by id limit 1;
                                                                     QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..0.33 rows=1 width=8) (actual time=0.097..0.098 rows=1 loops=1)
-> Index Scan using index_delayed_jobs_on_strand on delayed_jobs (cost=0.00..24181.74 rows=72927 width=8) (actual time=0.095..0.095 rows=1 loops=1)
       Index Cond: ((strand)::text = 'sis_batch:account:15'::text)
Total runtime: 0.129 ms
(4 rows)

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: index usage for min() vs. "order by asc limit 1"

From

MirrorX

Date:

18 November 2011, 10:16:56

can you run an analyze command first and then post here the results of:
select * FROM pg_stats WHERE tablename = 'delayed_jobs';
?

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/index-usage-for-min-vs-order-by-asc-limit-1-tp5002928p5004410.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.