Thread: Trimming the cost of ORDER BY in a simple query

Trimming the cost of ORDER BY in a simple query

From
Date:
Hello,

I have a simple query with a pretty high cost (EXPLAIN ...), and I'm
wondering if I can somehow trim it.

Query (shows the last 7 dates):

=> SELECT DISTINCT date_part('year',  uu.add_date),  date_part('month',
uu.add_date),  date_part('day',   uu.add_date)  FROM user_url uu  WHERE
uu.user_id=1 ORDER BY  date_part('year',  uu.add_date) DESC, 
date_part('month', uu.add_date) DESC,  date_part('day',   uu.add_date)
DESC  LIMIT 7;

QUERY PLAN:

---------------------------------------------------------------------------------------------------------------------------------------------Limit
(cost=4510.14..4522.93 rows=2 width=8) (actual
 
time=19.924..20.160 rows=7 loops=1)  ->  Unique  (cost=4510.14..4522.93 rows=2 width=8) (actual
time=19.919..20.139 rows=7 loops=1)        ->  Sort  (cost=4510.14..4513.34 rows=1279 width=8) (actual
time=19.915..20.004 rows=78 loops=1)              Sort Key: date_part('year'::text, add_date),
date_part('month'::text, add_date), date_part('day'::text, add_date)              ->  Index Scan using foo on user_url
uu
 
(cost=0.00..4444.14 rows=1279 width=8) (actual time=0.095..14.761
rows=1225 loops=1)                    Index Cond: (user_id = 1)Total runtime: 20.313 ms
(7 rows)


It looks like the cost is all in ORDER BY, and if I remove ORDER BY the
execution time goes from 20-90 ms to less than 1 ms.

I do need the 7 most recent add_dates.  Is there a more efficient way
of grabbing them?

Thanks,
Otis




Re: Trimming the cost of ORDER BY in a simple query

From
Mischa Sandberg
Date:
Quoting ogjunk-pgjedan@yahoo.com: 
> I have a simple query with a pretty high cost (EXPLAIN ...), and I'm 
> wondering if I can somehow trim it. 
>  
> Query (shows the last 7 dates): 
>  
> => SELECT DISTINCT date_part('year',  uu.add_date),  
> date_part('month', 
> uu.add_date),  date_part('day',   uu.add_date)  FROM user_url uu  
> WHERE 
> uu.user_id=1 ORDER BY  date_part('year',  uu.add_date) DESC,  
> date_part('month', uu.add_date) DESC,  date_part('day',   
> uu.add_date) 
> DESC  LIMIT 7; 
>  
> QUERY PLAN: 
> 

---------------------------------------------------------------------------------------------------------------------------------------------

>  Limit  (cost=4510.14..4522.93 rows=2 width=8) (actual 
> time=19.924..20.160 rows=7 loops=1) 
>    ->  Unique  (cost=4510.14..4522.93 rows=2 width=8) (actual 
> time=19.919..20.139 rows=7 loops=1) 
>          ->  Sort  (cost=4510.14..4513.34 rows=1279 width=8) (actual 
> time=19.915..20.004 rows=78 loops=1) 
>                Sort Key: date_part('year'::text, add_date), 
> date_part('month'::text, add_date), date_part('day'::text, add_date) 
>                ->  Index Scan using foo on user_url uu  
> (cost=0.00..4444.14 rows=1279 width=8) (actual time=0.095..14.761 
> rows=1225 loops=1) 
>                      Index Cond: (user_id = 1) 
>  Total runtime: 20.313 ms 
> (7 rows) 
>  
>  
> It looks like the cost is all in ORDER BY, and if I remove ORDER BY 
> the 
> execution time goes from 20-90 ms to less than 1 ms. 
>  
> I do need the 7 most recent add_dates.  Is there a more efficient 
> way 
> of grabbing them? 
The query analyzer is using the sort to detect and return distinct 
values, as well. So there's not much point in trying to remove it. 



Re: Trimming the cost of ORDER BY in a simple query

From
Ragnar Hafstað
Date:
On Mon, 2005-05-02 at 21:35 -0700, ogjunk-pgjedan@YAHOO.COM wrote:

> Query (shows the last 7 dates):
> 
> => SELECT DISTINCT date_part('year',  uu.add_date),  date_part('month',
> uu.add_date),  date_part('day',   uu.add_date)  FROM user_url uu  WHERE
> uu.user_id=1 ORDER BY  date_part('year',  uu.add_date) DESC, 
> date_part('month', uu.add_date) DESC,  date_part('day',   uu.add_date)
> DESC  LIMIT 7;
> 

i assume add_date is a timestamp, because otherwise you could
ORDER by add_date, and use an index on (desc, add_date):
SELECT DISTINCT .... WHERE uu.user_id=1 
ORDER BY uu.user_id DESC ,uu.add_date DESC LIMIT 7;

this will not work for a timestamp add_date

> QUERY PLAN:
...
>  Total runtime: 20.313 ms

20 ms does not sound like a big problem.

if 20 ms is really too much for you, and add_date IS a timestamp,
you might think about adding a date column , maintained
by triggers of by your application, and add an index on (user_id, x)

another possibility (total guess) is a functional index
create index func_id_date on user_url(user_id,(add_date::date));
SELECT ... ORDER BY uu.user_id DESC ,uu.add_date::date DESC LIMIT 7;
I have no idea if this will work, or if the planner will use such
an index.

gnari




Re: Trimming the cost of ORDER BY in a simple query

From
Harald Fuchs
Date:
In article <1115104938.5853.25.camel@localhost.localdomain>,
Ragnar Hafstað <gnari@simnet.is> writes:

> 20 ms does not sound like a big problem.

> if 20 ms is really too much for you, and add_date IS a timestamp,
> you might think about adding a date column , maintained
> by triggers of by your application, and add an index on (user_id, x)

Maybe a functional index on "date_trunc('day', add_date)" would be helpful?



Re: Trimming the cost of ORDER BY in a simple query

From
Ezequiel Tolnay
Date:
Ragnar Hafstað wrote:
> another possibility (total guess) is a functional index
> create index func_id_date on user_url(user_id,(add_date::date));
> SELECT ... ORDER BY uu.user_id DESC ,uu.add_date::date DESC LIMIT 7;
> I have no idea if this will work, or if the planner will use such
> an index.
To make sure the index is used you could build the query using a 
subquery like this:

SELECT DISTINCT date_part('year',  add_date),  date_part('month', 
add_date),  date_part('day', add_date)
FROM ( SELECT uu.add_date::date as add_date FROM user_url uu GROUP BY uu.user_id, uu.add_date HAVING uu.user_id=1 ORDER
BYuu.add_date::date DESC LIMIT 7
 
) x;

Perhaps a select distinct instead of the group by in the subquery would 
make use the index?

Cheers,

Ezequiel Tolnay