Thread: Trimming the cost of ORDER BY in a simple query
Hello, I have a simple query with a pretty high cost (EXPLAIN ...), and I'm wondering if I can somehow trim it. Query (shows the last 7 dates): => SELECT DISTINCT date_part('year', uu.add_date), date_part('month', uu.add_date), date_part('day', uu.add_date) FROM user_url uu WHERE uu.user_id=1 ORDER BY date_part('year', uu.add_date) DESC, date_part('month', uu.add_date) DESC, date_part('day', uu.add_date) DESC LIMIT 7; QUERY PLAN: ---------------------------------------------------------------------------------------------------------------------------------------------Limit (cost=4510.14..4522.93 rows=2 width=8) (actual time=19.924..20.160 rows=7 loops=1) -> Unique (cost=4510.14..4522.93 rows=2 width=8) (actual time=19.919..20.139 rows=7 loops=1) -> Sort (cost=4510.14..4513.34 rows=1279 width=8) (actual time=19.915..20.004 rows=78 loops=1) Sort Key: date_part('year'::text, add_date), date_part('month'::text, add_date), date_part('day'::text, add_date) -> Index Scan using foo on user_url uu (cost=0.00..4444.14 rows=1279 width=8) (actual time=0.095..14.761 rows=1225 loops=1) Index Cond: (user_id = 1)Total runtime: 20.313 ms (7 rows) It looks like the cost is all in ORDER BY, and if I remove ORDER BY the execution time goes from 20-90 ms to less than 1 ms. I do need the 7 most recent add_dates. Is there a more efficient way of grabbing them? Thanks, Otis
Quoting ogjunk-pgjedan@yahoo.com: > I have a simple query with a pretty high cost (EXPLAIN ...), and I'm > wondering if I can somehow trim it. > > Query (shows the last 7 dates): > > => SELECT DISTINCT date_part('year', uu.add_date), > date_part('month', > uu.add_date), date_part('day', uu.add_date) FROM user_url uu > WHERE > uu.user_id=1 ORDER BY date_part('year', uu.add_date) DESC, > date_part('month', uu.add_date) DESC, date_part('day', > uu.add_date) > DESC LIMIT 7; > > QUERY PLAN: > --------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=4510.14..4522.93 rows=2 width=8) (actual > time=19.924..20.160 rows=7 loops=1) > -> Unique (cost=4510.14..4522.93 rows=2 width=8) (actual > time=19.919..20.139 rows=7 loops=1) > -> Sort (cost=4510.14..4513.34 rows=1279 width=8) (actual > time=19.915..20.004 rows=78 loops=1) > Sort Key: date_part('year'::text, add_date), > date_part('month'::text, add_date), date_part('day'::text, add_date) > -> Index Scan using foo on user_url uu > (cost=0.00..4444.14 rows=1279 width=8) (actual time=0.095..14.761 > rows=1225 loops=1) > Index Cond: (user_id = 1) > Total runtime: 20.313 ms > (7 rows) > > > It looks like the cost is all in ORDER BY, and if I remove ORDER BY > the > execution time goes from 20-90 ms to less than 1 ms. > > I do need the 7 most recent add_dates. Is there a more efficient > way > of grabbing them? The query analyzer is using the sort to detect and return distinct values, as well. So there's not much point in trying to remove it.
On Mon, 2005-05-02 at 21:35 -0700, ogjunk-pgjedan@YAHOO.COM wrote: > Query (shows the last 7 dates): > > => SELECT DISTINCT date_part('year', uu.add_date), date_part('month', > uu.add_date), date_part('day', uu.add_date) FROM user_url uu WHERE > uu.user_id=1 ORDER BY date_part('year', uu.add_date) DESC, > date_part('month', uu.add_date) DESC, date_part('day', uu.add_date) > DESC LIMIT 7; > i assume add_date is a timestamp, because otherwise you could ORDER by add_date, and use an index on (desc, add_date): SELECT DISTINCT .... WHERE uu.user_id=1 ORDER BY uu.user_id DESC ,uu.add_date DESC LIMIT 7; this will not work for a timestamp add_date > QUERY PLAN: ... > Total runtime: 20.313 ms 20 ms does not sound like a big problem. if 20 ms is really too much for you, and add_date IS a timestamp, you might think about adding a date column , maintained by triggers of by your application, and add an index on (user_id, x) another possibility (total guess) is a functional index create index func_id_date on user_url(user_id,(add_date::date)); SELECT ... ORDER BY uu.user_id DESC ,uu.add_date::date DESC LIMIT 7; I have no idea if this will work, or if the planner will use such an index. gnari
In article <1115104938.5853.25.camel@localhost.localdomain>, Ragnar Hafstað <gnari@simnet.is> writes: > 20 ms does not sound like a big problem. > if 20 ms is really too much for you, and add_date IS a timestamp, > you might think about adding a date column , maintained > by triggers of by your application, and add an index on (user_id, x) Maybe a functional index on "date_trunc('day', add_date)" would be helpful?
Ragnar Hafstað wrote: > another possibility (total guess) is a functional index > create index func_id_date on user_url(user_id,(add_date::date)); > SELECT ... ORDER BY uu.user_id DESC ,uu.add_date::date DESC LIMIT 7; > I have no idea if this will work, or if the planner will use such > an index. To make sure the index is used you could build the query using a subquery like this: SELECT DISTINCT date_part('year', add_date), date_part('month', add_date), date_part('day', add_date) FROM ( SELECT uu.add_date::date as add_date FROM user_url uu GROUP BY uu.user_id, uu.add_date HAVING uu.user_id=1 ORDER BYuu.add_date::date DESC LIMIT 7 ) x; Perhaps a select distinct instead of the group by in the subquery would make use the index? Cheers, Ezequiel Tolnay