Re: When are index scans used over seq scans? - Mailing list pgsql-performance
From | Richard van den Berg |
---|---|
Subject | Re: When are index scans used over seq scans? |
Date | |
Msg-id | 42679922.3040300@trust-factory.com Whole thread Raw |
In response to | Re: When are index scans used over seq scans? (John A Meinel <john@arbash-meinel.com>) |
List | pgsql-performance |
Thanks a lot John for the correct search terms. :-) The suggestion in http://archives.postgresql.org/pgsql-performance/2005-04/msg00029.php to add a constraint that checks (finishtime >= starttime) does not make a difference for me. Still seq scans are used. The width solution explained in http://archives.postgresql.org/pgsql-performance/2005-04/msg00027.php and http://archives.postgresql.org/pgsql-performance/2005-04/msg00116.php does make a huge difference when selecting 1 timestamp using a BETWEEN (2ms vs 2sec), but as soon as I put 2 timestamps in a table and try a join, everything goes south (7.7sec). I have 10k timestamps in the duration table. :-( I'm getting more confused on how the planner decides to use indexes. For example, if I try: explain analyze select us.oid from sessions us where '2005-04-10 23:11:00' between us.starttimetrunc and us.finishtimetrunc; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------- Index Scan using sessions_st_ft_idx2 on sessions us (cost=0.00..18320.73 rows=4765 width=4) (actual time=0.063..2.455 rows=279 loops=1) Index Cond: (('2005-04-10 23:11:00'::timestamp without time zone <= finishtimetrunc) AND ('2005-04-10 23:11:00'::timestamp without time zone >= starttimetrunc)) Total runtime: 2.616 ms is uses the index! However, if I change the date it does not: explain analyze select us.oid from sessions us where '2005-04-09 23:11:00' between us.starttimetrunc and us.finishtimetrunc; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------- Seq Scan on sessions us (cost=0.00..68173.04 rows=41575 width=4) (actual time=553.424..1981.695 rows=64 loops=1) Filter: (('2005-04-09 23:11:00'::timestamp without time zone >= starttimetrunc) AND ('2005-04-09 23:11:00'::timestamp without time zone <= finishtimetrunc)) Total runtime: 1981.802 ms The times in sessions go from '2005-04-04 00:00:00' to '2005-04-10 23:59:00' so both are valid times to query for, but April 10th is more towards the end. A little experimenting shows that if I go earlier than '2005-04-10 13:26:15' seq scans are being used. I was thinking this timestamp would have something to do with the histogram_bounds in pg_stats, but I cannot find a match: starttimetrunc | {"2005-04-04 00:05:00","2005-04-04 11:49:00","2005-04-04 22:03:00","2005-04-05 10:54:00","2005-04-05 21:08:00","2005-04-06 10:28:00","2005-04-07 01:57:00","2005-04-07 15:55:00","2005-04-08 10:18:00","2005-04-08 17:12:00","2005-04-10 23:57:00"} finishtimetrunc | {"2005-04-04 00:05:00.93","2005-04-04 11:53:00.989999","2005-04-04 22:35:00.38","2005-04-05 11:13:00.029999","2005-04-05 21:31:00.989999","2005-04-06 10:45:01","2005-04-07 02:08:08.25","2005-04-07 16:20:00.93","2005-04-08 10:25:00.409999","2005-04-08 17:15:00.949999","2005-04-11 02:08:19"} -- Richard van den Berg, CISSP ------------------------------------------- Trust Factory B.V. | www.dna-portal.net Bazarstraat 44a | www.trust-factory.com 2518AK The Hague | Phone: +31 70 3620684 The Netherlands | Fax : +31 70 3603009 ------------------------------------------- Have you visited our new DNA Portal? -------------------------------------------
pgsql-performance by date: