Re: Slow query and indexes... - Mailing list pgsql-general
From | Jim Nasby |
---|---|
Subject | Re: Slow query and indexes... |
Date | |
Msg-id | D21FC8AF-1C6F-47FB-B44C-76978B612C26@decibel.org Whole thread Raw |
In response to | Re: Slow query and indexes... ("Jonas Henriksen" <jonas.f.henriksen@gmail.com>) |
List | pgsql-general |
There are other ways to influence the selection of a seqscan, notably effective_cache_size and random_page_cost. First, you need to find out at what point a seqscan is actually faster than an index scan. That's going to be a trial and error search, but eventually if you're going back far enough in time the seqscan will be faster. EXPLAIN ANALYZE has it's own overhead, so a better way to test this is with psql's timing command, and wrap the query into a count so you're not shoving a bunch of data across to psql: SELECT count(*) FROM (... your query goes here ...) a; (SELECT 1 might work too and would be more accurate) Once you've found the break even point, you can tweak all the cost estimates. Start by making sure that effective_cache_size is set approximately to how much memory you have. Increasing that will favor an index scan. Decreasing random_page_cost will also favor an index scan, though I'd try not to go below 2 and definitely not below 1. You can also tweak the CPU cost estimates (lower numbers will favor indexes). But keep in mind that doing that at a system level will impact every query running in the system. You may have no choice but to explicitly set custom parameters for just this statement. SET LOCAL and wrapping the SELECT in a transaction is a less painful way to do that. On May 7, 2007, at 10:47 AM, Jonas Henriksen wrote: > Well thanks, but that don't help me much. > > I've tried setting an extra condition using datetime>(now() - '14 > weeks'::interval) > > explain analyze > SELECT max(date_time) FROM data_values > where date_time > (now() - '14 weeks'::interval) > GROUP BY data_logger_id; > > HashAggregate (cost=23264.52..23264.55 rows=2 width=12) (actual > time=1691.447..1691.454 rows=3 loops=1) > -> Bitmap Heap Scan on data_values (cost=7922.08..21787.31 > rows=295442 width=12) (actual time=320.643..951.043 rows=298589 > loops=1) > Recheck Cond: (date_time > (now() - '98 days'::interval)) > -> Bitmap Index Scan on data_values_data_date_time_index > (cost=0.00..7848.22 rows=295442 width=0) (actual time=319.708..319.708 > rows=298589 loops=1) > Index Cond: (date_time > (now() - '98 days'::interval)) > Total runtime: 1691.598 ms > > However, when I switch to using datetime>(now() - '15 > weeks'::interval) I get: > explain analyze > SELECT max(date_time) FROM data_values > where date_time > (now() - '15 weeks'::interval) > GROUP BY data_logger_id; > > HashAggregate (cost=23798.26..23798.28 rows=2 width=12) (actual > time=3237.816..3237.823 rows=3 loops=1) > -> Seq Scan on data_values (cost=0.00..22084.62 rows=342728 > width=12) (actual time=0.037..2409.234 rows=344111 loops=1) > Filter: (date_time > (now() - '105 days'::interval)) > Total runtime: 3237.944 ms > > Doing "SET enable_seqscan=off" speeds up the query and forces the use > of the index, but I dont really love that solution... > > > regards Jonas:)) > > > > > On 5/7/07, Peter Eisentraut <peter_e@gmx.net> wrote: >> Am Montag, 7. Mai 2007 15:53 schrieb Jonas Henriksen: >> > while if I add a GROUP BY data_logger the query uses a seq scan >> and a >> > >> > lot of time: >> > >> explain analyze SELECT max(date_time) FROM data_values GROUP BY >> > data_logger_id; >> >> I don't think there is anything you can do about this. >> >> -- >> Peter Eisentraut >> http://developer.postgresql.org/~petere/ >> > > ---------------------------(end of > broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings > -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
pgsql-general by date: