Re: Timestamp-based indexing - Mailing list pgsql-performance
From | Harmon S. Nine |
---|---|
Subject | Re: Timestamp-based indexing |
Date | |
Msg-id | 41052762.50801@netarx.com Whole thread Raw |
In response to | Re: Timestamp-based indexing ("Matthew T. O'Connor" <matthew@zeut.net>) |
List | pgsql-performance |
We were getting a little desperate, so we engaged in overkill to rule out lack-of-analyze as a cause for the slow queries. Thanks for your advice :) -- Harmon Matthew T. O'Connor wrote: > VACUUM FULL ANALYZE every 3 hours seems a little severe. You will > probably be be served just as well by VACUUM ANALYZE. But you > probably don't need the VACUUM part most of the time. You might try > doing an ANALYZE on the specific tables you are having issues with. > Since ANALYZE should be much quicker and not have the performance > impact of a VACUUM, you could do it every hour, or even every 15 minutes. > > Good luck... > > Harmon S. Nine wrote: > >> Hello -- >> >> To increase query (i.e. select) performance, we're trying to get >> postgres to use an index based on a timestamp column in a given table. >> >> Event-based data is put into this table several times a minute, with >> the timestamp indicating when a particular row was placed in the table. >> >> The table is purged daily, retaining only the rows that are less than >> 7 days old. That is, any row within the table is less than 1 week >> old (+ 1 day, since the purge is daily). >> >> A typical number of rows in the table is around 400,000. >> >> A "VACUUM FULL ANALYZE" is performed every 3 hours. >> >> >> The problem: >> We often query the table to extract those rows that are, say, 10 >> minutes old or less. >> >> Given there are 10080 minutes per week, the planner could, properly >> configured, estimate the number of rows returned by such a query to be: >> >> 10 min/ 10080 min * 400,000 = 0.001 * 400,000 = 400. >> >> Making an index scan, with the timestamp field the index, far faster >> then a sequential scan. >> >> >> However, we can't get the planner to do an timestamp-based index scan. >> >> Anyone know what to do? >> >> >> Here's the table specs: >> >> monitor=# \d "eventtable" >> Table "public.eventtable" >> Column | Type | >> Modifiers >> -----------+-----------------------------+-------------------------------------------------------------- >> >> timestamp | timestamp without time zone | not null default >> ('now'::text)::timestamp(6) with time zone >> key | bigint | not null default >> nextval('public."eventtable_key_seq"'::text) >> propagate | boolean | >> facility | character(10) | >> priority | character(10) | >> host | character varying(128) | not null >> message | text | not null >> Indexes: >> "eventtable_pkey" primary key, btree ("timestamp", "key") >> "eventtable_host" btree (host) >> "eventtable_timestamp" btree ("timestamp") >> >> >> Here's a query (with "explain analyze"): >> >> monitor=# explain analyze select * from "eventtable" where timestamp >> > CURRENT_TIMESTAMP - INTERVAL '10 minutes'; >> QUERY PLAN >> ---------------------------------------------------------------------------------------------------------------------------- >> >> Seq Scan on "eventtable" (cost=0.00..19009.97 rows=136444 width=155) >> (actual time=11071.073..11432.522 rows=821 loops=1) >> Filter: (("timestamp")::timestamp with time zone > >> (('now'::text)::timestamp(6) with time zone - '@ 10 mins'::interval)) >> Total runtime: 11433.384 ms >> (3 rows) >> >> >> Here's something strange. We try to disable sequential scans, but to >> no avail. The estimated cost skyrockets, though: >> >> monitor=# set enable_seqscan = false; >> SET >> monitor=# explain analyze select * from "eventtable" where timestamp >> > CURRENT_TIMESTAMP - INTERVAL '10 minutes'; >> QUERY PLAN >> ------------------------------------------------------------------------------------------------------------------------------------- >> >> Seq Scan on "eventtable" (cost=100000000.00..100019009.97 >> rows=136444 width=155) (actual time=9909.847..9932.438 rows=1763 >> loops=1) >> Filter: (("timestamp")::timestamp with time zone > >> (('now'::text)::timestamp(6) with time zone - '@ 10 mins'::interval)) >> Total runtime: 9934.353 ms >> (3 rows) >> >> monitor=# set enable_seqscan = true; >> SET >> monitor=# >> >> >> >> Any help is greatly appreciated :) >> >> -- Harmon >> >> >> >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 3: if posting/reading through Usenet, please send an appropriate >> subscribe-nomail command to majordomo@postgresql.org so that your >> message can get through to the mailing list cleanly >> >
pgsql-performance by date: