slow queries on large syslog table - Mailing list pgsql-general
From | colm ennis |
---|---|
Subject | slow queries on large syslog table |
Date | |
Msg-id | DKEIIKIBPBFHLPDHMIMGEEPBCHAA.colm.ennis@eircom.net Whole thread Raw |
List | pgsql-general |
hi all, my selects using the config below are in some cases extremely slow, i would love if someone would point out ways to speed this up bit. note ive only recently started using postgresql, so feel free to point out the stupid mistakes ive made. im using a postgresql db to store cisco syslogs which are pumped in by msyslog. my postgresql is version 7.1.3, is running on freebsd 4.3 and is the standard ports install. my tables are : create table syslog_table (stimestamp timestamp,shostid integer,sciscomsgid integer, smsg text); create table ciscomsg_table (sciscomsgid serial,sciscomsg varchar(128)); create table host_table (shostid serial,shost varchar(128)); ive created a trigger on insertion to syslog_table which extracts ciscomsg, and updates the other two tables. as you can appreciate my syslog table can be quite large and is constantly growing, it currently has about 1.7 million rows. both ciscomsg_table and host_table have ~ 80 rows. ive also created a web interface for selecting syslogs based on optional combinations of timestamp, hostname and ciscomsg. to speed queries i created some indexes on syslog_table : create index syslog_table_stimestamp_index on syslog_table (stimestamp); create index syslog_table_shostid_index on syslog_table (shostid); create index syslog_table_sciscomsgid_index on syslog_table (sciscomsgid); create index syslog_table_shostid_sciscomsgid_index on syslog_table (shostid,sciscomsgid); ive just performed a vacuum analyse for this mail, although i guess syslog insertions will have been blocked for some of the time this running, so im trying to avoid doing this. ok now for queries, row counts, performance and explains. query - SELECT stimestamp,shostid,smsg FROM syslog_table WHERE (stimestamp >= '2001-12-13 00:00'::timestamp) AND (shostid IN (23,3)) ORDER BY stimestamp DESC LIMIT 1000 matching messages - 7212 query time(s) - 39 explain - Limit (cost=0.00..34032.89 rows=1000 width=24) -> Index Scan Backward using syslog_table_stimestamp_index on syslog_table (cost=0.00..111846.03 rows=3286 width=24) query - SELECT stimestamp,shostid,smsg FROM syslog_table WHERE (stimestamp >= '2001-12-13 00:00'::timestamp) AND (shostid IN (23,3)) AND (sciscomsgid IN (41,32,70)) ORDER BY stimestamp DESC LIMIT 1000 matching messages - 5 query time(s) - 79 explain - Limit (cost=75828.24..75828.24 rows=437 width=24) -> Sort (cost=75828.24..75828.24 rows=438 width=24) -> Seq Scan on syslog_table (cost=0.00..75809.05 rows=438 width=24) query - SELECT stimestamp,shostid,smsg FROM syslog_table WHERE (stimestamp >= '2001-12-06 00:00'::timestamp) AND (shostid IN (23,3)) AND (sciscomsgid IN (41,32,70)) ORDER BY stimestamp DESC LIMIT 1000 matching messages - 9 query time(s) - 73 explain - Limit (cost=0.00..66287.05 rows=1000 width=24) -> Index Scan Backward using syslog_table_stimestamp_index on syslog_table (cost=0.00..421735.44 rows=6362 width=24) query - SELECT stimestamp,shostid,smsg FROM syslog_table WHERE (stimestamp >= '2001-11-13 00:00'::timestamp) AND (shostid IN (23,3)) AND (sciscomsgid IN (41,32,70)) ORDER BY stimestamp DESC LIMIT 1000 matching messages - 19 query time(s) - 224 explain - Limit (cost=0.00..34831.93 rows=1000 width=24) -> Index Scan Backward using syslog_table_stimestamp_index on syslog_table (cost=0.00..581118.60 rows=16684 width=24) query - SELECT stimestamp,shostid,smsg FROM syslog_table WHERE (shostid IN (23,3)) AND (sciscomsgid IN (41,32,70)) ORDER BY stimestamp DESC LIMIT 1000 total messages/matching messages - /19 query time(s) - 225 explain - Limit (cost=0.00..34559.46 rows=1000 width=24) -> Index Scan Backward using syslog_table_stimestamp_index on syslog_table (cost=0.00..577149.86 rows=16700 width=24) i dont really know enough to be able to interpret the explain output but clearly only the syslog_table_stimestamp_index index is being used. as you can see the query times are terrible and approach browser timeouts. please, does anyone have any ideas on how to speed this up? thanks in advance for any help i get, colm ennis
pgsql-general by date: