Surprising no use of indexes - low performance - Mailing list pgsql-performance
From | Nicolas Charles |
---|---|
Subject | Surprising no use of indexes - low performance |
Date | |
Msg-id | 511D0430.1090603@normation.com Whole thread Raw |
Responses |
Re: Surprising no use of indexes - low performance
Re: Surprising no use of indexes - low performance Re: Surprising no use of indexes - low performance |
List | pgsql-performance |
Hello, I've been struggling to understand what's happening on my databases/query for several days, and I'm turning to higher mind for a logical answer. I'm dealing with a fairly large database, containing logs informations, that I crunch to get data out of it, with several indexes on them that I hoped were logical \d ruddersysevents Table « public.ruddersysevents » Colonne | Type | Modificateurs --------------------+--------------------------+-------------------------------------------------- id | integer | non NULL Par défaut, nextval('serial'::regclass) executiondate | timestamp with time zone | non NULL nodeid | text | non NULL directiveid | text | non NULL ruleid | text | non NULL serial | integer | non NULL component | text | non NULL keyvalue | text | executiontimestamp | timestamp with time zone | non NULL eventtype | character varying(64) | policy | text | msg | text | Index : "ruddersysevents_pkey" PRIMARY KEY, btree (id) "component_idx" btree (component) "configurationruleid_idx" btree (ruleid) "executiontimestamp_idx" btree (executiontimestamp) "keyvalue_idx" btree (keyvalue) "nodeid_idx" btree (nodeid) Contraintes de vérification : "ruddersysevents_component_check" CHECK (component <> ''::text) "ruddersysevents_configurationruleid_check" CHECK (ruleid <> ''::text) "ruddersysevents_nodeid_check" CHECK (nodeid <> ''::text) "ruddersysevents_policyinstanceid_check" CHECK (directiveid <> ''::text) It contains 11018592 entries, with the followinf patterns : 108492 distinct executiontimestamp 14 distinct nodeid 59 distinct directiveid 26 distinct ruleid 35 distinct serial Related table/index size are relation | size ----------------------------------------+--------- public.ruddersysevents | 3190 MB public.nodeid_idx | 614 MB public.configurationruleid_idx | 592 MB public.ruddersysevents_pkey | 236 MB public.executiontimestamp_idx | 236 MB I'm crunching the data by looking for each nodeid/ruleid/directiveid/serial with an executiontimestamp in an interval: explain analyze select executiondate, nodeid, ruleid, directiveid, serial, component, keyValue, executionTimeStamp, eventtype, policy, msg from RudderSysEvents where 1=1 and nodeId = '31264061-5ecb-4891-9aa4-83824178f43d' and ruleId = '61713ff1-aa6f-4c86-b3cb-7012bee707dd' and serial = 10 and executiontimestamp between to_timestamp('2012-11-22 16:00:16.005', 'YYYY-MM-DD HH24:MI:SS.MS') and to_timestamp('2013-01-25 18:53:52.467', 'YYYY-MM-DD HH24:MI:SS.MS') ORDER BY executionTimeStamp asc; Sort (cost=293125.41..293135.03 rows=3848 width=252) (actual time=28628.922..28647.952 rows=62403 loops=1) Sort Key: executiontimestamp Sort Method: external merge Disk: 17480kB -> Bitmap Heap Scan on ruddersysevents (cost=74359.66..292896.27 rows=3848 width=252) (actual time=1243.150..28338.927 rows=62403 loops=1) Recheck Cond: ((nodeid = '31264061-5ecb-4891-9aa4-83824178f43d'::text) AND (ruleid = '61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text)) Filter: ((serial = 10) AND (executiontimestamp >= to_timestamp('2012-11-22 16:00:16.005'::text, 'YYYY-MM-DD HH24:MI:SS.MS'::text)) AND (executiontimestamp <= to_timestamp('2013-01-25 18:53:52.467'::text, 'YYYY-MM-DD HH24:MI:SS.MS'::text))) -> BitmapAnd (cost=74359.66..74359.66 rows=90079 width=0) (actual time=1228.610..1228.610 rows=0 loops=1) -> Bitmap Index Scan on nodeid_idx (cost=0.00..25795.17 rows=716237 width=0) (actual time=421.365..421.365 rows=690503 loops=1) Index Cond: (nodeid = '31264061-5ecb-4891-9aa4-83824178f43d'::text) -> Bitmap Index Scan on configurationruleid_idx (cost=0.00..48562.32 rows=1386538 width=0) (actual time=794.490..794.490 rows=1381391 loops=1) Index Cond: (ruleid = '61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text) Total runtime: 28657.352 ms I'm surprised that the executiontimestamp index is not used, since it seems to be where most of the query time is spent. For all my tests, I removed all the incoming logs, so that this table has only selects and no writes I'm using Postgres 8.4, on a quite smallish VM, with some process runnings, with the following non default configuration shared_buffers = 112MB work_mem = 8MB maintenance_work_mem = 48MB max_stack_depth = 3MB wal_buffers = 1MB effective_cache_size = 128MB checkpoint_segments = 6 Increasing the shared_buffers to 384, 1GB or 1500MB didn't improve the performances (less than 10%). I would have expected it to improve, since the indexes would all fit in RAM Every explain analyze made after modification of configuration and indexes where done after a complete batch of crunch of logs by our apps, to be sure the stats were corrects I tested further with the indexes, and reached these results : 1/ dropping the unique index "configurationruleid_idx" btree (ruleid), "executiontimestamp_idx" btree (executiontimestamp), "keyvalue_idx" btree (keyvalue), "nodeid_idx" btree (nodeid) to replace them by a unique index did lower the perfs : create index composite_idx on ruddersysevents (executiontimestamp, ruleid, serial, nodeid); Index Scan using composite_idx on ruddersysevents (cost=0.01..497350.22 rows=3729 width=252) (actual time=7.989..83717.901 rows=62403 loops=1) Index Cond: ((executiontimestamp >= to_timestamp('2012-11-22 16:00:16.005'::text, 'YYYY-MM-DD HH24:MI:SS.MS'::text)) AND (executiontimestamp <= to_timestamp('2013-01-25 18:53:52.467'::text, 'YYYY-MM-DD HH24:MI:SS.MS'::text)) AND (ruleid = '61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text) AND (serial = 10) AND (nodeid = '31264061-5ecb-4891-9aa4-83824178f43d'::text)) 2/ Removing nodeid from the index did lower again the perf create index composite2_idx on ruddersysevents (executiontimestamp, ruleid, serial); Index Scan using composite2_idx on ruddersysevents (cost=0.01..449948.98 rows=3729 width=252) (actual time=23.507..84888.349 rows=62403 loops=1) Index Cond: ((executiontimestamp >= to_timestamp('2012-11-22 16:00:16.005'::text, 'YYYY-MM-DD HH24:MI:SS.MS'::text)) AND (executiontimestamp <= to_timestamp('2013-01-25 18:53:52.467'::text, 'YYYY-MM-DD HH24:MI:SS.MS'::text)) AND (ruleid = '61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text) AND (serial = 10)) Filter: (nodeid = '31264061-5ecb-4891-9aa4-83824178f43d'::text) 3/ Removing executiontimestamp from the composite index makes the query performs better at the begining of its uses (around 17 secondes), but over time it degrades (I'm logging query longer than 20 secondes, and there are very rare in the first half of the batch, and getting more and more common at the end) to what is below create index composite3_idx on ruddersysevents (ruleid, serial, nodeid); Sort (cost=24683.44..24693.07 rows=3849 width=252) (actual time=60454.558..60474.013 rows=62403 loops=1) Sort Key: executiontimestamp Sort Method: external merge Disk: 17480kB -> Bitmap Heap Scan on ruddersysevents (cost=450.12..24454.23 rows=3849 width=252) (actual time=146.065..60249.143 rows=62403 loops=1) Recheck Cond: ((ruleid = '61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text) AND (serial = 10) AND (nodeid = '31264061-5ecb-4891-9aa4-83824178f43d'::text)) Filter: ((executiontimestamp >= to_timestamp('2012-11-22 16:00:16.005'::text, 'YYYY-MM-DD HH24:MI:SS.MS'::text)) AND (executiontimestamp <= to_timestamp('2013-01-25 18:53:52.467'::text, 'YYYY-MM-DD HH24:MI:SS.MS'::text))) -> Bitmap Index Scan on composite3_idx (cost=0.00..449.15 rows=6635 width=0) (actual time=129.102..129.102 rows=62403 loops=1) Index Cond: ((ruleid = '61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text) AND (serial = 10) AND (nodeid = '31264061-5ecb-4891-9aa4-83824178f43d'::text)) Total runtime: 60484.022 ms And with all these tests, the difference of the whole database processing with each index time is within 10% error margin (2h10 to 2h25), even if most of the time is spent on the SQL query. So my question is : "Why *not* indexing the column which is not used makes the query slower over time, while not slowing the application?" If you have some clues, I'm really interested. Best regards Nicolas
pgsql-performance by date: