We have a slow performing query that we are trying to improve, and it appears to be performing a sequential scan at a point where it should be utilizing an index. Can anyone tell me why postgres is opting to do it this way?
AND collection_data.collection_context_key = a2.context_key
AND a2.ancestor_key = ?
The key relationships should all using indexed columns, but the query plan that postgres comes up with ends up performing a sequential scan on the collection_data table (in this case about 602k rows) where we would have expected it to utilize the index:
"collection_data_collection_context_key_fkey" FOREIGN KEY (collection_context_key) REFERENCES contexts(context_key) ON DELETE CASCADE
"collection_data_context_key_fkey" FOREIGN KEY (context_key) REFERENCES contexts(context_key) ON DELETE CASCADE
Can anyone suggest a way that we can get postgres to use the collection_data_context_key_index properly? I thought that it might be related to the fact that collection_data_context_key_index is a CLUSTERED index, but we did some basic experimentation that seems to indicate otherwise, i.e. the bad plan persists despite re-clustering the index.
We are using PostgreSQL 9.2.5 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.4.5-8) 4.4.5, 64-bit
Interestingly, on an instance running PostgreSQL 9.2.4 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.4.5-8) 4.4.5, 64-bit where I copied the 2 tables over to a temporary database, the plan comes out differently:
-> Index Only Scan using virtual_ancestors_pkey on virtual_ancestors a2 (cost=0.00..1220.85 rows=396 width=4) (actual time=0.025..2.732 rows=1954 loops=1)
Index Cond: (ancestor_key = 1072173)
Heap Fetches: 1954
Buffers: shared hit=1397
######## Note the index scan here - this is what it SHOULD be doing ##############
-> Index Scan using collection_data_context_key_index on collection_data (cost=0.00..79.24 rows=56 width=8) (actual time=0.004..0.005 rows=1 loops=1954)
Index Cond: (collection_context_key = a2.context_key)
Buffers: shared hit=6132
-> Index Only Scan using virtual_ancestors_pkey on virtual_ancestors a1 (cost=0.00..35.40 rows=641 width=8) (actual time=0.007..0.015 rows=6 loops=2084)
Index Cond: (ancestor_key = collection_data.context_key)
Heap Fetches: 13007
Buffers: shared hit=14929
Total runtime: 76.431 ms
Why can't I get the Postgres 9.2.5 instance to use the optimal plan?
Thanks in advance!
-- - Stefan Amshey
The first plan expects to process 600000 rows in the sequential scan, and the second plan expects to process only one, so it looks like the statistics in the first database are out of date, did you run vacuum analyse?