Re: Incorrect index being used - Mailing list pgsql-general
From | Jesse Long |
---|---|
Subject | Re: Incorrect index being used |
Date | |
Msg-id | 525536AF.6000909@unknown.za.net Whole thread Raw |
In response to | Re: Incorrect index being used (Albe Laurenz <laurenz.albe@wien.gv.at>) |
Responses |
Re: Incorrect index being used
Re: Incorrect index being used |
List | pgsql-general |
On 09/10/2013 12:10, Albe Laurenz wrote: > Jesse Long wrote: >> I have the following query, run immediately after executing VACUUM in >> the database. There is only one connection to the database. > You should run ANALYZE, not VACUUM. > >> The query runs for much longer than I expect it to run for, and I think >> this is due to it using the incorrect subplan. As you can see, subplans >> 1 and 3 make use of and index, but these subplans are not used. >> Subplans and 4 are seqscan, and they are used. >> >> How can I get PostgreSQL to use subplan 1 and 3? > They are only possible if an "Index Only Scan" is possible, which > can only be used if the respective table entries are visible for > all transactions. > >> testdb=> explain analyse SELECT * FROM ARCHIVE_DOCUMENT AS r0 WHERE r0.NODE_ID = 29 AND >> r0.ARCHIVE_DATE >= '2013-07-08 18:28:00' AND (EXISTS (SELECT r1.* FROM ARCHIVE_DOCUMENT_INDEX AS r1 >> WHERE r1.ARCHIVE_ID = r0.ID AND r1.NODE_ID = r0.NODE_ID AND r1.VALUE = 'BSH70002152' ) OR EXISTS ( >> SELECT r2.* FROM ARCHIVE_DOCUMENT_INDEX AS r2 WHERE r2.ARCHIVE_ID = r0.ID AND r2.NODE_ID = r0.NODE_ID >> AND r2.VALUE = 'TC212592' ) ) ORDER BY r0.ARCHIVE_DATE DESC LIMIT 10; > [...] > >> Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR (alternatives: SubPlan 3 or hashed >> SubPlan 4)) >> >> Rows Removed by Filter: 710851 >> >> SubPlan 1 >> >> -> Index Only Scan using archive_document_index_x_archive_id_node_id_value on >> archive_document_index r1 (cost=0.57..4.59 rows=1 width=0) (never executed) >> >> Index Cond: ((archive_id = r0.id) AND (node_id = r0.node_id) AND (value = >> 'BSH70002152'::text)) >> >> Heap Fetches: 0 >> >> SubPlan 2 >> >> -> Seq Scan on archive_document_index r1_1 (cost=0.00..1958104.00 rows=1520 width=16) >> (actual time=44418.383..44558.293 rows=4 loops=1) >> >> Filter: ((value)::text = 'BSH70002152'::text) >> >> Rows Removed by Filter: 95009919 >> >> SubPlan 3 >> >> -> Index Only Scan using archive_document_index_x_archive_id_node_id_value on >> archive_document_index r2 (cost=0.57..4.59 rows=1 width=0) (never executed) >> >> Index Cond: ((archive_id = r0.id) AND (node_id = r0.node_id) AND (value = >> 'TC212592'::text)) >> >> Heap Fetches: 0 >> >> SubPlan 4 >> >> -> Seq Scan on archive_document_index r2_1 (cost=0.00..1958104.00 rows=1520 width=16) >> (actual time=41659.464..41663.342 rows=1 loops=1) >> >> Filter: ((value)::text = 'TC212592'::text) >> >> Rows Removed by Filter: 95009922 > The estimates are quite off. > Does "ANALYZE archive_document", possibly after increasing > default_statistics_target, make a difference? > > Yours, > Laurenz Albe > Hi Laurenz, Thank you for the feedback. There is no problem with row visibility, there is only one connection to the database - the connection I am using to do these selects. Thanks you for the advise regarding ANALYZE. I ran ANALYZE on both tables concerned, but not much changed: QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=0.56..151.85 rows=10 width=122) (actual time=40841.984..85668.213 rows=2 loops=1) -> Index Scan Backward using idx_archive_document_x_node_id_archive_date on archive_document r0 (cost=0.56..7627640.20rows=504186 width=122) (actual time=40841.98 Index Cond: ((node_id = 29) AND (archive_date >= '2013-07-08 18:28:00'::timestamp without time zone)) Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR (alternatives: SubPlan 3 or hashed SubPlan 4)) Rows Removed by Filter: 710851 SubPlan 1 -> Index Only Scan using archive_document_index_x_archive_id_node_id_value on archive_document_index r1 (cost=0.57..4.59rows=1 width=0) (never executed) Index Cond: ((archive_id = r0.id) AND (node_id = r0.node_id) AND (value = 'BSH70002152'::text)) Heap Fetches: 0 SubPlan 2 -> Seq Scan on archive_document_index r1_1 (cost=0.00..1958101.80 rows=1568 width=16) (actual time=36633.365..40841.909rows=4 loops=1) Filter: ((value)::text = 'BSH70002152'::text) Rows Removed by Filter: 95009919 SubPlan 3 -> Index Only Scan using archive_document_index_x_archive_id_node_id_value on archive_document_index r2 (cost=0.57..4.59rows=1 width=0) (never executed) Index Cond: ((archive_id = r0.id) AND (node_id = r0.node_id) AND (value = 'TC212592'::text)) Heap Fetches: 0 SubPlan 4 -> Seq Scan on archive_document_index r2_1 (cost=0.00..1958101.80 rows=1568 width=16) (actual time=40241.599..44462.485rows=1 loops=1) Filter: ((value)::text = 'TC212592'::text) Rows Removed by Filter: 95009922 Total runtime: 85676.734 ms (22 rows) Thanks, Jesse
pgsql-general by date: