Re: Incorrect index being used - Mailing list pgsql-general
From | Jesse Long |
---|---|
Subject | Re: Incorrect index being used |
Date | |
Msg-id | 525537AD.50106@iso-8859-1.za.net Whole thread Raw |
In response to | Re: Incorrect index being used (Jesse Long <jpl@iso-8859-1.za.net>) |
List | pgsql-general |
On 09/10/2013 12:57, Jesse Long wrote: > On 09/10/2013 12:10, Albe Laurenz wrote: >> Jesse Long wrote: >>> I have the following query, run immediately after executing VACUUM in >>> the database. There is only one connection to the database. >> You should run ANALYZE, not VACUUM. >> >>> The query runs for much longer than I expect it to run for, and I think >>> this is due to it using the incorrect subplan. As you can see, subplans >>> 1 and 3 make use of and index, but these subplans are not used. >>> Subplans and 4 are seqscan, and they are used. >>> >>> How can I get PostgreSQL to use subplan 1 and 3? >> They are only possible if an "Index Only Scan" is possible, which >> can only be used if the respective table entries are visible for >> all transactions. >> >>> testdb=> explain analyse SELECT * FROM ARCHIVE_DOCUMENT AS r0 WHERE >>> r0.NODE_ID = 29 AND >>> r0.ARCHIVE_DATE >= '2013-07-08 18:28:00' AND (EXISTS (SELECT r1.* >>> FROM ARCHIVE_DOCUMENT_INDEX AS r1 >>> WHERE r1.ARCHIVE_ID = r0.ID AND r1.NODE_ID = r0.NODE_ID AND r1.VALUE >>> = 'BSH70002152' ) OR EXISTS ( >>> SELECT r2.* FROM ARCHIVE_DOCUMENT_INDEX AS r2 WHERE r2.ARCHIVE_ID = >>> r0.ID AND r2.NODE_ID = r0.NODE_ID >>> AND r2.VALUE = 'TC212592' ) ) ORDER BY r0.ARCHIVE_DATE DESC LIMIT 10; >> [...] >> >>> Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR >>> (alternatives: SubPlan 3 or hashed >>> SubPlan 4)) >>> >>> Rows Removed by Filter: 710851 >>> >>> SubPlan 1 >>> >>> -> Index Only Scan using >>> archive_document_index_x_archive_id_node_id_value on >>> archive_document_index r1 (cost=0.57..4.59 rows=1 width=0) (never >>> executed) >>> >>> Index Cond: ((archive_id = r0.id) AND (node_id = >>> r0.node_id) AND (value = >>> 'BSH70002152'::text)) >>> >>> Heap Fetches: 0 >>> >>> SubPlan 2 >>> >>> -> Seq Scan on archive_document_index r1_1 >>> (cost=0.00..1958104.00 rows=1520 width=16) >>> (actual time=44418.383..44558.293 rows=4 loops=1) >>> >>> Filter: ((value)::text = 'BSH70002152'::text) >>> >>> Rows Removed by Filter: 95009919 >>> >>> SubPlan 3 >>> >>> -> Index Only Scan using >>> archive_document_index_x_archive_id_node_id_value on >>> archive_document_index r2 (cost=0.57..4.59 rows=1 width=0) (never >>> executed) >>> >>> Index Cond: ((archive_id = r0.id) AND (node_id = >>> r0.node_id) AND (value = >>> 'TC212592'::text)) >>> >>> Heap Fetches: 0 >>> >>> SubPlan 4 >>> >>> -> Seq Scan on archive_document_index r2_1 >>> (cost=0.00..1958104.00 rows=1520 width=16) >>> (actual time=41659.464..41663.342 rows=1 loops=1) >>> >>> Filter: ((value)::text = 'TC212592'::text) >>> >>> Rows Removed by Filter: 95009922 >> The estimates are quite off. >> Does "ANALYZE archive_document", possibly after increasing >> default_statistics_target, make a difference? >> >> Yours, >> Laurenz Albe >> > > Hi Laurenz, > > Thank you for the feedback. > > There is no problem with row visibility, there is only one connection > to the database - the connection I am using to do these selects. > > Thanks you for the advise regarding ANALYZE. I ran ANALYZE on both > tables concerned, but not much changed: > > QUERY PLAN > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > > > Limit (cost=0.56..151.85 rows=10 width=122) (actual > time=40841.984..85668.213 rows=2 loops=1) > > -> Index Scan Backward using > idx_archive_document_x_node_id_archive_date on archive_document r0 > (cost=0.56..7627640.20 rows=504186 width=122) (actual time=40841.98 > > Index Cond: ((node_id = 29) AND (archive_date >= '2013-07-08 > 18:28:00'::timestamp without time zone)) > > Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR > (alternatives: SubPlan 3 or hashed SubPlan 4)) > > Rows Removed by Filter: 710851 > > SubPlan 1 > > -> Index Only Scan using > archive_document_index_x_archive_id_node_id_value on > archive_document_index r1 (cost=0.57..4.59 rows=1 width=0) (never > executed) > > Index Cond: ((archive_id = r0.id) AND (node_id = > r0.node_id) AND (value = 'BSH70002152'::text)) > > Heap Fetches: 0 > > SubPlan 2 > > -> Seq Scan on archive_document_index r1_1 > (cost=0.00..1958101.80 rows=1568 width=16) (actual > time=36633.365..40841.909 rows=4 loops=1) > > Filter: ((value)::text = 'BSH70002152'::text) > > Rows Removed by Filter: 95009919 > > SubPlan 3 > > -> Index Only Scan using > archive_document_index_x_archive_id_node_id_value on > archive_document_index r2 (cost=0.57..4.59 rows=1 width=0) (never > executed) > > Index Cond: ((archive_id = r0.id) AND (node_id = > r0.node_id) AND (value = 'TC212592'::text)) > > Heap Fetches: 0 > > SubPlan 4 > > -> Seq Scan on archive_document_index r2_1 > (cost=0.00..1958101.80 rows=1568 width=16) (actual > time=40241.599..44462.485 rows=1 loops=1) > > Filter: ((value)::text = 'TC212592'::text) > > Rows Removed by Filter: 95009922 > > Total runtime: 85676.734 ms > > (22 rows) Sorry, I neglected to mention that this is on PostgreSQL 9.3.0. Thanks, Jesse
pgsql-general by date: