Thread: planner used functional index in 7.3.6, now does a seq scan in 7.4.7 after upgrade
planner used functional index in 7.3.6, now does a seq scan in 7.4.7 after upgrade
From
Chris Tennant
Date:
Greetings, I've just upgraded from 7.3.6 to 7.4.7 (running on Debian Linux). I dumped and reloaded my db as part of the upgrade. Everything is working great, except that one query that executed in < 1 ms on 7.3.6 now takes > 500 ms on 7.4.7. When I look at the query plan,the planner is no longer taking advantage of a functional index (the index is correctly defined as immutable), and is doing a sequence scan instead. I'm stumped -- I can't see any reason why this would be broken. Out of desperation, I tried dropping and recreating the index, and dropping and recreating the function. I've also run "ANALYZE" on the tables, in addition to the regular analysis and vacuums that I run frequently as part of regular maintenance. No luck. Has anyone else run into something like this? Any hints would be much appreciated. FWIW, I've provided all the gory details below. Thank you!! Chris ---------------- This is the query: SELECT data_stored.id as data_stored_id, patient_data.id as patient_data_id, stereo_image_attributes.id as stereo_id, stereo_image_attributes.x_offsetas stereo_x_offset, stereo_image_attributes.y_offsetas stereo_y_offset, stereo_image_attributes.stereo_swapas stereo_swap, stereo_image_attributes.analysis_dateas stereo_analysis_date, stereo_image_attributes.analysis_detailas stereo_analysis_detail, stereo_image_attributes.analysis_storage_idas stereo_analysis_storage_id FROM opt_stereo_pair_image_attributes stereo_image_attributes, opt_patient_data patient_data, opt_patient_data_entries patient_data_entry, opt_patient_data_stored data_stored WHERE patient_data.version_id = ? AND patient_data_entry.patient_data_id = patient_data.idAND patient_data_entry.id = data_stored.patient_data_entry_id AND data_stored.id = stereo_id ( stereo_image_attributes.left_patient_data_stored_id, stereo_image_attributes.right_patient_data_id stereo_image_attributes.left_patient_data_id ) I then run EXPLAIN ANALYZE on 7.3.6 and 7.4.7, with the same query parameter. here's what EXPLAIN ANALYZE yields on 7.3.6. note that it *is* using the functional index "stereo_pair_image_attributes_stereo_id" QUERY PLAN -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Nested Loop (cost=0.00..5331.34 rows=2 width=64) (actual time=0.13..0.52 rows=2 loops=1) -> Nested Loop (cost=0.00..153.23 rows=4 width=20) (actual time=0.10..0.45 rows=7 loops=1) -> Nested Loop (cost=0.00..8.52 rows=16 width=12) (actual time=0.05..0.18 rows=37 loops=1) -> Index Scan using patient_data_version_id on opt_patient_data patient_data (cost=0.00..4.75 rows=1 width=4) (actual time=0.03..0.03 rows=1 loops=1) Index Cond: (version_id = 323268) -> Index Scan using opt_patient_data_id_keyon opt_patient_data_entries patient_data_entry (cost=0.00..3.40 rows=30 width=8) (actual time=0.01..0.09 rows=37 loops=1) Index Cond: (patient_data_entry.patient_data_id = "outer".id) -> Index Scan using opt_patient_data_stored_entry_count on opt_patient_data_stored data_stored (cost=0.00..8.85 rows=2 width=8) (actual time=0.01..0.01 rows=0 loops=37) Index Cond: ("outer".id = data_stored.patient_data_entry_id) -> IndexScan using stereo_pair_image_attributes_stereo_id on opt_stereo_pair_image_attributes stereo_image_attributes (cost=0.00..1454.62 rows=451 width=44) (actual time=0.01..0.01 rows=0 loops=7) Index Cond: ("outer".id = stereo_id(stereo_image_attributes.left_patient_data_stored_id, stereo_image_attributes.right_patient_data_id, stereo_image_attributes.left_patient_data_id))Total runtime: 0.62 msec (12 rows) And then on 7.4.7. note that it is *not* using the functional index, but is instead doing a seq scan on opt_stereo_pair_image_attributes. QUERY PLAN -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Hash Join (cost=173.16..2628.80 rows=2 width=40) (actual time=245.433..581.288 rows=2 loops=1) Hash Cond: (stereo_id("outer".left_patient_data_stored_id, "outer".right_patient_data_id, "outer".left_patient_data_id) = "inner".id) -> Seq Scan on opt_stereo_pair_image_attributes stereo_image_attributes (cost=0.00..1793.21 rows=88321 width=44) (actual time=0.004..162.218 rows=88321 loops=1) -> Hash (cost=173.15..173.15 rows=4 width=8) (actual time=0.709..0.709 rows=0 loops=1) -> Nested Loop (cost=0.00..173.15 rows=4 width=8) (actual time=0.114..0.691 rows=7 loops=1) -> Nested Loop (cost=0.00..8.44 rows=17 width=8) (actual time=0.042..0.280 rows=37 loops=1) -> Index Scan using patient_data_version_id on opt_patient_data patient_data (cost=0.00..4.76 rows=1 width=4) (actual time=0.019..0.021 rows=1 loops=1) Index Cond: (version_id = 323268) -> IndexScan using opt_patient_data_id_key on opt_patient_data_entries patient_data_entry (cost=0.00..3.35 rows=26 width=8) (actual time=0.013..0.110 rows=37 loops=1) Index Cond: (patient_data_entry.patient_data_id = "outer".id) -> Index Scan using opt_patient_data_stored_entry_count on opt_patient_data_stored data_stored (cost=0.00..9.66 rows=2 width=8) (actual time=0.006..0.007 rows=0 loops=37) Index Cond: ("outer".id = data_stored.patient_data_entry_id)Total runtime: 581.390 ms (13 rows) Here are the schema details for the table with the functional index: Table "public.opt_stereo_pair_image_attributes" Column | Type | Modifiers ------------------------------+-----------------------------+--------------------id | integer | not nullleft_patient_data_stored_id | integer | not nullright_patient_data_stored_id| integer | not nullx_offset | integer | default 0y_offset | integer | default 0analysis_date |timestamp without time zone |analysis_detail | integer |analysis_storage_id |integer |stereo_swap | integer | default 0left_patient_data_id | integer | not nullright_patient_data_id | integer | not nullanalysis_type_id | integer | not null default 0 Indexes: "opt_stereo_pair_image_attributes_pkey" primary key, btree (id) "opt_stereo_pair_image_attributes_stereo_key"unique, btree (left_patient_data_stored_id, right_patient_data_stored_id) "stereo_pair_image_attributes_stereo_id" btree (stereo_id(right_patient_data_id, left_patient_data_id)) Foreign-key constraints: "analysis_type_id" FOREIGN KEY (analysis_type_id) REFERENCES opt_analysis_types(id) "left_patient_data_id" FOREIGN KEY (left_patient_data_id) REFERENCES opt_patient_data(id) "right_patient_data_id" FOREIGN KEY (right_patient_data_id) REFERENCES opt_patient_data(id) "opt_stereo_pair_image_attributes_right_patient_data_stored" FOREIGN KEY (right_patient_data_stored_id) REFERENCES opt_patient_data_stored(id) "opt_stereo_pair_image_attributes_left_patient_data_stored"FOREIGN KEY (left_patient_data_stored_id) REFERENCES opt_patient_data_stored(id) here's the definition of the function (as immutable): CREATE OR REPLACE FUNCTION stereo_id (INTEGER, INTEGER, INTEGER) RETURNS INTEGER AS 'BEGIN RETURN CASE WHEN $2 = $3 THEN $1 ELSE -1 END; END;' LANGUAGE 'plpgsql' IMMUTABLE; and here's the original definition of the functional index: CREATE INDEX stereo_pair_image_attributes_stereo_id ON opt_stereo_pair_image_attributes (stereo_id(right_patient_data_id,left_patient_data_id));
Re: planner used functional index in 7.3.6, now does a seq scan in 7.4.7 after upgrade
From
Tom Lane
Date:
Chris Tennant <postgresql-ctennant@elirious.com> writes: > here's the definition of the function (as immutable): > CREATE OR REPLACE FUNCTION stereo_id (INTEGER, INTEGER, INTEGER) RETURNS > INTEGER AS > 'BEGIN RETURN CASE WHEN $2 = $3 THEN $1 ELSE -1 END; END;' > LANGUAGE 'plpgsql' IMMUTABLE; > and here's the original definition of the functional index: > CREATE INDEX stereo_pair_image_attributes_stereo_id > ON opt_stereo_pair_image_attributes > (stereo_id(right_patient_data_id,left_patient_data_id)); Um, that index seems to be on some other function that may have the same name, but only takes two arguments? regards, tom lane
doh. I copied an earlier definition of the function into the email. Thanks for catching my (moronic) error, and my apologies for distracting everyone on the list. However, the underlying problem remains: even with the correct function definition, the query executes thousands of times slower on 7.4 than on 7.3 Here's the correct definition: CREATE OR REPLACE FUNCTION stereo_id (INTEGER, INTEGER, INTEGER) RETURNS INTEGER AS 'BEGIN RETURN CASE WHEN $2 = $3 THEN $1 ELSE -1 END; END;' LANGUAGE 'plpgsql' IMMUTABLE; DROP INDEX stereo_pair_image_attributes_stereo_id; CREATE INDEX stereo_pair_image_attributes_stereo_id ON opt_stereo_pair_image_attributes (stereo_id(left_patient_data_stored_id, right_patient_data_id,left_patient_data_id)); To double-check, I re-applied the correct function and the index on both the 7.3 and the 7.4 databases (I have them running on different machines). Same problem, the query executes on the 7.3 database in 0.13 ms, and on the 7.4 database in 571 ms. Tom Lane wrote: > Chris Tennant <postgresql-ctennant@elirious.com> writes: > >> here's the definition of the function (as immutable): >> > > >> CREATE OR REPLACE FUNCTION stereo_id (INTEGER, INTEGER, INTEGER) RETURNS >> INTEGER AS >> 'BEGIN RETURN CASE WHEN $2 = $3 THEN $1 ELSE -1 END; END;' >> LANGUAGE 'plpgsql' IMMUTABLE; >> > > >> and here's the original definition of the functional index: >> > > >> CREATE INDEX stereo_pair_image_attributes_stereo_id >> ON opt_stereo_pair_image_attributes >> (stereo_id(right_patient_data_id,left_patient_data_id)); >> > > Um, that index seems to be on some other function that may have the same > name, but only takes two arguments? > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > >
Chris Tennant <postgresql-ctennant@elirious.com> writes: > ... the underlying problem remains: even > with the correct function definition, the query executes thousands of > times slower on 7.4 than on 7.3 Well, note that 7.4 thinks it's finding a *better* plan --- the estimated cost is about half what it was in 7.3. (I believe the reason is that 7.4 can handle hash and merge joins on equalities of two expressions, where 7.3 and before only considered them for trivial "Var = Var" clauses.) The fact that the plan is in reality worse means that there's an estimation error involved; and it's easy to see in the 7.3 output: > -> Index Scan using stereo_pair_image_attributes_stereo_id on opt_stereo_pair_image_attributes stereo_image_attributes(cost=0.00..1454.62 rows=451 width=44) (actual time=0.01..0.01 rows=0 loops=7) > Index Cond: ("outer".id = stereo_id(stereo_image_attributes.left_patient_data_stored_id, stereo_image_attributes.right_patient_data_id,stereo_image_attributes.left_patient_data_id)) 451 estimated vs less-than-1 actual is pretty bad. The real question I have for you is why you are "upgrading" to a three-year-old PG release? The 7.x releases have no chance of estimating this query well because they don't keep any statistics about the contents of functional indexes. 8.0 and up do, so they'd probably do a lot better with this. If I were you I'd be trying to migrate to 8.1.5, not anything older. regards, tom lane
Tom, Thank you so much for your help. Upgrading to 8.1.5 did the trick, the query now has a better plan, and executes quickly: QUERY PLAN -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Nested Loop (cost=0.00..215.81 rows=2 width=40) (actual time=0.134..0.508 rows=4 loops=1) -> Nested Loop (cost=0.00..106.86 rows=3 width=8) (actual time=0.050..0.324 rows=21 loops=1) -> Nested Loop (cost=0.00..9.07 rows=16 width=8) (actual time=0.035..0.098 rows=11 loops=1) -> Index Scan using patient_data_version_id on opt_patient_data patient_data (cost=0.00..4.82 rows=1 width=4) (actual time=0.016..0.018 rows=1 loops=1) Index Cond: (version_id = 123) -> Index Scan using opt_patient_data_id_keyon opt_patient_data_entries patient_data_entry (cost=0.00..3.65 rows=48 width=8) (actual time=0.011..0.035 rows=11 loops=1) Index Cond: (patient_data_entry.patient_data_id = "outer".id) -> Index Scan using opt_patient_data_stored_entry_count on opt_patient_data_stored data_stored (cost=0.00..6.09 rows=2 width=8) (actual time=0.006..0.011 rows=2 loops=11) Index Cond: ("outer".id = data_stored.patient_data_entry_id) -> Index Scan using stereo_pair_image_attributes_stereo_id on opt_stereo_pair_image_attributes stereo_image_attributes (cost=0.00..36.08 rows=16 width=44) (actual time=0.005..0.005 rows=0 loops=21) Index Cond: ("outer".id = stereo_id(stereo_image_attributes.left_patient_data_stored_id, stereo_image_attributes.right_patient_data_id, stereo_image_attributes.left_patient_data_id))Total runtime: 0.595 ms (12 rows) I had "upgraded" to 7.4 because that is the default version for Debian sarge. I wanted to get away from building postgresql from source, as I had always done previously. But I'm now a fresh convert to building from source ;-) Thanks again for your help. - Chris Tom Lane wrote: > Chris Tennant <postgresql-ctennant@elirious.com> writes: > >> ... the underlying problem remains: even >> with the correct function definition, the query executes thousands of >> times slower on 7.4 than on 7.3 >> > > Well, note that 7.4 thinks it's finding a *better* plan --- the > estimated cost is about half what it was in 7.3. (I believe the reason > is that 7.4 can handle hash and merge joins on equalities of two > expressions, where 7.3 and before only considered them for trivial > "Var = Var" clauses.) The fact that the plan is in reality worse > means that there's an estimation error involved; and it's easy to > see in the 7.3 output: > > >> -> Index Scan using stereo_pair_image_attributes_stereo_id on opt_stereo_pair_image_attributes stereo_image_attributes(cost=0.00..1454.62 rows=451 width=44) (actual time=0.01..0.01 rows=0 loops=7) >> Index Cond: ("outer".id = stereo_id(stereo_image_attributes.left_patient_data_stored_id, stereo_image_attributes.right_patient_data_id,stereo_image_attributes.left_patient_data_id)) >> > > 451 estimated vs less-than-1 actual is pretty bad. The real question > I have for you is why you are "upgrading" to a three-year-old PG > release? The 7.x releases have no chance of estimating this query well > because they don't keep any statistics about the contents of functional > indexes. 8.0 and up do, so they'd probably do a lot better with this. > If I were you I'd be trying to migrate to 8.1.5, not anything older. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > >
Le lundi 20 novembre 2006 00:20, Chris Tennant a écrit : > I had "upgraded" to 7.4 because that is the default version for Debian > sarge. I wanted to get away from building postgresql from source, as I > had always done previously. But I'm now a fresh convert to building > from source ;-) You could have used debian backports: deb http://people.debian.org/~mpitt/packages/sarge-backports/ / http://backports.org/debian/pool/main/p/postgresql-8.1/ -- Dimitri Fontaine http://www.dalibo.com/