Thread: planner used functional index in 7.3.6, now does a seq scan in 7.4.7 after upgrade

planner used functional index in 7.3.6, now does a seq scan in 7.4.7 after upgrade

From

Chris Tennant

Date:

19 November 2006, 02:41:30

Greetings,

I've just upgraded from 7.3.6 to 7.4.7 (running on Debian Linux).  I 
dumped and reloaded my db as part of the upgrade.  Everything is working 
great, except that one query that executed in < 1 ms on 7.3.6 now takes > 500 ms on 7.4.7.  When I look at the query
plan,the planner is no 
 
longer taking advantage of a functional index (the index is correctly 
defined as immutable), and is doing a sequence scan instead.  I'm 
stumped -- I can't see any reason why this would be broken.  Out of 
desperation, I tried dropping and recreating the index, and dropping and 
recreating the function.  I've also run "ANALYZE" on the tables, in 
addition to the regular analysis and vacuums that I run frequently as 
part of regular maintenance.  No luck.

Has anyone else run into something like this?  Any hints would be much 
appreciated.  FWIW, I've provided all the gory details below.

Thank you!!

Chris

----------------
This is the query:
         SELECT                 data_stored.id as data_stored_id,                 patient_data.id as patient_data_id,
             stereo_image_attributes.id as                       stereo_id,
stereo_image_attributes.x_offsetas                       stereo_x_offset,
stereo_image_attributes.y_offsetas                       stereo_y_offset,
stereo_image_attributes.stereo_swapas                       stereo_swap,
stereo_image_attributes.analysis_dateas                       stereo_analysis_date,
stereo_image_attributes.analysis_detailas                       stereo_analysis_detail,
stereo_image_attributes.analysis_storage_idas                       stereo_analysis_storage_id                   FROM
            opt_stereo_pair_image_attributes stereo_image_attributes,                opt_patient_data patient_data,
          opt_patient_data_entries patient_data_entry,                opt_patient_data_stored data_stored
         WHERE                patient_data.version_id = ? AND                patient_data_entry.patient_data_id =
patient_data.idAND                patient_data_entry.id = 
 
data_stored.patient_data_entry_id AND                data_stored.id = stereo_id (
stereo_image_attributes.left_patient_data_stored_id,                 stereo_image_attributes.right_patient_data_id
           stereo_image_attributes.left_patient_data_id  )
 


I then run EXPLAIN ANALYZE on 7.3.6 and 7.4.7, with the same query 
parameter.  here's what EXPLAIN ANALYZE yields on 7.3.6.  note that it 
*is* using the functional index "stereo_pair_image_attributes_stereo_id"
                                                                                              
QUERY PLAN                                                   

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Nested
Loop (cost=0.00..5331.34 rows=2 width=64) (actual 
 
time=0.13..0.52 rows=2 loops=1)  ->  Nested Loop  (cost=0.00..153.23 rows=4 width=20) (actual 
time=0.10..0.45 rows=7 loops=1)        ->  Nested Loop  (cost=0.00..8.52 rows=16 width=12) (actual 
time=0.05..0.18 rows=37 loops=1)              ->  Index Scan using patient_data_version_id on 
opt_patient_data patient_data  (cost=0.00..4.75 rows=1 width=4) (actual 
time=0.03..0.03 rows=1 loops=1)                    Index Cond: (version_id = 323268)              ->  Index Scan using
opt_patient_data_id_keyon 
 
opt_patient_data_entries patient_data_entry  (cost=0.00..3.40 rows=30 
width=8) (actual time=0.01..0.09 rows=37 loops=1)                    Index Cond: (patient_data_entry.patient_data_id =

"outer".id)        ->  Index Scan using opt_patient_data_stored_entry_count on 
opt_patient_data_stored data_stored  (cost=0.00..8.85 rows=2 width=8) 
(actual time=0.01..0.01 rows=0 loops=37)              Index Cond: ("outer".id = data_stored.patient_data_entry_id)  ->
IndexScan using stereo_pair_image_attributes_stereo_id on 
 
opt_stereo_pair_image_attributes stereo_image_attributes  
(cost=0.00..1454.62 rows=451 width=44) (actual time=0.01..0.01 rows=0 
loops=7)        Index Cond: ("outer".id = 
stereo_id(stereo_image_attributes.left_patient_data_stored_id, 
stereo_image_attributes.right_patient_data_id, 
stereo_image_attributes.left_patient_data_id))Total runtime: 0.62 msec
(12 rows)


And then on 7.4.7.  note that it is *not* using the functional index, 
but is instead doing a seq scan on opt_stereo_pair_image_attributes.
                                                                                        
QUERY PLAN                             

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Hash
Join (cost=173.16..2628.80 rows=2 width=40) (actual 
 
time=245.433..581.288 rows=2 loops=1)  Hash Cond: (stereo_id("outer".left_patient_data_stored_id, 
"outer".right_patient_data_id, "outer".left_patient_data_id) = "inner".id)  ->  Seq Scan on
opt_stereo_pair_image_attributes
 
stereo_image_attributes  (cost=0.00..1793.21 rows=88321 width=44) 
(actual time=0.004..162.218 rows=88321 loops=1)  ->  Hash  (cost=173.15..173.15 rows=4 width=8) (actual 
time=0.709..0.709 rows=0 loops=1)        ->  Nested Loop  (cost=0.00..173.15 rows=4 width=8) (actual 
time=0.114..0.691 rows=7 loops=1)              ->  Nested Loop  (cost=0.00..8.44 rows=17 width=8) 
(actual time=0.042..0.280 rows=37 loops=1)                    ->  Index Scan using patient_data_version_id on 
opt_patient_data patient_data  (cost=0.00..4.76 rows=1 width=4) (actual 
time=0.019..0.021 rows=1 loops=1)                          Index Cond: (version_id = 323268)                    ->
IndexScan using opt_patient_data_id_key on 
 
opt_patient_data_entries patient_data_entry  (cost=0.00..3.35 rows=26 
width=8) (actual time=0.013..0.110 rows=37 loops=1)                          Index Cond: 
(patient_data_entry.patient_data_id = "outer".id)              ->  Index Scan using opt_patient_data_stored_entry_count

on opt_patient_data_stored data_stored  (cost=0.00..9.66 rows=2 width=8) 
(actual time=0.006..0.007 rows=0 loops=37)                    Index Cond: ("outer".id = 
data_stored.patient_data_entry_id)Total runtime: 581.390 ms
(13 rows)


Here are the schema details for the table with the functional index:
               Table "public.opt_stereo_pair_image_attributes"           Column            |            Type
|     Modifiers
 
------------------------------+-----------------------------+--------------------id                           | integer
                   | not nullleft_patient_data_stored_id  | integer                     | not
nullright_patient_data_stored_id| integer                     | not nullx_offset                     | integer
          | default 0y_offset                     | integer                     | default 0analysis_date
|timestamp without time zone |analysis_detail              | integer                     |analysis_storage_id
|integer                     |stereo_swap                  | integer                     | default
0left_patient_data_id        | integer                     | not nullright_patient_data_id        | integer
       | not nullanalysis_type_id             | integer                     | not null 
 
default 0
Indexes:   "opt_stereo_pair_image_attributes_pkey" primary key, btree (id)
"opt_stereo_pair_image_attributes_stereo_key"unique, btree 
 
(left_patient_data_stored_id, right_patient_data_stored_id)   "stereo_pair_image_attributes_stereo_id" btree 
(stereo_id(right_patient_data_id, left_patient_data_id))
Foreign-key constraints:   "analysis_type_id" FOREIGN KEY (analysis_type_id) REFERENCES 
opt_analysis_types(id)   "left_patient_data_id" FOREIGN KEY (left_patient_data_id) REFERENCES 
opt_patient_data(id)   "right_patient_data_id" FOREIGN KEY (right_patient_data_id) 
REFERENCES opt_patient_data(id)   "opt_stereo_pair_image_attributes_right_patient_data_stored" FOREIGN 
KEY (right_patient_data_stored_id) REFERENCES opt_patient_data_stored(id)
"opt_stereo_pair_image_attributes_left_patient_data_stored"FOREIGN 
 
KEY (left_patient_data_stored_id) REFERENCES opt_patient_data_stored(id)

here's the definition of the function (as immutable):

CREATE OR REPLACE FUNCTION stereo_id (INTEGER, INTEGER, INTEGER) RETURNS 
INTEGER AS    'BEGIN  RETURN CASE WHEN $2 = $3 THEN $1 ELSE -1 END;    END;'  LANGUAGE 'plpgsql' IMMUTABLE;

and here's the original definition of the functional index:

CREATE INDEX stereo_pair_image_attributes_stereo_id   ON opt_stereo_pair_image_attributes
(stereo_id(right_patient_data_id,left_patient_data_id));

Re: planner used functional index in 7.3.6, now does a seq scan in 7.4.7 after upgrade

From

Tom Lane

Date:

19 November 2006, 03:28:30

Chris Tennant <postgresql-ctennant@elirious.com> writes:
> here's the definition of the function (as immutable):

> CREATE OR REPLACE FUNCTION stereo_id (INTEGER, INTEGER, INTEGER) RETURNS 
> INTEGER AS
>      'BEGIN  RETURN CASE WHEN $2 = $3 THEN $1 ELSE -1 END;    END;'
>    LANGUAGE 'plpgsql' IMMUTABLE;

> and here's the original definition of the functional index:

> CREATE INDEX stereo_pair_image_attributes_stereo_id
>     ON opt_stereo_pair_image_attributes
>         (stereo_id(right_patient_data_id,left_patient_data_id));

Um, that index seems to be on some other function that may have the same
name, but only takes two arguments?
        regards, tom lane

Re: planner used functional index in 7.3.6, now does a seq

From

Chris Tennant

Date:

19 November 2006, 12:45:54

doh.  I copied an earlier definition of the function into the email.  
Thanks for catching my (moronic) error, and my apologies for distracting 
everyone on the list.  However, the underlying problem remains:  even 
with the correct function definition, the query executes thousands of 
times slower on 7.4 than on 7.3

Here's the correct definition:

CREATE OR REPLACE FUNCTION stereo_id (INTEGER, INTEGER, INTEGER) RETURNS 
INTEGER AS    'BEGIN  RETURN CASE WHEN $2 = $3 THEN $1 ELSE -1 END;    END;'  LANGUAGE 'plpgsql' IMMUTABLE;

DROP INDEX stereo_pair_image_attributes_stereo_id;
CREATE INDEX stereo_pair_image_attributes_stereo_id   ON opt_stereo_pair_image_attributes
(stereo_id(left_patient_data_stored_id,          right_patient_data_id,left_patient_data_id));
 

To double-check, I re-applied the correct function and the index on both 
the 7.3 and the 7.4 databases (I have them running on different 
machines).  Same problem, the query executes on the 7.3 database in 0.13 
ms, and on the 7.4 database in 571 ms.



Tom Lane wrote:
> Chris Tennant <postgresql-ctennant@elirious.com> writes:
>   
>> here's the definition of the function (as immutable):
>>     
>
>   
>> CREATE OR REPLACE FUNCTION stereo_id (INTEGER, INTEGER, INTEGER) RETURNS 
>> INTEGER AS
>>      'BEGIN  RETURN CASE WHEN $2 = $3 THEN $1 ELSE -1 END;    END;'
>>    LANGUAGE 'plpgsql' IMMUTABLE;
>>     
>
>   
>> and here's the original definition of the functional index:
>>     
>
>   
>> CREATE INDEX stereo_pair_image_attributes_stereo_id
>>     ON opt_stereo_pair_image_attributes
>>         (stereo_id(right_patient_data_id,left_patient_data_id));
>>     
>
> Um, that index seems to be on some other function that may have the same
> name, but only takes two arguments?
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
>
>

Re: planner used functional index in 7.3.6, now does a seq

From

Tom Lane

Date:

19 November 2006, 15:00:49

Chris Tennant <postgresql-ctennant@elirious.com> writes:
> ... the underlying problem remains:  even 
> with the correct function definition, the query executes thousands of 
> times slower on 7.4 than on 7.3

Well, note that 7.4 thinks it's finding a *better* plan --- the
estimated cost is about half what it was in 7.3.  (I believe the reason
is that 7.4 can handle hash and merge joins on equalities of two
expressions, where 7.3 and before only considered them for trivial
"Var = Var" clauses.)  The fact that the plan is in reality worse
means that there's an estimation error involved; and it's easy to
see in the 7.3 output:

>    -> Index Scan using stereo_pair_image_attributes_stereo_id on opt_stereo_pair_image_attributes
stereo_image_attributes(cost=0.00..1454.62 rows=451 width=44) (actual time=0.01..0.01 rows=0 loops=7)
 
>         Index Cond: ("outer".id = stereo_id(stereo_image_attributes.left_patient_data_stored_id,
stereo_image_attributes.right_patient_data_id,stereo_image_attributes.left_patient_data_id))
 

451 estimated vs less-than-1 actual is pretty bad.  The real question
I have for you is why you are "upgrading" to a three-year-old PG
release?  The 7.x releases have no chance of estimating this query well
because they don't keep any statistics about the contents of functional
indexes.  8.0 and up do, so they'd probably do a lot better with this.
If I were you I'd be trying to migrate to 8.1.5, not anything older.
        regards, tom lane

Re: planner used functional index in 7.3.6, now does a seq

From

Chris Tennant

Date:

19 November 2006, 19:21:08

Tom,

Thank you so much for your help.  Upgrading to 8.1.5 did the trick, the 
query now has a better plan, and executes quickly:
                                                                                               
QUERY PLAN                       

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Nested
Loop (cost=0.00..215.81 rows=2 width=40) (actual 
 
time=0.134..0.508 rows=4 loops=1)  ->  Nested Loop  (cost=0.00..106.86 rows=3 width=8) (actual 
time=0.050..0.324 rows=21 loops=1)        ->  Nested Loop  (cost=0.00..9.07 rows=16 width=8) (actual 
time=0.035..0.098 rows=11 loops=1)              ->  Index Scan using patient_data_version_id on 
opt_patient_data patient_data  (cost=0.00..4.82 rows=1 width=4) (actual 
time=0.016..0.018 rows=1 loops=1)                    Index Cond: (version_id = 123)              ->  Index Scan using
opt_patient_data_id_keyon 
 
opt_patient_data_entries patient_data_entry  (cost=0.00..3.65 rows=48 
width=8) (actual time=0.011..0.035 rows=11 loops=1)                    Index Cond: (patient_data_entry.patient_data_id
=
 
"outer".id)        ->  Index Scan using opt_patient_data_stored_entry_count on 
opt_patient_data_stored data_stored  (cost=0.00..6.09 rows=2 width=8) 
(actual time=0.006..0.011 rows=2 loops=11)              Index Cond: ("outer".id = data_stored.patient_data_entry_id)
-> Index Scan using stereo_pair_image_attributes_stereo_id on 
 
opt_stereo_pair_image_attributes stereo_image_attributes  
(cost=0.00..36.08 rows=16 width=44) (actual time=0.005..0.005 rows=0 
loops=21)        Index Cond: ("outer".id = 
stereo_id(stereo_image_attributes.left_patient_data_stored_id, 
stereo_image_attributes.right_patient_data_id, 
stereo_image_attributes.left_patient_data_id))Total runtime: 0.595 ms
(12 rows)

I had "upgraded" to 7.4 because that is the default version for Debian 
sarge.  I wanted to get away from building postgresql from source, as I 
had always done previously.  But I'm now a fresh convert to building 
from source ;-)

Thanks again for your help.

- Chris

Tom Lane wrote:
> Chris Tennant <postgresql-ctennant@elirious.com> writes:
>   
>> ... the underlying problem remains:  even 
>> with the correct function definition, the query executes thousands of 
>> times slower on 7.4 than on 7.3
>>     
>
> Well, note that 7.4 thinks it's finding a *better* plan --- the
> estimated cost is about half what it was in 7.3.  (I believe the reason
> is that 7.4 can handle hash and merge joins on equalities of two
> expressions, where 7.3 and before only considered them for trivial
> "Var = Var" clauses.)  The fact that the plan is in reality worse
> means that there's an estimation error involved; and it's easy to
> see in the 7.3 output:
>
>   
>>    -> Index Scan using stereo_pair_image_attributes_stereo_id on opt_stereo_pair_image_attributes
stereo_image_attributes(cost=0.00..1454.62 rows=451 width=44) (actual time=0.01..0.01 rows=0 loops=7)
 
>>         Index Cond: ("outer".id = stereo_id(stereo_image_attributes.left_patient_data_stored_id,
stereo_image_attributes.right_patient_data_id,stereo_image_attributes.left_patient_data_id))
 
>>     
>
> 451 estimated vs less-than-1 actual is pretty bad.  The real question
> I have for you is why you are "upgrading" to a three-year-old PG
> release?  The 7.x releases have no chance of estimating this query well
> because they don't keep any statistics about the contents of functional
> indexes.  8.0 and up do, so they'd probably do a lot better with this.
> If I were you I'd be trying to migrate to 8.1.5, not anything older.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq
>
>

Re: planner used functional index in 7.3.6, now does a seq

From

Dimitri Fontaine

Date:

20 November 2006, 05:15:59

Le lundi 20 novembre 2006 00:20, Chris Tennant a écrit :
> I had "upgraded" to 7.4 because that is the default version for Debian
> sarge.  I wanted to get away from building postgresql from source, as I
> had always done previously.  But I'm now a fresh convert to building
> from source ;-)

You could have used debian backports: deb http://people.debian.org/~mpitt/packages/sarge-backports/ /
http://backports.org/debian/pool/main/p/postgresql-8.1/

--
Dimitri Fontaine
http://www.dalibo.com/