Thread: Sequence vs. Index Scan

Sequence vs. Index Scan

From

"Aaron Bono"

Date:

05 May 2007, 18:06:00

I have two schemas, both with the exact same DDL.  One performs great and the other has problems with a specific
query:<br/><br />SELECT *<br />FROM<br />    branch_active_vw<br />WHERE branch_id = get_branch_for_zip ( '22151' ) <br
/>ORDERBY branch_name<br />;<br /><br />I am not defining the view here because the explain plans show the real
problem. I can post the view as well if it will help.<br /><br />The explain plans are as follows:<br /><br />Fast
Schema:<br />1. Sort  (cost=17.50..17.51 rows=1 width=680) (actual time=2838.583..2838.586 rows=1 loops=1)<br />2.  
SortKey: branch.branch_name<br />3.   ->  Nested Loop Left Join  (cost=0.00..17.49 rows=1 width=680) (actual time=
2838.060..2838.093rows=1 loops=1)<br />4.         Join Filter: ("inner".branch_group_id = "outer".branch_group_id)<br
/>5.        ->  Nested Loop  (cost=0.00..11.45 rows=1 width=647) (actual time=2837.776..2837.804 rows=1 loops=1)<br
/>6.              ->  Nested Loop  (cost=0.00..7.88 rows=1 width=618) (actual time=2837.697..2837.716 rows=1
loops=1)<br/>7.                     Join Filter: ("inner".locale_id = "outer".locale_id) <br />8.                    
-> Nested Loop  (cost=0.00..6.86 rows=1 width=598) (actual time=2837.666..2837.676 rows=1 loops=1)<br />9. 
                        Join Filter: ("inner".corp_id = "outer".corp_id) <br />10.                          -> 
IndexScan using branch_pkey on branch  (cost=0.00..5.84 rows=1 width=560) (actual time=2837.621..2837.624 rows=1
loops=1)<br/>11.                                Index Cond: (branch_id = get_branch_for_zip('22151'::character
varying))<br />12.                                Filter: ((start_day <= now()) AND ((end_day IS NULL) OR (end_day
>=now())))<br />13.                          ->  Seq Scan on corp  (cost=0.00..1.01 rows=1 width=46) (actual
time=0.015..0.017 rows=1 loops=1)<br />14.                    ->  Seq Scan on locale  (cost=0.00..1.01 rows=1
width=28)(actual time=0.014..0.016 rows=1 loops=1)<br />15.              ->  Index Scan using zip_cd_pkey on zip_cd
branch_address_zip_cd (cost= 0.00..3.55 rows=1 width=37) (actual time=0.066..0.069 rows=1 loops=1)<br />16.
                  Index Cond: (branch_address_zip_cd.zip_cd_id = "outer".branch_address_zip_id)<br />17.        -> 
SeqScan on branch_group  (cost= 0.00..1.07 rows=7 width=41) (actual time=0.013..0.029 rows=7 loops=1)<br />18.       
SubPlan<br/>19.          ->  Seq Scan on branch_area  (cost=0.00..4.89 rows=1 width=6) (actual time=0.132..0.137
rows=2loops=1)<br />20.                Filter: (branch_id = $0) <br />21. Total runtime: 2839.044 ms<br clear="all"
/><br/>Slow Schema:<br />Sort  (cost=12.77..12.78 rows=1 width=1380) (actual time=157492.513..157492.515 rows=1
loops=1)<br/>1.   Sort Key: branch.branch_name<br />2.   ->  Nested Loop Left Join  (cost= 0.00..12.76 rows=1
width=1380)(actual time=130130.384..157492.484 rows=1 loops=1)<br />3.         Join Filter: ("inner".branch_group_id =
"outer".branch_group_id)<br/>4.         ->  Nested Loop  (cost= 0.00..10.34 rows=1 width=1360) (actual
time=130130.157..157492.253rows=1 loops=1)<br />5.               Join Filter: ("inner".locale_id =
"outer".locale_id)<br/>6.               ->  Nested Loop  (cost= 0.00..9.31 rows=1 width=1340) (actual
time=130130.127..157492.213rows=1 loops=1)<br />7.                     ->  Nested Loop  (cost=0.00..5.75 rows=1
width=1311)(actual time=130130.042..157492.119 rows=1 loops=1)<br />8.                           Join Filter:
("inner".corp_id= "outer".corp_id) <br />9.                           ->  Seq Scan on branch  (cost=0.00..4.72
rows=1width=1281) (actual time=130129.988..157492.057 rows=1 loops=1)<br />10.                                Filter:
((start_day<= now()) AND ((end_day IS NULL) OR (end_day >= now())) AND (branch_id =
get_branch_for_zip('22151'::charactervarying))) <br />11.                          ->  Seq Scan on corp 
(cost=0.00..1.01rows=1 width=38) (actual time=0.022..0.024 rows=1 loops=1)<br />12.                    ->  Index
Scanusing zip_cd_pkey on zip_cd branch_address_zip_cd  (cost= 0.00..3.55 rows=1 width=37) (actual time=0.070..0.073
rows=1loops=1)<br />13.                          Index Cond: (branch_address_zip_cd.zip_cd_id =
"outer".branch_address_zip_id)<br/>14.              ->  Seq Scan on locale  (cost= 0.00..1.01 rows=1 width=28)
(actualtime=0.013..0.015 rows=1 loops=1)<br />15.        ->  Seq Scan on branch_group  (cost=0.00..1.01 rows=1
width=28)(actual time=0.013..0.015 rows=1 loops=1)<br />16.        SubPlan<br />17.          ->  Seq Scan on
branch_area (cost= 0.00..1.40 rows=1 width=5) (actual time=0.077..0.084 rows=1 loops=1)<br />18.                Filter:
(branch_id= $0)<br />19. Total runtime: 157492.890 ms<br /><br />The problem is that lines 10-12 on the fast schema
showan index scan while lines 9-10  of the slow schema show a sequence scan.  The sequence scan of the branch_id,
combinedwith the rest of the filter takes forever.  I have checked and there IS an index, specifically a primary key
index,on the branch_id in both schemas so I cannot figure out why the optimizer is looking at these differently.  In
fact,the table the branch_id comes from has the exact same indices and foreign keys on both schemas. <br /><br />Any
directionwould be deeply appreciated.<br /><br />Thanks!<br />Aaron<br /><br />-- <br
/>==================================================================<br/>   Aaron Bono<br />   Aranya Software
Technologies,Inc.<br />   <a href="http://www.aranya.com"> http://www.aranya.com</a><br />   <a
href="http://codeelixir.com">http://codeelixir.com</a><br
/>==================================================================

Re: Sequence vs. Index Scan

From

Andrew Sullivan

Date:

05 May 2007, 18:30:29

On Sat, May 05, 2007 at 04:05:52PM -0500, Aaron Bono wrote:
> I have two schemas, both with the exact same DDL.  One performs great and

Do they have the same data?

>  I have
> checked and there IS an index, specifically a primary key index, on the
> branch_id in both schemas so I cannot figure out why the optimizer is
> looking at these differently.  In fact, the table the branch_id comes from
> has the exact same indices and foreign keys on both schemas.

Different data?  Different statistics?  Different numbers of
(possibly dead) rows?

A

-- 
Andrew Sullivan  | ajs@crankycanuck.ca
Everything that happens in the world happens at some place.    --Jane Jacobs

Re: Sequence vs. Index Scan

From

"Aaron Bono"

Date:

05 May 2007, 19:01:00

On 5/5/07, Andrew Sullivan <ajs@crankycanuck.ca> wrote:

On Sat, May 05, 2007 at 04:05:52PM -0500, Aaron Bono wrote:
> I have two schemas, both with the exact same DDL. One performs great and

Do they have the same data?

They have different data. The fast one has about 150 rows and the slow one has about 40 rows. The field in question here, the branch_id, is a BIGSERIAL in both.

>  I have
> checked and there IS an index, specifically a primary key index, on the
> branch_id in both schemas so I cannot figure out why the optimizer is
> looking at these differently.  In fact, the table the branch_id comes from
> has the exact same indices and foreign keys on both schemas.

Different data?  Different statistics?  Different numbers of
(possibly dead) rows?

We don't allow deletes and updates are fairly infrequent. I also did a vacuum analyze to no effect.

What can I do to get more statistics that would help?

Thanks for the help,
Aaron

--
==================================================================
   Aaron Bono
   Aranya Software Technologies, Inc.
   http://www.aranya.com
   http://codeelixir.com
==================================================================

Re: Sequence vs. Index Scan

From

Tom Lane

Date:

05 May 2007, 21:22:59

"Aaron Bono" <postgresql@aranya.com> writes:
> 9.                           ->  Seq Scan on branch  (cost=0.00..4.72 rows=1
> width=1281) (actual time=130129.988..157492.057 rows=1 loops=1)
> 10.                                Filter: ((start_day <= now()) AND
> ((end_day IS NULL) OR (end_day >= now())) AND (branch_id =
> get_branch_for_zip('22151'::character varying)))

There is something *awfully* wacko about that entry --- the fact that
the cost estimate is less than 5 units means that the planner thinks
there's 4 or fewer pages; either that's way wrong or the
get_branch_for_zip function is taking enormous amounts of time per row.
Have you tried timing that function on its own?

One possible reason for the performance difference is if you have 
get_branch_for_zip marked as stable in one database and volatile in the
other --- volatile would prevent it from being used in an indexqual as
you'd like.
        regards, tom lane

Re: Sequence vs. Index Scan

From

Andrew Sullivan

Date:

05 May 2007, 21:41:10

On Sat, May 05, 2007 at 05:00:53PM -0500, Aaron Bono wrote:
> 
> They have different data.  The fast one has about 150 rows and the  slow one
> has about 40 rows.  The field in question here, the branch_id, is a
> BIGSERIAL in both.

I'd be astonished if a table of 40 rows ever got index scanned.  It's
probably more efficient to read the whole table.  But it seems your
case may be strange.  

> We don't allow deletes and updates are fairly infrequent.  I also did a
> vacuum analyze to no effect.

How do you "not allow" deletes?  Does this happen because of a failed
foreign key?  If so, you can end up with dead tuples.  I'd look at
the output of VACUUM VERBOSE to make sure you don't have a lot of
dead tuples.  That said, I wonder if fiddling with the statistics on
your tables might help.

A

-- 
Andrew Sullivan  | ajs@crankycanuck.ca
In the future this spectacle of the middle classes shocking the avant-
garde will probably become the textbook definition of Postmodernism.                --Brad Holland

Re: Sequence vs. Index Scan

From

"Aaron Bono"

Date:

06 May 2007, 00:29:04

On 5/5/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Aaron Bono" <postgresql@aranya.com> writes:
> 9. -> Seq Scan on branch (cost=0.00..4.72 rows=1
> width=1281) (actual time= 130129.988..157492.057 rows=1 loops=1)
> 10. Filter: ((start_day <= now()) AND
> ((end_day IS NULL) OR (end_day >= now())) AND (branch_id =
> get_branch_for_zip('22151'::character varying)))

There is something *awfully* wacko about that entry --- the fact that
the cost estimate is less than 5 units means that the planner thinks
there's 4 or fewer pages; either that's way wrong or the
get_branch_for_zip function is taking enormous amounts of time per row.
Have you tried timing that function on its own?

Yes, the first thing I did was run it on its own. It takes about 2-6 seconds to run which is not great but acceptable and the functions are marked as stable. It is the same +/- a second or two on each of the two schemas. The function is a plpgsql function that executes a dynamic SQL string. Would it be better to change it to a straight SQL function? Would that help the planner? I may be able to change it if this will help.

One possible reason for the performance difference is if you have
get_branch_for_zip marked as stable in one database and volatile in the
other --- volatile would prevent it from being used in an indexqual as
you'd like.

I just switched the functions in both schemas to be stable (before I posted the original question), they were volatile. After doing that, the fast schema worked great but the slow one is still under performing. I even went back to the slow schema and dropped and recreated the function with no discernible difference.

This is for PostgreSQL 8.1.3 and the results are the same on Windows and Linux versions. The windows version is a restore from the Linux version's backup.

Re: Sequence vs. Index Scan

From

"Aaron Bono"

Date:

06 May 2007, 01:13:53

On 5/5/07, Andrew Sullivan <ajs@crankycanuck.ca> wrote:

On Sat, May 05, 2007 at 05:00:53PM -0500, Aaron Bono wrote:

> We don't allow deletes and updates are fairly infrequent. I also did a
> vacuum analyze to no effect.

How do you "not allow" deletes?

Permissions are set on the tables to only allow the application to do select/insert/update. We use start/end dates to indicate that records are no longer active and then have the application run against views that filter out inactive records. It also allows "removal" of records sometime in the future, gives us the ability to "undelete" records, and keeps us from having to code the application to have to clean up because of foreign key constraints. There are other advantages.

I'd look at
the output of VACUUM VERBOSE to make sure you don't have a lot of
dead tuples.

Fast Schema:
psql:vacuumverbose.sql:1: INFO: vacuuming "fast_schema.branch"
psql:vacuumverbose.sql:1: INFO: index "branch_pkey" now contains 150 row versions in 2 pages
DETAIL: 0 index pages have been deleted, 0 are currently reusable.

Slow Schema:
psql:vacuumverbose.sql:1: INFO: vacuuming "slow_schema.branch"
psql:vacuumverbose.sql:1: INFO: index "branch_pkey" now contains 29 row versions in 2 pages
DETAIL: 0 index pages have been deleted, 0 are currently reusable.

Like Tom said earlier, I really think it has something to do with the volatile/stable indicator on the function but I changed that. The function in the slow schema acts like it is still volatile while the fast schema is obviously stable.

Is there another way to analyze how the database is using the function in the selects? That or is there a way to just look at how the function is being used by the optimizer?

I do appreciate the feedback.

--
==================================================================
   Aaron Bono
   Aranya Software Technologies, Inc.
   http://www.aranya.com
   http://codeelixir.com
==================================================================

Re: Sequence vs. Index Scan

From

"Aaron Bono"

Date:

06 May 2007, 01:25:25

On 5/5/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Aaron Bono" <postgresql@aranya.com> writes:
> 9. -> Seq Scan on branch (cost=0.00..4.72 rows=1
> width=1281) (actual time= 130129.988..157492.057 rows=1 loops=1)
> 10. Filter: ((start_day <= now()) AND
> ((end_day IS NULL) OR (end_day >= now())) AND (branch_id =
> get_branch_for_zip('22151'::character varying)))

There is something *awfully* wacko about that entry --- the fact that
the cost estimate is less than 5 units means that the planner thinks
there's 4 or fewer pages; either that's way wrong or the
get_branch_for_zip function is taking enormous amounts of time per row.
Have you tried timing that function on its own?

One possible reason for the performance difference is if you have
get_branch_for_zip marked as stable in one database and volatile in the
other --- volatile would prevent it from being used in an indexqual as
you'd like.

I verified it by putting a RAISE NOTICE in the function. The fast schema runs the function twice (odd, I would think it would run only once). The slow schema runs it 30 times (the number of records returned + 1). I know I put the functions into both schemas as stable and even dropped and recreated the function. Then I verified with EMS Manager and it tells me the DDL for the function in the database is set to stable. Is there something I can do to tell PostgreSQL that I really did mean stable?

Thanks for all the help,
Aaron

--
==================================================================
   Aaron Bono
   Aranya Software Technologies, Inc.
   http://www.aranya.com
   http://codeelixir.com
==================================================================

Re: Sequence vs. Index Scan

From

"Jaime Casanova"

Date:

06 May 2007, 01:37:34

On 5/5/07, Aaron Bono <postgresql@aranya.com> wrote:
> On 5/5/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > "Aaron Bono" <postgresql@aranya.com> writes:
> > > 9.                           ->  Seq Scan on branch  (cost=0.00..4.72
> rows=1
> > > width=1281) (actual time= 130129.988..157492.057 rows=1 loops=1)
> > > 10.                                Filter: ((start_day
> <= now()) AND
> > > ((end_day IS NULL) OR (end_day >= now())) AND (branch_id =
> > > get_branch_for_zip('22151'::character varying)))
> >
> > There is something *awfully* wacko about that entry --- the fact that
> > the cost estimate is less than 5 units means that the planner thinks
> > there's 4 or fewer pages; either that's way wrong or the
> > get_branch_for_zip function is taking enormous amounts of time per row.
> > Have you tried timing that function on its own?
> >
> > One possible reason for the performance difference is if you have
> > get_branch_for_zip marked as stable in one database and volatile in the
> > other --- volatile would prevent it from being used in an indexqual as
> > you'd like.
> >
>
> I verified it by putting a RAISE NOTICE in the function.  The fast schema
> runs the function twice (odd, I would think it would run only once).  The
> slow schema runs it 30 times (the number of records returned + 1).  I know I
> put the functions into both schemas as stable and even dropped and recreated
> the function.  Then I verified with EMS Manager and it tells me the DDL for
> the function in the database is set to stable.  Is there something I can do
> to tell PostgreSQL that I really did mean stable?
>

maybe this is silly but you can verify what the database thinks of the
function selecting from pg_proc

select pronamespace, provolatile
from pg_proc where proname = 'get_branch_for_zip'

-- 
regards,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."                                      Richard Cook

Re: Sequence vs. Index Scan

From

"Aaron Bono"

Date:

06 May 2007, 15:46:06

On 5/5/07, Jaime Casanova <systemguards@gmail.com> wrote:

On 5/5/07, Aaron Bono <postgresql@aranya.com> wrote:
> On 5/5/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > "Aaron Bono" < postgresql@aranya.com> writes:
> > > 9.                           ->  Seq Scan on branch  (cost=0.00..4.72
> rows=1
> > > width=1281) (actual time= 130129.988..157492.057 rows=1 loops=1)
> > > 10.                                Filter: ((start_day
> <= now()) AND
> > > ((end_day IS NULL) OR (end_day >= now())) AND (branch_id =
> > > get_branch_for_zip('22151'::character varying)))
> >
> > There is something *awfully* wacko about that entry --- the fact that
> > the cost estimate is less than 5 units means that the planner thinks
> > there's 4 or fewer pages; either that's way wrong or the
> > get_branch_for_zip function is taking enormous amounts of time per row.
> > Have you tried timing that function on its own?
> >
> > One possible reason for the performance difference is if you have
> > get_branch_for_zip marked as stable in one database and volatile in the
> > other --- volatile would prevent it from being used in an indexqual as
> > you'd like.
> >
>
> I verified it by putting a RAISE NOTICE in the function.  The fast schema
> runs the function twice (odd, I would think it would run only once).  The
> slow schema runs it 30 times (the number of records returned + 1).  I know I
> put the functions into both schemas as stable and even dropped and recreated
> the function.  Then I verified with EMS Manager and it tells me the DDL for
> the function in the database is set to stable.  Is there something I can do
> to tell PostgreSQL that I really did mean stable?
>

maybe this is silly but you can verify what the database thinks of the
function selecting from pg_proc

select pronamespace, provolatile
from pg_proc where proname = 'get_branch_for_zip'

select pronamespace, provolatile, proname
from pg_proc where proname = 'get_branch_for_zip';
pronamespace | provolatile |      proname
--------------+-------------+--------------------
     26644852 | s           | get_branch_for_zip
     26644856 | s           | get_branch_for_zip

The select is using the function on the slow schema as if it were volatile but as stable on the fast schema.

I did a restart of the service and that didn't help.

Then I inserted 150 more records in the slow schema and pow - it started working like the fast schema.

So my conclusion is that the function is being treated as volatile even though it is stable because the number of records is small. Is there any way to tell PostgreSQL that when I say stable I really mean stable?

I am getting close to a solution. Thanks again for the help!

-Aaron

--
==================================================================
   Aaron Bono
   Aranya Software Technologies, Inc.
   http://www.aranya.com
   http://codeelixir.com
==================================================================

Re: Sequence vs. Index Scan

From

Andrew Sullivan

Date:

07 May 2007, 07:58:39

On Sun, May 06, 2007 at 01:45:54PM -0500, Aaron Bono wrote:
> Then I inserted 150 more records in the slow schema and pow - it started
> working like the fast schema.
> 
> So my conclusion is that the function is being treated as volatile even
> though it is stable because the number of records is small. 

I don't think that's the issue.  If this is dependent on the
number of records, then for some reason the way the data is
structured means that the planner thinks a seqscan's a better bet. 
This is probably due to distribution of the values.  You could try
increasing the stats sample, and see if that helps.

A

-- 
Andrew Sullivan  | ajs@crankycanuck.ca
This work was visionary and imaginative, and goes to show that visionary
and imaginative work need not end up well.     --Dennis Ritchie

Re: Sequence vs. Index Scan

From

Tom Lane

Date:

07 May 2007, 11:38:17

Andrew Sullivan <ajs@crankycanuck.ca> writes:
> On Sun, May 06, 2007 at 01:45:54PM -0500, Aaron Bono wrote:
>> So my conclusion is that the function is being treated as volatile even
>> though it is stable because the number of records is small. 

> I don't think that's the issue.  If this is dependent on the
> number of records, then for some reason the way the data is
> structured means that the planner thinks a seqscan's a better bet. 
> This is probably due to distribution of the values.  You could try
> increasing the stats sample, and see if that helps.

It's got nothing to do with distribution, just with numbers of pages to
fetch.  You'll nearly always get a seqscan plan if there are only a
couple of pages in the table, simply because it would take more I/O to
read the index too.

The reason this is a problem in this example is that the function is so
expensive to execute.  The planner should be avoiding the seqscan on the
basis of CPU cost not I/O cost, but it doesn't know that the function is
expensive enough to drive the decision that way.

In CVS HEAD (8.3-to-be) we've added a "cost" property to functions,
which provides a clean way to fix this issue, but there's no good way to
deal with it in existing releases :-(
        regards, tom lane

Re: Sequence vs. Index Scan

From

"Aaron Bono"

Date:

07 May 2007, 13:40:26

On 5/7/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andrew Sullivan <ajs@crankycanuck.ca> writes:
> On Sun, May 06, 2007 at 01:45:54PM -0500, Aaron Bono wrote:
>> So my conclusion is that the function is being treated as volatile even
>> though it is stable because the number of records is small.

> I don't think that's the issue.  If this is dependent on the
> number of records, then for some reason the way the data is
> structured means that the planner thinks a seqscan's a better bet.
> This is probably due to distribution of the values.  You could try
> increasing the stats sample, and see if that helps.

It's got nothing to do with distribution, just with numbers of pages to
fetch.  You'll nearly always get a seqscan plan if there are only a
couple of pages in the table, simply because it would take more I/O to
read the index too.

The reason this is a problem in this example is that the function is so
expensive to execute.  The planner should be avoiding the seqscan on the
basis of CPU cost not I/O cost, but it doesn't know that the function is
expensive enough to drive the decision that way.

In CVS HEAD (8.3-to-be) we've added a "cost" property to functions,
which provides a clean way to fix this issue, but there's no good way to
deal with it in existing releases :-(

                        regards, tom lane

Since we don't delete records, we just deactivate them, I added 100 dummy records that are not active. This fixed the problem. As the number of records in that table grows, I will delete the dummy records.

Thanks for all the help!
Aaron