Thread: query plan different for "SELECT ..." and "DECLARE CURSOR ..."?

query plan different for "SELECT ..." and "DECLARE CURSOR ..."?

From

David Blasby

Date:

01 October 2003, 15:53:29

I've been noticing query planning to be different for a cursor-based 
select and normal select.

For example, my query looks like this:

=# SELECT <select clause>

The query takes about 1/4 of a second.

But, for:

=# BEGIN;
=# DECLARE mycursor BINARY CURSOR FOR SELECT <select clause>;
=# FETCH ALL IN mycursor;

The same [SQL] query is being used, but this takes about 6 seconds (20* 
longer).

Using explain, I see that the planner chose a different plan.

Why does this sort of thing happen?  How do I stop it?

I've included the query plans below if you think a specific example is 
important - but i'm more looking for a generic answer.  Sorry for the 
complexity.

NOTE: these are are PostGIS queries (&& is GIST indexed).  The reason a 
binary cursor is being used is because I use the WKB (well known binary) 
geometry representation as the transit 'language'.

thanks for your help,
dave


"SELECT plan"
QUERY PLAN 


---------------------------------------------------------------------------------------------------------------------------------------------------------
HashJoin  (cost=1993.11..2462.50 rows=9 width=40) (actual 
 
time=77.69..169.96 rows=67 loops=1)   Hash Cond: ("outer".lha_id = "inner".lha_id)   ->  Subquery Scan b
(cost=1983.00..2443.08rows=1839 width=16) 
 
(actual time=60.48..127.20 rows=67 loops=1)         ->  Aggregate  (cost=1983.00..2443.08 rows=1839 width=16) 
(actual time=60.47..127.08 rows=67 loops=1)               ->  Group  (cost=1983.00..2351.14 rows=18387 width=16) 
(actual time=59.68..115.57 rows=18387 loops=1)                     ->  Merge Join  (cost=1983.00..2305.17 rows=18387 
width=16) (actual time=59.67..93.81 rows=18387 loops=1)                           Merge Cond: ("outer".lha_id = 
"inner".lha_from_id)                           ->  Sort  (cost=8.77..8.99 rows=89 width=8) 
(actual time=0.44..0.48 rows=82 loops=1)                                 Sort Key: p.lha_id
   ->  Seq Scan on lha_pop p 
 
(cost=0.00..5.89 rows=89 width=8) (actual time=0.03..0.15 rows=89 loops=1)                           ->  Sort
(cost=1974.23..2020.19rows=18387 
 
width=8) (actual time=59.19..64.80 rows=18387 loops=1)                                 Sort Key: s.lha_from_id
                      ->  Seq Scan on msp_trip_summary s 
 
(cost=0.00..671.84 rows=18387 width=8) (actual time=1.70..31.31 
rows=18387 loops=1)                                       Filter: (distance > 200)   ->  Hash  (cost=10.11..10.11
rows=1width=36) (actual 
 
time=15.71..15.71 rows=0 loops=1)         ->  Seq Scan on lha_albers a  (cost=0.00..10.11 rows=1 
width=36) (actual time=1.06..15.54 rows=89 loops=1)               Filter: (the_geom && 'SRID=-1;BOX3D(250000 250000 
0,1900000 1900000 0)'::geometry) Total runtime: 173.97 msec
(18 rows)


and the "DECLARE" plan:                                                 QUERY PLAN 

------------------------------------------------------------------------------------------------------------ Nested
Loop (cost=1983.00..2476.17 rows=9 width=40)   Join Filter: ("outer".lha_id = "inner".lha_id)   ->  Seq Scan on
lha_albersa  (cost=0.00..10.11 rows=1 width=36)         Filter: (the_geom && 'SRID=-1;BOX3D(250000 250000 0,1900000 
 
1900000 0)'::geometry)   ->  Subquery Scan b  (cost=1983.00..2443.08 rows=1839 width=16)         ->  Aggregate
(cost=1983.00..2443.08rows=1839 width=16)               ->  Group  (cost=1983.00..2351.14 rows=18387 width=16)
          ->  Merge Join  (cost=1983.00..2305.17 rows=18387 
 
width=16)                           Merge Cond: ("outer".lha_id = 
"inner".lha_from_id)                           ->  Sort  (cost=8.77..8.99 rows=89 width=8)
  Sort Key: p.lha_id                                 ->  Seq Scan on lha_pop p 
 
(cost=0.00..5.89 rows=89 width=8)                           ->  Sort  (cost=1974.23..2020.19 rows=18387 
width=8)                                 Sort Key: s.lha_from_id                                 ->  Seq Scan on
msp_trip_summarys 
 
(cost=0.00..671.84 rows=18387 width=8)                                       Filter: (distance > 200) Total runtime:
0.41msec
 
(17 rows)



tap=# explain DECLARE mycursor BINARY CURSOR FOR SELECT 
asbinary(force_collection(force_2d(the_geom)),'NDR'),LHA_ID::text from (
tap(#                         Select a.the_geom, a.lha_id, trips from 
lha_albers a,
tap(#                                 (SELECT (s.lha_from_id) as lha_id, 
(sum(s.count)::float / max(p.population)::float * 100) as trips
tap(#                                 from lha_pop p, msp_trip_summary s
tap(#                                 where  p.lha_id = s.lha_from_id 
AND s.distance > 200 Group by s.lha_from_id) b
tap(#                         where a.lha_id = b.lha_id
tap(#                 ) AS TBL WHERE the_geom && setSRID('BOX3D(250000 
250000,1900000 1900000)'::BOX3D, -1 )
tap-# ;

Re: query plan different for "SELECT ..." and "DECLARE CURSOR ..."?

From

Tom Lane

Date:

01 October 2003, 19:57:50

David Blasby <dblasby@refractions.net> writes:
> I've been noticing query planning to be different for a cursor-based 
> select and normal select.

IIRC, in a DECLARE context the planner puts more weight on the startup
cost than the total cost, on the theory that you might not be planning
to fetch the whole result, and even if you are you may prefer to overlap
some frontend and backend processing by fetching the results
incrementally rather than all at once.

There was some talk of introducing a control variable to affect this
weighting, but it's not there yet.

In any case, I'd think the real issue here is that the planner thinks
these two plans are nearly the same cost, when in reality there's an
order-of-magnitude difference.  As far as I can see the problem is with
the estimation of this scan result:

>           ->  Seq Scan on lha_albers a  (cost=0.00..10.11 rows=1 width=36) (actual time=1.06..15.54 rows=89 loops=1)
>                 Filter: (the_geom && 'SRID=-1;BOX3D(250000 2500000,1900000 1900000 0)'::geometry)

The factor-of-89 error in row count here translates directly to a
factor-of-89 underestimation of the cost of the nestloop plan.

You may need to bite the bullet and try to devise some real selectivity
estimation techniques for your geometric operators.  The stuff in
src/backend/utils/adt/geo_selfuncs.c at the moment is all just stubs :-(
        regards, tom lane

Re: query plan different for "SELECT ..." and "DECLARE

From

David Blasby

Date:

02 October 2003, 13:35:33

Tom Lane wrote:

> You may need to bite the bullet and try to devise some real selectivity
> estimation techniques for your geometric operators.  The stuff in
> src/backend/utils/adt/geo_selfuncs.c at the moment is all just stubs :-(

I've already done this - it actually gives pretty accurate estimates.
Basically, I do a two-stage pass through the geometry table.  The first 
pass just calculates the bounding box of all the geometries.   I make a 
2D histogram structure (much like a really really simple quad tree) that 
fits this bounding box.  The 2nd pass looks at each individual geometry 
in the table - it updates one (or more) cells in the 2D histogram.

When the planner asks for the selectivity of a '&&' query, I pull in the 
2D Histogram (its in the geometry_column metadata table) and look at the 
query's 'query bounding box' and see which cells of the 2D histogram it 
overlaps.  I make an estimate of the number of rows the query will 
return by looking at the % overlap of the query window and the number of 
geometries in that 2D histogram cells.  You'd think such a crude test 
wouldnt give you good results, but in fact its amazingly accurate.

If for some reason I cannot calculate an estimate, I fall back to the 
geo_selfuncs.c method and return some really small number.

The PostGIS mailing list (postgis.refractions.net) has more technical 
details.

The only real problem is that the user has to manual keep stats 
up-to-date.  Is there anyway to attach something to VACUUM ANALYSE?

As an aside, PostGIS is now passing the Open GIS Consortium's "Simple 
Features For SQL" conformance test.  We're hoping to submit it for the 
OGC stamp of approval "very soon."

Thanks for the info on start-up cost being more heavily weighted.  The 
user who reported this problem didnt have the index-selectivity-stats 
package turned on in their database.  Once they turned it on, everything 
worked correctly.  I was just confused as to why the DECLARE and SELECT 
were making different plans.

dave

Re: query plan different for "SELECT ..." and "DECLARE CURSOR ..."?

From

Tom Lane

Date:

02 October 2003, 13:45:22

David Blasby <dblasby@refractions.net> writes:
> The only real problem is that the user has to manual keep stats 
> up-to-date.  Is there anyway to attach something to VACUUM ANALYSE?

The ANALYZE code is set up with the idea that there could be multiple
analysis methods and various kinds of stuff stored in pg_statistic.
Right now there isn't any way to physically plug in a different analysis
routine :-( but it seems like it'd be reasonable to add some hook of
that kind.  Perhaps CREATE TYPE could be extended to specify an analysis
routine for ANALYZE to call for columns of that type.

Please take a look at pg_statistic.h and commands/analyze.c and see if
this would do the job for you.  Obviously it's too late for 7.4 but we
could think about a solution in 7.5.
        regards, tom lane

Re: query plan different for "SELECT ..." and "DECLARE

From

Joe Conway

Date:

02 October 2003, 14:14:38

Tom Lane wrote:
> David Blasby <dblasby@refractions.net> writes:
>>The only real problem is that the user has to manual keep stats 
>>up-to-date.  Is there anyway to attach something to VACUUM ANALYSE?
> 
> The ANALYZE code is set up with the idea that there could be multiple
> analysis methods and various kinds of stuff stored in pg_statistic.
> Right now there isn't any way to physically plug in a different analysis
> routine :-( but it seems like it'd be reasonable to add some hook of
> that kind.  Perhaps CREATE TYPE could be extended to specify an analysis
> routine for ANALYZE to call for columns of that type.
> 
> Please take a look at pg_statistic.h and commands/analyze.c and see if
> this would do the job for you.  Obviously it's too late for 7.4 but we
> could think about a solution in 7.5.
> 

It would be interesting if a PL/R function could be plugged in for both 
the ANALYZE function and the selectivity function. There are quite a few 
spatial data related packages available for R; one of them might fit 
nicely for this application.

Joe