Thread: query plan different for "SELECT ..." and "DECLARE CURSOR ..."?
I've been noticing query planning to be different for a cursor-based select and normal select. For example, my query looks like this: =# SELECT <select clause> The query takes about 1/4 of a second. But, for: =# BEGIN; =# DECLARE mycursor BINARY CURSOR FOR SELECT <select clause>; =# FETCH ALL IN mycursor; The same [SQL] query is being used, but this takes about 6 seconds (20* longer). Using explain, I see that the planner chose a different plan. Why does this sort of thing happen? How do I stop it? I've included the query plans below if you think a specific example is important - but i'm more looking for a generic answer. Sorry for the complexity. NOTE: these are are PostGIS queries (&& is GIST indexed). The reason a binary cursor is being used is because I use the WKB (well known binary) geometry representation as the transit 'language'. thanks for your help, dave "SELECT plan" QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------- HashJoin (cost=1993.11..2462.50 rows=9 width=40) (actual time=77.69..169.96 rows=67 loops=1) Hash Cond: ("outer".lha_id = "inner".lha_id) -> Subquery Scan b (cost=1983.00..2443.08rows=1839 width=16) (actual time=60.48..127.20 rows=67 loops=1) -> Aggregate (cost=1983.00..2443.08 rows=1839 width=16) (actual time=60.47..127.08 rows=67 loops=1) -> Group (cost=1983.00..2351.14 rows=18387 width=16) (actual time=59.68..115.57 rows=18387 loops=1) -> Merge Join (cost=1983.00..2305.17 rows=18387 width=16) (actual time=59.67..93.81 rows=18387 loops=1) Merge Cond: ("outer".lha_id = "inner".lha_from_id) -> Sort (cost=8.77..8.99 rows=89 width=8) (actual time=0.44..0.48 rows=82 loops=1) Sort Key: p.lha_id -> Seq Scan on lha_pop p (cost=0.00..5.89 rows=89 width=8) (actual time=0.03..0.15 rows=89 loops=1) -> Sort (cost=1974.23..2020.19rows=18387 width=8) (actual time=59.19..64.80 rows=18387 loops=1) Sort Key: s.lha_from_id -> Seq Scan on msp_trip_summary s (cost=0.00..671.84 rows=18387 width=8) (actual time=1.70..31.31 rows=18387 loops=1) Filter: (distance > 200) -> Hash (cost=10.11..10.11 rows=1width=36) (actual time=15.71..15.71 rows=0 loops=1) -> Seq Scan on lha_albers a (cost=0.00..10.11 rows=1 width=36) (actual time=1.06..15.54 rows=89 loops=1) Filter: (the_geom && 'SRID=-1;BOX3D(250000 250000 0,1900000 1900000 0)'::geometry) Total runtime: 173.97 msec (18 rows) and the "DECLARE" plan: QUERY PLAN ------------------------------------------------------------------------------------------------------------ Nested Loop (cost=1983.00..2476.17 rows=9 width=40) Join Filter: ("outer".lha_id = "inner".lha_id) -> Seq Scan on lha_albersa (cost=0.00..10.11 rows=1 width=36) Filter: (the_geom && 'SRID=-1;BOX3D(250000 250000 0,1900000 1900000 0)'::geometry) -> Subquery Scan b (cost=1983.00..2443.08 rows=1839 width=16) -> Aggregate (cost=1983.00..2443.08rows=1839 width=16) -> Group (cost=1983.00..2351.14 rows=18387 width=16) -> Merge Join (cost=1983.00..2305.17 rows=18387 width=16) Merge Cond: ("outer".lha_id = "inner".lha_from_id) -> Sort (cost=8.77..8.99 rows=89 width=8) Sort Key: p.lha_id -> Seq Scan on lha_pop p (cost=0.00..5.89 rows=89 width=8) -> Sort (cost=1974.23..2020.19 rows=18387 width=8) Sort Key: s.lha_from_id -> Seq Scan on msp_trip_summarys (cost=0.00..671.84 rows=18387 width=8) Filter: (distance > 200) Total runtime: 0.41msec (17 rows) tap=# explain DECLARE mycursor BINARY CURSOR FOR SELECT asbinary(force_collection(force_2d(the_geom)),'NDR'),LHA_ID::text from ( tap(# Select a.the_geom, a.lha_id, trips from lha_albers a, tap(# (SELECT (s.lha_from_id) as lha_id, (sum(s.count)::float / max(p.population)::float * 100) as trips tap(# from lha_pop p, msp_trip_summary s tap(# where p.lha_id = s.lha_from_id AND s.distance > 200 Group by s.lha_from_id) b tap(# where a.lha_id = b.lha_id tap(# ) AS TBL WHERE the_geom && setSRID('BOX3D(250000 250000,1900000 1900000)'::BOX3D, -1 ) tap-# ;
David Blasby <dblasby@refractions.net> writes: > I've been noticing query planning to be different for a cursor-based > select and normal select. IIRC, in a DECLARE context the planner puts more weight on the startup cost than the total cost, on the theory that you might not be planning to fetch the whole result, and even if you are you may prefer to overlap some frontend and backend processing by fetching the results incrementally rather than all at once. There was some talk of introducing a control variable to affect this weighting, but it's not there yet. In any case, I'd think the real issue here is that the planner thinks these two plans are nearly the same cost, when in reality there's an order-of-magnitude difference. As far as I can see the problem is with the estimation of this scan result: > -> Seq Scan on lha_albers a (cost=0.00..10.11 rows=1 width=36) (actual time=1.06..15.54 rows=89 loops=1) > Filter: (the_geom && 'SRID=-1;BOX3D(250000 2500000,1900000 1900000 0)'::geometry) The factor-of-89 error in row count here translates directly to a factor-of-89 underestimation of the cost of the nestloop plan. You may need to bite the bullet and try to devise some real selectivity estimation techniques for your geometric operators. The stuff in src/backend/utils/adt/geo_selfuncs.c at the moment is all just stubs :-( regards, tom lane
Tom Lane wrote: > You may need to bite the bullet and try to devise some real selectivity > estimation techniques for your geometric operators. The stuff in > src/backend/utils/adt/geo_selfuncs.c at the moment is all just stubs :-( I've already done this - it actually gives pretty accurate estimates. Basically, I do a two-stage pass through the geometry table. The first pass just calculates the bounding box of all the geometries. I make a 2D histogram structure (much like a really really simple quad tree) that fits this bounding box. The 2nd pass looks at each individual geometry in the table - it updates one (or more) cells in the 2D histogram. When the planner asks for the selectivity of a '&&' query, I pull in the 2D Histogram (its in the geometry_column metadata table) and look at the query's 'query bounding box' and see which cells of the 2D histogram it overlaps. I make an estimate of the number of rows the query will return by looking at the % overlap of the query window and the number of geometries in that 2D histogram cells. You'd think such a crude test wouldnt give you good results, but in fact its amazingly accurate. If for some reason I cannot calculate an estimate, I fall back to the geo_selfuncs.c method and return some really small number. The PostGIS mailing list (postgis.refractions.net) has more technical details. The only real problem is that the user has to manual keep stats up-to-date. Is there anyway to attach something to VACUUM ANALYSE? As an aside, PostGIS is now passing the Open GIS Consortium's "Simple Features For SQL" conformance test. We're hoping to submit it for the OGC stamp of approval "very soon." Thanks for the info on start-up cost being more heavily weighted. The user who reported this problem didnt have the index-selectivity-stats package turned on in their database. Once they turned it on, everything worked correctly. I was just confused as to why the DECLARE and SELECT were making different plans. dave
David Blasby <dblasby@refractions.net> writes: > The only real problem is that the user has to manual keep stats > up-to-date. Is there anyway to attach something to VACUUM ANALYSE? The ANALYZE code is set up with the idea that there could be multiple analysis methods and various kinds of stuff stored in pg_statistic. Right now there isn't any way to physically plug in a different analysis routine :-( but it seems like it'd be reasonable to add some hook of that kind. Perhaps CREATE TYPE could be extended to specify an analysis routine for ANALYZE to call for columns of that type. Please take a look at pg_statistic.h and commands/analyze.c and see if this would do the job for you. Obviously it's too late for 7.4 but we could think about a solution in 7.5. regards, tom lane
Tom Lane wrote: > David Blasby <dblasby@refractions.net> writes: >>The only real problem is that the user has to manual keep stats >>up-to-date. Is there anyway to attach something to VACUUM ANALYSE? > > The ANALYZE code is set up with the idea that there could be multiple > analysis methods and various kinds of stuff stored in pg_statistic. > Right now there isn't any way to physically plug in a different analysis > routine :-( but it seems like it'd be reasonable to add some hook of > that kind. Perhaps CREATE TYPE could be extended to specify an analysis > routine for ANALYZE to call for columns of that type. > > Please take a look at pg_statistic.h and commands/analyze.c and see if > this would do the job for you. Obviously it's too late for 7.4 but we > could think about a solution in 7.5. > It would be interesting if a PL/R function could be plugged in for both the ANALYZE function and the selectivity function. There are quite a few spatial data related packages available for R; one of them might fit nicely for this application. Joe