Thread: declared cursor uses slow plan

declared cursor uses slow plan

From
Kevin Neufeld
Date:
Why would postgres use a different query plan for declared cursors than
without?

I have a relatively simple query that takes about 150ms using explain
analyze. However, when I wrap the same query in a declared cursor
statement, the subsequent fetch statement takes almost 30seconds. For
some reason, the planner decided to do a nested loop left join instead
of a hash left join. Does anyone know why the planner would choose this
course?

For those interested, the results of the planner are:

EXPLAIN ANALYZE SELECT a.wb_id, a.group_code, a.area, a.type, a.source,
a.fcode, asbinary((a.the_geom), 'XDR'), c.name, b.gnis_id FROM
csn_waterbodies a LEFT JOIN (csn_named_waterbodies as b JOIN
all_gnis_info as c ON b.gnis_id = c.gnis_id) on a.wb_id = b.wb_id WHERE
the_geom && GeometryFromText('POLYGON ((998061.4211119856
820217.228917891, 1018729.3748344192 820217.228917891,
1018729.3748344192 827989.3006519538, 998061.4211119856
827989.3006519538, 998061.4211119856 820217.228917891))', 42102);


QUERY
PLAN


-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


Hash Left Join  (cost=1554.46..1611.26 rows=5 width=1048) (actual
time=144.620..150.277 rows=208 loops=1)
  Hash Cond: ("outer".wb_id = "inner".wb_id)
  ->  Index Scan using csn_waterbodies_the_geom_idx on csn_waterbodies
a  (cost=0.00..6.40 rows=5 width=1026) (actual time=0.192..2.838
rows=208 loops=1)
        Index Cond: (the_geom && 'SRID=42102;POLYGON((998061.421111986
820217.228917891,1018729.37483442 820217.228917891,1018729.37483442
827989.300651954,998061.421111986 827989.300651954,998061.421111986
820217.228917891))'::geometry)
        Filter: (the_geom && 'SRID=42102;POLYGON((998061.421111986
820217.228917891,1018729.37483442 820217.228917891,1018729.37483442
827989.300651954,998061.421111986 827989.300651954,998061.421111986
820217.228917891))'::geometry)
  ->  Hash  (cost=1535.13..1535.13 rows=7734 width=26) (actual
time=143.717..143.717 rows=0 loops=1)
        ->  Merge Join  (cost=0.00..1535.13 rows=7734 width=26) (actual
time=6.546..134.906 rows=7203 loops=1)
              Merge Cond: ("outer".gnis_id = "inner".gnis_id)
              ->  Index Scan using csn_named_waterbodies_gnis_id_idx on
csn_named_waterbodies b  (cost=0.00..140.37 rows=7215 width=8) (actual
time=0.035..10.796 rows=7204 loops=1)
              ->  Index Scan using all_gnis_info_gnis_id_idx on
all_gnis_info c  (cost=0.00..1210.19 rows=41745 width=22) (actual
time=0.014..60.387 rows=42757 loops=1)
Total runtime: 150.713 ms
(11 rows)


DECLARE thread_33000912 CURSOR FOR SELECT ...


QUERY
PLAN


-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


Nested Loop Left Join  (cost=0.00..8165.43 rows=5 width=1048)
  Join Filter: ("outer".wb_id = "inner".wb_id)
  ->  Index Scan using csn_waterbodies_the_geom_idx on csn_waterbodies
a  (cost=0.00..6.40 rows=5 width=1026)
        Index Cond: (the_geom && 'SRID=42102;POLYGON((998061.421111986
820217.228917891,1018729.37483442 820217.228917891,1018729.37483442
827989.300651954,998061.421111986 827989.300651954,998061.421111986
820217.228917891))'::geometry)
        Filter: (the_geom && 'SRID=42102;POLYGON((998061.421111986
820217.228917891,1018729.37483442 820217.228917891,1018729.37483442
827989.300651954,998061.421111986 827989.300651954,998061.421111986
820217.228917891))'::geometry)
  ->  Merge Join  (cost=0.00..1535.13 rows=7734 width=26)
        Merge Cond: ("outer".gnis_id = "inner".gnis_id)
        ->  Index Scan using csn_named_waterbodies_gnis_id_idx on
csn_named_waterbodies b  (cost=0.00..140.37 rows=7215 width=8)
        ->  Index Scan using all_gnis_info_gnis_id_idx on all_gnis_info
c  (cost=0.00..1210.19 rows=41745 width=22)
(9 rows)



Cheers,
Kevin

--
Kevin Neufeld,
Refractions Research Inc.,
kneufeld@refractions.net
Phone: (250) 383-3022
Fax:   (250) 383-2140


Re: declared cursor uses slow plan

From
Tom Lane
Date:
Kevin Neufeld <kneufeld@refractions.net> writes:
> I have a relatively simple query that takes about 150ms using explain
> analyze. However, when I wrap the same query in a declared cursor
> statement, the subsequent fetch statement takes almost 30seconds. For
> some reason, the planner decided to do a nested loop left join instead
> of a hash left join. Does anyone know why the planner would choose this
> course?

Plans for cursors are optimized partly for startup speed as opposed to
total time, on the assumption that you'd rather get some of the rows
sooner so you can crunch on them.

Probably there should be a knob you can fool with to adjust the strength
of the effect, but at present I think it's hard-wired.

The real problem here of course is that the total cost of the nestloop
is being underestimated so badly (the estimate is only 5x more than the
hash join where reality is 200x more).  It looks like this is mainly
because the number of matching rows from csn_waterbodies is badly
underestimated, which comes from the fact that we have no useful
statistics for geometric operators :-(.  I think that the PostGIS crew
is working that problem but I have no idea how far along they are...

            regards, tom lane