"Julian Scarfe" <julian@avbrief.com> writes:
> 6) Now I combine the filters in 4 & 5 (as I did from 1 & 2 to get 3, which
> performed in a similar time to 1)
> explain analyze
> select n.ref, n.code, a.ident, a.name
> from n left outer join a on (a.ident = n.code)
> where bbox && box (point (-0.032, 0.873), point (0.017, 0.908))
> and box (q_node, q_node)
> @ box (point (-0.032, 0.873), point (0.017, 0.908))
> QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------------
> Nested Loop Left Join (cost=0.00..851.06 rows=8 width=45) (actual
> time=11.662..7919.946 rows=150 loops=1)
> Join Filter: (("inner".ident)::text = "outer".code)
> -> Index Scan using n_bbox on n (cost=0.00..88.44 rows=1 width=20)
> (actual time=0.107..10.256 rows=150 loops=1)
> Index Cond: (bbox && '(0.017,0.908),(-0.032,0.873)'::box)
> Filter: (box(q_node, q_node) @ '(0.017,0.908),(-0.032,0.873)'::box)
> -> Seq Scan on a (cost=0.00..611.05 rows=10105 width=25) (actual
> time=0.006..18.044 rows=10105 loops=150)
> Total runtime: 7920.684 ms
> Whoa! Instead of a performance similar to query 4, it chooses a different
> strategy, and takes 40 times as long. (Both tables just analyzed.)
The problem is that it's underestimating the number of rows pulled from
the n table (1 vs actual 150), which makes a simple nestloop join look
like the way to go. That error comes from the fact that we don't really
have any statistical estimation for geometric conditions :-(. Some of
the PostGIS hackers have been working on such, I believe, but I'm not
sure how far they've gotten.
regards, tom lane