Re: Slow query: Select all buildings that have >1 pharmacies and >1 schools within 1000m - Mailing list pgsql-performance

From Craig James
Subject Re: Slow query: Select all buildings that have >1 pharmacies and >1 schools within 1000m
Date
Msg-id CAFwQ8rd0eMcX45HmMTGqwbW==nqMovN=cs2MZYbu7SGGqppBHA@mail.gmail.com
Whole thread Raw
In response to Slow query: Select all buildings that have >1 pharmacies and >1 schools within 1000m  (Stefan Keller <sfkeller@gmail.com>)
Responses Re: Slow query: Select all buildings that have >1 pharmacies and >1 schools within 1000m  (Stefan Keller <sfkeller@gmail.com>)
List pgsql-performance
On Tue, Aug 7, 2012 at 5:01 AM, Stefan Keller <sfkeller@gmail.com> wrote:
Hi

I have an interesting query to be optimized related to this one [1].

The query definition is: Select all buildings that have more than 1
pharmacies and more than 1 schools within a radius of 1000m.

The problem is that I think that this query is inherently O(n^2). In
fact the solution I propose below takes forever...

Maybe you could get rid of the O(n^2) aspect like this:

   Select all buildings that have more than 1
   pharmacies and more than 1 schools within a radius of 1000m
   from
      (Select all buildings that have more than four (pharmacy or school)
        within a radius of 1000m)

The inner select should be fast -- you could make it fast by creating a new property like "building of interest" that was "pharmacy or school" and build an index on the "building of interest" property.

The inner query would reduce your sample set to a much smaller set of buildings, and presumably the outer query could handle that pretty quickly.

Craig James
 

My questions:

1. Any comments about the nature of this problem?

2. ... on how to speed it up ?

3. In the original query [1] there's a count which contains a
subquery. According to my tests PostgreSQL does not allow this despite
the documentation which says "count(expression)".

Remarks: I know that "count(*)" could be faster on PostgreSQL but
"count(osm_id)" does not change the query plan and this does not seem
to be the bottleneck here anyway.

Yours, S.

[1] http://gis.stackexchange.com/questions/11445/selecting-pois-around-specific-buildings-using-postgis


Here's my query:

-- Select all buildings that have >1 pharmacies and >1 schools within 1000m:
SELECT osm_id AS building_id
FROM
  (SELECT osm_id, way
   FROM osm_polygon
   WHERE tags @> hstore('building','yes')
  ) AS b
WHERE
 (SELECT count(*) > 1
  FROM osm_poi AS p
  WHERE p.tags @> hstore('amenity','pharmacy')
  AND ST_DWithin(b.way,p.way,1000)
 )
 AND
 (SELECT count(*) > 1
  FROM osm_poi AS p
  WHERE p.tags @> hstore('amenity','school')
  AND ST_DWithin(b.way,p.way,1000)
 )
-- Total query runtime: 4308488 ms. 66345 rows retrieved.

Here's the query plan (from EXPLAIN):
"Index Scan using osm_polygon_tags_idx on osm_polygon
(cost=0.00..406812.81 rows=188 width=901)"
"  Index Cond: (tags @> '"building"=>"yes"'::hstore)"
"  Filter: ((SubPlan 1) AND (SubPlan 2))"
"  SubPlan 1"
"    ->  Aggregate  (cost=269.19..269.20 rows=1 width=0)"
"          ->  Bitmap Heap Scan on osm_poi p  (cost=7.76..269.19
rows=1 width=0)"
"                Recheck Cond: (way && st_expand(osm_polygon.way,
1000::double precision))"
"                Filter: ((tags @> '"amenity"=>"pharmacy"'::hstore)
AND (osm_polygon.way && st_expand(way, 1000::double precision)) AND
_st_dwithin(osm_polygon.way, way, 1000::double precision))"
"                ->  Bitmap Index Scan on osm_poi_way_idx
(cost=0.00..7.76 rows=62 width=0)"
"                      Index Cond: (way && st_expand(osm_polygon.way,
1000::double precision))"
"  SubPlan 2"
"    ->  Aggregate  (cost=269.19..269.20 rows=1 width=0)"
"          ->  Bitmap Heap Scan on osm_poi p  (cost=7.76..269.19
rows=1 width=0)"
"                Recheck Cond: (way && st_expand(osm_polygon.way,
1000::double precision))"
"                Filter: ((tags @> '"amenity"=>"school"'::hstore) AND
(osm_polygon.way && st_expand(way, 1000::double precision)) AND
_st_dwithin(osm_polygon.way, way, 1000::double precision))"
"                ->  Bitmap Index Scan on osm_poi_way_idx
(cost=0.00..7.76 rows=62 width=0)"
"                      Index Cond: (way && st_expand(osm_polygon.way,
1000::double precision))"

***

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

pgsql-performance by date:

Previous
From: "Tomas Vondra"
Date:
Subject: Re: Slow query: Select all buildings that have >1 pharmacies and >1 schools within 1000m
Next
From: Jeff Janes
Date:
Subject: Re: Sequential scan instead of index scan