Re: Postgresql 9.1 - select statement with multiple "with-clauses" becomes very slow - Mailing list pgsql-general

From Alban Hertroys
Subject Re: Postgresql 9.1 - select statement with multiple "with-clauses" becomes very slow
Date
Msg-id CAF-3MvM8t7QO33HbgTpSz1E4aZRkKoRHqOtSvKqEf0sj_ZXR6w@mail.gmail.com
Whole thread Raw
In response to Postgresql 9.1 - select statement with multiple "with-clauses" becomes very slow  (Opel Fahrer <opelfahrer79@yahoo.de>)
Responses Re: Postgresql 9.1 - select statement with multiple "with-clauses" becomes very slow  (Opel Fahrer <opelfahrer79@yahoo.de>)
List pgsql-general
You're ending up with something that's basically a carthesian product of closebuildings and closebuildingdescriptions.

Your query looks like a simple join would serve just fine, something like:

prepare getmydata(real,real,real) AS (
select image, data
from info
inner join buildings on (buildings.id = info.building_id)
inner join buildingdescriptions on (buildingdescriptions.id = buildings.description_id)
where ST_DWithin(position, 'POINT($1 $2)', $3)
)


On 3 January 2013 12:43, Opel Fahrer <opelfahrer79@yahoo.de> wrote:
I'm a noob in writing efficient Postgres queries, so I wrote a first function to query multiple linked tables using the PostGIS extension. The query should fetch data from multiple tables and finally give me a table with two columns. Here's the code:

[code]
    prepare getmydata(real,real,real) AS (
     with
     closeby(id) AS (
      select buildingid from info where ST_DWithin(position, 'POINT($1 $2)', $3)   
     ),
     closebuildings(descriptionid,image) AS (
      select descriptionid,image from buildings where id IN (select * from closeby)
     ),
     closebuildingdescriptions(data) AS (
      select data from buildingdescriptions where id IN (select descriptionid from closebuildings)
     )
     select image,data from closebuildings,closebuildingdescriptions;
    );
    execute getmydata(0.0,0.0,10.0);
[/code]

Actually the problem is that this query is VERY slow, even if the database content is small (taking around 15 minutes or so). The problem seems to be that postgres has to make sure that for the select statement both columns have equal length. If I only do "select image from closebuildings", the results are delivered in 0.1 secs, a "select data from closebuildingdescriptions" is delivered in 7.8 secs.

I ran an "explain analyze" call, but I can't make any sense from the output:

[code]
"Nested Loop  (cost=7816.51..2636821.06 rows=131352326 width=36) (actual time=117.125..6723.014 rows=12845056 loops=1)"
"  CTE closeby"
"    ->  Seq Scan on info  (cost=0.00..1753.11 rows=186 width=4) (actual time=0.022..5.821 rows=1579 loops=1)"
"          Filter: (("position" && '0103000020797F000001000000050000007D3F35DEAC512041E3A59BBC9BE153417D3F35DEAC512041E3A59BBCA0E153417D3F35DED4512041E3A59BBCA0E153417D3F35DED4512041E3A59BBC9BE153417D3F35DEAC512041E3A59BBC9BE15341'::geometry) AND ('0101000020797F00007D3F35DEC0512041E3A59B3C9EE15341'::geometry && st_expand("position", 10::double precision)) AND _st_dwithin("position", '0101000020797F00007D3F35DEC0512041E3A59B3C9EE15341'::geometry, 10::double precision))"
"  CTE closebuildings"
"    ->  Hash Semi Join  (cost=6.04..4890.03 rows=1351 width=8) (actual time=54.743..61.025 rows=3584 loops=1)"
"          Hash Cond: (closebuildings.id = closeby.buildingid)"
"          ->  Seq Scan on closebuildings  (cost=0.00..4358.52 rows=194452 width=12) (actual time=0.042..31.646 rows=194452 loops=1)"
"          ->  Hash  (cost=3.72..3.72 rows=186 width=4) (actual time=7.073..7.073 rows=1579 loops=1)"
"                Buckets: 1024  Batches: 1  Memory Usage: 56kB"
"                ->  CTE Scan on closeby  (cost=0.00..3.72 rows=186 width=4) (actual time=0.023..6.591 rows=1579 loops=1)"
"  CTE closebuildingdescriptions"
"    ->  Nested Loop  (cost=30.40..1173.37 rows=97226 width=516) (actual time=117.103..1890.902 rows=3584 loops=1)"
"          ->  HashAggregate  (cost=30.40..32.40 rows=200 width=4) (actual time=63.529..66.176 rows=3584 loops=1)"
"                ->  CTE Scan on closebuildings  (cost=0.00..27.02 rows=1351 width=4) (actual time=54.746..62.316 rows=3584 loops=1)"
"          ->  Index Scan using buildingdescriptions_pkey on buildingdescriptions  (cost=0.00..5.69 rows=1 width=520) (actual time=0.506..0.507 rows=1 loops=3584)"
"                Index Cond: (id = closebuildings.descriptionid)"
"  ->  CTE Scan on closebuildingdescriptions  (cost=0.00..1944.52 rows=97226 width=32) (actual time=117.115..1901.993 rows=3584 loops=1)"
"  ->  CTE Scan on closebuildings  (cost=0.00..27.02 rows=1351 width=4) (actual time=0.000..0.536 rows=3584 loops=3584)"
"Total runtime: 7870.567 ms"
[/code]


If anyone can come up with a solution or a suggestion how to solve this, I would highly appreciate it.

Cheers



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

pgsql-general by date:

Previous
From: Opel Fahrer
Date:
Subject: Postgresql 9.1 - select statement with multiple "with-clauses" becomes very slow
Next
From: Opel Fahrer
Date:
Subject: Re: Postgresql 9.1 - select statement with multiple "with-clauses" becomes very slow