Re: Query plan question, and a memory leak - Mailing list pgsql-general
From | Greg Stark |
---|---|
Subject | Re: Query plan question, and a memory leak |
Date | |
Msg-id | 87lm0y9hxz.fsf@stark.dyndns.tv Whole thread Raw |
In response to | Re: Query plan question, and a memory leak (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Query plan question, and a memory leak
|
List | pgsql-general |
Tom Lane <tgl@sss.pgh.pa.us> writes: > Greg Stark <gsstark@mit.edu> writes: > > 1) The following query has an odd plan that I can't figure out how to read. It > > seems to include the subplan twice, does that mean it's executing it twice? > > If you had done EXPLAIN ANALYZE, you would know ;-) Well, not till later this week I wouldn't :) > > Even twice doesn't explain the cost which is much higher than similar plans > > that don't trigger the duplicate subplan. What am I doing wrong to trigger > > this behaviour? > > The inner nested loop's cost looks in line to me for one execution of > the subplan and one execution of the inner indexscan for each tuple of > the outer table (ad). Given that the subplan depends on ad.ad_id > there's really not any way to avoid re-executing it for each row of ad. > > The reason the outer nested loop looks so bad is you've got an > unconstrained join to gg... That's intentional, this is the same query I was trying to run earlier this weekend and ran out of disk space for the results. Even so the numbers don't seem right. Something really quite strange is going on. Look at these plans. Adding the clause to actually exclude records where the subquery don't find any records causes the cost to go through the roof. I wouldn't expect it to actually take any more time. In fact I would expect it to take a lot less time since it takes time to handle the resulting data too. slo=> explain select * from (select a, (select b from b where b=a) as x from a) as z ; QUERY PLAN ---------------------------------------------------------------- Subquery Scan z (cost=0.00..20.00 rows=1000 width=4) -> Seq Scan on a (cost=0.00..20.00 rows=1000 width=4) SubPlan -> Seq Scan on b (cost=0.00..22.50 rows=5 width=4) Filter: (b = $0) (5 rows) Time: 1.34 ms slo=> explain select * from (select a, (select b from b where b=a) as x from a) as z where x is not null; QUERY PLAN ---------------------------------------------------------------- Subquery Scan z (cost=0.00..22520.00 rows=995 width=4) -> Seq Scan on a (cost=0.00..22520.00 rows=995 width=4) Filter: ((subplan) IS NOT NULL) SubPlan -> Seq Scan on b (cost=0.00..22.50 rows=5 width=4) Filter: (b = $0) -> Seq Scan on b (cost=0.00..22.50 rows=5 width=4) Filter: (b = $0) (8 rows) Time: 1.68 ms For this test I created these tables, there are no records in them: slo=> \d a Table "public.a" Column | Type | Modifiers --------+---------+----------- a | integer | slo=> \d b Table "public.b" Column | Type | Modifiers --------+---------+----------- b | integer | slo=> \d c Table "public.c" Column | Type | Modifiers --------+---------+----------- c | integer | > > 2) The version of the query at the bottom appears to trigger a big memory > > leak. The only difference is the addition of a "WHERE geom2 @ make_box()" > > clause. (make_box returns a box, the definition is below). > > What datatype is geom2? It was a box as well. I've eliminated that column from the queries at least for now though. -- greg
pgsql-general by date: