Thread: Query Join Performance
I have a database with the following tables:<br /><br />zip {<br /> zip_id bigserial,<br /> zip_cd varchar(6) -- thisis indexed<br />}<br /><br />city {<br /> city_id bigserial,<br /> city_name varchar(50)<br />}<br /><br />zip_city{ <br /> city_id bigint (FK to city table),<br /> zip_id bigint (FK to zip table),<br /> longitude numeric,<br/> latitude numeric<br />}<br /><br />We want to find the zip_cd values for all zip codes within 50 miles ofa specified zip code and have this query so far (assume our given zip code is 64131): <br /><br /> select<br /> zip_city.zip_id,<br /> gps_distance(zip_city.longitude::numeric, zip_city.latitude::numeric, my_location.longitude::numeric,my_location.latitude::numeric) AS distance<br /> from zip_city <br /> inner join (<br/> select<br /> longitude,<br /> latitude<br /> from zip<br /> inner joinzip_city on (<br /> zip.zip_id = zip_city.zip_id<br /> )<br /> where zip_cd = '64131' <br /> ) my_location on (<br /> -- We use this as an approximation to filter out as many records as possible beforedoing the gps_distance() function call<br /> -- It works well as a outer bounds to get the most obvious outsidezip_city values. <br /> abs(zip_city.longitude - my_location.longitude) <= (50.0 / 60.0)<br /> ANDabs(zip_city.latitude - my_location.latitude) <= (50.0 / 60.0)<br /> )<br /><br />This runs in about 2 seconds onWindows and gives me 650 records - not bad. <br /><br />When I add just a little bit:<br /><br /> select<br /> zip_city.zip_id,<br /> gps_distance(zip_city.longitude::numeric, zip_city.latitude::numeric, my_location.longitude::numeric,my_location.latitude::numeric) AS distance <br /> from zip_city<br /> -->>>>THE ONLY DIFFERENCE IS THE NEXT 3 LINES <<<<--<br /> inner join zip on (<br /> zip_city.zip_id= zip.zip_id<br /> )<br /> -->>>> End of added code <<<<-- <br /> inner join(<br /> select<br /> longitude,<br /> latitude<br /> from zip<br /> innerjoin zip_city on (<br /> zip.zip_id = zip_city.zip_id<br /> )<br /> where zip_cd = '64131'<br /> ) my_location on (<br /> abs(zip_city.longitude - my_location.longitude) <= (50.0 / 60.0)<br /> AND abs(zip_city.latitude - my_location.latitude) <= (50.0 / 60.0)<br /> )<br /><br />Performance tanks withthis query - it takes over 120 seconds (that is where I set the timeout). I read that a foreign key doesn't help performanceand my guess is that PostgreSQL isn't just joining on the 650 records but rather on the over 800,000 records inthe zip table (yes we have that many - we include Canada zip codes which is 90% of the data - Blame Canada! ;). <br /><br/>Is there something I can do to boost performance? I tried putting the first query above in an inner join before joiningwith the zip table with the same result.<br /><br />BTW, on our Linux box the full query we run (which adds 3 moretables on the whole operation along with more filtering on the zip table) finishes in under 10 seconds. Problem is ourdevelopment is on Windows and this is a real pain for developers to test. <br /><br />Any assistance is appreciated. I did look at the explain plans but I don't see anything other than what I mentioned above. I can include thoseif necessary.<br /><br />Thanks!<br />Aaron<br clear="all" /><br />-- <br /> ==================================================================<br/> Aaron Bono<br /> Aranya Software Technologies,Inc.<br /> <a href="http://www.aranya.com">http://www.aranya.com</a><br /> <a href="http://codeelixir.com">http://codeelixir.com</a><br />==================================================================
Aaron Bono wrote: > Performance tanks with this query - it takes over 120 seconds (that is > where > I set the timeout). > BTW, on our Linux box the full query we run (which adds 3 more tables on > the > whole operation along with more filtering on the zip table) finishes in > under 10 seconds. Problem is our development is on Windows and this is a > real pain for developers to test. So what's different between the systems. Obvious things to look at: 1. EXPLAIN ANALYSE outputs to show the plans (these presumably are different, but in what details, and why?) 2. Configuration (particularly memory/cost settings). 3. Hardware. 4. Locale/encoding - these can affect index usage and sorting. -- Richard Huxton Archonet Ltd
On 4/25/07, Richard Huxton <dev@archonet.com> wrote:
Now I am having the same problem on the Linux box so I doubt it is the platform.
The biggest problem I notice is when I add a join from a child table (zip_city) to a parent table (zip). I have filtered the child table down to about 650 records but when I add the join to the parent which has over 800,000 records, performance tanks. I was able to benchmark two queries last night on my Windows machine:
-- This runs in just over 2 seconds
select
nearby_zip_city.zip_id,
gps_distance(zip_city.longitude::numeric, zip_city.latitude::numeric, nearby_zip_city.longitude::numeric, nearby_zip_city.latitude::numeric) AS distance
from zip_city
inner join zip on (
zip.zip_id = zip_city.zip_id
)
inner join zip_city as nearby_zip_city on (
abs(zip_city.longitude - nearby_zip_city.longitude) <= (50.0 / 60.0)
AND abs(zip_city.latitude - nearby_zip_city.latitude) <= (50.0 / 60.0)
)
where zip.zip_cd = '66105'
-- This takes over 48 seconds and I just added a join from the zip_city child to the zip parent table
select
nearby_zip.zip_cd,
gps_distance(zip_city.longitude::numeric, zip_city.latitude::numeric, nearby_zip_city.longitude::numeric, nearby_zip_city.latitude::numeric) AS distance
from zip_city
inner join zip on (
zip.zip_id = zip_city.zip_id
)
inner join zip_city as nearby_zip_city on (
abs(zip_city.longitude - nearby_zip_city.longitude) <= (50.0 / 60.0)
AND abs(zip_city.latitude - nearby_zip_city.latitude) <= ( 50.0 / 60.0)
)
-->>> The next 3 lines are the main difference <<<--
inner join zip as nearby_zip on (
nearby_zip_city.zip_id = nearby_zip.zip_id
)
-->>> End of difference <<<--
where zip.zip_cd = '66105'
-- Explain plan for faster/first query:
Nested Loop (cost=45779.82..147990502.45 rows=451678770 width=40) (actual time=5404.943..20151.684 rows=653 loops=1)
Join Filter: ((abs(("inner".longitude - "outer".longitude)) <= 0.833333333333333::double precision) AND (abs(("inner".latitude - "outer".latitude)) <= 0.833333333333333::double precision))
-> Seq Scan on zip_city nearby_zip_city (cost=0.00..25514.19 rows=901719 width=24) (actual time= 0.053..2311.547 rows=901719 loops=1)
-> Materialize (cost=45779.82..45824.90 rows=4508 width=16) (actual time=0.003..0.009 rows=1 loops=901719)
-> Hash Join (cost=8944.55..45775.31 rows=4508 width=16) (actual time= 444.657..4490.901 rows=1 loops=1)
Hash Cond: ("outer".zip_id = "inner".zip_id)
-> Seq Scan on zip_city (cost=0.00..25514.19 rows=901719 width=24) (actual time=0.004..2334.548 rows=901719 loops=1)
-> Hash (cost=8933.90..8933.90 rows=4261 width=8) (actual time=0.296..0.296 rows=1 loops=1)
-> Bitmap Heap Scan on zip (cost=32.91..8933.90 rows=4261 width=8) (actual time= 0.272..0.275 rows=1 loops=1)
Recheck Cond: ((zip_cd)::text = '66105'::text)
-> Bitmap Index Scan on zip_zip_cd_key (cost=0.00..32.91 rows=4261 width=0) (actual time= 0.250..0.250 rows=1 loops=1)
Index Cond: ((zip_cd)::text = '66105'::text)
-- Explain plan for shower/second query:
Nested Loop (cost=75372.31..148056286.32 rows=451678770 width=43) (actual time= 62688.188..69916.943 rows=653 loops=1)
Join Filter: ((abs(("inner".longitude - "outer".longitude)) <= 0.833333333333333::double precision) AND (abs(("inner".latitude - "outer".latitude)) <= 0.833333333333333::double precision))
-> Hash Join (cost=29592.49..91298.06 rows=901719 width=27) (actual time=17905.224..52279.151 rows=901719 loops=1)
Hash Cond: ("outer".zip_id = "inner".zip_id)
-> Seq Scan on zip_city nearby_zip_city (cost=0.00..25514.19 rows=901719 width=24) (actual time=0.044..2888.993 rows=901719 loops=1)
-> Hash (cost=21634.79..21634.79 rows=852279 width=19) (actual time= 13925.502..13925.502 rows=852279 loops=1)
-> Seq Scan on zip nearby_zip (cost=0.00..21634.79 rows=852279 width=19) (actual time=0.042..2535.742 rows=852279 loops=1)
-> Materialize (cost=45779.82..45824.90 rows=4508 width=16) (actual time=0.002..0.009 rows=1 loops=901719)
-> Hash Join (cost=8944.55..45775.31 rows=4508 width=16) (actual time=421.374..4453.224 rows=1 loops=1)
Hash Cond: ("outer".zip_id = "inner".zip_id)
-> Seq Scan on zip_city (cost=0.00..25514.19 rows=901719 width=24) (actual time=0.028..2333.941 rows=901719 loops=1)
-> Hash (cost=8933.90..8933.90 rows=4261 width=8) (actual time= 0.604..0.604 rows=1 loops=1)
-> Bitmap Heap Scan on zip (cost=32.91..8933.90 rows=4261 width=8) (actual time=0.588..0.591 rows=1 loops=1)
Recheck Cond: ((zip_cd)::text = '66105'::text)
-> Bitmap Index Scan on zip_zip_cd_key (cost=0.00..32.91 rows=4261 width=0) (actual time=0.559..0.559 rows=1 loops=1)
Index Cond: ((zip_cd)::text = '66105'::text)
It is the join that is killing the query but I am at a loss of the best approach to fix it. I have some work arounds in mind by flattening out the tables but I would rather not have to do that.
Thanks!
Aaron
--
==================================================================
Aaron Bono
Aranya Software Technologies, Inc.
http://www.aranya.com
http://codeelixir.com
==================================================================
Aaron Bono wrote:
> Performance tanks with this query - it takes over 120 seconds (that is
> where
> I set the timeout).
> BTW, on our Linux box the full query we run (which adds 3 more tables on
> the
> whole operation along with more filtering on the zip table) finishes in
> under 10 seconds. Problem is our development is on Windows and this is a
> real pain for developers to test.
So what's different between the systems. Obvious things to look at:
1. EXPLAIN ANALYSE outputs to show the plans (these presumably are
different, but in what details, and why?)
2. Configuration (particularly memory/cost settings).
3. Hardware.
4. Locale/encoding - these can affect index usage and sorting.
--
Richard Huxton
Archonet Ltd
Now I am having the same problem on the Linux box so I doubt it is the platform.
The biggest problem I notice is when I add a join from a child table (zip_city) to a parent table (zip). I have filtered the child table down to about 650 records but when I add the join to the parent which has over 800,000 records, performance tanks. I was able to benchmark two queries last night on my Windows machine:
-- This runs in just over 2 seconds
select
nearby_zip_city.zip_id,
gps_distance(zip_city.longitude::numeric, zip_city.latitude::numeric, nearby_zip_city.longitude::numeric, nearby_zip_city.latitude::numeric) AS distance
from zip_city
inner join zip on (
zip.zip_id = zip_city.zip_id
)
inner join zip_city as nearby_zip_city on (
abs(zip_city.longitude - nearby_zip_city.longitude) <= (50.0 / 60.0)
AND abs(zip_city.latitude - nearby_zip_city.latitude) <= (50.0 / 60.0)
)
where zip.zip_cd = '66105'
-- This takes over 48 seconds and I just added a join from the zip_city child to the zip parent table
select
nearby_zip.zip_cd,
gps_distance(zip_city.longitude::numeric, zip_city.latitude::numeric, nearby_zip_city.longitude::numeric, nearby_zip_city.latitude::numeric) AS distance
from zip_city
inner join zip on (
zip.zip_id = zip_city.zip_id
)
inner join zip_city as nearby_zip_city on (
abs(zip_city.longitude - nearby_zip_city.longitude) <= (50.0 / 60.0)
AND abs(zip_city.latitude - nearby_zip_city.latitude) <= ( 50.0 / 60.0)
)
-->>> The next 3 lines are the main difference <<<--
inner join zip as nearby_zip on (
nearby_zip_city.zip_id = nearby_zip.zip_id
)
-->>> End of difference <<<--
where zip.zip_cd = '66105'
-- Explain plan for faster/first query:
Nested Loop (cost=45779.82..147990502.45 rows=451678770 width=40) (actual time=5404.943..20151.684 rows=653 loops=1)
Join Filter: ((abs(("inner".longitude - "outer".longitude)) <= 0.833333333333333::double precision) AND (abs(("inner".latitude - "outer".latitude)) <= 0.833333333333333::double precision))
-> Seq Scan on zip_city nearby_zip_city (cost=0.00..25514.19 rows=901719 width=24) (actual time= 0.053..2311.547 rows=901719 loops=1)
-> Materialize (cost=45779.82..45824.90 rows=4508 width=16) (actual time=0.003..0.009 rows=1 loops=901719)
-> Hash Join (cost=8944.55..45775.31 rows=4508 width=16) (actual time= 444.657..4490.901 rows=1 loops=1)
Hash Cond: ("outer".zip_id = "inner".zip_id)
-> Seq Scan on zip_city (cost=0.00..25514.19 rows=901719 width=24) (actual time=0.004..2334.548 rows=901719 loops=1)
-> Hash (cost=8933.90..8933.90 rows=4261 width=8) (actual time=0.296..0.296 rows=1 loops=1)
-> Bitmap Heap Scan on zip (cost=32.91..8933.90 rows=4261 width=8) (actual time= 0.272..0.275 rows=1 loops=1)
Recheck Cond: ((zip_cd)::text = '66105'::text)
-> Bitmap Index Scan on zip_zip_cd_key (cost=0.00..32.91 rows=4261 width=0) (actual time= 0.250..0.250 rows=1 loops=1)
Index Cond: ((zip_cd)::text = '66105'::text)
-- Explain plan for shower/second query:
Nested Loop (cost=75372.31..148056286.32 rows=451678770 width=43) (actual time= 62688.188..69916.943 rows=653 loops=1)
Join Filter: ((abs(("inner".longitude - "outer".longitude)) <= 0.833333333333333::double precision) AND (abs(("inner".latitude - "outer".latitude)) <= 0.833333333333333::double precision))
-> Hash Join (cost=29592.49..91298.06 rows=901719 width=27) (actual time=17905.224..52279.151 rows=901719 loops=1)
Hash Cond: ("outer".zip_id = "inner".zip_id)
-> Seq Scan on zip_city nearby_zip_city (cost=0.00..25514.19 rows=901719 width=24) (actual time=0.044..2888.993 rows=901719 loops=1)
-> Hash (cost=21634.79..21634.79 rows=852279 width=19) (actual time= 13925.502..13925.502 rows=852279 loops=1)
-> Seq Scan on zip nearby_zip (cost=0.00..21634.79 rows=852279 width=19) (actual time=0.042..2535.742 rows=852279 loops=1)
-> Materialize (cost=45779.82..45824.90 rows=4508 width=16) (actual time=0.002..0.009 rows=1 loops=901719)
-> Hash Join (cost=8944.55..45775.31 rows=4508 width=16) (actual time=421.374..4453.224 rows=1 loops=1)
Hash Cond: ("outer".zip_id = "inner".zip_id)
-> Seq Scan on zip_city (cost=0.00..25514.19 rows=901719 width=24) (actual time=0.028..2333.941 rows=901719 loops=1)
-> Hash (cost=8933.90..8933.90 rows=4261 width=8) (actual time= 0.604..0.604 rows=1 loops=1)
-> Bitmap Heap Scan on zip (cost=32.91..8933.90 rows=4261 width=8) (actual time=0.588..0.591 rows=1 loops=1)
Recheck Cond: ((zip_cd)::text = '66105'::text)
-> Bitmap Index Scan on zip_zip_cd_key (cost=0.00..32.91 rows=4261 width=0) (actual time=0.559..0.559 rows=1 loops=1)
Index Cond: ((zip_cd)::text = '66105'::text)
It is the join that is killing the query but I am at a loss of the best approach to fix it. I have some work arounds in mind by flattening out the tables but I would rather not have to do that.
Thanks!
Aaron
--
==================================================================
Aaron Bono
Aranya Software Technologies, Inc.
http://www.aranya.com
http://codeelixir.com
==================================================================
"Aaron Bono" <postgresql@aranya.com> writes: > The biggest problem I notice is when I add a join from a child table > (zip_city) to a parent table (zip). I have filtered the child table down to > about 650 records but when I add the join to the parent which has over > 800,000 records, performance tanks. It shouldn't surprise you that joining a 900k row table to an 800k row table isn't cheap. It would certainly be better if the thing delayed the join to nearby_zip until after it had done the restrictive join. Your problem is it doesn't realize that that join condition is restrictive: > Nested Loop (cost=45779.82..147990502.45 rows=451678770 width=40) (actual ^^^^^^^^^ > time=5404.943..20151.684 rows=653 loops=1) > Join Filter: ((abs(("inner".longitude - "outer".longitude)) <= > 0.833333333333333::double precision) AND (abs(("inner".latitude - > "outer".latitude)) <= 0.833333333333333::double precision)) which is hardly surprising since the condition is phrased in a way that isn't amenable to statistical analysis. You might want to look into using PostGIS for this sort of thing --- it provides operators that are better suited to the problem domain, and also allow some modicum of intelligence in the rowcount estimates. Another bad misestimation is here: > -> Bitmap Heap Scan on zip > (cost=32.91..8933.90rows=4261 width=8) (actual time= > 0.272..0.275 rows=1 loops=1) > Recheck Cond: ((zip_cd)::text = '66105'::text) > -> Bitmap Index Scan on zip_zip_cd_key (cost= > 0.00..32.91 rows=4261 width=0) (actual time=0.250..0.250 rows=1 loops=1) > Index Cond: ((zip_cd)::text = '66105'::text) The error of 4000x here contributes directly to the error in the top-level row estimate; but this one is a simple scalar condition and I'd expect our stats code to be able to deal with it. Are the stats on zip up-to-date? Maybe you need to increase the stats target for it. regards, tom lane
On 4/25/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Looks like a vacuum analyze did the trick. Performance is beautiful now. I should have tried that earlier.
I thought I had the auto vacuum turned on (PostgreSQL 8.1) but I guess it doesn't do analyze?
Anyway, I will schedule a vacuum analyze nightly - it is low usage and won't cause any problems.
Thanks!
--
==================================================================
Aaron Bono
Aranya Software Technologies, Inc.
http://www.aranya.com
http://codeelixir.com
==================================================================
"Aaron Bono" <postgresql@aranya.com> writes:
> The biggest problem I notice is when I add a join from a child table
> (zip_city) to a parent table (zip). I have filtered the child table down to
> about 650 records but when I add the join to the parent which has over
> 800,000 records, performance tanks.
It shouldn't surprise you that joining a 900k row table to an 800k row table
isn't cheap.
It would certainly be better if the thing delayed the join to nearby_zip
until after it had done the restrictive join. Your problem is it
doesn't realize that that join condition is restrictive:
> Nested Loop (cost=45779.82..147990502.45 rows=451678770 width=40) (actual
^^^^^^^^^
> time=5404.943..20151.684 rows=653 loops=1)
> Join Filter: ((abs(("inner".longitude - "outer".longitude)) <=
> 0.833333333333333::double precision) AND (abs(("inner".latitude -
> "outer".latitude)) <= 0.833333333333333::double precision))
which is hardly surprising since the condition is phrased in a way that
isn't amenable to statistical analysis. You might want to look into
using PostGIS for this sort of thing --- it provides operators that are
better suited to the problem domain, and also allow some modicum of
intelligence in the rowcount estimates.
Another bad misestimation is here:
> -> Bitmap Heap Scan on zip
> (cost=32.91..8933.90rows=4261 width=8) (actual time=
> 0.272..0.275 rows=1 loops=1)
> Recheck Cond: ((zip_cd)::text = '66105'::text)
> -> Bitmap Index Scan on zip_zip_cd_key (cost=
> 0.00..32.91 rows=4261 width=0) (actual time= 0.250..0.250 rows=1 loops=1)
> Index Cond: ((zip_cd)::text = '66105'::text)
The error of 4000x here contributes directly to the error in the
top-level row estimate; but this one is a simple scalar condition and I'd
expect our stats code to be able to deal with it. Are the stats on zip
up-to-date? Maybe you need to increase the stats target for it.
regards, tom lane
Looks like a vacuum analyze did the trick. Performance is beautiful now. I should have tried that earlier.
I thought I had the auto vacuum turned on (PostgreSQL 8.1) but I guess it doesn't do analyze?
Anyway, I will schedule a vacuum analyze nightly - it is low usage and won't cause any problems.
Thanks!
--
==================================================================
Aaron Bono
Aranya Software Technologies, Inc.
http://www.aranya.com
http://codeelixir.com
==================================================================
Aaron Bono wrote: > Looks like a vacuum analyze did the trick. Performance is beautiful now. I > should have tried that earlier. > > I thought I had the auto vacuum turned on (PostgreSQL 8.1) but I guess it > doesn't do analyze? > > Anyway, I will schedule a vacuum analyze nightly - it is low usage and won't > cause any problems. It should have done an analyze at some point. Unless this is Windows, in which case there's a bug that precludes autovacuum from running at all. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.