cube operations slower than geo_distance() on production server - Mailing list pgsql-performance

From Mark Stosberg
Subject cube operations slower than geo_distance() on production server
Date
Msg-id eqiht3$2fvu$1@news.hub.org
Whole thread Raw
Responses Re: cube operations slower than geo_distance() on production server
List pgsql-performance
With the help of some of this list, I was able to successfully  set up
and benchmark a cube-based replacement for geo_distance() calculations.

On a development box, the cube-based variations benchmarked consistently
running in about 1/3 of the time of the gel_distance() equivalents.

After setting up the same columns and indexes on a production
database, it's a different story. All the cube operations show
themselves to be about the same as, or noticeably slower than, the same
operations done with geo_distance().

I've stared at the EXPLAIN ANALYZE output as much I can to figure what's
gone. Could you help?

Here's the plan on the production server, which seems too slow. Below is the plan I get in
on the development server, which is much faster.

I tried "set enable_nestloop = off", which did change the plan, but the performance.

The production DB has much more data in it, but I still expected comparable results relative
to using geo_distance() calculations.

The production db gets a "VACUUM ANALYZE"  every couple of hours now.

Thanks!

  Mark

########

 Sort  (cost=6617.03..6617.10 rows=27 width=32) (actual time=2482.915..2487.008 rows=1375 loops=1)
   Sort Key: (cube_distance($0, zipcodes.earth_coords) / 1609.344::double precision)
   InitPlan
     ->  Index Scan using zipcodes_pkey on zipcodes  (cost=0.00..3.01 rows=1 width=32) (actual time=0.034..0.038 rows=1
loops=1)
           Index Cond: ((zipcode)::text = '90210'::text)
     ->  Index Scan using zipcodes_pkey on zipcodes  (cost=0.00..3.01 rows=1 width=32) (actual time=0.435..0.438 rows=1
loops=1)
           Index Cond: ((zipcode)::text = '90210'::text)
   ->  Nested Loop  (cost=538.82..6610.36 rows=27 width=32) (actual time=44.660..2476.919 rows=1375 loops=1)
         ->  Nested Loop  (cost=2.15..572.14 rows=9 width=36) (actual time=4.877..39.037 rows=136 loops=1)
               ->  Bitmap Heap Scan on zipcodes  (cost=2.15..150.05 rows=42 width=41) (actual time=3.749..4.951
rows=240loops=1) 
                     Recheck Cond: (cube_enlarge(($1)::cube, 16093.4357308298::double precision, 3) @ earth_coords)
                     ->  Bitmap Index Scan on zip_earth_coords_idx  (cost=0.00..2.15 rows=42 width=0) (actual
time=3.658..3.658rows=240 loops=1) 
                           Index Cond: (cube_enlarge(($1)::cube, 16093.4357308298::double precision, 3) @ earth_coords)
               ->  Index Scan using shelters_postal_code_for_joining_idx on shelters  (cost=0.00..10.02 rows=2
width=12)(actual time=0.079..0.133 rows=1 loops=240) 
                     Index Cond: ((shelters.postal_code_for_joining)::text = ("outer".zipcode)::text)
         ->  Bitmap Heap Scan on pets  (cost=536.67..670.47 rows=34 width=4) (actual time=16.844..17.830 rows=10
loops=136)
               Recheck Cond: ((pets.shelter_id = "outer".shelter_id) AND ((pets.pet_state)::text = 'available'::text))
               Filter: (species_id = 1) Sort  (cost=7004.53..7004.62 rows=39 width=32) (actual time=54.635..55.450
rows=475loops=1) 
               ->  BitmapAnd  (cost=536.67..536.67 rows=34 width=0) (actual time=16.621..16.621 rows=0 loops=136)
                     ->  Bitmap Index Scan on pets_shelter_id_idx  (cost=0.00..3.92 rows=263 width=0) (actual
time=0.184..0.184rows=132 loops=136) 
                           Index Cond: (pets.shelter_id = "outer".shelter_id)
                     ->  Bitmap Index Scan on pets_pet_state_idx  (cost=0.00..532.50 rows=39571 width=0) (actual
time=26.922..26.922rows=40390 loops=82) 
                           Index Cond: ((pet_state)::text = 'available'::text)
 Total runtime: 2492.852 ms


########### Faster plan in development:

 Sort  (cost=7004.53..7004.62 rows=39 width=32) (actual time=54.635..55.450 rows=475 loops=1)
   Sort Key: (cube_distance($0, earth_distance.earth_coords) / 1609.344::double precision)
   InitPlan
     ->  Bitmap Heap Scan on earth_distance  (cost=4.74..624.60 rows=212 width=32) (actual time=0.113..0.115 rows=1
loops=1)
           Recheck Cond: ((zipcode)::text = '90210'::text)
           ->  Bitmap Index Scan on earth_distance_zipcode_idx  (cost=0.00..4.74 rows=212 width=0) (actual
time=0.101..0.101rows=2 loops=1) 
                 Index Cond: ((zipcode)::text = '90210'::text)
     ->  Bitmap Heap Scan on earth_distance  (cost=4.74..624.60 rows=212 width=32) (actual time=0.205..0.208 rows=1
loops=1)
           Recheck Cond: ((zipcode)::text = '90210'::text)
           ->  Bitmap Index Scan on earth_distance_zipcode_idx  (cost=0.00..4.74 rows=212 width=0) (actual
time=0.160..0.160rows=2 loops=1) 
                 Index Cond: ((zipcode)::text = '90210'::text)
   ->  Hash Join  (cost=618.67..5754.30 rows=39 width=32) (actual time=13.499..52.924 rows=475 loops=1)
         Hash Cond: ("outer".shelter_id = "inner".shelter_id)
         ->  Bitmap Heap Scan on pets  (cost=44.85..5158.42 rows=4298 width=4) (actual time=4.278..34.192 rows=3843
loops=1)
               Recheck Cond: ((pet_state)::text = 'available'::text)
               Filter: (species_id = 1)
               ->  Bitmap Index Scan on pets_pet_state_idx  (cost=0.00..44.85 rows=6244 width=0) (actual
time=3.623..3.623rows=7257 loops=1) 
                     Index Cond: ((pet_state)::text = 'available'::text)
         ->  Hash  (cost=573.65..573.65 rows=66 width=36) (actual time=8.916..8.916 rows=102 loops=1)
               ->  Nested Loop  (cost=3.15..573.65 rows=66 width=36) (actual time=3.004..8.513 rows=102 loops=1)
                     ->  Bitmap Heap Scan on earth_distance  (cost=3.15..152.36 rows=42 width=41) (actual
time=2.751..3.432rows=240 loops=1) 
                           Recheck Cond: (cube_enlarge(($1)::cube, 16093.4357308298::double precision, 3) @
earth_coords)
                           ->  Bitmap Index Scan on earth_coords_idx  (cost=0.00..3.15 rows=42 width=0) (actual
time=2.520..2.520rows=480 loops=1) 
                                 Index Cond: (cube_enlarge(($1)::cube, 16093.4357308298::double precision, 3) @
earth_coords)
                     ->  Index Scan using shelters_postal_code_for_joining_idx on shelters  (cost=0.00..10.01 rows=2
width=12)(actual time=0.011..0.015 rows=0 loops=240) 
                           Index Cond: ((shelters.postal_code_for_joining)::text = ("outer".zipcode)::text)
 Total runtime: 58.038 ms

pgsql-performance by date:

Previous
From: "Daniel Cristian Cruz"
Date:
Subject: Re: Recreate big table
Next
From: Mark Stosberg
Date:
Subject: Re: Can anyone make this code tighter? Too slow, Please help!