Thread: cube operations slower than geo_distance() on production server

cube operations slower than geo_distance() on production server

From

Mark Stosberg

Date:

09 February 2007, 15:26:45

With the help of some of this list, I was able to successfully  set up
and benchmark a cube-based replacement for geo_distance() calculations.

On a development box, the cube-based variations benchmarked consistently
running in about 1/3 of the time of the gel_distance() equivalents.

After setting up the same columns and indexes on a production
database, it's a different story. All the cube operations show
themselves to be about the same as, or noticeably slower than, the same
operations done with geo_distance().

I've stared at the EXPLAIN ANALYZE output as much I can to figure what's
gone. Could you help?

Here's the plan on the production server, which seems too slow. Below is the plan I get in
on the development server, which is much faster.

I tried "set enable_nestloop = off", which did change the plan, but the performance.

The production DB has much more data in it, but I still expected comparable results relative
to using geo_distance() calculations.

The production db gets a "VACUUM ANALYZE"  every couple of hours now.

Thanks!

  Mark

########

 Sort  (cost=6617.03..6617.10 rows=27 width=32) (actual time=2482.915..2487.008 rows=1375 loops=1)
   Sort Key: (cube_distance($0, zipcodes.earth_coords) / 1609.344::double precision)
   InitPlan
     ->  Index Scan using zipcodes_pkey on zipcodes  (cost=0.00..3.01 rows=1 width=32) (actual time=0.034..0.038 rows=1
loops=1)
           Index Cond: ((zipcode)::text = '90210'::text)
     ->  Index Scan using zipcodes_pkey on zipcodes  (cost=0.00..3.01 rows=1 width=32) (actual time=0.435..0.438 rows=1
loops=1)
           Index Cond: ((zipcode)::text = '90210'::text)
   ->  Nested Loop  (cost=538.82..6610.36 rows=27 width=32) (actual time=44.660..2476.919 rows=1375 loops=1)
         ->  Nested Loop  (cost=2.15..572.14 rows=9 width=36) (actual time=4.877..39.037 rows=136 loops=1)
               ->  Bitmap Heap Scan on zipcodes  (cost=2.15..150.05 rows=42 width=41) (actual time=3.749..4.951
rows=240loops=1) 
                     Recheck Cond: (cube_enlarge(($1)::cube, 16093.4357308298::double precision, 3) @ earth_coords)
                     ->  Bitmap Index Scan on zip_earth_coords_idx  (cost=0.00..2.15 rows=42 width=0) (actual
time=3.658..3.658rows=240 loops=1) 
                           Index Cond: (cube_enlarge(($1)::cube, 16093.4357308298::double precision, 3) @ earth_coords)
               ->  Index Scan using shelters_postal_code_for_joining_idx on shelters  (cost=0.00..10.02 rows=2
width=12)(actual time=0.079..0.133 rows=1 loops=240) 
                     Index Cond: ((shelters.postal_code_for_joining)::text = ("outer".zipcode)::text)
         ->  Bitmap Heap Scan on pets  (cost=536.67..670.47 rows=34 width=4) (actual time=16.844..17.830 rows=10
loops=136)
               Recheck Cond: ((pets.shelter_id = "outer".shelter_id) AND ((pets.pet_state)::text = 'available'::text))
               Filter: (species_id = 1) Sort  (cost=7004.53..7004.62 rows=39 width=32) (actual time=54.635..55.450
rows=475loops=1) 
               ->  BitmapAnd  (cost=536.67..536.67 rows=34 width=0) (actual time=16.621..16.621 rows=0 loops=136)
                     ->  Bitmap Index Scan on pets_shelter_id_idx  (cost=0.00..3.92 rows=263 width=0) (actual
time=0.184..0.184rows=132 loops=136) 
                           Index Cond: (pets.shelter_id = "outer".shelter_id)
                     ->  Bitmap Index Scan on pets_pet_state_idx  (cost=0.00..532.50 rows=39571 width=0) (actual
time=26.922..26.922rows=40390 loops=82) 
                           Index Cond: ((pet_state)::text = 'available'::text)
 Total runtime: 2492.852 ms


########### Faster plan in development:

 Sort  (cost=7004.53..7004.62 rows=39 width=32) (actual time=54.635..55.450 rows=475 loops=1)
   Sort Key: (cube_distance($0, earth_distance.earth_coords) / 1609.344::double precision)
   InitPlan
     ->  Bitmap Heap Scan on earth_distance  (cost=4.74..624.60 rows=212 width=32) (actual time=0.113..0.115 rows=1
loops=1)
           Recheck Cond: ((zipcode)::text = '90210'::text)
           ->  Bitmap Index Scan on earth_distance_zipcode_idx  (cost=0.00..4.74 rows=212 width=0) (actual
time=0.101..0.101rows=2 loops=1) 
                 Index Cond: ((zipcode)::text = '90210'::text)
     ->  Bitmap Heap Scan on earth_distance  (cost=4.74..624.60 rows=212 width=32) (actual time=0.205..0.208 rows=1
loops=1)
           Recheck Cond: ((zipcode)::text = '90210'::text)
           ->  Bitmap Index Scan on earth_distance_zipcode_idx  (cost=0.00..4.74 rows=212 width=0) (actual
time=0.160..0.160rows=2 loops=1) 
                 Index Cond: ((zipcode)::text = '90210'::text)
   ->  Hash Join  (cost=618.67..5754.30 rows=39 width=32) (actual time=13.499..52.924 rows=475 loops=1)
         Hash Cond: ("outer".shelter_id = "inner".shelter_id)
         ->  Bitmap Heap Scan on pets  (cost=44.85..5158.42 rows=4298 width=4) (actual time=4.278..34.192 rows=3843
loops=1)
               Recheck Cond: ((pet_state)::text = 'available'::text)
               Filter: (species_id = 1)
               ->  Bitmap Index Scan on pets_pet_state_idx  (cost=0.00..44.85 rows=6244 width=0) (actual
time=3.623..3.623rows=7257 loops=1) 
                     Index Cond: ((pet_state)::text = 'available'::text)
         ->  Hash  (cost=573.65..573.65 rows=66 width=36) (actual time=8.916..8.916 rows=102 loops=1)
               ->  Nested Loop  (cost=3.15..573.65 rows=66 width=36) (actual time=3.004..8.513 rows=102 loops=1)
                     ->  Bitmap Heap Scan on earth_distance  (cost=3.15..152.36 rows=42 width=41) (actual
time=2.751..3.432rows=240 loops=1) 
                           Recheck Cond: (cube_enlarge(($1)::cube, 16093.4357308298::double precision, 3) @
earth_coords)
                           ->  Bitmap Index Scan on earth_coords_idx  (cost=0.00..3.15 rows=42 width=0) (actual
time=2.520..2.520rows=480 loops=1) 
                                 Index Cond: (cube_enlarge(($1)::cube, 16093.4357308298::double precision, 3) @
earth_coords)
                     ->  Index Scan using shelters_postal_code_for_joining_idx on shelters  (cost=0.00..10.01 rows=2
width=12)(actual time=0.011..0.015 rows=0 loops=240) 
                           Index Cond: ((shelters.postal_code_for_joining)::text = ("outer".zipcode)::text)
 Total runtime: 58.038 ms

Re: cube operations slower than geo_distance() on production server

From

"Merlin Moncure"

Date:

09 February 2007, 21:31:40

On 2/10/07, Mark Stosberg <mark@summersault.com> wrote:
>
> With the help of some of this list, I was able to successfully  set up
> and benchmark a cube-based replacement for geo_distance() calculations.
>
> On a development box, the cube-based variations benchmarked consistently
> running in about 1/3 of the time of the gel_distance() equivalents.
>
> After setting up the same columns and indexes on a production
> database, it's a different story. All the cube operations show
> themselves to be about the same as, or noticeably slower than, the same
> operations done with geo_distance().
>
> I've stared at the EXPLAIN ANALYZE output as much I can to figure what's
> gone. Could you help?
>
> Here's the plan on the production server, which seems too slow. Below is the plan I get in
> on the development server, which is much faster.
>
> I tried "set enable_nestloop = off", which did change the plan, but the performance.
>
> The production DB has much more data in it, but I still expected comparable results relative
> to using geo_distance() calculations.

any objection to posting the query (any maybe tables, keys, indexes, etc)?

merlin

Re: cube operations slower than geo_distance() on production server

From

Mark Stosberg

Date:

12 February 2007, 12:12:17

Merlin Moncure wrote:
> On 2/10/07, Mark Stosberg <mark@summersault.com> wrote:
>>
>> With the help of some of this list, I was able to successfully  set up
>> and benchmark a cube-based replacement for geo_distance() calculations.
>>
>> On a development box, the cube-based variations benchmarked consistently
>> running in about 1/3 of the time of the gel_distance() equivalents.
>>
>> After setting up the same columns and indexes on a production
>> database, it's a different story. All the cube operations show
>> themselves to be about the same as, or noticeably slower than, the same
>> operations done with geo_distance().
>>
>> I've stared at the EXPLAIN ANALYZE output as much I can to figure what's
>> gone. Could you help?
>>
>> Here's the plan on the production server, which seems too slow. Below
>> is the plan I get in
>> on the development server, which is much faster.
>>
>> I tried "set enable_nestloop = off", which did change the plan, but
>> the performance.
>>
>> The production DB has much more data in it, but I still expected
>> comparable results relative
>> to using geo_distance() calculations.
>
> any objection to posting the query (any maybe tables, keys, indexes, etc)?

Here the basic query I'm using:
SELECT
 -- 1609.344 is a constant for "meters per mile"
 cube_distance( (SELECT earth_coords from zipcodes WHERE zipcode =
'90210') , earth_coords)/1609.344
   AS RADIUS
   FROM pets
   -- "shelters_active" is a view where "shelter_state = 'active'"
   JOIN shelters_active as shelters USING (shelter_id)
   -- The zipcode fields here are varchars
   JOIN zipcodes ON (
        shelters.postal_code_for_joining = zipcodes.zipcode )
   -- search for just 'dogs'
   WHERE species_id = 1
       AND pet_state='available'
      AND earth_box(
        (SELECT earth_coords from zipcodes WHERE zipcode = '90210') ,
10*1609.344
      ) @ earth_coords
   ORDER BY RADIUS;

All the related columns are indexed:
   pets.species_id
   pets.shelter_id
   pets.pet_state

   shelters.shelter_id (pk)
   shelters.postal_code_for_joining
   shelters.active

   zipcodes.zipcode (pk)
   zipcodes.earth_coords

The pets table has about 300,000 rows, but only about 10% are
"available". It sees regular updates and is "vacuum analyzed" every
couple of hours now. the rest of the tables get "vacuum analyzed
nightly".  The shelters table is about 99% "shelter_state = active".
It's updated infrequently.

The zipcodes table has about 40,000 rows in it and doesn't change.

I tried a partial index on the pets table "WHERE pet_state =
'available'. I could see the index was used, but the performance was
unaffected.

The "EXPLAIN ANALYZE" output is attached, to try to avoid mail-client
wrapping. The query is running 10 times slower today than on Friday,
perhaps because of server load, or because we are at the end of a VACUUM
cycle.

Thanks for any help!

    Mark
Sort  (cost=6887.03..6887.10 rows=27 width=32) (actual time=17925.098..17927.979 rows=1324 loops=1)
   Sort Key: (cube_distance($0, zipcodes.earth_coords) / 1609.344::double precision)
   InitPlan
     ->  Index Scan using zipcodes_pkey on zipcodes  (cost=0.00..3.01 rows=1 width=32) (actual time=0.028..0.031 rows=1
loops=1)Index Cond: ((zipcode)::text = '90210'::text) 
     ->  Index Scan using zipcodes_pkey on zipcodes  (cost=0.00..3.01 rows=1 width=32) (actual time=0.687..0.692 rows=1
loops=1)Index Cond: ((zipcode)::text = '90210'::text) 
   ->  Nested Loop  (cost=568.82..6880.36 rows=27 width=32) (actual time=346.932..17919.697 rows=1324 loops=1)
         ->  Nested Loop  (cost=2.15..572.14 rows=9 width=36) (actual time=8.321..43.378 rows=136 loops=1)
               ->  Bitmap Heap Scan on zipcodes  (cost=2.15..150.05 rows=42 width=41) (actual time=3.442..4.402
rows=240loops=1) 
                     Recheck Cond: (cube_enlarge(($1)::cube, 16093.4357308298::double precision, 3) @ earth_coords)
                     ->  Bitmap Index Scan on zip_earth_coords_idx  (cost=0.00..2.15 rows=42 width=0) (actual
time=3.426..3.426rows=240 loops=1) 
                     Index Cond: (cube_enlarge(($1)::cube, 16093.4357308298::double precision, 3) @ earth_coord)
               ->  Index Scan using shelters_postal_code_for_joining_idx on shelters  (cost=0.00..10.02 rows=2
width=12)(actual time=0.093..0.155 rows=1 loops=240) Index Cond: ((shelters.postal_code_for_joining)::text =
("outer".zipcode)::text)
         ->  Bitmap Heap Scan on pets  (cost=566.67..700.47 rows=34 width=4) (actual time=130.363..131.367 rows=10
loops=136)Recheck Cond: ((pets.shelter_id = "outer".shelter_id) AND ((pets.pet_state)::text = 'available'::text)) 
               Filter: (species_id = 1)
               ->  BitmapAnd  (cost=566.67..566.67 rows=34 width=0) (actual time=129.333..129.333 rows=0 loops=136)
                     ->  Bitmap Index Scan on pets_shelter_id_idx  (cost=0.00..3.92 rows=263 width=0) (actual
time=0.164..0.164rows=178 loops=136) Index Cond: (pets.shelter_id = "outer".shelter_id) 
                     ->  Bitmap Index Scan on pets_pet_state_idx  (cost=0.00..562.50 rows=39571 width=0) (actual
time=213.620..213.620rows=195599 loops=82) 
                           Index Cond: ((pet_state)::text = 'available'::text)
 Total runtime: 17933.675 ms

Re: cube operations slower than geo_distance() on production server

From

"Merlin Moncure"

Date:

12 February 2007, 15:03:35

On 2/12/07, Mark Stosberg <mark@summersault.com> wrote:
> Merlin Moncure wrote:
> > On 2/10/07, Mark Stosberg <mark@summersault.com> wrote:
> >>
> >> With the help of some of this list, I was able to successfully  set up
> >> and benchmark a cube-based replacement for geo_distance() calculations.
> >>
> >> On a development box, the cube-based variations benchmarked consistently
> >> running in about 1/3 of the time of the gel_distance() equivalents.
> >>
> >> After setting up the same columns and indexes on a production
> >> database, it's a different story. All the cube operations show
> >> themselves to be about the same as, or noticeably slower than, the same
> >> operations done with geo_distance().
> >>
> >> I've stared at the EXPLAIN ANALYZE output as much I can to figure what's
> >> gone. Could you help?
> >>
> >> Here's the plan on the production server, which seems too slow. Below
> >> is the plan I get in
> >> on the development server, which is much faster.
> >>
> >> I tried "set enable_nestloop = off", which did change the plan, but
> >> the performance.
> >>
> >> The production DB has much more data in it, but I still expected
> >> comparable results relative
> >> to using geo_distance() calculations.
> >
> > any objection to posting the query (any maybe tables, keys, indexes, etc)?
>
> Here the basic query I'm using:
> SELECT
>  -- 1609.344 is a constant for "meters per mile"
>  cube_distance( (SELECT earth_coords from zipcodes WHERE zipcode =
> '90210') , earth_coords)/1609.344
>    AS RADIUS
>    FROM pets
>    -- "shelters_active" is a view where "shelter_state = 'active'"
>    JOIN shelters_active as shelters USING (shelter_id)
>    -- The zipcode fields here are varchars
>    JOIN zipcodes ON (
>         shelters.postal_code_for_joining = zipcodes.zipcode )
>    -- search for just 'dogs'
>    WHERE species_id = 1
>        AND pet_state='available'
>       AND earth_box(
>         (SELECT earth_coords from zipcodes WHERE zipcode = '90210') ,
> 10*1609.344
>       ) @ earth_coords
>    ORDER BY RADIUS;

your query looks a bit funky. here are the problems I see.

* in your field list, you don't need to re-query the zipcode table.
>  cube_distance( (SELECT earth_coords from zipcodes WHERE zipcode =
> '90210') , earth_coords)/1609.344 AS RADIUS

becomes

 cube_distance(pets.earth_coords, earth_coords ) / 1609.344 AS RADIUS

also, dont. re-refer to the zipcodes table in the join clause. you are
already joining to it:
>       AND earth_box(
>         (SELECT earth_coords from zipcodes WHERE zipcode = '90210') ,
> 10*1609.344) @ earth_coords

becomes

  AND earth_box(zipcodes.earth_coords, 10*1609.344) ) @ pets.earth_coords

* also, does pet_state have any other states than 'available' and  '
not available'? if not, you should be using a boolean. if so, you can
consider a functional index to convert it to a booelan.

* if you always look up pets by species, we can explore composite
index columns on species, available (especially using the above
functional suggestion), etc.  composite > partial (imo)

thats just to start. play with it and see what comes up.

merlin

Re: cube operations slower than geo_distance() on production server

From

"Merlin Moncure"

Date:

12 February 2007, 15:06:09

On 2/12/07, Merlin Moncure <mmoncure@gmail.com> wrote:
>  cube_distance(pets.earth_coords, earth_coords ) / 1609.344 AS RADIUS

this should read:
cube_distance(pets.earth_coords, zipcodes.earth_coords ) / 1609.344 AS RADIUS

merlin

Re: cube operations slower than geo_distance() on production server

From

Mark Stosberg

Date:

12 February 2007, 15:49:40

Merlin--

Thanks so much for your help. Some follow-ups are below.

Merlin Moncure wrote:
>
>> Here the basic query I'm using:
>> SELECT
>>  -- 1609.344 is a constant for "meters per mile"
>>  cube_distance( (SELECT earth_coords from zipcodes WHERE zipcode =
>> '90210') , earth_coords)/1609.344
>>    AS RADIUS
>>    FROM pets
>>    -- "shelters_active" is a view where "shelter_state = 'active'"
>>    JOIN shelters_active as shelters USING (shelter_id)
>>    -- The zipcode fields here are varchars
>>    JOIN zipcodes ON (
>>         shelters.postal_code_for_joining = zipcodes.zipcode )
>>    -- search for just 'dogs'
>>    WHERE species_id = 1
>>        AND pet_state='available'
>>       AND earth_box(
>>         (SELECT earth_coords from zipcodes WHERE zipcode = '90210') ,
>> 10*1609.344
>>       ) @ earth_coords
>>    ORDER BY RADIUS;
>
> your query looks a bit funky. here are the problems I see.
>
> * in your field list, you don't need to re-query the zipcode table.
>>  cube_distance( (SELECT earth_coords from zipcodes WHERE zipcode =
>> '90210') , earth_coords)/1609.344 AS RADIUS
>
> becomes
>
> cube_distance(pets.earth_coords, earth_coords ) / 1609.344 AS RADIUS

It may not have been clear from the query, but only the 'zipcodes' table
has an 'earth_coords' column. Also, I think your refactoring means
something different. My query expresses "number of miles this pet is
from 90210", while I think the refactor expresses a distance between a
pet and another calculated value.

> also, dont. re-refer to the zipcodes table in the join clause. you are
> already joining to it:
>>       AND earth_box(
>>         (SELECT earth_coords from zipcodes WHERE zipcode = '90210') ,
>> 10*1609.344) @ earth_coords
>
> becomes
>
>  AND earth_box(zipcodes.earth_coords, 10*1609.344) ) @ pets.earth_coords

I have the same question here as above-- I don't see how the new syntax
includes the logic of "distance from the 90210 zipcode".

> * also, does pet_state have any other states than 'available' and  '
> not available'? if not, you should be using a boolean. if so, you can
> consider a functional index to convert it to a booelan.

Yes, it has three states.

> * if you always look up pets by species, we can explore composite
> index columns on species, available (especially using the above
> functional suggestion), etc.  composite > partial (imo)

We nearly always search by species. Right now it's mostly dogs and some
cats. I searched for references to composite index columns, and didn't
find much. Could you provide a direct reference to what you have in
mind?

Any other ideas appreciated!

    Mark

Re: cube operations slower than geo_distance() on production server

From

"Merlin Moncure"

Date:

13 February 2007, 10:15:41

On 2/12/07, Mark Stosberg <mark@summersault.com> wrote:
> Merlin Moncure wrote:
> >
> >> Here the basic query I'm using:
> >> SELECT
> >>  -- 1609.344 is a constant for "meters per mile"
> >>  cube_distance( (SELECT earth_coords from zipcodes WHERE zipcode =
> >> '90210') , earth_coords)/1609.344
> >>    AS RADIUS
> >>    FROM pets
> >>    -- "shelters_active" is a view where "shelter_state = 'active'"
> >>    JOIN shelters_active as shelters USING (shelter_id)
> >>    -- The zipcode fields here are varchars
> >>    JOIN zipcodes ON (
> >>         shelters.postal_code_for_joining = zipcodes.zipcode )
> >>    -- search for just 'dogs'
> >>    WHERE species_id = 1
> >>        AND pet_state='available'
> >>       AND earth_box(
> >>         (SELECT earth_coords from zipcodes WHERE zipcode = '90210') ,
> >> 10*1609.344
> >>       ) @ earth_coords
> >>    ORDER BY RADIUS;
> >
> It may not have been clear from the query, but only the 'zipcodes' table
> has an 'earth_coords' column. Also, I think your refactoring means
> something different. My query expresses "number of miles this pet is
> from 90210", while I think the refactor expresses a distance between a
> pet and another calculated value.

my mistake, i misunderstood what you were trying to do...can you try
removing the 'order by radius' and see if it helps? if not, we can try
working on this query some more.  There is a better, faster way to do
this, I'm sure of it.

merlin

Re: cube operations slower than geo_distance() on production server

From

"Merlin Moncure"

Date:

13 February 2007, 10:31:35

On 2/13/07, Merlin Moncure <mmoncure@gmail.com> wrote:
> On 2/12/07, Mark Stosberg <mark@summersault.com> wrote:
> > Merlin Moncure wrote:
> > >
> > >> Here the basic query I'm using:
> > >> SELECT
> > >>  -- 1609.344 is a constant for "meters per mile"
> > >>  cube_distance( (SELECT earth_coords from zipcodes WHERE zipcode =
> > >> '90210') , earth_coords)/1609.344
> > >>    AS RADIUS
> > >>    FROM pets
> > >>    -- "shelters_active" is a view where "shelter_state = 'active'"
> > >>    JOIN shelters_active as shelters USING (shelter_id)
> > >>    -- The zipcode fields here are varchars
> > >>    JOIN zipcodes ON (
> > >>         shelters.postal_code_for_joining = zipcodes.zipcode )
> > >>    -- search for just 'dogs'
> > >>    WHERE species_id = 1
> > >>        AND pet_state='available'
> > >>       AND earth_box(
> > >>         (SELECT earth_coords from zipcodes WHERE zipcode = '90210') ,
> > >> 10*1609.344
> > >>       ) @ earth_coords
> > >>    ORDER BY RADIUS;
> > >
> > It may not have been clear from the query, but only the 'zipcodes' table
> > has an 'earth_coords' column. Also, I think your refactoring means
> > something different. My query expresses "number of miles this pet is
> > from 90210", while I think the refactor expresses a distance between a
> > pet and another calculated value.
>
> my mistake, i misunderstood what you were trying to do...can you try
> removing the 'order by radius' and see if it helps? if not, we can try
> working on this query some more.  There is a better, faster way to do
> this, I'm sure of it.

try this:

SELECT * FROM
(
SELECT
  earth_coords(q.earth_coords, s.earth_coords)/1609.344 as radius
  FROM pets
  JOIN shelters_active as shelters USING (shelter_id)
  JOIN zipcodes s ON shelters.postal_code_for_joining = zipcodes.zipcode
  JOIN zipcodes q ON q.zipcode = '90210'
  WHERE species_id = 1
    AND pet_state='available'
    AND earth_box(q.earth_coords, 10*1609.344) @ s.earth_coords
) p order by radius

merlin

Re: cube operations slower than geo_distance() on production server

From

Mark Stosberg

Date:

13 February 2007, 16:54:35

On Tue, Feb 13, 2007 at 09:31:18AM -0500, Merlin Moncure wrote:
>
> >my mistake, i misunderstood what you were trying to do...can you try
> >removing the 'order by radius' and see if it helps? if not, we can try
> >working on this query some more.  There is a better, faster way to do
> >this, I'm sure of it.

Merlin,

Thanks again for your help. I did try without the "order by", and it
didn't make more difference.

> try this:

Based on your example, I was able to further refine the query to remove
the duplicate sub-selects that I had. However, this didn't seem to
improve performance.

I'm still stuck with the same essential problem: On the development
server, where is less data (400 results returns vs 1300), the cube
search is at least twice as fast, but on the production server, it is
consistently slower.

So, either the difference is one of scale, or I have some different
configuration detail in production that is causing the issue.

For reference, here's two versions of the query. The first uses
the old geo_distance(), and the second one is the new cube query I'm
trying, inspired by your suggested refactoring.

It's not surprising to me that the queries run at different speeds
on different servers, but it /is/ surprising that their relative speeds
reverse!

    Mark

-- Searching for all dogs within 10 miles of 90210 zipcode
EXPLAIN ANALYZE
SELECT
    zipcodes.lon_lat <@> center.lon_lat AS radius
        FROM (SELECT lon_lat FROM zipcodes WHERE zipcode = '90210') as center,
            pets
        JOIN shelters_active as shelters USING (shelter_id)
        JOIN zipcodes on (shelters.postal_code_for_joining = zipcodes.zipcode)
        WHERE species_id = 1
            AND pet_state='available'
            AND (zipcodes.lon_lat <@> center.lon_lat) < 10
        ORDER BY RADIUS;

EXPLAIN ANALYZE
SELECT
   cube_distance( center.earth_coords , zipcodes.earth_coords)/1609.344
   AS RADIUS
   FROM (SELECT
            earth_coords,
            earth_box( earth_coords , 10*1609.344 ) as center_box
            from zipcodes WHERE zipcode = '90210'
        ) AS center,
        pets
   JOIN shelters_active AS shelters USING (shelter_id)
   JOIN zipcodes ON ( shelters.postal_code_for_joining = zipcodes.zipcode )
   WHERE species_id = 1
       AND pet_state='available'
       AND center_box @ zipcodes.earth_coords
          ORDER BY RADIUS;

Re: cube operations slower than geo_distance() on production server

From

Tom Lane

Date:

14 February 2007, 03:24:39

Mark Stosberg <mark@summersault.com> writes:
> For reference, here's two versions of the query. The first uses
> the old geo_distance(), and the second one is the new cube query I'm
> trying, inspired by your suggested refactoring.

You didn't show EXPLAIN ANALYZE output :-(

Looking back in the thread, the last E.A. output I see is in your
message of 2/12 11:11, and the striking thing there is that it seems all
the time is going into one indexscan:

-> Bitmap Index Scan on pets_pet_state_idx (cost=0.00..562.50 rows=39571 width=0) (actual time=213.620..213.620
rows=195599loops=82)
Index Cond: ((pet_state)::text = 'available'::text)
Total runtime: 17933.675 ms

213.620 * 82 = 17516.840, so this step is all but 400msec of the run.

There are two things wrong here: first, that the estimated row count is
only 20% of actual; it should certainly not be that far off for such a
simple condition. I wonder if your vacuum/analyze procedures are
actually working. Second, you mentioned somewhere along the line that
'available' pets are about 10% of all the entries, which means that this
indexscan is more than likely entirely counterproductive: it would be
cheaper to ignore this index altogether.

Suggestions:

1. ANALYZE the table by hand, try the explain again and see if this
rowcount estimate gets better. If so, you need to look into why your
existing procedures aren't keeping the stats up to date.

2. If, with a more accurate rowcount estimate, the planner still wants
to use this index, try discouraging it. Brute force would be to drop
the index. If there are less-common pet_states that are actually worth
searching for, maybe keep the index but change it to a partial index
WHERE pet_state != 'available'.

Also, I don't see that you mentioned anywhere what PG version you are
running, but if it's not the latest then an update might help. I recall
having fixed a bug that made the planner too eager to AND on an index
that wouldn't actually help much ... which seems to fit this problem
description pretty well.

regards, tom lane

Re: cube operations slower than geo_distance() on production server

From

"Merlin Moncure"

Date:

14 February 2007, 09:58:48

On 2/14/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> There are two things wrong here: first, that the estimated row count is
> only 20% of actual; it should certainly not be that far off for such a
> simple condition.  I wonder if your vacuum/analyze procedures are
> actually working.  Second, you mentioned somewhere along the line that
> 'available' pets are about 10% of all the entries, which means that this
> indexscan is more than likely entirely counterproductive: it would be
> cheaper to ignore this index altogether.

I think switching the index on pet_state to a composite on (pet_state,
species_id) might help too.

or even better:

create function is_pet_available(text) returns bool as
$$
  select $1='available';
$$ language sql immutable;

create index pets_available_species_idx on
pets(is_pet_available(pet_state), species_id);

refactor your query something similar to:

SELECT * FROM
(
SELECT
 earth_coords(q.earth_coords, s.earth_coords)/1609.344 as radius
 FROM pets
 JOIN shelters_active as shelters USING (shelter_id)
 JOIN zipcodes s ON shelters.postal_code_for_joining = zipcodes.zipcode
 JOIN zipcodes q ON q.zipcode = '90210'
 WHERE
   is_pet_available(pet_state)
   AND species_id = 1
   AND earth_box(q.earth_coords, 10*1609.344) @ s.earth_coords
) p order by radius

merlin

reindex vs 'analyze' (was: Re: cube operations slower than geo_distance() on production server)

From

Mark Stosberg

Date:

14 February 2007, 12:28:58

Merlin Moncure wrote:
> On 2/14/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> There are two things wrong here: first, that the estimated row
>> count is only 20% of actual; it should certainly not be that far
>> off for such a simple condition. I wonder if your vacuum/analyze
>> procedures are actually working. Second, you mentioned somewhere
>> along the line that 'available' pets are about 10% of all the
>> entries, which means that this indexscan is more than likely
>> entirely counterproductive: it would be cheaper to ignore this
>> index altogether.

Tom,

Thanks for the generosity of your time. We are using  8.1.3 currently. I
have read there are some performance improvements in 8.2, but we have
not started evaluating that yet.

Your suggestion about the pet_state index was right on. I tried
"Analyze" on it, but still got the same bad estimate. However, I then
used "reindex" on that index, and that fixed the estimate accuracy,
which made the query run faster! The cube search now benchmarks faster
than the old search in production, taking about 2/3s of the time of the
old one.

Any ideas why the manual REINDEX did something that "analyze" didn't? It
makes me wonder if there is other tuning like this to do.

Attached is the EA output from the most recent run, after the "re-index".

> I think switching the index on pet_state to a composite on (pet_state,
> species_id) might help too.
>
> or even better:
>
> create function is_pet_available(text) returns bool as
> $$
>  select $1='available';
> $$ language sql immutable;
>
> create index pets_available_species_idx on
> pets(is_pet_available(pet_state), species_id);

Merlin,

Thanks for this suggestion. It is not an approach I had used before, and
I was interested to try it. However, the new index didn't get chosen.
(Perhaps I would need to drop the old one?) However, Tom's suggestions
did help. I'll follow up on that in just a moment.

>
> refactor your query something similar to:
>
> SELECT * FROM
> (
> SELECT
> earth_coords(q.earth_coords, s.earth_coords)/1609.344 as radius
> FROM pets
> JOIN shelters_active as shelters USING (shelter_id)
> JOIN zipcodes s ON shelters.postal_code_for_joining = zipcodes.zipcode
> JOIN zipcodes q ON q.zipcode = '90210'
> WHERE
>   is_pet_available(pet_state)
>   AND species_id = 1
>   AND earth_box(q.earth_coords, 10*1609.344) @ s.earth_coords
> ) p order by radius
>
> merlin
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend

Sort  (cost=5276.93..5277.00 rows=28 width=64) (actual time=1981.830..1984.415 rows=1344 loops=1)
   Sort Key: (cube_distance(public.zipcodes.earth_coords, public.zipcodes.earth_coords) / 1609.344::double precision)
   ->  Nested Loop  (cost=291.32..5276.26 rows=28 width=64) (actual time=24.080..1976.479 rows=1344 loops=1)
         ->  Nested Loop  (cost=2.15..575.79 rows=11 width=68) (actual time=2.637..34.067 rows=131 loops=1)
               ->  Nested Loop  (cost=2.15..153.48 rows=42 width=73) (actual time=1.939..3.972 rows=240 loops=1)
                     ->  Index Scan using zipcodes_pkey on zipcodes  (cost=0.00..3.01 rows=1 width=32) (actual
time=0.283..0.287rows=1 loops=1) 
                           Index Cond: ((zipcode)::text = '90210'::text)
                     ->  Bitmap Heap Scan on zipcodes  (cost=2.15..149.84 rows=42 width=41) (actual time=1.403..2.323
rows=240loops=1) 
                           Recheck Cond: (cube_enlarge(("outer".earth_coords)::cube, 16093.4357308298::double
precision,3) @ zipcodes.earth_coords) 
                           ->  Bitmap Index Scan on zip_earth_coords_idx  (cost=0.00..2.15 rows=42 width=0) (actual
time=1.377..1.377rows=240 loops=1) 
                                 Index Cond: (cube_enlarge(("outer".earth_coords)::cube, 16093.4357308298::double
precision,3) @ zipcodes.earth_coords) 
               ->  Index Scan using shelters_postal_code_for_joining_idx on shelters  (cost=0.00..10.03 rows=2
width=12)(actual time=0.064..0.118 rows=1 loops=240) 
                     Index Cond: ((shelters.postal_code_for_joining)::text = ("outer".zipcode)::text)
                     Filter: ((shelter_state)::text = 'active'::text)
         ->  Bitmap Heap Scan on pets  (cost=289.17..426.86 rows=35 width=4) (actual time=14.362..14.746 rows=10
loops=131)
               Recheck Cond: ((pets.shelter_id = "outer".shelter_id) AND ((pets.pet_state)::text = 'available'::text))
               Filter: (species_id = 1)
               ->  BitmapAnd  (cost=289.17..289.17 rows=35 width=0) (actual time=14.219..14.219 rows=0 loops=131)
                     ->  Bitmap Index Scan on pets_shelter_id_idx  (cost=0.00..3.89 rows=256 width=0) (actual
time=0.188..0.188rows=168 loops=131) 
                           Index Cond: (pets.shelter_id = "outer".shelter_id)
                     ->  Bitmap Index Scan on pets_pet_state_idx  (cost=0.00..285.02 rows=41149 width=0) (actual
time=22.043..22.043rows=40397 loops=82) 
                           Index Cond: ((pet_state)::text = 'available'::text)
 Total runtime: 1988.962 ms

Re: reindex vs 'analyze' (was: Re: cube operations slower than geo_distance() on production server)

From

Tom Lane

Date:

14 February 2007, 14:07:49

Mark Stosberg <mark@summersault.com> writes:
> Your suggestion about the pet_state index was right on. I tried
> "Analyze" on it, but still got the same bad estimate. However, I then
> used "reindex" on that index, and that fixed the estimate accuracy,
> which made the query run faster!

No, the estimate is about the same, and so is the plan.  The data seems
to have changed though --- on Monday you had

    ->  Bitmap Index Scan on pets_pet_state_idx  (cost=0.00..562.50 rows=39571 width=0) (actual time=213.620..213.620
rows=195599loops=82) 
           Index Cond: ((pet_state)::text = 'available'::text)

and now it's

     ->  Bitmap Index Scan on pets_pet_state_idx  (cost=0.00..285.02 rows=41149 width=0) (actual time=22.043..22.043
rows=40397loops=82) 
           Index Cond: ((pet_state)::text = 'available'::text)

Don't tell me you got 155000 pets adopted out yesterday ... what
happened here?

[ thinks... ] One possibility is that those were dead but
not-yet-vacuumed rows.  What's your vacuuming policy on this table?
(A bitmap-index-scan plan node will count dead rows as returned,
unlike all other plan node types, since we haven't actually visited
the heap yet...)

            regards, tom lane

Re: reindex vs 'analyze'

From

Mark Stosberg

Date:

14 February 2007, 15:05:21

On Wed, Feb 14, 2007 at 01:07:23PM -0500, Tom Lane wrote:
> Mark Stosberg <mark@summersault.com> writes:
> > Your suggestion about the pet_state index was right on. I tried
> > "Analyze" on it, but still got the same bad estimate. However, I then
> > used "reindex" on that index, and that fixed the estimate accuracy,
> > which made the query run faster!
>
> No, the estimate is about the same, and so is the plan.  The data seems
> to have changed though --- on Monday you had
>
>     ->  Bitmap Index Scan on pets_pet_state_idx  (cost=0.00..562.50 rows=39571 width=0) (actual time=213.620..213.620
rows=195599loops=82) 
>            Index Cond: ((pet_state)::text = 'available'::text)
>
> and now it's
>
>      ->  Bitmap Index Scan on pets_pet_state_idx  (cost=0.00..285.02 rows=41149 width=0) (actual time=22.043..22.043
rows=40397loops=82) 
>            Index Cond: ((pet_state)::text = 'available'::text)
>
> Don't tell me you got 155000 pets adopted out yesterday ... what
> happened here?

That seemed be the difference that the "reindex" made. The number of
rows in the table and the number marked "available" is roughly
unchanged.

select count(*) from pets;
--------
304951
  (1 row)

select count(*) from pets where pet_state = 'available';
-------
39857

It appears just about 400 were marked as "adopted" yesterday.

> [ thinks... ] One possibility is that those were dead but
> not-yet-vacuumed rows.  What's your vacuuming policy on this table?

It gets vacuum analyzed ery two hours throughout most of the day. Once
Nightly we vacuum analyze everything, but most of the time we just do
this table.

> (A bitmap-index-scan plan node will count dead rows as returned,
> unlike all other plan node types, since we haven't actually visited
> the heap yet...)

Thanks again for your help, Tom.

    Mark

--
 . . . . . . . . . . . . . . . . . . . . . . . . . . .
   Mark Stosberg            Principal Developer
   mark@summersault.com     Summersault, LLC
   765-939-9301 ext 202     database driven websites
 . . . . . http://www.summersault.com/ . . . . . . . .

Re: reindex vs 'analyze'

From

Mark Stosberg

Date:

16 February 2007, 15:26:04

Tom Lane wrote:
> Mark Stosberg <mark@summersault.com> writes:
>> Your suggestion about the pet_state index was right on. I tried
>> "Analyze" on it, but still got the same bad estimate. However, I then
>> used "reindex" on that index, and that fixed the estimate accuracy,
>> which made the query run faster!
>
> No, the estimate is about the same, and so is the plan.  The data seems
> to have changed though --- on Monday you had
>
>     ->  Bitmap Index Scan on pets_pet_state_idx  (cost=0.00..562.50 rows=39571 width=0) (actual time=213.620..213.620
rows=195599loops=82) 
>            Index Cond: ((pet_state)::text = 'available'::text)
>
> and now it's
>
>      ->  Bitmap Index Scan on pets_pet_state_idx  (cost=0.00..285.02 rows=41149 width=0) (actual time=22.043..22.043
rows=40397loops=82) 
>            Index Cond: ((pet_state)::text = 'available'::text)
>
> Don't tell me you got 155000 pets adopted out yesterday ... what
> happened here?
>
> [ thinks... ] One possibility is that those were dead but
> not-yet-vacuumed rows.  What's your vacuuming policy on this table?
> (A bitmap-index-scan plan node will count dead rows as returned,
> unlike all other plan node types, since we haven't actually visited
> the heap yet...)

Today I noticed a combination of related mistakes here.

1. The Vacuum commands were being logged to a file that didn't exist.
I'm mot sure if this prevented them being run. I had copied the cron
entry for another machine, but neglected to create /var/log/pgsql:

vacuumdb -z --table pets -d saveapet  >> /var/log/pgsql/vacuum.log 2>&1

###

However, I again noticed that the row counts were horribly off on the
'pet_state' index, and again used REINDEX to fix it. (Examples below).
However, if the "VACUUM ANALYZE" wasn't actually run, that does seem
like it could have been related.

I'll have to see how things are tomorrow after a full round of database
vacuuming.

   Mark


->  Bitmap Index Scan on pets_pet_state_idx  (cost=0.00..337.29
rows=39226 width=0) (actual time=77.158.          .77.158 rows=144956
loops=81)
                           Index Cond: ((pet_state)::text =
'available'::text)
 Total runtime: 8327.261 ms


->  Bitmap Index Scan on pets_pet_state_idx  (cost=0.00..271.71
rows=39347 width=0) (actual time=15.466..15.466 rows=40109 loops=81)
                           Index Cond: ((pet_state)::text =
'available'::text)
 Total runtime: 1404.124 ms