Home > mailing lists

Re: cube operations slower than geo_distance() on production server - Mailing list pgsql-performance

From	Tom Lane
Subject	Re: cube operations slower than geo_distance() on production server
Date	February 14, 2007 03:24:39
Msg-id	18714.1171437344@sss.pgh.pa.us Whole thread Raw
In response to	Re: cube operations slower than geo_distance() on production server (Mark Stosberg <mark@summersault.com>)
Responses	Re: cube operations slower than geo_distance() on production server
List	pgsql-performance

Tree view

Mark Stosberg <mark@summersault.com> writes:
> For reference, here's two versions of the query. The first uses
> the old geo_distance(), and the second one is the new cube query I'm
> trying, inspired by your suggested refactoring.

You didn't show EXPLAIN ANALYZE output :-(

Looking back in the thread, the last E.A. output I see is in your
message of 2/12 11:11, and the striking thing there is that it seems all
the time is going into one indexscan:

-> Bitmap Index Scan on pets_pet_state_idx (cost=0.00..562.50 rows=39571 width=0) (actual time=213.620..213.620
rows=195599loops=82)
Index Cond: ((pet_state)::text = 'available'::text)
Total runtime: 17933.675 ms

213.620 * 82 = 17516.840, so this step is all but 400msec of the run.

There are two things wrong here: first, that the estimated row count is
only 20% of actual; it should certainly not be that far off for such a
simple condition. I wonder if your vacuum/analyze procedures are
actually working. Second, you mentioned somewhere along the line that
'available' pets are about 10% of all the entries, which means that this
indexscan is more than likely entirely counterproductive: it would be
cheaper to ignore this index altogether.

Suggestions:

1. ANALYZE the table by hand, try the explain again and see if this
rowcount estimate gets better. If so, you need to look into why your
existing procedures aren't keeping the stats up to date.

2. If, with a more accurate rowcount estimate, the planner still wants
to use this index, try discouraging it. Brute force would be to drop
the index. If there are less-common pet_states that are actually worth
searching for, maybe keep the index but change it to a partial index
WHERE pet_state != 'available'.

Also, I don't see that you mentioned anywhere what PG version you are
running, but if it's not the latest then an update might help. I recall
having fixed a bug that made the planner too eager to AND on an index
that wouldn't actually help much ... which seems to fit this problem
description pretty well.

regards, tom lane

pgsql-performance by date:

From: Heikki Linnakangas
Date: 13 February 2007, 18:53:23
Subject: Re: quad or dual core Intel CPUs

From: Brian Herlihy
Date: 14 February 2007, 04:40:45
Subject: An unwanted seqscan

Re: cube operations slower than geo_distance() on production server - Mailing list pgsql-performance

Previous

Next