Re: Avoiding Recheck Cond when using Select Distinct - Mailing list pgsql-performance

From Jeff Janes
Subject Re: Avoiding Recheck Cond when using Select Distinct
Date
Msg-id CAMkU=1xDhzmkEJZ4xQNg9nqJCi6sfaxKATbZMNTLPTu=UQ7-kA@mail.gmail.com
Whole thread Raw
In response to Avoiding Recheck Cond when using Select Distinct  (jackrg <jack@groundbreakingsoftware.com>)
Responses Re: Avoiding Recheck Cond when using Select Distinct
List pgsql-performance
On Fri, Feb 22, 2013 at 9:45 AM, Jack Royal-Gordon <jack@groundbreakingsoftware.com> wrote:

On Feb 22, 2013, at 9:19 AM, Jeff Janes <jeff.janes@gmail.com> wrote:

On Fri, Feb 22, 2013 at 8:36 AM, jackrg <jack@groundbreakingsoftware.com> wrote:
The following query produces a Recheck Cond and a costly Bitmap Heap Scan
even though I have a composite index that covers both columns being filtered
and selected.  

Can you show us the definition of that index?

CREATE INDEX test ON sales_points USING btree (book_id, store_id);

When I made up some random data that I thought was plausible, the planner did choose to use this use this index.  You might have some skew in your data that we need to know more about in order to reproduce this (But that is probably not worthwhile, see below)
 

 
I believe this is because the initial bitmap scan produces
2912 rows, which is too many for the available bitmap space.  I've tried
rewriting the command as "Select ... group by" but it still uses the BHS. Is
there a way to rewrite this command that would improve performance by
avoiding the costly Bitmap Heap Scan?


How do you know that the bitmap heap scan is costly, since you haven't gotten it to use an alternative to compare it to?   As a temporary experimental measure, you could at least  set enable_bitmapscan TO off, to see what happens.

Here's the "Explan Analze" output after setting enable_bitmapscan off:
"HashAggregate  (cost=8202.02..8203.33 rows=131 width=8) (actual time=4.275..4.280 rows=32 loops=1)"
"  ->  Index Only Scan using test on sales_points  (cost=0.01..8187.46 rows=2912 width=8) (actual time=0.083..3.268 rows=4233 loops=1)"
"        Index Cond: ((book_id = ANY ('{421,422,419,420}'::integer[])) AND (store_id = ANY ('{1,2,3,4,5,6,199,201,202}'::integer[])))"
"        Heap Fetches: 4233"
"Total runtime: 4.331 ms"

While the total runtime reported is actually less, the costs reported are 50% higher.


I suspect this is because the planner assumes that IO is rather expensive, and the bitmap plan saves on IO slightly while taking more CPU.  But since all your data is in memory in this case, the IO savings are not real, while the extra CPU time is real.  However, this is probably not relevant.  Shaving 5ms off of a 10ms execution time looks impressive, but shaving 5ms off of your problematic case of 10s isn't going to do much for you. 

It could be that the time savings extrapolated to the problem case will be multiplicative rather than additive.  But my gut is telling me it won't be multiplicative.  In fact it will probably go the other way, if the bitmap truly is more IO efficient, and your problematic case is due to poor IO performance, then getting it off the bitmap scan could easily make things worse.
 

 Under load in production, the
command takes 10,158 ms.  

Do you mean 10.158 ms rather than 10,158 ms?  If the production environment really takes 1000 times longer than the environment in which you gathered the EXPLAIN, then I would seriously doubt how useful that EXPLAIN could possibly be.

That's right, 10K ms.  That's why I started trying to tune the performance of this query -- I had a warning in my log of a long-running query.  Two thoughts why these numbers might be so different: 1) in production but not in the standalone test, many jobs are hitting the database simultaneously, so perhaps there was a great deal of I/O wait time involved;

If the server is so overwhelmed that it takes 1000 times longer, I think it would be setting off alarm bells all over the place, rather than with just this one query.  Unless all that other activity is driving your data out of the cache so this query need to read it back from disk when on dev system it doesn't need to.  But that means you need to find a way to simulate the actual cache hit on your dev system, or you won't be able to do realistic testing.

It would really help to have "explain (analyze, buffers)".  Especially if you turn on track_io_timing, (although that part probably can't be done on Heroku, as it requires superuser access.)

It would help even more if you can get that information while the server is misbehaving, rather than when it is not.  Does Heroku support auto_explain?

 
2) production and the standalone test were separated by about 8 hours, and its possible that some sort of auto-vacuum operation had taken place to make the execution more efficient.

More likely would be that some operation caused it to change to a different plan that is inherently inefficient, rather than changing the efficiency of a given plan.  That is where auto_explain would help.  Also, if the bad execution is coming through your app, while the execution through psql, then maybe your app is secretly changing some of the planner settings.

Cheers,

Jeff

pgsql-performance by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Are bitmap index scans slow to start?
Next
From: Andre
Date:
Subject: Server stalls, all CPU 100% system time