Re: Possible infinite loop in query using bitmap scans - Mailing list pgsql-general

From Casey Duncan
Subject Re: Possible infinite loop in query using bitmap scans
Date
Msg-id cb63857e8cb866fca0b5daa088322c12@pandora.com
Whole thread Raw
In response to Re: Possible infinite loop in query using bitmap scans  (Michael Fuhr <mike@fuhr.org>)
Responses Re: Possible infinite loop in query using bitmap scans  (Casey Duncan <casey@pandora.com>)
List pgsql-general
On Mar 13, 2006, at 9:50 AM, Michael Fuhr wrote:

> On Sun, Mar 12, 2006 at 11:36:23PM -0800, Casey Duncan wrote:
>>  SELECT count(*) FROM webhits
>>        WHERE path LIKE '/radio/tuner_%.swf' AND status = 200
>>            AND date_recorded >= '3/10/2006'::TIMESTAMP
>>            AND date_recorded < '3/11/2006'::TIMESTAMP;
> [...]
>>  Aggregate  (cost=794775.08..794775.09 rows=1 width=0)
> [...]
>> According to the planner it should take <15 minutes which is typical
>> in
>> practice.
>
> The planner's cost estimate is in units of disk page fetches, not
> time.  The above estimate isn't 794775.09 ms (~13.25 min) but rather
> 794775.09 times the cost of a single page fetch, however much that
> is.  See "Using EXPLAIN" in the "Performance Tips" chapter of the
> documentation.
>
> http://www.postgresql.org/docs/8.1/interactive/performance-
> tips.html#USING-EXPLAIN

Doh! I should've known that.

>> About half the times it runs, however, it never terminates
>> (even after days) and just spins consuming 99+% of CPU with no disk
>> activity. This query was never a problem in postgres versions < 8.1.2,
>> however the data has grown substantially since that time. I notice it
>> uses the recent in-memory bitmap feature, so I wondered if it was
>> exposing a bug.
>
> If the problem happens half the time then you have a somewhat
> repeatable test case.  Do you get more consistent performance if
> you set enable_bitmapscan to off?  What's the query plan if you do
> that?

Here's the plan with bitmap scans off:


       QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
----------------------
  Aggregate  (cost=891363.71..891363.72 rows=1 width=0)
    ->  Index Scan using webhits_date_idx1 on webhits
(cost=0.00..891360.30 rows=1362 width=0)
          Index Cond: ((date_recorded >= '2006-03-10
00:00:00'::timestamp without time zone) AND (date_recorded <
'2006-03-11 00:00:00'::timestamp without time zone))
          Filter: (((path)::text ~~ '/radio/tuner_%.swf'::text) AND
(status = 200))
(4 rows)

The query runs to completion this way in about 40 minutes. I turned
bitmap scans back on and it hangs again (I ran it for about 5 hours).

> If you narrow the search criteria so the query returns fewer rows,
> do you still see the problem?  Can you identify a "sour spot" where
> the problem starts to happen?

I'll do that tomorrow, and let you know.

>> If I restart the postmaster, the query will complete in the expected
>> time.
>
> Does the problem eventually start happening again?  If so, after
> how long?  How did you determine that the restart is relevant?  Do
> you consistently see different (presumably better) performance after
> a restart than if you don't restart?

This is a production box, so I can't restart it whenever I want and I
haven't yet reproduced it elsewhere -- the data base size makes that
cumbersome at best -- but once after it hung up, I restarted postgres
and the report ran to completion for a couple of days then started
hanging again. Today it seems to pretty consistently hang, I'll see if
I can restart it overnight and test it again.

-Casey


pgsql-general by date:

Previous
From: Brandon Keepers
Date:
Subject: Relation 'pg_largeobject' does not exist
Next
From: Tom Lane
Date:
Subject: Re: Relation 'pg_largeobject' does not exist