Re: intermittant performance problem - Mailing list pgsql-general

From Mike Charnoky
Subject Re: intermittant performance problem
Date
Msg-id 49CA5743.7090704@nextbus.com
Whole thread Raw
In response to Re: intermittant performance problem  (Mike Charnoky <noky@nextbus.com>)
Responses Re: intermittant performance problem
Re: intermittant performance problem
List pgsql-general
Mike Charnoky wrote:
> Scott Marlowe wrote:
>> On Mon, Mar 9, 2009 at 8:21 PM, Mike Charnoky <noky@nextbus.com> wrote:
>>> The random sampling query is normally pretty snappy.  It usually takes on
>>> the order of 1 second to sample a few thousand rows of data out of a few
>>> million.  The sampling is consistently quick, too.  However, on some days,
>>> the sampling starts off quick, then when the process starts sampling from a
>>> different subset of data (different range of times for the same day), the
>>> sampling query takes a couple minutes.
>>
>> Then definitely look at saving explain plans before execution to
>> compare fast to slow runs.  This definitely sounds like ocassionally
>> bad query plans to me so far.
>>>
>>> Tom Lane wrote:
>>>> Mike Charnoky <noky@nextbus.com> writes:
>>>>> The sampling query which runs really slow on some days looks something
>>>>> like this:
>>>>> INSERT INTO sampled_data
>>>>>   (item_name, timestmp, ... )
>>>>>   SELECT item_name, timestmp, ... )
>>>>>   FROM raw_data
>>>>>   WHERE timestmp >= ? and timestmp < ?
>>>>>   AND item_name=?
>>>>>   AND some_data_field NOTNULL
>>>>>   ORDER BY random()
>>>>>   LIMIT ?;
>>>> Hmph, I'd expect that that would run pretty slowly *all* the time :-(.
>>>> There's no good way to optimize "ORDER BY random()".  However, it seems
>>>> like the first thing you should do is modify the program so that it
>>>> issues an EXPLAIN for that right before actually doing the query, and
>>>> then you could see if the plan is different on the slow days.

I'm at the end of my rope here.  Tried the following:

* Manually run ANALYZE on the table before running the sampling query.
This does not help, there are still times when the sampling runs slower
by two orders of magnitude and disk IO is through the roof.
* Bump work_mem for the query to 128MB (up from 32MB).  This did not
help.  Also, no temp files were created in $PG/data/base/pgsql_tmp/, so
work_mem does not seem to be an issue.
* EXPLAINs look nearly identical whether the query runs quickly or slowly

The thing that gets me is, why does this query totally hose the entire
database?  Other clients have a very hard time writing to the db when
this sampling query is running slow, disk IO is maxxed out.  This just
doesn't seem right.  Why would a single pg backend strangle db
performance to such an extent?  Aren't there ways to throttle this back?

Due to the nature of the sampling (need to limit using several
parameters with a WHERE clause), I can't just generate random numbers to
select data that I need.  Looks like I am stuck using ORDER BY RANDOM().
  The only other option at this point seems to be to implement
TABLESAMPLE, probably starting with the academic work that Neil Conway
published (http://neilconway.org/talks/hacking/)


Mike

pgsql-general by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: Proper entry of polygon type data
Next
From: Tk421
Date:
Subject: Re: Problem with Windows XP Pro SP3 asn PostgreSQL 8.3.5