Re: intermittant performance problem - Mailing list pgsql-general

From Mike Charnoky
Subject Re: intermittant performance problem
Date
Msg-id 49B5CEA5.9010306@nextbus.com
Whole thread Raw
In response to Re: intermittant performance problem  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: intermittant performance problem  (Scott Marlowe <scott.marlowe@gmail.com>)
List pgsql-general
The random sampling query is normally pretty snappy.  It usually takes
on the order of 1 second to sample a few thousand rows of data out of a
few million.  The sampling is consistently quick, too.  However, on some
days, the sampling starts off quick, then when the process starts
sampling from a different subset of data (different range of times for
the same day), the sampling query takes a couple minutes.

Regarding the concurrent vacuuming, this is definitely not happening.  I
always check pg_stat_activity whenever the sampling process starts to
lag behind.  I have never seen a vacuum running during this time.

Interesting idea to issue the EXPLAIN first... I will see if I can
instrument the sampling program to do this.

Thanks for your help Tom.


Mike

Tom Lane wrote:
> Mike Charnoky <noky@nextbus.com> writes:
>> The sampling query which runs really slow on some days looks something
>> like this:
>
>> INSERT INTO sampled_data
>>    (item_name, timestmp, ... )
>>    SELECT item_name, timestmp, ... )
>>    FROM raw_data
>>    WHERE timestmp >= ? and timestmp < ?
>>    AND item_name=?
>>    AND some_data_field NOTNULL
>>    ORDER BY random()
>>    LIMIT ?;
>
> Hmph, I'd expect that that would run pretty slowly *all* the time :-(.
> There's no good way to optimize "ORDER BY random()".  However, it seems
> like the first thing you should do is modify the program so that it
> issues an EXPLAIN for that right before actually doing the query, and
> then you could see if the plan is different on the slow days.
>
>> We have done a great deal of PG tuning, including the autovacuum for the
>> "raw_data" table.  Autovacuum kicks like clockwork every day on that
>> table after the sampling process finishes (after one day's worth of data
>> is deleted from "raw_data" table, a roughly 7% change in size).
>
> Also, are you sure you have ruled out the possibility that the problem
> comes from autovac kicking in *while* the update is running?
>
>             regards, tom lane
>

pgsql-general by date:

Previous
From: Mike Charnoky
Date:
Subject: Re: intermittant performance problem
Next
From: Craig Ringer
Date:
Subject: Re: C++ User-defined functions