The random sampling query is normally pretty snappy. It usually takes
on the order of 1 second to sample a few thousand rows of data out of a
few million. The sampling is consistently quick, too. However, on some
days, the sampling starts off quick, then when the process starts
sampling from a different subset of data (different range of times for
the same day), the sampling query takes a couple minutes.
Regarding the concurrent vacuuming, this is definitely not happening. I
always check pg_stat_activity whenever the sampling process starts to
lag behind. I have never seen a vacuum running during this time.
Interesting idea to issue the EXPLAIN first... I will see if I can
instrument the sampling program to do this.
Thanks for your help Tom.
Mike
Tom Lane wrote:
> Mike Charnoky <noky@nextbus.com> writes:
>> The sampling query which runs really slow on some days looks something
>> like this:
>
>> INSERT INTO sampled_data
>> (item_name, timestmp, ... )
>> SELECT item_name, timestmp, ... )
>> FROM raw_data
>> WHERE timestmp >= ? and timestmp < ?
>> AND item_name=?
>> AND some_data_field NOTNULL
>> ORDER BY random()
>> LIMIT ?;
>
> Hmph, I'd expect that that would run pretty slowly *all* the time :-(.
> There's no good way to optimize "ORDER BY random()". However, it seems
> like the first thing you should do is modify the program so that it
> issues an EXPLAIN for that right before actually doing the query, and
> then you could see if the plan is different on the slow days.
>
>> We have done a great deal of PG tuning, including the autovacuum for the
>> "raw_data" table. Autovacuum kicks like clockwork every day on that
>> table after the sampling process finishes (after one day's worth of data
>> is deleted from "raw_data" table, a roughly 7% change in size).
>
> Also, are you sure you have ruled out the possibility that the problem
> comes from autovac kicking in *while* the update is running?
>
> regards, tom lane
>