Thread: Gigantic load average spikes

Gigantic load average spikes

From
rihad
Date:
Postgres 10.3 On freebsd 10.3 is almost idle, disk i/o about 5-10%, 
number running processes about 1-3, cpu about 90% idle, then we run a 
i/o heavy job like "refresh materialized view", cpu & disk i/o still not 
maxed out, all of a sudden the number of running processes increases to 
about 250-300 for a second, which increases load averages that we 
periodically poll to see if everything's alright. Why is that? Disk i/o 
is far from being maxed out, the actual number of running processes is 
really small.




Re: Gigantic load average spikes

From
Adrian Klaver
Date:
On 3/31/19 8:23 AM, rihad wrote:
> Postgres 10.3 On freebsd 10.3 is almost idle, disk i/o about 5-10%, 
> number running processes about 1-3, cpu about 90% idle, then we run a 
> i/o heavy job like "refresh materialized view", cpu & disk i/o still not 
> maxed out, all of a sudden the number of running processes increases to 
> about 250-300 for a second, which increases load averages that we 

Do you know what the processes are?

Are there any messages in the Postgres log from that time period that 
might shed light?

> periodically poll to see if everything's alright. Why is that? Disk i/o 
> is far from being maxed out, the actual number of running processes is 
> really small.

You are referring to the between spikes process count?

> 
> 
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Gigantic load average spikes

From
"Peter J. Holzer"
Date:
On 2019-03-31 19:23:11 +0400, rihad wrote:
> Postgres 10.3 On freebsd 10.3 is almost idle, disk i/o about 5-10%, number
> running processes about 1-3, cpu about 90% idle, then we run a i/o heavy job
> like "refresh materialized view", cpu & disk i/o still not maxed out, all of
> a sudden the number of running processes increases to about 250-300 for a

What exactly do you mean by "running processes"? I don't think I've ever
seen a Unix with only 1 to 3 running processes in total, so you are
probably referring to processes in a certain state. Runnable (R)?
Uninterruptible sleep (D)? Both? Something else?

        hp

--
   _  | Peter J. Holzer    | we build much bigger, better disasters now
|_|_) |                    | because we have much more sophisticated
| |   | hjp@hjp.at         | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

Attachment

Re: Gigantic load average spikes

From
rihad
Date:
> What exactly do you mean by "running processes"? I don't think I've ever
> seen a Unix with only 1 to 3 running processes in total, so you are
> probably referring to processes in a certain state. Runnable (R)?
> Uninterruptible sleep (D)? Both? Something else?

Just that, 250-300 running processes in top shown for a second. 250-300 
is the number of postgres worker processes used, but normally only 1-3 
of them are running according to top. At times of load FreeBSD (?) 
schedules all of them to run. This doesn't really put the machine on its 
knees, it just impacts load avg.




Re: Gigantic load average spikes

From
David Rowley
Date:
On Mon, 1 Apr 2019 at 18:08, rihad <rihad@mail.ru> wrote:
>
> > What exactly do you mean by "running processes"? I don't think I've ever
> > seen a Unix with only 1 to 3 running processes in total, so you are
> > probably referring to processes in a certain state. Runnable (R)?
> > Uninterruptible sleep (D)? Both? Something else?
>
> Just that, 250-300 running processes in top shown for a second. 250-300
> is the number of postgres worker processes used, but normally only 1-3
> of them are running according to top. At times of load FreeBSD (?)
> schedules all of them to run. This doesn't really put the machine on its
> knees, it just impacts load avg.

Perhaps a bunch of processes waiting on the access exclusive lock on
the materialized view being released?

log_lock_waits might help you if the MV takes more than a second to
refresh, otherwise, you might need to have a look at ungranted locks
in pg_locks and see if the number of locks spikes during the refresh.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Gigantic load average spikes

From
Adrian Klaver
Date:
On 3/31/19 10:08 PM, rihad wrote:
>> What exactly do you mean by "running processes"? I don't think I've ever
>> seen a Unix with only 1 to 3 running processes in total, so you are
>> probably referring to processes in a certain state. Runnable (R)?
>> Uninterruptible sleep (D)? Both? Something else?
> 
> Just that, 250-300 running processes in top shown for a second. 250-300 
> is the number of postgres worker processes used, but normally only 1-3 
> of them are running according to top. At times of load FreeBSD (?) 

So what are the process titles?

> schedules all of them to run. This doesn't really put the machine on its 
> knees, it just impacts load avg.
> 
> 
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Gigantic load average spikes

From
rihad
Date:
On 04/01/2019 06:17 PM, Adrian Klaver wrote:
> On 3/31/19 10:08 PM, rihad wrote:
>>> What exactly do you mean by "running processes"? I don't think I've 
>>> ever
>>> seen a Unix with only 1 to 3 running processes in total, so you are
>>> probably referring to processes in a certain state. Runnable (R)?
>>> Uninterruptible sleep (D)? Both? Something else?
>>
>> Just that, 250-300 running processes in top shown for a second. 
>> 250-300 is the number of postgres worker processes used, but normally 
>> only 1-3 of them are running according to top. At times of load 
>> FreeBSD (?) 
>
> So what are the process titles?

postgres




Re: Gigantic load average spikes

From
Adrian Klaver
Date:
On 4/1/19 8:06 AM, rihad wrote:
> On 04/01/2019 06:17 PM, Adrian Klaver wrote:
>> On 3/31/19 10:08 PM, rihad wrote:
>>>> What exactly do you mean by "running processes"? I don't think I've 
>>>> ever
>>>> seen a Unix with only 1 to 3 running processes in total, so you are
>>>> probably referring to processes in a certain state. Runnable (R)?
>>>> Uninterruptible sleep (D)? Both? Something else?
>>>
>>> Just that, 250-300 running processes in top shown for a second. 
>>> 250-300 is the number of postgres worker processes used, but normally 
>>> only 1-3 of them are running according to top. At times of load 
>>> FreeBSD (?) 
>>
>> So what are the process titles?
> 
> postgres

Have you tried hitting the 'c' key in top to get a fuller description?

> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Gigantic load average spikes

From
Michel Pelletier
Date:


On Sun, Mar 31, 2019 at 10:49 PM David Rowley <david.rowley@2ndquadrant.com> wrote:

Perhaps a bunch of processes waiting on the access exclusive lock on
the materialized view being released?

log_lock_waits might help you if the MV takes more than a second to
refresh, otherwise, you might need to have a look at ungranted locks
in pg_locks and see if the number of locks spikes during the refresh.

I think David's got the right idea here.  Like he said, investigate pg_locks, if it is the refresh materialized view, you can avoid the problem by doing 'REFRESH MATERIALIZED VIEW CONCURRENTLY'.  You will need at least one unique index on the table.


--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Gigantic load average spikes

From
rihad
Date:
On 04/01/2019 08:30 PM, Michel Pelletier wrote:


On Sun, Mar 31, 2019 at 10:49 PM David Rowley <david.rowley@2ndquadrant.com> wrote:

Perhaps a bunch of processes waiting on the access exclusive lock on
the materialized view being released?

log_lock_waits might help you if the MV takes more than a second to
refresh, otherwise, you might need to have a look at ungranted locks
in pg_locks and see if the number of locks spikes during the refresh.

I think David's got the right idea here.  Like he said, investigate pg_locks, if it is the refresh materialized view, you can avoid the problem by doing 'REFRESH MATERIALIZED VIEW CONCURRENTLY'.  You will need at least one unique index on the table.

It is actually refreshed concurrently.



--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Gigantic load average spikes

From
Rene Romero Benavides
Date:
On Mon, Apr 1, 2019 at 10:35 AM rihad <rihad@mail.ru> wrote:
On 04/01/2019 08:30 PM, Michel Pelletier wrote:


On Sun, Mar 31, 2019 at 10:49 PM David Rowley <david.rowley@2ndquadrant.com> wrote:

Perhaps a bunch of processes waiting on the access exclusive lock on
the materialized view being released?

log_lock_waits might help you if the MV takes more than a second to
refresh, otherwise, you might need to have a look at ungranted locks
in pg_locks and see if the number of locks spikes during the refresh.

I think David's got the right idea here.  Like he said, investigate pg_locks, if it is the refresh materialized view, you can avoid the problem by doing 'REFRESH MATERIALIZED VIEW CONCURRENTLY'.  You will need at least one unique index on the table.

It is actually refreshed concurrently.



--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Hi. How many vcores does the server have? what's the load average we're talking about? do you mind sharing your postgresql configuration? 

--