Thread: What could lock up pg_stat_activity

What could lock up pg_stat_activity

From

Marcus Engene

Date:

14 October 2011, 15:35:58

Hi,

I have a site with reasonable # of page loads. I log slow selects so I
can get panic later. Today I had a 12s period where things stood still.

As an overload precaution, for each page I...
select count(*) as nbr from pg_stat_activity where current_query <> '<IDLE>'
...to see that not too much is torturing the db. But (roughly) 1/3 of
the entries was this test above. What could possibly make a count() on
pg_stat_activity take 12s? Please speculate wildly.

Setup:
Contemporary Debian
72GB
4x1TB RAID6
2x100GB RAID1 SSD
pgbouncer
private local network
pg 9.0.4

Thanks,
Marcus

Re: What could lock up pg_stat_activity

From

Steve Crawford

Date:

14 October 2011, 16:19:33

On 10/14/2011 08:07 AM, Marcus Engene wrote:
> Hi,
>
> I have a site with reasonable # of page loads. I log slow selects so I
> can get panic later. Today I had a 12s period where things stood still.
>
> As an overload precaution, for each page I...
> select count(*) as nbr from pg_stat_activity where current_query <>
> '<IDLE>'
> ...to see that not too much is torturing the db. But (roughly) 1/3 of
> the entries was this test above. What could possibly make a count() on
> pg_stat_activity take 12s? Please speculate wildly.
>
> Setup:
> Contemporary Debian
> 72GB
> 4x1TB RAID6
> 2x100GB RAID1 SSD
> pgbouncer
> private local network
> pg 9.0.4
>
> Thanks,
> Marcus
>
>

A bit more info is in order. Is the database handling lots of writes or
mostly simple reads? How much of the database is typically touched? How
big is the database? What data is on the spinning media vs. the SSD? Do
you have battery-backed write-cache? How big is it?

There are lots of situations with large RAM sizes where the OS allows so
much data to be cached that when a flush to disk occurs, everything
halts till the data is written.

If you don't have Greg Smith's "PostgreSQL 9.0 High Performance", go buy it.

You may find this recent thread interesting, not for the solution but
for the discussion:
http://postgresql.1045698.n5.nabble.com/Adding-more-memory-hugh-cpu-load-td4888181.html#a4888530

Cheers,
Steve

Re: What could lock up pg_stat_activity

From

Marcus Engene

Date:

25 October 2011, 14:23:09

On 10/14/11 6:19 , Steve Crawford wrote:
> On 10/14/2011 08:07 AM, Marcus Engene wrote:
>> Hi,
>>
>> I have a site with reasonable # of page loads. I log slow selects so
>> I can get panic later. Today I had a 12s period where things stood
>> still.
>>
>> As an overload precaution, for each page I...
>> select count(*) as nbr from pg_stat_activity where current_query <>
>> '<IDLE>'
>> ...to see that not too much is torturing the db. But (roughly) 1/3 of
>> the entries was this test above. What could possibly make a count()
>> on pg_stat_activity take 12s? Please speculate wildly.
>>
>> Setup:
>> Contemporary Debian
>> 72GB
>> 4x1TB RAID6
>> 2x100GB RAID1 SSD
>> pgbouncer
>> private local network
>> pg 9.0.4
>>
>> Thanks,
>> Marcus
>>
>>
>
> A bit more info is in order. Is the database handling lots of writes
> or mostly simple reads? How much of the database is typically touched?
> How big is the database? What data is on the spinning media vs. the
> SSD? Do you have battery-backed write-cache? How big is it?
>
> There are lots of situations with large RAM sizes where the OS allows
> so much data to be cached that when a flush to disk occurs, everything
> halts till the data is written.
>
> If you don't have Greg Smith's "PostgreSQL 9.0 High Performance", go
> buy it.
>
> You may find this recent thread interesting, not for the solution but
> for the discussion:
> http://postgresql.1045698.n5.nabble.com/Adding-more-memory-hugh-cpu-load-td4888181.html#a4888530
>
>
> Cheers,
> Steve
>
>

Hi Steve,

It indeed seems to be /proc/sys/vm/dirty_bytes and
dirty_background_bytes that were the culput.

I think it's when I dump or gzip large files that I provoke this
locking. I am however confused as to why pg_stat_activity found it
necessary to lock itself just because postgres had a pending fsync.

I have both Greg Smiths excellent High Performance book and the other
admin book from Packt.

Thanks for the answer!

Best regards,
Marcus