On 06/30/2015 07:52 AM, eudald_v wrote:
> Two days from now, I've been experiencing that, randomly, the connections
> rise up till they reach max connections, and the load average of the server
> goes arround 300~400, making every command issued on the server take
> forever. When this happens, ram is relatively low (70Gb used), cores
> activity is lower than usual and sometimes swap happens (I've swappiness
> configured to 10%)
As Tom said, the most likely reason for this is application behavior and
blocking locks. Try some of these queries on our scripts page:
https://github.com/pgexperts/pgx_scripts/tree/master/locks
However, I have seem some other things which cause these kinds of stalls:
* runaway connection generation by the application, due to either a
programming bug or an irresponsible web crawler (see
https://www.pgexperts.com/blog/quinn_weaver/)
* issues evicting blocks from shared_buffers: what is your
shared_buffers set to? How large is your database?
* Checkpoint stalls: what FS are you on? What are your transaction log
settings for PostgreSQL?
* Issues with the software/hardware stack around your storage, causing
total IO stalls periodically. What does IO throughput look like
before/during/after the stalls?
The last was the cause the last time I dealt with a situation like
yours; it turned out the issue was bad RAID card firmware where the card
would lock up whenever the write-through buffer got too much pressure.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com