Home > mailing lists

Re: CPU spikes and transactions - Mailing list pgsql-performance

From	Tony Kay
Subject	Re: CPU spikes and transactions
Date	October 14, 2013 23:27:00
Msg-id	CAB=fRcr1Chzz9p4B0mRnc10XEnTAm-B4T8bctW-=Bgw5a0P=cQ@mail.gmail.com Whole thread
In response to	CPU spikes and transactions (Tony Kay <tony@teamunify.com>)
Responses	Re: CPU spikes and transactions
List	pgsql-performance

Tree view

Hi Calvin,

Yes, I have sar data on all systems going back for years.

Since others are going to probably want to be assured I am really "reading the data" right:

- This is 92% user CPU time, 5% sys, and 1% soft

- On some of the problems, I _do_ see a short spike of pgswpout's (memory pressure), but again, not enough to end up using much system time

- The database disks are idle (all data being used is in RAM)..and are SSDs....average service times are barely measurable in ms.

If I had to guess, I'd say it was spinlock misbehavior....I cannot understand why ekse a transaction blocking other things would drive the CPUs so hard into the ground with user time.

Tony

Tony Kay

TeamUnify, LLC

TU Corporate Website

TU Facebook | Free OnDeck Mobile Apps

On Mon, Oct 14, 2013 at 4:05 PM, Calvin Dodge <caldodge@gmail.com> wrote:

Have you tried running "vmstat 1" during these times? If so, what is
the percentage of WAIT time? Given that IIRC shared buffers should be
no more than 25% of installed memory, I wonder if too little is
available for system caching of disk reads. A high WAIT percentage
would indicate excessive I/O (especially random seeks).

Calvin Dodge

On Mon, Oct 14, 2013 at 6:00 PM, Tony Kay <tony@teamunify.com> wrote:
> Hi,
>
> I'm running 9.1.6 w/22GB shared buffers, and 32GB overall RAM on a 16
> Opteron 6276 CPU box. We limit connections to roughly 120, but our webapp is
> configured to allocate a thread-local connection, so those connections are
> rarely doing anything more than half the time.
>
> We have been running smoothly for over a year on this configuration, and
> recently started having huge CPU spikes that bring the system to its knees.
> Given that it is a multiuser system, it has been quite hard to pinpoint the
> exact cause, but I think we've narrowed it down to two data import jobs that
> were running in semi-long transactions (clusters of row inserts).
>
> The tables affected by these inserts are used in common queries.
>
> The imports will bring in a row count of perhaps 10k on average covering 4
> tables.
>
> The insert transactions are at isolation level read committed (the default
> for the JDBC driver).
>
> When the import would run (again, theory...we have not been able to
> reproduce), we would end up maxed out on CPU, with a load average of 50 for
> 16 CPUs (our normal busy usage is a load average of 5 out of 16 CPUs).
>
> When looking at the active queries, most of them are against the tables that
> are affected by these imports.
>
> Our workaround (that is holding at present) was to drop the transactions on
> those imports (which is not optimal, but fortunately is acceptable for this
> particular data). This workaround has prevented any further incidents, but
> is of course inconclusive.
>
> Does this sound familiar to anyone, and if so, please advise.
>
> Thanks in advance,
>
> Tony Kay
>

pgsql-performance by date:

From: Tony Kay
Date: 14 October 2013, 23:00:26
Subject: CPU spikes and transactions

From: Tomas Vondra
Date: 14 October 2013, 23:42:22
Subject: Re: CPU spikes and transactions

Re: CPU spikes and transactions - Mailing list pgsql-performance

Previous

Next