Thread: Ubuntu 12.04 / 3.2 Kernel Bad for PostgreSQL Performance

Ubuntu 12.04 / 3.2 Kernel Bad for PostgreSQL Performance

From
Shaun Thomas
Date:
Hey guys,

This isn't a question, but a kind of summary over a ton of investigation
I've been doing since a recent "upgrade". Anyone else out there with
"big iron" might want to confirm this, but it seems pretty reproducible.
This seems to affect the latest 3.2 mainline and by extension, any
platform using it. My tests are restricted to Ubuntu 12.04, but it may
apply elsewhere.

Comparing the latest official 3.2 kernel to the latest official 3.4
kernel (both Ubuntu), there are some rather striking differences. I'll
start with some pgbench tests.

* This test is 800 read-only clients, with 2 controlling threads on a
55GB database (scaling factor of 3600) for 3 minutes.
   * With 3.4:
     * Max TPS was 68933.
     * CPU was between 50 and 55% idle.
     * Load average was between 10 and 15.
   * With 3.2:
     * Max TPS was 17583. A total loss of 75% performance.
     * CPU was between 12 and 25% idle.
     * Load average was between 10 and 60---effectively random.
   * Next, we checked minimal write tests. This time, with only two
clients. All other metrics are the same.
     * With 3.4:
       * Max TPS was 4548.
       * CPU was between 88 and 92% idle.
       * Load average was between 1.7 and 2.5.
     * With 3.2:
       * Max TPS was 4639.
       * CPU was between 88 and 92% idle.
       * Load average was between 3 and 4.

Overall, performance was _much_ worse in 3.2 by almost every metric
except for very low contention activity. More CPU for less transactions,
and wildly inaccurate load reporting. The 3.2 kernel in its current
state should be considered detrimental and potentially malicious under
high task contention.

I'll admit not letting the tests run for more than 10 iterations, but I
didn't really need more than that. Even one iteration is enough to see
this in action. At least every Ubuntu 3.2 kernel since 3.2.0-31 exhibits
this, but I haven't tested further back. I've also examined both
official Ubuntu 3.2 and Ubuntu mainline kernels as obtained from here:

http://kernel.ubuntu.com/~kernel-ppa/mainline

The 3.2.34 mainline also has these problems. For reference, I tested the
3.4.20 Quantal release on Precise because the Precise 3.4 kernel hasn't
been maintained.

Again, anyone running 12.04 LTS, take a good hard look at your systems.
Hopefully you have a spare machine to test with. I'm frankly appalled
this thing is in an LTS release.

I'll also note that all kernels exhibit some extent of client threads
bloating load reports. In a pgbench for-loop (run, sleep 1, repeat),
sometimes load will jump to some very high number between iterations,
but on a 3.4, it will settle down again. On a 3.2, it just jumps
randomly. I tested that with this script:

nLoop=0

while [ 1 -eq 1 ]; do

   if [ $[$nLoop % 20] -eq 0 ]; then
     echo -e "Stat Time\t\tSleep\tRun\tLoad Avg"
   fi

   stattime=$(date +"%Y-%m-%d %H:%M:%S")
   sleep=$(ps -emo stat | egrep -c 'D')
   run=$(ps -emo stat | egrep -c 'R')
   loadavg=$(cat /proc/loadavg | cut -d ' ' -f 1)

   echo -e "${stattime}\t${sleep}\t${run}\t${loadavg}"
   sleep 1

   nLoop=$[$nLoop + 1]

done

The jumps look like this:

Stat Time        Sleep    Run    Load Avg
2012-12-05 12:23:13    0    16    7.66
2012-12-05 12:23:14    0    12    7.66
2012-12-05 12:23:15    0    7    7.66
2012-12-05 12:23:16    0    17    7.66
2012-12-05 12:23:17    0    1    24.51
2012-12-05 12:23:18    0    2    24.51

It's much harder to trigger on 3.4, but still happens.

If anyone has tested against 3.6 or 3.7, I'd love to hear your input.
Inconsistent load reports are one thing... strangled performance and
inflated CPU usage are quite another.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
sthomas@optionshouse.com
100

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: Ubuntu 12.04 / 3.2 Kernel Bad for PostgreSQL Performance

From
Niels Kristian Schjødt
Date:
Where as I can't say I yet tried out the 3.4 kernel, I can say that I am running 3.2 too, and maybe there is a
connectionto the past issues of strange CPU behavior I have had (as you know and have been so kind to try helping me
solve).I will without a doubt try out 3.4 or 3.6 within the coming days, and report back on the topic. 


Den 05/12/2012 kl. 19.28 skrev Shaun Thomas <sthomas@optionshouse.com>:

> Hey guys,
>
> This isn't a question, but a kind of summary over a ton of investigation
> I've been doing since a recent "upgrade". Anyone else out there with
> "big iron" might want to confirm this, but it seems pretty reproducible.
> This seems to affect the latest 3.2 mainline and by extension, any
> platform using it. My tests are restricted to Ubuntu 12.04, but it may
> apply elsewhere.
>
> Comparing the latest official 3.2 kernel to the latest official 3.4
> kernel (both Ubuntu), there are some rather striking differences. I'll
> start with some pgbench tests.
>
> * This test is 800 read-only clients, with 2 controlling threads on a
> 55GB database (scaling factor of 3600) for 3 minutes.
>  * With 3.4:
>    * Max TPS was 68933.
>    * CPU was between 50 and 55% idle.
>    * Load average was between 10 and 15.
>  * With 3.2:
>    * Max TPS was 17583. A total loss of 75% performance.
>    * CPU was between 12 and 25% idle.
>    * Load average was between 10 and 60---effectively random.
>  * Next, we checked minimal write tests. This time, with only two
> clients. All other metrics are the same.
>    * With 3.4:
>      * Max TPS was 4548.
>      * CPU was between 88 and 92% idle.
>      * Load average was between 1.7 and 2.5.
>    * With 3.2:
>      * Max TPS was 4639.
>      * CPU was between 88 and 92% idle.
>      * Load average was between 3 and 4.
>
> Overall, performance was _much_ worse in 3.2 by almost every metric
> except for very low contention activity. More CPU for less transactions,
> and wildly inaccurate load reporting. The 3.2 kernel in its current
> state should be considered detrimental and potentially malicious under
> high task contention.
>
> I'll admit not letting the tests run for more than 10 iterations, but I
> didn't really need more than that. Even one iteration is enough to see
> this in action. At least every Ubuntu 3.2 kernel since 3.2.0-31 exhibits
> this, but I haven't tested further back. I've also examined both
> official Ubuntu 3.2 and Ubuntu mainline kernels as obtained from here:
>
> http://kernel.ubuntu.com/~kernel-ppa/mainline
>
> The 3.2.34 mainline also has these problems. For reference, I tested the
> 3.4.20 Quantal release on Precise because the Precise 3.4 kernel hasn't
> been maintained.
>
> Again, anyone running 12.04 LTS, take a good hard look at your systems.
> Hopefully you have a spare machine to test with. I'm frankly appalled
> this thing is in an LTS release.
>
> I'll also note that all kernels exhibit some extent of client threads
> bloating load reports. In a pgbench for-loop (run, sleep 1, repeat), sometimes load will jump to some very high
numberbetween iterations, but on a 3.4, it will settle down again. On a 3.2, it just jumps randomly. I tested that with
thisscript: 
>
> nLoop=0
>
> while [ 1 -eq 1 ]; do
>
>  if [ $[$nLoop % 20] -eq 0 ]; then
>    echo -e "Stat Time\t\tSleep\tRun\tLoad Avg"
>  fi
>
>  stattime=$(date +"%Y-%m-%d %H:%M:%S")
>  sleep=$(ps -emo stat | egrep -c 'D')
>  run=$(ps -emo stat | egrep -c 'R')
>  loadavg=$(cat /proc/loadavg | cut -d ' ' -f 1)
>
>  echo -e "${stattime}\t${sleep}\t${run}\t${loadavg}"
>  sleep 1
>
>  nLoop=$[$nLoop + 1]
>
> done
>
> The jumps look like this:
>
> Stat Time        Sleep    Run    Load Avg
> 2012-12-05 12:23:13    0    16    7.66
> 2012-12-05 12:23:14    0    12    7.66
> 2012-12-05 12:23:15    0    7    7.66
> 2012-12-05 12:23:16    0    17    7.66
> 2012-12-05 12:23:17    0    1    24.51
> 2012-12-05 12:23:18    0    2    24.51
>
> It's much harder to trigger on 3.4, but still happens.
>
> If anyone has tested against 3.6 or 3.7, I'd love to hear your input. Inconsistent load reports are one thing...
strangledperformance and inflated CPU usage are quite another. 
>
> --
> Shaun Thomas
> OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
> 312-444-8534
> sthomas@optionshouse.com
> 100
>
> ______________________________________________
>
> See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance



Re: Ubuntu 12.04 / 3.2 Kernel Bad for PostgreSQL Performance

From
Daniel Farina
Date:
On Wed, Dec 5, 2012 at 10:28 AM, Shaun Thomas <sthomas@optionshouse.com> wrote:
> Hey guys,
>
> This isn't a question, but a kind of summary over a ton of investigation
> I've been doing since a recent "upgrade". Anyone else out there with
> "big iron" might want to confirm this, but it seems pretty reproducible.
> This seems to affect the latest 3.2 mainline and by extension, any
> platform using it. My tests are restricted to Ubuntu 12.04, but it may
> apply elsewhere.
>
> Comparing the latest official 3.2 kernel to the latest official 3.4
> kernel (both Ubuntu), there are some rather striking differences. I'll
> start with some pgbench tests.

Is 3.2 a significant regression from previous releases, or is 3.4 just
faster?  Your wording only indicates that "older kernel is slow," but
your tone would suggest that you feel this is a regression, cf. being
unhappy that 3.2 made its way into a LTS release (why wouldn't it? it
was a relatively current kernel at the time).

--
fdr


Re: Ubuntu 12.04 / 3.2 Kernel Bad for PostgreSQL Performance

From
Shaun Thomas
Date:
On 12/05/2012 04:19 PM, Daniel Farina wrote:

> Is 3.2 a significant regression from previous releases, or is 3.4 just
> faster?  Your wording only indicates that "older kernel is slow," but
> your tone would suggest that you feel this is a regression, cf.

It's definitely a regression. I'm trying to pin it down, but the
3.2.0-24 kernel didn't do the CPU drain down to single-digits on that
client load test. I'm working on 3.2.0-30 and going down to figure out
which patch might have done it.

Older kernels performed better. And by older, I mean 2.6. Still not 3.4
levels, but that's expected. I haven't checked 3.0, but other threads
I've read suggest it had less problems. Sorry if I wasn't clear.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: Ubuntu 12.04 / 3.2 Kernel Bad for PostgreSQL Performance

From
Bruce Momjian
Date:
On Wed, Dec  5, 2012 at 04:25:28PM -0600, Shaun Thomas wrote:
> On 12/05/2012 04:19 PM, Daniel Farina wrote:
>
> >Is 3.2 a significant regression from previous releases, or is 3.4 just
> >faster?  Your wording only indicates that "older kernel is slow," but
> >your tone would suggest that you feel this is a regression, cf.
>
> It's definitely a regression. I'm trying to pin it down, but the
> 3.2.0-24 kernel didn't do the CPU drain down to single-digits on
> that client load test. I'm working on 3.2.0-30 and going down to
> figure out which patch might have done it.
>
> Older kernels performed better. And by older, I mean 2.6. Still not
> 3.4 levels, but that's expected. I haven't checked 3.0, but other
> threads I've read suggest it had less problems. Sorry if I wasn't
> clear.

Ah, that is interesting about 2.6.  I had wondered how Debian stable
would have performed, 2.6.32-5.  This relates to a recent discussion
about the appropriateness of Ubuntu for database servers:

    http://archives.postgresql.org/pgsql-performance/2012-11/msg00358.php

Thanks.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +


Re: Ubuntu 12.04 / 3.2 Kernel Bad for PostgreSQL Performance

From
Shaun Thomas
Date:
On 12/05/2012 04:41 PM, Bruce Momjian wrote:

> Ah, that is interesting about 2.6.  I had wondered how Debian stable
> would have performed, 2.6.32-5.  This relates to a recent discussion
> about the appropriateness of Ubuntu for database servers:

Hmm. I may have to recant. I just removed our fusionIO driver from the
loop and suddenly everything is honey and roses. It would appear that
some recent 3.2 kernel patch borks the driver in some horrible way.
Wihtout it, I see 50-ish percent CPU, 70k tps even with 800 clients...
Just like 3.4.

So I jumped the gun a bit. Stupid drivers.

I'm still curious why only recent 3.2's cause it, but 3.4 don't. That's
mighty odd.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


Re: Ubuntu 12.04 / 3.2 Kernel Bad for PostgreSQL Performance

From
Scott Marlowe
Date:
On Wed, Dec 5, 2012 at 4:04 PM, Shaun Thomas <sthomas@optionshouse.com> wrote:
> On 12/05/2012 04:41 PM, Bruce Momjian wrote:
>
>> Ah, that is interesting about 2.6.  I had wondered how Debian stable
>> would have performed, 2.6.32-5.  This relates to a recent discussion
>> about the appropriateness of Ubuntu for database servers:
>
>
> Hmm. I may have to recant. I just removed our fusionIO driver from the loop
> and suddenly everything is honey and roses. It would appear that some recent
> 3.2 kernel patch borks the driver in some horrible way. Wihtout it, I see
> 50-ish percent CPU, 70k tps even with 800 clients... Just like 3.4.
>
> So I jumped the gun a bit. Stupid drivers.
>
> I'm still curious why only recent 3.2's cause it, but 3.4 don't. That's
> mighty odd.

Have you got a support contract with fusion IO guys? Where I work we
have fusion IO cards and a support contract and are about to start
doing some testing on ubuntu 12.04 as well so I'll let you know what
we find out.


Re: Ubuntu 12.04 / 3.2 Kernel Bad for PostgreSQL Performance

From
Stuart Bishop
Date:
On Thu, Dec 6, 2012 at 1:28 AM, Shaun Thomas <sthomas@optionshouse.com> wrote:

> This isn't a question, but a kind of summary over a ton of investigation
> I've been doing since a recent "upgrade". Anyone else out there with
> "big iron" might want to confirm this, but it seems pretty reproducible.
> This seems to affect the latest 3.2 mainline and by extension, any
> platform using it. My tests are restricted to Ubuntu 12.04, but it may
> apply elsewhere.

I'm not seeing this on our production systems. I haven't run benchmarks.

One of our systems currently has a mixture of PG 8.4 shards running on
Ubuntu 10.04 (2.6 kernel) and PG 9.1 shards running on Ubuntu 12.04
(3.2 kernel). Load & cpu utilization (per 'top') are comparable.
Shards have 64GB of RAM, shared_buffers=3GB, 60 active connections.

Another production PG 9.1 system with shared_buffers=5GB also seems
fine. Old load graphs show the load is comparable from when it was
running Ubuntu 10.04.

My big systems are still all on Ubuntu 10.04 (cut over in January I expect).

--
Stuart Bishop <stuart@stuartbishop.net>
http://www.stuartbishop.net/


Re: Ubuntu 12.04 / 3.2 Kernel Bad for PostgreSQL Performance

From
John Lister
Date:
On 05/12/2012 18:28, Shaun Thomas wrote:
> Hey guys,
>
> This isn't a question, but a kind of summary over a ton of investigation
> I've been doing since a recent "upgrade". Anyone else out there with
> "big iron" might want to confirm this, but it seems pretty reproducible.
> This seems to affect the latest 3.2 mainline and by extension, any
> platform using it. My tests are restricted to Ubuntu 12.04, but it may
> apply elsewhere.
>
Very interesting results, I've been trying to benchmark my box but
getting what I would call poor performance given the setup, but I am
running 3.2 on ubuntu 12.04. Could I ask what hardware (offline if you
wish) was used for the results below?
> Comparing the latest official 3.2 kernel to the latest official 3.4
> kernel (both Ubuntu), there are some rather striking differences. I'll
> start with some pgbench tests.
>
> * This test is 800 read-only clients, with 2 controlling threads on a
> 55GB database (scaling factor of 3600) for 3 minutes.
>   * With 3.4:
>     * Max TPS was 68933.
>     * CPU was between 50 and 55% idle.
>     * Load average was between 10 and 15.
>   * With 3.2:
>     * Max TPS was 17583. A total loss of 75% performance.
>     * CPU was between 12 and 25% idle.
>     * Load average was between 10 and 60---effectively random.
>   * Next, we checked minimal write tests. This time, with only two
> clients. All other metrics are the same.
>     * With 3.4:
>       * Max TPS was 4548.
>       * CPU was between 88 and 92% idle.
>       * Load average was between 1.7 and 2.5.
>     * With 3.2:
>       * Max TPS was 4639.
>       * CPU was between 88 and 92% idle.
>       * Load average was between 3 and 4.
>
TIme to see what a 3.4 kernel does to my setup I think?

Thanks

John