Re: Best OS & Configuration for Dual Xeon w/4GB & - Mailing list pgsql-performance

From Scott Marlowe
Subject Re: Best OS & Configuration for Dual Xeon w/4GB &
Date
Msg-id 1142636433.1141.33.camel@state.g2switchworks.com
Whole thread Raw
In response to Best OS & Configuration for Dual Xeon w/4GB & Adaptec RAID 2200S  (Kenji Morishige <kenjim@juniper.net>)
Responses Re: Best OS & Configuration for Dual Xeon w/4GB &
Re: Best OS & Configuration for Dual Xeon w/4GB &
List pgsql-performance
On Fri, 2006-03-17 at 16:11, Kenji Morishige wrote:
> About a year ago we decided to migrate our central database that powers various
> intranet tools from MySQL to PostgreSQL. We have about 130 tables and about
> 10GB of data that stores various status information for a variety of services
> for our intranet.  We generally have somewhere between 150-200 connections to
> the database at any given time and probably anywhere between 5-10 new
> connections being made every second and about 100 queries per second. Most
> of the queries and transactions are very small due to the fact that the tools
> were designed to work around the small functionality of MySQL 3.23 DB.
> Our company primarily uses FreeBSD and we are stuck on FreeBSD 4.X series due
> to IT support issues,

There were a LOT of performance enhancements to FreeBSD with the 5.x
series release.  I'd recommend fast tracking the database server to the
5.x branch.  4-stable was release 6 years ago.  5-stable was released
two years ago.

> but I believe I may be able to get more performance out
> of our server by reconfiguring and setting up the postgresql.conf file up
> better.

Can't hurt.  But if your OS isn't doing the job, postgresql.conf can
only do so much, nee?

>   The performance is not as good as I was hoping at the moment and
> it seems as if the database is not making use of the available ram.
> snapshot of active server:
> last pid:  5788;  load averages:  0.32,  0.31,  0.28                                                     up
127+15:16:0813:59:24 
> 169 processes: 1 running, 168 sleeping
> CPU states:  5.4% user,  0.0% nice,  9.9% system,  0.0% interrupt, 84.7% idle
> Mem: 181M Active, 2632M Inact, 329M Wired, 179M Cache, 199M Buf, 81M Free
> Swap: 4096M Total, 216K Used, 4096M Free
>
>   PID USERNAME      PRI NICE  SIZE    RES STATE  C   TIME   WCPU    CPU COMMAND
> 14501 pgsql           2   0   254M   242M select 2  76:26  1.95%  1.95% postgre
>  5720 root           28   0  2164K  1360K CPU0   0   0:00  1.84%  0.88% top
>  5785 pgsql           2   0   255M 29296K sbwait 0   0:00  3.00%  0.15% postgre
>  5782 pgsql           2   0   255M 11900K sbwait 0   0:00  3.00%  0.15% postgre
>  5772 pgsql           2   0   255M 11708K sbwait 2   0:00  1.54%  0.15% postgre

That doesn't look good.  Is this machine freshly rebooted, or has it
been running postgres for a while?  179M cache and 199M buffer with 2.6
gig inactive is horrible for a machine running a 10gig databases.

For comparison, here's what my production linux boxes show in top:
 16:42:27  up 272 days, 14:49,  1 user,  load average: 1.02, 1.04, 1.00
162 processes: 161 sleeping, 1 running, 0 zombie, 0 stopped
CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle
           total    0.2%    0.0%    0.4%   0.0%     0.0%    0.4%   98.7%
           cpu00    0.4%    0.0%    0.4%   0.0%     0.0%    0.0%   99.0%
           cpu01    0.0%    0.0%    0.4%   0.0%     0.0%    0.9%   98.5%
Mem: 6096912k av, 4529208k used, 1567704k free, 0k shrd,  306884k buff
                  2398948k actv, 1772072k in_d,   78060k in_c
Swap: 4192880k av,  157480k used, 4035400k free        3939332k cached

PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU COMMAND
24000 postgres  15 0  752  524  456 S   0.0  0.0   0:00   1 rotatelogs
24012 postgres  15 0 1536 1420 1324 S   0.0  0.0   7:11   0 postmaster
24015 postgres  15 0 2196 2032  996 S   0.0  0.0  56:07   0 postmaster
24016 postgres  15 0 1496 1352 1004 S   0.0  0.0 233:46   1 postmaster

Note that the kernel here is caching ~3.9 gigs of data.  so, postgresql
doesn't have to.   Also, the disk buffers are sitting at > 300 Megs.

If FreeBSD 4.x can't or won't cache more than that, there's an OS issue
here, either endemic to FreeBSD 4.x, or your configuration of it.


> Dual Xeon 3.06Ghz 4GB RAM

Make sure hyperthreading is disabled, it's generally a performance loss
for pgsql.

> Adaptec 2200S 48MB cache & 4 disks configured in RAID5

I'm not a huge fan of adaptec RAID controllers, and 48 Megs ain't much.
But for what you're doing, I'd expect it to run well enough.  Have you
tested this array with bonnie++ to see what kind of performance it gets
in general?  There could be some kind of hardware issue going on you're
not seeing in the logs.

Is that memory cache set to write back not through, and does it have
battery backup (the cache, not the machine)?

> The OS is installed on the local single disk and postgres data directory
> is on the RAID5 partition.  Maybe Adaptec 2200S RAID5 performance is not as
> good as the vendor claimed.  It was my impression that the raid controller
> these days are optimized for RAID5 and going RAID10 would not benefit me much.

You have to be careful about RAID 10, since many controllers serialize
access through multiple levels of RAID, and therefore wind up being
slower in RAID 10 or 50 than in RAID 1 or 5.

> Also, I may be overlooking a postgresql.conf setting.  I have attached the
> config file.

If you're doing a lot of small transactions you might see some gain from
increasing commit_delay to 100 to 1000 and commit siblings to 25 to
100.  It won't set the world on fire, but it's given me a 25% boost on
certain loads with lots of small transactions

>
> In summary, my questions:
>
> 1. Would running PG on FreeBSD 5.x or 6.x or Linux improve performance?

It most probably would.  I'd at least test it out.

> 2. Should I change SCSI controller config to use RAID 10 instead of 5?

Maybe.  With that controller, and many others in its league, you may be
slowing things down doing that.  You may be better off with a simple
RAID 1 instead as well.  Also, if you've got a problem with the
controller serializing multiple raid levels, you might see the best
performance with one raid level on the controller and the other handled
by the kernel.  BSD does do kernel level RAID, right?

> 3. Why isn't postgres using all 4GB of ram for at least caching table for reads?

Because that's your Operating System's job.

> 4. Are there any other settings in the conf file I could try to tweak?

With the later versions of PostgreSQL, it's gotten better at doing the
OS job of caching, IF you set it to use enough memory.  You might try
cranking up shared memory / shared_buffers to something large like 75%
of the machine memory and see if that does help.  With 7.4 and before,
it's generally a really bad idea.   Looking at your postgresql.conf it
appears you're running a post-7.4 version, so you might be able to get
away with handing over all the ram to the database.

Now that the tuning stuff is out of the way.  Have you been using the
logging to look for individual slow queries and run explain analyze on
them? Are you analyzing your database and vacuuming it too?

pgsql-performance by date:

Previous
From: Kenji Morishige
Date:
Subject: Best OS & Configuration for Dual Xeon w/4GB & Adaptec RAID 2200S
Next
From: "Claus Guttesen"
Date:
Subject: Re: Best OS & Configuration for Dual Xeon w/4GB & Adaptec RAID 2200S