Thread: Hardware suggestions for Linux/PGSQL server

Hardware suggestions for Linux/PGSQL server

From
Jeff Bohmer
Date:
Hi everyone,

I want to pick your brains for hardware suggestions about a
Linux-based PostgreSQL 7.4 server.  It will be a dedicated DB server
backing our web sites and hit by application servers (which do
connection pooling).  I've hopefully provided all relevant
information below.  Any thoughts, comments or suggestions are welcome.

Our current server and database:
    Mac OS X Server 10.2.8
    single 1.25GHz G4
    2 GB 333MHz RAM
    7200 rpm SCSI drive for OS, logs
    15k rpm SCSI drive for data

    PostgreSQL 7.3.4
    1 database, 1.1 GB in size, growing by ~15 MB / week
    60 tables, 1 schema, largest is 1m rows, 1 at 600k, 3 at 100k
    Peak traffic:
        500 UPDATEs, INSERTs and DELETEs / minute
        6000 SELECTs / minutes
        90 connections

Performance is fine most of the time, but not during peak loads.
We're never swapping and disk IO during the SELECT peaks is hardly
anything (under 3MB/sec).  I think UPDATE peaks might be saturating
disk IO.  Normally, most queries finish in under .05 seconds.  Some
take 2-3 seconds.  During peaks, the fast queries are just OK and the
slower ones take too long (like over 8 seconds).

We're moving to Linux from OS X for improved stability and more
hardware options.  We need to do this soon.  The current server is
max'd out at 2GB RAM and I'm afraid might start swapping in a month.

Projected database/traffic in 12 months:
    Database size will be at least 2.5 GB
    Largest table still 1m rows, but 100k tables will grow to 250k
    Will be replicated to a suitable standby slave machine
    Peak traffic:
        2k UPDATEs, INSERTs, DELETEs / minute
        20k SELECTs / minute
        150 - 200 connections

We're willing to shell out extra bucks to get something that will
undoubtedly handle the projected peak load in 12 months with
excellent performance.  But we're not familiar with PG's performance
on Linux and don't like to waste money.

I've been thinking of this (overkill? not enough?):
    2 Intel 32-bit CPUs
    Lowest clock speed chip for the fastest available memory bus
    4 GB RAM (maybe we only need 3 GB to start with?)
    SCSI RAID 1 for OS
    For PostgreSQL data and logs ...
        15k rpm SCSI disks
        RAID 5, 7 disks, 256MB battery-backed write cache
        (Should we save $ and get a 4-disk RAID 10 array?)

I wonder about the 32bit+bigmem vs. 64bit question.  At what database
size will we need more than 4GB RAM?

We'd like to always have enough RAM to cache the entire database.
While 64bit is in our long-term future, we're willing to stick with
32bit Linux until 64bit Linux on Itanium/Opteron and 64bit PostgreSQL
"settle in" to proven production-quality.

TIA,
- Jeff

--

Jeff Bohmer
VisionLink, Inc.
_________________________________
303.402.0170
www.visionlink.org
_________________________________
People. Tools. Change. Community.

Re: Hardware suggestions for Linux/PGSQL server

From
William Yu
Date:
Jeff Bohmer wrote:
> We're willing to shell out extra bucks to get something that will
> undoubtedly handle the projected peak load in 12 months with excellent
> performance.  But we're not familiar with PG's performance on Linux and
> don't like to waste money.

Properly tuned, PG on Linux runs really nice. A few people have
mentioned the VM swapping algorithm on Linux is semi-dumb. I get around
that problem by having a ton of memory and almost no swap.

> I've been thinking of this (overkill? not enough?):
>     2 Intel 32-bit CPUs
>     Lowest clock speed chip for the fastest available memory bus
>     4 GB RAM (maybe we only need 3 GB to start with?)
>     SCSI RAID 1 for OS
>     For PostgreSQL data and logs ...
>         15k rpm SCSI disks
>         RAID 5, 7 disks, 256MB battery-backed write cache
>         (Should we save $ and get a 4-disk RAID 10 array?)
>
> I wonder about the 32bit+bigmem vs. 64bit question.  At what database
> size will we need more than 4GB RAM?

With 4GB of RAM, you're already running into bigmem. By default, Linux
gives 2GB of address space to programs and 2GB to kernel. I usually see
people quote 5%-15% penalty in general for using PAE versus a flat
address space. I've seen simple MySQL benchmarks where 64-bit versions
run 35%+ faster versus 32-bit+PAE but how that translates to PG, I dunno
yet.

> We'd like to always have enough RAM to cache the entire database. While
> 64bit is in our long-term future, we're willing to stick with 32bit
> Linux until 64bit Linux on Itanium/Opteron and 64bit PostgreSQL "settle
> in" to proven production-quality.

Well if this is the case, you probably should get an Opteron server
*now* and just run 32-bit Linux on it until you're sure about the
software. No point in buying a Xeon and then throwing the machine away
in a year when you decide you need 64-bit for more speed.


Re: Hardware suggestions for Linux/PGSQL server

From
Jeff Bohmer
Date:
>Properly tuned, PG on Linux runs really nice. A few people have
>mentioned the VM swapping algorithm on Linux is semi-dumb. I get
>around that problem by having a ton of memory and almost no swap.

I think we want your approach: enough RAM to avoid swapping altogether.



>With 4GB of RAM, you're already running into bigmem. By default,
>Linux gives 2GB of address space to programs and 2GB to kernel.

It seems I don't fully understand the bigmem situation.  I've
searched the archives, googled, checked RedHat's docs, etc.  But I'm
getting conflicting, incomplete and/or out of date information.  Does
anyone have pointers to bigmem info or configuration for the 2.4
kernel?

If Linux is setup with 2GB for kernel and 2GB for user, would that be
OK with a DB size of 2-2.5 GB?  I'm figuring the kernel will cache
most/all of the DB in it's 2GB and there's 2GB left for PG processes.
Where does PG's SHM buffers live, kernel or user?  (I don't plan on
going crazy with buffers, but will guess we'd need about 128MB, 256MB
at most.)



>I usually see people quote 5%-15% penalty in general for using PAE
>versus a flat address space. I've seen simple MySQL benchmarks where
>64-bit versions run 35%+ faster versus 32-bit+PAE but how that
>translates to PG, I dunno yet.
>
>>We'd like to always have enough RAM to cache the entire database.
>>While 64bit is in our long-term future, we're willing to stick with
>>32bit Linux until 64bit Linux on Itanium/Opteron and 64bit
>>PostgreSQL "settle in" to proven production-quality.
>
>Well if this is the case, you probably should get an Opteron server
>*now* and just run 32-bit Linux on it until you're sure about the
>software. No point in buying a Xeon and then throwing the machine
>away in a year when you decide you need 64-bit for more speed.

That's a good point.  I had forgotten about the option to run 32bit
on an Operton.  If we had 3GB or 4GB initially on an Opteron, we'd
need bigmem for 32bit Linux, right?

This might work nicely since we'd factor in the penalty from PAE for
now and have the performance boost from moving to 64bit available on
demand.  Not having to build another DB server in a year would also
be nice.

FYI, we need stability first and performance second.

Thank you,
- Jeff

--

Jeff Bohmer
VisionLink, Inc.
_________________________________
303.402.0170
www.visionlink.org
_________________________________
People. Tools. Change. Community.

Re: Hardware suggestions for Linux/PGSQL server

From
"scott.marlowe"
Date:
Just one more piece of advice, you might want to look into a good battery
backed cache hardware RAID controller.  They work quite well for heavily
updated databases.  The more drives you throw at the RAID array the faster
it will be.


Re: Hardware suggestions for Linux/PGSQL server

From
William Yu
Date:
Jeff Bohmer wrote:
> It seems I don't fully understand the bigmem situation.  I've searched
> the archives, googled, checked RedHat's docs, etc.  But I'm getting
> conflicting, incomplete and/or out of date information.  Does anyone
> have pointers to bigmem info or configuration for the 2.4 kernel?

Bigmem is the name for Linux's PAE support.

> If Linux is setup with 2GB for kernel and 2GB for user, would that be OK
> with a DB size of 2-2.5 GB?  I'm figuring the kernel will cache most/all
> of the DB in it's 2GB and there's 2GB left for PG processes. Where does
> PG's SHM buffers live, kernel or user?  (I don't plan on going crazy
> with buffers, but will guess we'd need about 128MB, 256MB at most.)

PG's SHM buffers live in user. Whether Linux's OS caches lives in user
or kernel, I think it's in kernel and I remember reading a max of ~950KB
w/o bigmem which means your 3.5GB of available OS memory will definitely
have to be swapped in and out of kernel space using PAE.

>> Well if this is the case, you probably should get an Opteron server
>> *now* and just run 32-bit Linux on it until you're sure about the
>> software. No point in buying a Xeon and then throwing the machine away
>> in a year when you decide you need 64-bit for more speed.
>
> That's a good point.  I had forgotten about the option to run 32bit on
> an Operton.  If we had 3GB or 4GB initially on an Opteron, we'd need
> bigmem for 32bit Linux, right?
>
> This might work nicely since we'd factor in the penalty from PAE for now
> and have the performance boost from moving to 64bit available on
> demand.  Not having to build another DB server in a year would also be
> nice.
>
> FYI, we need stability first and performance second.

We ordered a 2x Opteron server the moment the CPU was released and it's
been perfect -- except for one incident where the PCI riser card had
drifted out of the PCI slot due to the heavy SCSI cables connected to
the card.

I think most of the Opteron server MBs are pretty solid but you want
extra peace-of-mind, you could get a server from Newisys as they pack in
a cartload of extra monitoring features.


Re: Hardware suggestions for Linux/PGSQL server

From
Shridhar Daithankar
Date:
Jeff Bohmer wrote:
>> Well if this is the case, you probably should get an Opteron server
>> *now* and just run 32-bit Linux on it until you're sure about the
>> software. No point in buying a Xeon and then throwing the machine away
>> in a year when you decide you need 64-bit for more speed.
>
>
> That's a good point.  I had forgotten about the option to run 32bit on
> an Operton.  If we had 3GB or 4GB initially on an Opteron, we'd need
> bigmem for 32bit Linux, right?
>
> This might work nicely since we'd factor in the penalty from PAE for now
> and have the performance boost from moving to 64bit available on
> demand.  Not having to build another DB server in a year would also be
> nice.

FWIW, there are only two pieces of software that need 64bit aware for a typical
server job. Kernel and glibc. Rest of the apps can do fine as 32 bits unless you
are oracle and insist on outsmarting OS.

In fact running 32 bit apps on 64 bit OS has plenty of advantages like
effectively using the cache. Unless you need 64bit, going for 64bit software is
not advised.

  Shridhar

--
-----------------------------
Shridhar Daithankar
LIMS CPE Team Member, PSPL.
mailto:shridhar_daithankar@persistent.co.in
Phone:- +91-20-5676700 Extn.270
Fax  :- +91-20-5676701
-----------------------------


Re: Hardware suggestions for Linux/PGSQL server

From
William Yu
Date:
Shridhar Daithankar wrote:
>
> FWIW, there are only two pieces of software that need 64bit aware for a
> typical server job. Kernel and glibc. Rest of the apps can do fine as 32
> bits unless you are oracle and insist on outsmarting OS.
>
> In fact running 32 bit apps on 64 bit OS has plenty of advantages like
> effectively using the cache. Unless you need 64bit, going for 64bit
> software is not advised.

This is a good point. While doing research on this matter a few months
back, I saw comments by people testing 64-bit MySQL that some operations
would run faster and some slower due to the use of 64-bit datatypes
versus 32-bit. The best solution in the end is probably to run 32-bit
Postgres under a 64-bit kernel -- unless your DB tends to have a lot of
64-bit datatypes.


Re: Hardware suggestions for Linux/PGSQL server

From
Jeff Bohmer
Date:
>Just one more piece of advice, you might want to look into a good battery
>backed cache hardware RAID controller.  They work quite well for heavily
>updated databases.  The more drives you throw at the RAID array the faster
>it will be.

I've seen this list often recommended such a setup.  We'll probably
get battery-backed write cache and start out with a 4 disk RAID 10
array.  Then add more disks and change RAID 5 if more read
performance is needed.

Thanks,
- Jeff
--

Jeff Bohmer
VisionLink, Inc.
_________________________________
303.402.0170
www.visionlink.org
_________________________________
People. Tools. Change. Community.

Re: Hardware suggestions for Linux/PGSQL server

From
Jeff Bohmer
Date:
>Shridhar Daithankar wrote:
>>
>>FWIW, there are only two pieces of software that need 64bit aware
>>for a typical server job. Kernel and glibc. Rest of the apps can do
>>fine as 32 bits unless you are oracle and insist on outsmarting OS.
>>
>>In fact running 32 bit apps on 64 bit OS has plenty of advantages
>>like effectively using the cache. Unless you need 64bit, going for
>>64bit software is not advised.
>
>This is a good point. While doing research on this matter a few
>months back, I saw comments by people testing 64-bit MySQL that some
>operations would run faster and some slower due to the use of 64-bit
>datatypes versus 32-bit. The best solution in the end is probably to
>run 32-bit Postgres under a 64-bit kernel -- unless your DB tends to
>have a lot of 64-bit datatypes.


Thanks Shridhar and William,

This advice has been very helpful.  I would imagine a lot of folks
are, or will soon be looking at 32- vs. 64-bit just for memory
reasons and not 64-bit apps.

- Jeff
--

Jeff Bohmer
VisionLink, Inc.
_________________________________
303.402.0170
www.visionlink.org
_________________________________
People. Tools. Change. Community.

Re: Hardware suggestions for Linux/PGSQL server

From
"Andrew G. Hammond"
Date:
I don't know what your budget is, but there are now 10k RPM SATA 150
drives on the market. Their price/performance is impressive. You may
want to consider going with a bunch of these instead of SCSI disks (more
spindles vs. faster spindles). 3ware makes a hardware raid card that can
drive up to 12 SATA disks. I have been told by a few people who have
used it that the linux driver is very solid.

Drew


Jeff Bohmer wrote:

>> Just one more piece of advice, you might want to look into a good
>> battery
>> backed cache hardware RAID controller.  They work quite well for heavily
>> updated databases.  The more drives you throw at the RAID array the
>> faster
>> it will be.
>
>
> I've seen this list often recommended such a setup.  We'll probably
> get battery-backed write cache and start out with a 4 disk RAID 10
> array.  Then add more disks and change RAID 5 if more read performance
> is needed.
>
> Thanks,
> - Jeff




Re: Hardware suggestions for Linux/PGSQL server

From
Christopher Browne
Date:
In the last exciting episode, drew@xyzzy.dhs.org ("Andrew G. Hammond") wrote:
> I don't know what your budget is, but there are now 10k RPM SATA 150
> drives on the market. Their price/performance is impressive. You may
> want to consider going with a bunch of these instead of SCSI disks
> (more spindles vs. faster spindles). 3ware makes a hardware raid
> card that can drive up to 12 SATA disks. I have been told by a few
> people who have used it that the linux driver is very solid.

We got a couple of those in for testing purposes; when opportunity
presents itself, I'll have to check to see if they are any more honest
about commits than traditional IDE drives.

If they still "lie" the same way IDE drives do, it is entirely
possible that they are NOT nearly as impressive as you presently
imagine.  It's not much good if they're "way fast" if you can't trust
them to actually store data when they claim it is stored...
--
(reverse (concatenate 'string "gro.gultn" "@" "enworbbc"))
http://www.ntlug.org/~cbbrowne/lisp.html
"Much of this software was user-friendly, meaning that it was intended
for users who did not know anything about computers, and furthermore
had absolutely no intention whatsoever of learning."
-- A. S. Tanenbaum, "Modern Operating Systems, ch 1.2.4"

Re: Hardware suggestions for Linux/PGSQL server

From
Jeff Bohmer
Date:
>In the last exciting episode, drew@xyzzy.dhs.org ("Andrew G. Hammond") wrote:
>>  I don't know what your budget is, but there are now 10k RPM SATA 150
>>  drives on the market. Their price/performance is impressive. You may
>>  want to consider going with a bunch of these instead of SCSI disks
>>  (more spindles vs. faster spindles). 3ware makes a hardware raid
>>  card that can drive up to 12 SATA disks. I have been told by a few
>>  people who have used it that the linux driver is very solid.
>
>We got a couple of those in for testing purposes; when opportunity
>presents itself, I'll have to check to see if they are any more honest
>about commits than traditional IDE drives.
>
>If they still "lie" the same way IDE drives do, it is entirely
>possible that they are NOT nearly as impressive as you presently
>imagine.  It's not much good if they're "way fast" if you can't trust
>them to actually store data when they claim it is stored...

We lost data because of this very problem when a UPS didn't signal
the shut down before it ran out of juice.

Here's an excellent explanation of the problem:
http://archives.postgresql.org/pgsql-general/2003-10/msg01343.php

This post indicates that SATA drives still have problems, but a new
ATA standard might fix things in the future:
http://archives.postgresql.org/pgsql-general/2003-10/msg01395.php

SATA RAID is a good option for a testing server, though.


- Jeff
--

Jeff Bohmer
VisionLink, Inc.
_________________________________
303.402.0170
www.visionlink.org
_________________________________
People. Tools. Change. Community.