Re: Planning a new server - help needed - Mailing list pgsql-performance

From Greg Smith
Subject Re: Planning a new server - help needed
Date
Msg-id Pine.GSO.4.64.0803281300240.2521@westnet.com
Whole thread Raw
In response to Planning a new server - help needed  (Laszlo Nagy <gandalf@shopzeus.com>)
Responses Re: Planning a new server - help needed  (James Mansion <james@mansionfamily.plus.com>)
List pgsql-performance
On Fri, 28 Mar 2008, Laszlo Nagy wrote:

> We already have a server but it is becoming slow and we would like to
> have something that is faster.

What's it slow at?  Have you identified the bottlenecks and current
sources of sluggish behavior?  That sort of thing is much more informative
to look into in regards to redesigning for new hardware than trivia like
disk layout.  For all we know you're CPU bound.

> The database itself is an OLTP system. There are many smaller tables, and
> some bigger ones (biggest table with 1.2 million records, table size 966MB,
> indexes size 790MB).

The total database size is the interesting number you left out here.  And
you didn't mention how much RAM either.  That ratio has a lot of impact on
how hard you'll push the disks.

> Question 1. We are going to use PostgreSQL 3.1 with FreeBSD. The pg docs say
> that it is better to use FreeBSD because it can alter the I/O priority of
> processes dynamically.

You shouldn't make an OS decision based on a technical detail that small.
I won't knock FreeBSD because it's a completely reasonable choice, but
there's no credible evidence it's a better performer for the workload you
expect than, say, Linux or even Solaris x64.  (The benchmarks the FreeBSD
team posted as part of their 7.0 fanfare are not representative of real
PostgreSQL performance, and are read-only as well).

All the reasonable OS choices here are close enough to one another (as
long as you get FreeBSD 7, earlier versions are really slow) that you
should be thinking in terms of reliability, support, and features rather
than viewing this from a narrow performance perspective.  There's nothing
about what you've described that sounds like it needs bleeding-edge
performance to achieve. For reliability, I first look at how good the disk
controller and its matching driver in the OS used is, which brings us to:

> Question 2. SCSI or SATA? I plan to buy a RocketRAID 3520 controller with 8
> SATA 2 disks. The operating system would be on another disk pair, connected
> to the motherboard's controller. I wonder if I can get more performance with
> SCSI, for the same amount of money? (I can spend about $1500 on the
> controller and the disks, that would cover 10 SATA 2 disks and the
> controller.)

Highpoint has traditionally made disk controllers that were garbage.  The
3520 is from a relatively new series of products from them, and it seems
like a reasonable unit.  However:  do you want to be be deploying your
system on a new card with zero track record for reliability, and from a
company that has never done a good job before?  I can't think of any
reason at all why you should take that risk.

The standard SATA RAID controller choices people suggest here are 3ware,
Areca, and LSI Logic.  Again, unless you're really pushing what the
hardware is capable of these are all close to each other performance-wise
(see http://femme.tweakblogs.net/blog/196/highpoint-rocketraid-3220.html
for something that include the Highpoint card).  You should be thinking in
terms of known reliability and stability when you select a database
controller card, and Highpoint isn't even on the list of vendors to
consider yet by those standards.

As for SCSI vs. SATA, I collected up the usual arguments on both sides at
http://www.postgresqldocs.org/index.php/SCSI_vs._IDE/SATA_Disks

> However, the transaction log file should be on a separate disk and maybe I
> could gain more performance by putting indexes on a separate drive, but I do
> not want to reduce the number of disks in the RAID 0+1 array.

If you're looking at 8+ disks and have a caching controller with a battery
backup, which appears to be your target configuration, there little reason
to expect a big performance improvement from splitting the transaction log
out onto a seperate disk.  As you note, doing that will reduce the spread
of disk for the database which may cost you more in performance than
seperate transaction logs gain.

It is worth considering creating a seperate filesystem on the big array to
hold the xlog data through, because that gives you more flexibility in
terms of mount parameters there.  For example, you can always turn off
atime updates on the transaction log filesystem, and in many cases the
filesystem journal updates can be optimized more usefully (the xlog
doesn't require them).

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

pgsql-performance by date:

Previous
From: Gavin Sherry
Date:
Subject: Re: using like in a prepare doesnt' use the right index
Next
From: Dan Harris
Date:
Subject: Re: Planning a new server - help needed