SSD options, small database, ZFS - Mailing list pgsql-performance

From CSS
Subject SSD options, small database, ZFS
Date
Msg-id 0464974E-CF82-41D7-A266-5BAE3CBCEC51@morefoo.com
Whole thread Raw
Responses Re: SSD options, small database, ZFS  (Arjen van der Meijden <acmmailing@tweakers.net>)
List pgsql-performance
Hello all,

I've spent some time looking through previous posts regarding
postgres and SSD drives and have also been reading up on the
subject of SSDs in general elsewhere.

Some quick background:

We're currently looking at changing our basic database setup as we
migrate away from some rather old (circa 2006 or so) hardware to
more current gear.  Our previous setup basically consisted of two
types of 1U boxes - dual-xeon with a low-end Adaptec RAID card
paired with two U320 or U160 10K SCSI drives for database and light
web frontend and mail submission duties and single P4 and dual PIII
boxes with commodity drives handling mail deliveries.  Most of the
delivery servers have been migrated to current hardware (single and
dual quad-core xeons, 4 WD RE3 SATA drives, FreeBSD w/ZFS), and
we've also moved a handful of the db/web services to one of these
servers just to see what a large increase in RAM and CPU can do for
us.  As we sort of expected, the performance overall was better,
but there's a serious disk IO wall we hit when running nightly jobs
that are write heavy and long-running.

There are currently four db/web servers that have a combined total
of 133GB of on-disk postgres data.  Given the age of the servers,
the fact that each only has two drives on rather anemic RAID
controllers with no cache/BBU, and none has more than 2GB RAM I
think it's a safe bet that consolidating this down to two modern
servers can give us better performance, allow for growth over the
next few years, and let us split db duties from web/mail
submission.  One live db server, one standby.

And this is where SSDs come in.  We're not looking at terabytes of
data here, and I don't see us growing much beyond our current
database size in the next few years.  SSDs are getting cheap enough
that this feels like a good match - we can afford CPU and RAM, we
can't afford some massive 24 drive monster and a pile of SAS
drives.  The current db boxes top out at 300 IOPS, the SATA boxes
maybe about half that.  If I can achieve 300x4 IOPS (or better,
preferably...) on 2 or 4 SSDs, I'm pretty much sold.

From what I've been reading here, this sounds quite possible.  I
understand the Intel 320 series are the best "bargain" option since
they can survive an unexpected shutdown, so I'm not going to go
looking beyond that - OCZ makes me a bit nervous and finding a
clear list of which drives have the "supercapacitor" has been
difficult.  I have no qualms about buying enough SSDs to mirror
them all.  I am aware I need to have automated alerts for various
SMART stats so I know when danger is imminent.  I know I need to
have this box replicated even with mirrors and monitoring up the
wazoo.

Here's my remaining questions:

-I'm calling our combined databases at 133GB "small", fair
assumption?  -Is there any chance that a server with dual quad core
xeons, 32GB RAM, and 2 or 4 SSDs (assume mirrored) could be slower
than the 4 old servers described above?  I'm beating those on raw
cpu, quadrupling the amount of RAM (and consolidating said RAM),
and going from disks that top out at 4x300 IOPS with SSDs that
conservatively should provide 2000 IOPS.

-We're also finally automating more stuff and trying to standardize
server configs.  One tough decision we made that has paid off quite
well was to move to ZFS.  We find the features helpful to admin
tasks outweigh the drawbacks and RAM is cheap enough that we can
deal with its tendency to eat RAM.  Is ZFS + Postgres + SSDs a bad
combo?

-Should I even be looking at the option of ZFS on SATA or low-end
SAS drives and ZIL and L2ARC on SSDs?  Initially this intrigued me,
but I can't quite get my head around how the SSD-based ZIL can deal
with flushing the metadata out when the whole system is under any
sort of extreme write-heavy load - I mean if the ZIL is absorbing
2000 IOPS of metadata writes, at some point it has to get full as
it's trying to flush this data to much slower spinning drives.

-Should my standby box be the same configuration or should I look
at actual spinning disks on that?  How rough is replication on the
underlying storage?  Would the total data written on the slave be
less or equal to the master?

Any input is appreciated.  I did really mean for this to be a much
shorter post...

Thanks,

Charles

pgsql-performance by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: Join over two tables of 50K records takes 2 hours
Next
From: Scott Marlowe
Date:
Subject: Re: Join over two tables of 50K records takes 2 hours