Thread: Install Postgres on a SAN volume?

Install Postgres on a SAN volume?

From
William Garrison
Date:
We are using PostgreSQL 8.2.9 on Windows, and we are setting up some new
machines.  We used to install PostgreSQL on C: and then we put the
tablespaces onto our SAN drive (Z:).  When we tried to mount the
snapshots of the SAN we learned that they were useless since we only had
the tablespaces, not all the other stuff.  So we are considering two
options:

1) We could install PostgreSQL directly onto the SAN volume.  That puts
the data directory, the application files, tools, EXEs, DLLs, etc. on
the SAN volume
2) We could install PostgreSQL onto the C: drive and then configure the
data folder to be on the SAN volume (Z:)

I'm not the hardware guy, but our SAN is supposedly superly-duperly
fast, RAID with mirrors stripes and even plaid.  The connection is dual
fiber channels, is hardened against nuclear strikes, and includes a
laser defense system.  So I am assured it is fast.  Option #1 is the
simplest, and offers us lots of advantages.  But I can't help but wonder
if putting the actual application files and transaction logs on there is
smart.  It is really nice because it supports instant snapshots so we
can, in theory, snapshot a volume and re-mount it elsewhere.  But I am
looking for any down sides to doing it this way.  Any suggestions from
Postgres veterans?

Thanks.

Re: Install Postgres on a SAN volume?

From
Greg Smith
Date:
On Mon, 8 Sep 2008, William Garrison wrote:

> 2) We could install PostgreSQL onto the C: drive and then configure the data
> folder to be on the SAN volume (Z:)

Do that.  You really don't want to get into the situation where you can't
run anything related to the PostgreSQL service just because the SAN isn't
available.  You may have internal SAN fans that will swear that never
happens, but it does.  Also, it allows installing a later PostgreSQL
version upgrade on another system and testing against the SAN data files
in a way that said system could become the new server.  There's all kinds
of systems management reasons you should separate the database application
from the database files.

> So I am assured it is fast.

Compared to what?  The same amount spent on direct storage would be widly
faster.

The thing to remember about SANs is that they are complicated, and there
are many ways you can misconfigure them so that their database performance
sucks.  Make sure you actually benchmark the SAN and compare it to direct
connected disks to see if it's acting sanely; don't just believe what
people tell you.

I personally can't understand why anybody would spend SAN $ and then
hobble the whole thing by running PostgreSQL on Windows.  The Win32 port
is functional, but it's really not fast.

> It is really nice because it supports instant snapshots so we can, in
> theory, snapshot a volume and re-mount it elsewhere.

You'll still need to setup basic PITR recovery to know you got a useful
snapshot.  See
http://lethargy.org/~jesus/archives/92-PostgreSQL-warm-standby-on-ZFS-crack.html
for a nice intro to that that uses ZFS as the snapshot implementation.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Install Postgres on a SAN volume?

From
"Douglas McNaught"
Date:
On Mon, Sep 8, 2008 at 6:18 PM, William Garrison <postgres@mobydisk.com> wrote:
> 2) We could install PostgreSQL onto the C: drive and then configure the data folder to be
> on the SAN volume (Z:)

You want this.  If you're going to take snapshots, you need all the
data files AND the transaction logs to be in the snapshot.  Putting
the application files on the SAN probably won't hurt, but won't really
help either (as it's likely that all the application code will stay in
memory once the system is running).

You can put the text logfiles on a local drive if you want.

-Doug

Re: Install Postgres on a SAN volume?

From
William Garrison
Date:
Thanks.

I notice that the link you provided says:
"Per best practices, my postgres data directory, xlogs and WAL archives
are on different filesystems (ZFS of course). "

Why is this a best practice?  Is there a reference for that?

Greg Smith wrote:
> On Mon, 8 Sep 2008, William Garrison wrote:
>
>> 2) We could install PostgreSQL onto the C: drive and then configure
>> the data folder to be on the SAN volume (Z:)
>
> Do that.  You really don't want to get into the situation where you
> can't run anything related to the PostgreSQL service just because the
> SAN isn't available.  You may have internal SAN fans that will swear
> that never happens, but it does.  Also, it allows installing a later
> PostgreSQL version upgrade on another system and testing against the
> SAN data files in a way that said system could become the new server.
> There's all kinds of systems management reasons you should separate
> the database application from the database files.
>
>> So I am assured it is fast.
>
> Compared to what?  The same amount spent on direct storage would be
> widly faster.
>
> The thing to remember about SANs is that they are complicated, and
> there are many ways you can misconfigure them so that their database
> performance sucks.  Make sure you actually benchmark the SAN and
> compare it to direct connected disks to see if it's acting sanely;
> don't just believe what people tell you.
>
> I personally can't understand why anybody would spend SAN $ and then
> hobble the whole thing by running PostgreSQL on Windows.  The Win32
> port is functional, but it's really not fast.
>
>> It is really nice because it supports instant snapshots so we can, in
>> theory, snapshot a volume and re-mount it elsewhere.
>
> You'll still need to setup basic PITR recovery to know you got a
> useful snapshot.  See
> http://lethargy.org/~jesus/archives/92-PostgreSQL-warm-standby-on-ZFS-crack.html
> for a nice intro to that that uses ZFS as the snapshot implementation.
>
> --
> * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
>


Re: Install Postgres on a SAN volume?

From
Greg Smith
Date:
On Mon, 8 Sep 2008, William Garrison wrote:

> I notice that the link you provided says:
> "Per best practices, my postgres data directory, xlogs and WAL archives are
> on different filesystems (ZFS of course). "
>
> Why is this a best practice?  Is there a reference for that?

Those all have different ways they are used, and it's not unusual for
people to mount each with slightly different parameters or otherwise tune
them for their respective role.

There's also some physical properties involved.  You probably want the WAL
files on the fastest parts of the disk (drive speed varies considerably
from the inside to the outside tracks), while the WAL archives can go onto
the slowest filesystem without a problem.  The easy way to do that is to
partition the volume--normally the first partition you make will be
fastest, while the logical end of the disk is the slowest part.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Install Postgres on a SAN volume?

From
Magnus Hagander
Date:
Greg Smith wrote:
> On Mon, 8 Sep 2008, William Garrison wrote:
>
>> 2) We could install PostgreSQL onto the C: drive and then configure
>> the data folder to be on the SAN volume (Z:)
>
> Do that.  You really don't want to get into the situation where you
> can't run anything related to the PostgreSQL service just because the
> SAN isn't available.  You may have internal SAN fans that will swear
> that never happens, but it does.  Also, it allows installing a later
> PostgreSQL version upgrade on another system and testing against the SAN
> data files in a way that said system could become the new server.
> There's all kinds of systems management reasons you should separate the
> database application from the database files.

The counter-argument is that keeping the database software on the same
drive will ensure you always run the same version in say a two node
failover cluster. But that's a fairly specific use-case. For the general
use-case, I agree with Greg's recommendation.


>> So I am assured it is fast.
>
> Compared to what?  The same amount spent on direct storage would be
> widly faster.

Counterpoint: SAN is already in place...


> The thing to remember about SANs is that they are complicated, and there
> are many ways you can misconfigure them so that their database
> performance sucks.  Make sure you actually benchmark the SAN and compare
> it to direct connected disks to see if it's acting sanely; don't just
> believe what people tell you.
>
> I personally can't understand why anybody would spend SAN $ and then
> hobble the whole thing by running PostgreSQL on Windows.  The Win32 port
> is functional, but it's really not fast.

As the guy who did much of the work on the Win32 port, I have to +1 this
several times over ;-) PostgreSQL on Win32 will always be slower than
it's on Unix, for architectural reasons. This difference can increase
drastically under high load.

These are facts, not just the general recommendation from Unix people to
stay away from Windows, btw :-)


>> It is really nice because it supports instant snapshots so we can, in
>> theory, snapshot a volume and re-mount it elsewhere.
>
> You'll still need to setup basic PITR recovery to know you got a useful
> snapshot.  See
> http://lethargy.org/~jesus/archives/92-PostgreSQL-warm-standby-on-ZFS-crack.html
> for a nice intro to that that uses ZFS as the snapshot implementation.

Say what? As long as your SAN guarantees an atomic snapshot of all your
data (which every SAN I've ever heard of guarantees if you're on a
single volume - entry level SANs often don't have the functionality to
do multi-volume atomic snapshots, though), you don't need to set up PITR
 for simple backups, AFAIK. It's just simple crash recovery...

It's still good idea, though, since it gives you the PIT part of PITR,
which you don't get with just a snapshot.

//Magnus

Re: Install Postgres on a SAN volume?

From
Greg Smith
Date:
On Tue, 9 Sep 2008, Magnus Hagander wrote:

> As long as your SAN guarantees an atomic snapshot of all your data
> (which every SAN I've ever heard of guarantees if you're on a single
> volume - entry level SANs often don't have the functionality to do
> multi-volume atomic snapshots, though), you don't need to set up PITR
> for simple backups

It's all those ifs in there that leave me still recommending it.  It's
certainly possible to get a consistant snapshot with the right hardware
and setup.  What concerns me about recommending that without a long list
of caveats is the kinds of corruption you'd get if all those conditions
aren't perfect will of course not ever happen during testing.  Murphy says
that it will happen only when you find yourself really needing that
snapshot to work one day.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Install Postgres on a SAN volume?

From
Magnus Hagander
Date:
Greg Smith wrote:
> On Tue, 9 Sep 2008, Magnus Hagander wrote:
>
>> As long as your SAN guarantees an atomic snapshot of all your data
>> (which every SAN I've ever heard of guarantees if you're on a single
>> volume - entry level SANs often don't have the functionality to do
>> multi-volume atomic snapshots, though), you don't need to set up PITR
>> for simple backups
>
> It's all those ifs in there that leave me still recommending it.  It's
> certainly possible to get a consistant snapshot with the right hardware
> and setup.  What concerns me about recommending that without a long list
> of caveats is the kinds of corruption you'd get if all those conditions
> aren't perfect will of course not ever happen during testing.  Murphy
> says that it will happen only when you find yourself really needing that
> snapshot to work one day.

Well, I agree one should be careful, but I don't see the risk if you
just change all those ifs into a single one, which is "if all your data
*and* WAL is on the same SAN LUN".

(heck, you don't need hardware to do it, you can do software snapshot
just fine - as long as you keep all your stuff on the same mountpoint
there as well)

//Magnus

Re: Install Postgres on a SAN volume?

From
Robert Treat
Date:
On Tuesday 09 September 2008 04:37:09 Magnus Hagander wrote:
> Greg Smith wrote:
> > On Tue, 9 Sep 2008, Magnus Hagander wrote:
> >> As long as your SAN guarantees an atomic snapshot of all your data
> >> (which every SAN I've ever heard of guarantees if you're on a single
> >> volume - entry level SANs often don't have the functionality to do
> >> multi-volume atomic snapshots, though), you don't need to set up PITR
> >> for simple backups
> >
> > It's all those ifs in there that leave me still recommending it.  It's
> > certainly possible to get a consistant snapshot with the right hardware
> > and setup.  What concerns me about recommending that without a long list
> > of caveats is the kinds of corruption you'd get if all those conditions
> > aren't perfect will of course not ever happen during testing.  Murphy
> > says that it will happen only when you find yourself really needing that
> > snapshot to work one day.
>
> Well, I agree one should be careful, but I don't see the risk if you
> just change all those ifs into a single one, which is "if all your data
> *and* WAL is on the same SAN LUN".
>
> (heck, you don't need hardware to do it, you can do software snapshot
> just fine - as long as you keep all your stuff on the same mountpoint
> there as well)
>

That's pretty key, but there can be advantages to doing it using the pitr
tools, and I think in most cases it would be hard to argue it isn't safer.

As a counter example to theo's zfs based post, I posted a linux/lvm script
that can work as the basis of a simple snapshot backup tool, available at
http://people.planetpostgresql.org/xzilla/index.php?/archives/344-ossdb-snapshot,-lvm-database-snapshot-tool.html

And yes, I prefer working on the zfs based one :-)

--
Robert Treat
http://www.omniti.com
Database: Scalability: Consulting: