Thread: Best high availability solution ?

Best high availability solution ?

From
Arnaud Lesauvage
Date:
Hi list !

I have a small enterprise network (~15 workstations, 1 server),
all running windows OSes. Most of our work is done on a PostgreSQL
DB (on the windows server).
I am the only IT here, and my boss asked me to find a way to have
the database always online, without my intervention.
Last time I went on vacation, the server crashed and no one was
able to repair it.

Our application connects to PostgreSQL through ODBC, with a simple
TCP/IP connection.
I though that I would first install a Slony-I cluster. That would
be fine for data replication, but still if the main server
crashes, the database connections will not work anymore because
the name of the backup-server will be different than the name of
the master, so all ODBC connection should be changed to use the
new machine name.
Since I cannot ask anyone to do some DNS changes or things like
that, I am looking for a simple way to have my database always
online (note that I already have a UPS and RAID1 on the server to
prevent most failures).

After some searches, I found LifeKeeper, which looks very good but
is quite expensive !
Are there easier and/or better solutions than that ?

Thanks for your advices on this matter !

--
Arnaud


Re: Best high availability solution ?

From
Markus Schiltknecht
Date:
Hi Arnaud,

perhaps you can still use Slony-I for replication and have another tool
automatically handle connections (check out PgPool[1] or SQLRelay[2]).

Or go for a middleware replication solution. Check C-JDBC[3], perhaps
there is something similar for ODBC?

LifeKeeper seems to handle replication at a lower level (filesystem,
distributed memory, or such) and does not seem to be very well suited
for database replication. However, I've had just a quick glance at the
website.

Hope that helps

Markus

[1]: http://pgfoundry.org/projects/pgpool/
[2]: http://sqlrelay.sourceforge.net/
[3]: http://c-jdbc.objectweb.org/

On Wed, 2006-05-31 at 09:36 +0200, Arnaud Lesauvage wrote:
> Hi list !
>
> I have a small enterprise network (~15 workstations, 1 server),
> all running windows OSes. Most of our work is done on a PostgreSQL
> DB (on the windows server).
> I am the only IT here, and my boss asked me to find a way to have
> the database always online, without my intervention.
> Last time I went on vacation, the server crashed and no one was
> able to repair it.
>
> Our application connects to PostgreSQL through ODBC, with a simple
> TCP/IP connection.
> I though that I would first install a Slony-I cluster. That would
> be fine for data replication, but still if the main server
> crashes, the database connections will not work anymore because
> the name of the backup-server will be different than the name of
> the master, so all ODBC connection should be changed to use the
> new machine name.
> Since I cannot ask anyone to do some DNS changes or things like
> that, I am looking for a simple way to have my database always
> online (note that I already have a UPS and RAID1 on the server to
> prevent most failures).
>
> After some searches, I found LifeKeeper, which looks very good but
> is quite expensive !
> Are there easier and/or better solutions than that ?
>
> Thanks for your advices on this matter !
>
> --
> Arnaud
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match


Re: Best high availability solution ?

From
Arnaud Lesauvage
Date:
Markus Schiltknecht a écrit :
> Hi Arnaud,
>
> perhaps you can still use Slony-I for replication and have another tool
> automatically handle connections (check out PgPool[1] or SQLRelay[2]).
>
> Or go for a middleware replication solution. Check C-JDBC[3], perhaps
> there is something similar for ODBC?
>
> LifeKeeper seems to handle replication at a lower level (filesystem,
> distributed memory, or such) and does not seem to be very well suited
> for database replication. However, I've had just a quick glance at the
> website.
>
> Hope that helps
>
> Markus
>
> [1]: http://pgfoundry.org/projects/pgpool/
> [2]: http://sqlrelay.sourceforge.net/
> [3]: http://c-jdbc.objectweb.org/


Hi Markus !

Thanks for your suggestions, they are very helpful !
pgpool and SQLRelay looked really great, but there are no ports
for win32... :(
Maybe some kind of virtual-ip software would do the trick, but I
am still looking for it !

--
Arnaud


Re: Best high availability solution ?

From
"Magnus Hagander"
Date:
> Hi list !
>
> I have a small enterprise network (~15 workstations, 1
> server), all running windows OSes. Most of our work is done
> on a PostgreSQL DB (on the windows server).
> I am the only IT here, and my boss asked me to find a way to
> have the database always online, without my intervention.
> Last time I went on vacation, the server crashed and no one
> was able to repair it.
>
> Our application connects to PostgreSQL through ODBC, with a
> simple TCP/IP connection.
> I though that I would first install a Slony-I cluster. That
> would be fine for data replication, but still if the main
> server crashes, the database connections will not work
> anymore because the name of the backup-server will be
> different than the name of the master, so all ODBC connection
> should be changed to use the new machine name.
> Since I cannot ask anyone to do some DNS changes or things
> like that, I am looking for a simple way to have my database
> always online (note that I already have a UPS and RAID1 on
> the server to prevent most failures).
>
> After some searches, I found LifeKeeper, which looks very
> good but is quite expensive !
> Are there easier and/or better solutions than that ?
>
> Thanks for your advices on this matter !

Since you're a Windows shop, you may already have the experience (and
even liceneses perhaps?) to run Microsoft Cluster Service (part of 2003
Enterprise Edition or 2000 Advanced Server). PostgreSQL will work fine
with it. Works with shared disks using either fibrechannel or iSCSI.

If you don't have the licenses for it already, it might turn out very
expensive. And if you don't already have fibrechannel, that part is
definitly expensive - but iSCSI could help you.


If you're willing to move off Windows for the server platform, you could
look at one of the solutions like slony+pgpool, or maybe DRBD+linux/ha.
That'll be less expensive in both hardware and licenses, but if you
don't have the people to maintain a new platform for it that's likely to
be prohibitive.

//Magnus

Re: Best high availability solution ?

From
Arnaud Lesauvage
Date:
Magnus Hagander a écrit :
> Since you're a Windows shop, you may already have the experience (and
> even liceneses perhaps?) to run Microsoft Cluster Service (part of 2003
> Enterprise Edition or 2000 Advanced Server). PostgreSQL will work fine
> with it. Works with shared disks using either fibrechannel or iSCSI.
>
> If you don't have the licenses for it already, it might turn out very
> expensive. And if you don't already have fibrechannel, that part is
> definitly expensive - but iSCSI could help you.

I am running Windows 2000 Server (the "normal" edition, not the
"advanced" one), so I don't have the cluster service available.

> If you're willing to move off Windows for the server platform, you could
> look at one of the solutions like slony+pgpool, or maybe DRBD+linux/ha.
> That'll be less expensive in both hardware and licenses, but if you
> don't have the people to maintain a new platform for it that's likely to
> be prohibitive.

I don't have the Linux knowledge to move from windows unfortunately.

Otherwise, is the clustering service really necessary ?
Aren't there simple "virtual IP addresses" solutions available ?
I could have a single virtual IP for the 2 servers, and having an
automatic failover if the master server is down ? The slave would
never be accessed by clients directly (only after a failover) but
would by synchronized with the master (slony-I).

Does this kind of software exist in the windows world ?

--
Arnaud


Re: Best high availability solution ?

From
"Dave Page"
Date:

> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org] On Behalf Of
> Arnaud Lesauvage
> Sent: 31 May 2006 10:39
> To: Magnus Hagander
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Best high availability solution ?
>
> Otherwise, is the clustering service really necessary ?
> Aren't there simple "virtual IP addresses" solutions available ?
> I could have a single virtual IP for the 2 servers, and having an
> automatic failover if the master server is down ? The slave would
> never be accessed by clients directly (only after a failover) but
> would by synchronized with the master (slony-I).
>
> Does this kind of software exist in the windows world ?

Slony-I only exists for Windows in a not-even-beta state at the moment,
so even if you get things up and running using a virtual IP solution you
will be hand-holding Slony until it gets properly released.

The 'virtual IP' solution you are looking for is called Network Load
Balancing btw, and is on standard edition. I really don't see how you
would make it work safely though - it's designed more for load
distribution, and though you can weight rules towards one server,
there's no guarantee that it won't send a connection to the backup
server.

You would also need some way of automatically handling the Slony
failover which is definitely not recommended.

If I'm honest, I think your boss is going to be disappointed. You would
add a *lot* of complexity to the system to make it handle failures with
zero intervention, and that extra complexity is probably more likely to
go wrong than a single server. I'd spend your time and money on making
sure your raid & ups are good, that you are running on server grade
hardware with ECC RAM, and that you have good out of band management
facilities so even if you are away from the office you can connect via
VPN/modem or whatever and fix things.

Regards, Dave.

Re: Best high availability solution ?

From
Arnaud Lesauvage
Date:
Dave Page a écrit :
> Slony-I only exists for Windows in a not-even-beta state at the moment,
> so even if you get things up and running using a virtual IP solution you
> will be hand-holding Slony until it gets properly released.

OK, I thought it was in production stage.

> If I'm honest, I think your boss is going to be disappointed. You would
> add a *lot* of complexity to the system to make it handle failures with
> zero intervention, and that extra complexity is probably more likely to
> go wrong than a single server. I'd spend your time and money on making
> sure your raid & ups are good, that you are running on server grade
> hardware with ECC RAM, and that you have good out of band management
> facilities so even if you are away from the office you can connect via
> VPN/modem or whatever and fix things.

I think you're right.
Is there a simple replication solution for windows then ? Or will
I have to stop the master postgresql at night to pgdump and
pgrestore on the backup server ?

If I have a synchronized backup server (even if it synchronizes
once or twice a day, this might be OK), I could just write a
simple script that would change the host file of the clients to
have the postgresql's name to point to the backup IP...

--
Arnaud


Re: Best high availability solution ?

From
Tino Wildenhain
Date:
Arnaud Lesauvage schrieb:
...
>
> I think you're right.
> Is there a simple replication solution for windows then ? Or will I have
> to stop the master postgresql at night to pgdump and pgrestore on the
> backup server ?

pg_dump does not require you to stop the master database anyway.
(in fact it cannot even dump a stopped database :-)

> If I have a synchronized backup server (even if it synchronizes once or
> twice a day, this might be OK), I could just write a simple script that
> would change the host file of the clients to have the postgresql's name
> to point to the backup IP...

I think you can use PITR for this use-case:

  http://www.postgresql.org/docs/current/static/backup-online.html

Regards
Tino

Re: Best high availability solution ?

From
"Dave Page"
Date:

> -----Original Message-----
> From: Arnaud Lesauvage [mailto:thewild@freesurf.fr]
> Sent: 31 May 2006 11:27
> To: Dave Page
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Best high availability solution ?
>
> Dave Page a écrit :
> > Slony-I only exists for Windows in a not-even-beta state at
> the moment,
> > so even if you get things up and running using a virtual IP
> solution you
> > will be hand-holding Slony until it gets properly released.
>
> OK, I thought it was in production stage.

The code is written, and is good as far as we are aware, but has not been through a beta/release cycle yet.

> Is there a simple replication solution for windows then ? Or will
> I have to stop the master postgresql at night to pgdump and
> pgrestore on the backup server ?

You don't need to stop the master server to pg_dump (in fact, it won't work if you do). On the slave, you can just drop
theDBs and pg_restore them. 

PITR is another solution you might consider.

> If I have a synchronized backup server (even if it synchronizes
> once or twice a day, this might be OK), I could just write a
> simple script that would change the host file of the clients to
> have the postgresql's name to point to the backup IP...

Might be better if the users know they are being switched over - that way they will know to check for lost transactions
etc.If you're using an ODBC app, consider using a separate DSN for the live and backup servers - similar solutions
wouldapply to other interfaces of course. 

Regards, Dave

Re: Best high availability solution ?

From
Tino Wildenhain
Date:
Arnaud Lesauvage schrieb:
> Tino Wildenhain a écrit :
>
>> pg_dump does not require you to stop the master database anyway.
>> (in fact it cannot even dump a stopped database :-)
>
>
> Hello Tino,
>
> I think I might just use this pg_dump solution...
> Seems to be quite simple to set up.

personally I think the WAL approach is by far easier
to set up and maintain - the pg_dump is in fact easy,
but the restore to another database can be tricky
if you want it unattended and bullit-proof the same
time.

Re: Best high availability solution ?

From
Arnaud Lesauvage
Date:
Dave Page a écrit :
> The code is written, and is good as far as we are aware, but has not been through a beta/release cycle yet.

OK, that's already pretty good then.
Are there binary releases available ?

>> Is there a simple replication solution for windows then ? Or will
>> I have to stop the master postgresql at night to pgdump and
>> pgrestore on the backup server ?
>
> You don't need to stop the master server to pg_dump (in fact, it won't work if you do). On the slave, you can just
dropthe DBs and pg_restore them. 
> PITR is another solution you might consider.

It looks quite harder to script though... But I'll consider it as
a better solution.

> Might be better if the users know they are being switched over - that way they will know to check for lost
transactionsetc. If you're using an ODBC app, consider using a separate DSN for the live and backup servers - similar
solutionswould apply to other interfaces of course. 

There are no critical transactions here.
If the DB server fails and users are switched over to a backup
server with data of the day before, it is just fine.
Losing 1/2 of work is OK, but not working for 4 days is not.

So I might just write a script so that the person in charge when I
am away can run it and alert everyone.

--
Arnaud


Re: Best high availability solution ?

From
"Dave Page"
Date:

> -----Original Message-----
> From: Arnaud Lesauvage [mailto:thewild@freesurf.fr]
> Sent: 31 May 2006 11:53
> To: Dave Page
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Best high availability solution ?
>
> Dave Page a écrit :
> > The code is written, and is good as far as we are aware,
> but has not been through a beta/release cycle yet.
>
> OK, that's already pretty good then.
> Are there binary releases available ?

http://developer.pgadmin.org/~hiroshi/Slony-I/

That's built against 8.1 iirc.

USE AT YOUR OWN RISK!! IT MAY EAT YOUR DATA AND SET YOUR SERVER ON FIRE :-)

Regards, Dave.

Re: Best high availability solution ?

From
Arnaud Lesauvage
Date:
Tino Wildenhain a écrit :
> personally I think the WAL approach is by far easier
> to set up and maintain - the pg_dump is in fact easy,
> but the restore to another database can be tricky
> if you want it unattended and bullit-proof the same
> time.

I'll have to study this more in-depth then.
If I got it right, the procedure would be :
- wal archiving enabled
- base backup once a day (pg_start_backup, copy the 'data'
directory, pg_stop_backup)
- create restore-script to be run on the second server, which would :
   - copy the backup to the 'data' directory
   - copy the wal files to the 'pg_xlog' directory
   - create the recovery.conf in the data directory (should always
stay there maybe)
   - start the postmaster

Then anyone could just run this script in case of a failure of the
master server to have an up-to-date database running.

Then with a script that would change my DNS so that
mypgserver.domain.tld (used in ODBC connection string) points to
CNAME mybackupserver.domain.tld instead of CNAME
mymasterserver.domain.tld, getting back to production ould be
quite easy...?

--
Arnaud


Re: Best high availability solution ?

From
Arnaud Lesauvage
Date:
Dave Page a écrit :
> http://developer.pgadmin.org/~hiroshi/Slony-I/
> That's built against 8.1 iirc.

Great !

> USE AT YOUR OWN RISK!! IT MAY EAT YOUR DATA AND SET YOUR SERVER ON FIRE :-)

I'll keep an eye on it then ! ;-)

--
Arnaud


Re: Best high availability solution ?

From
Tino Wildenhain
Date:
Arnaud Lesauvage schrieb:
> Tino Wildenhain a écrit :
>
>> personally I think the WAL approach is by far easier
>> to set up and maintain - the pg_dump is in fact easy,
>> but the restore to another database can be tricky
>> if you want it unattended and bullit-proof the same
>> time.
>
>
> I'll have to study this more in-depth then.
> If I got it right, the procedure would be :
> - wal archiving enabled
> - base backup once a day (pg_start_backup, copy the 'data' directory,
> pg_stop_backup)

I'd think you can skip that and just do it once at the very beginning.
But if you like to use the WAL files to recover a 3rd, new system, this
would be a good approach.

> - create restore-script to be run on the second server, which would :
>   - copy the backup to the 'data' directory
>   - copy the wal files to the 'pg_xlog' directory
>   - create the recovery.conf in the data directory (should always stay
> there maybe)
>   - start the postmaster

> Then anyone could just run this script in case of a failure of the
> master server to have an up-to-date database running.

Actually you would let the (master-) server run the script
and let it (trigger) copy and import of the WAL segment as
it gets ready. This way your backup server is very close
to the current state and you dont loose much if the first
machine completely dies suddenly.

> Then with a script that would change my DNS so that
> mypgserver.domain.tld (used in ODBC connection string) points to CNAME
> mybackupserver.domain.tld instead of CNAME mymasterserver.domain.tld,
> getting back to production ould be quite easy...?

I guess the users would start over anyway. So easiest if you
provide a copy of the app with that other connection and
signal them if the first server dies to just close the first
and start the backup-application.

Regards
Tino

Re: Best high availability solution ?

From
"Dave Page"
Date:

> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Tino
> Wildenhain
> Sent: 31 May 2006 12:14
> To: Arnaud Lesauvage
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Best high availability solution ?
>
> > Then with a script that would change my DNS so that
> > mypgserver.domain.tld (used in ODBC connection string)
> points to CNAME
> > mybackupserver.domain.tld instead of CNAME
> mymasterserver.domain.tld,
> > getting back to production ould be quite easy...?
>
> I guess the users would start over anyway. So easiest if you
> provide a copy of the app with that other connection and
> signal them if the first server dies to just close the first
> and start the backup-application.

Yes - the DNS method would work, but you might run into caching issues
requiring the users to reboot or do a 'ipconfig /flushdns' before they
see the change.

Regards, Dave.

Re: Best high availability solution ?

From
Arnaud Lesauvage
Date:
Dave Page a écrit :
> Yes - the DNS method would work, but you might run into caching issues
> requiring the users to reboot or do a 'ipconfig /flushdns' before they
> see the change.

Yes, but I am not a very nice adminsitrator, and when there is any
problem with a datawase, my users HAVE to reboot ! ;-)
So that might just be fine !

--
Arnaud


Re: Best high availability solution ?

From
"Dave Page"
Date:

> -----Original Message-----
> From: Arnaud Lesauvage [mailto:thewild@freesurf.fr]
> Sent: 31 May 2006 12:33
> To: Dave Page
> Cc: Tino Wildenhain; pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Best high availability solution ?
>
> Dave Page a écrit :
> > Yes - the DNS method would work, but you might run into
> caching issues
> > requiring the users to reboot or do a 'ipconfig /flushdns'
> before they
> > see the change.
>
> Yes, but I am not a very nice adminsitrator, and when there is any
> problem with a datawase, my users HAVE to reboot ! ;-)
> So that might just be fine !

Ah, but wasn't this intended for when you are not there standing over them with a bat?

:-)

Regards, Dave.

Re: Best high availability solution ?

From
Arnaud Lesauvage
Date:
Dave Page a écrit :
> Ah, but wasn't this intended for when you are not there standing over them with a bat?
>
> :-)


It was, but they know I will come back from vacation one day, and
then....

;-)

--
Arnaud


Re: Best high availability solution ?

From
Arnaud Lesauvage
Date:
Tino Wildenhain a écrit :
>> If I understand you right, you suggest that the wal files should be
>> automatically copied to the backup server, which should parse them as
>> soon as they arrive ?
>
> eactly.

Indeed, it seems to be the best solution !
After reading the backup-online page, I see that the recovery
process will end after at the end of the wal log. So with your
suggestion, I should script the copy of the wal files to the
backup server (say, every 5 minutes), and script on the backup
server a recovery process every 5 minutes also ?

--
Arnaud




Re: Best high availability solution ?

From
Christopher Browne
Date:
In the last exciting episode, dpage@vale-housing.co.uk ("Dave Page") wrote:
> If I'm honest, I think your boss is going to be disappointed. You
> would add a *lot* of complexity to the system to make it handle
> failures with zero intervention, and that extra complexity is
> probably more likely to go wrong than a single server. I'd spend
> your time and money on making sure your raid & ups are good, that
> you are running on server grade hardware with ECC RAM, and that you
> have good out of band management facilities so even if you are away
> from the office you can connect via VPN/modem or whatever and fix
> things.

We have found something of the same thing with trying to get improved
reliability out of HACMP (an IBM product that automatically fails over
applications between servers).

We had previously experienced too-frequent problems due to lack of
reliability of our servers.  (Sun high end stuff, as it happened...)

Moving to HACMP on AIX, well, the IBM AIX servers have been way more
reliable.  Unfortunately, HACMP is all too fragile.  It has a lot of
"moving parts" (instances of the "extra complexity" that Dave
mentioned), and apparently you have to have enough outages to upgrade
components to keep it reliable that it rather undermines the uptime.

My suspicion (not actually confirmable with real numbers; all I can do
is hand-wave) is that if we had spent the costs put into HACMP on
otherwise beefing up the Golden Servers, we'd probably have had better
reliability out of depending on the individual boxes to be reliable.

In any case, whatever you use for this, whether Slony-I, with
"automatic failover" scripts, or some sort of "heartbeating/server
takeover" scheme, will suffer from the "too many complex components"
problem.

A vital problem is that it's really hard to validate that the
production configuration is correct.  If you made a mistake, it'll all
blow up.  And you don't want to run tests that might blow everything
up, do you?  :-)
--
let name="cbbrowne" and tld="gmail.com" in name ^ "@" ^ tld;;
http://linuxdatabases.info/info/linuxdistributions.html
"Now, if someone proposed using people who spam comp.sys.* groups with
political  screeds  in  place  of  lab  rats  for  drug  testing,  I'd
wholeheartedly concur".  -- John C. Randolph

Re: Best high availability solution ?

From
Arnaud Lesauvage
Date:
Stuart Bishop a écrit :
> If your application is normally reliable, I think the best, cheapest and
> simplest way of keeping the system online when you are on leave is to give
> your work the phone number of a company offering PostgreSQL support
> services. I would avoid adding the extra complexity and additional support
> and maintenance burdens unless you really need it - can work afford to be
> without the system for a day if it crashes? And do you expect it to happen
> infrequently enough that the outages will not be a problem? The advantage of
> having a human available to restore functionality is that they will be able
> to deal with the situations you haven't thought of, whereas an automated
> solution will likely only deal with the situations you have thought of as
> well as making the system more complex, thus creating more things that could
> go wrong.


I agree with you, but I will have two servers : one will be the
file server, the primary controller of the active directory, and
the backup of postgresql; the other one will be the postgresql
server and the backup of the file server and active directory.
So restoring from this situation does not only require postgresql
knoledge, it requires a full understanding of our network
structure, applications, configurations, etc...
But you have a point that having a contact for postgresql support
is a very good idea anyway !

--
Arnaud


Re: Best high availability solution ?

From
"Magnus Hagander"
Date:
> >Since you're a Windows shop, you may already have the
> experience (and
> >even liceneses perhaps?) to run Microsoft Cluster Service
> (part of 2003
> >Enterprise Edition or 2000 Advanced Server). PostgreSQL will
> work fine
> >with it. Works with shared disks using either fibrechannel or iSCSI.
>
> Are you sure that will really work?

Yes. I have used it.


> I thought Postgresql requires shared memory amongst the
> processes. Is that not true on the Windows platform?

Oh it does. Makes no change.

Microsoft Cluster Service is an active/passive failover clustering
solutino. PostgreSQL will only be *active* on one node at a time. So
shared memory stuff is not affected in any way.

(You can make it actiev/active by running two separate postgresql
installations on the two nodes, with failover-with-lower-performance,
but it's not a load-sharing cluster solution of any time)

//Magnus

Re: Best high availability solution ?

From
Stuart Bishop
Date:
Arnaud Lesauvage wrote:
> Hi list !
>
> I have a small enterprise network (~15 workstations, 1 server), all
> running windows OSes. Most of our work is done on a PostgreSQL DB (on
> the windows server).
> I am the only IT here, and my boss asked me to find a way to have the
> database always online, without my intervention.
> Last time I went on vacation, the server crashed and no one was able to
> repair it.

If your application is normally reliable, I think the best, cheapest and
simplest way of keeping the system online when you are on leave is to give
your work the phone number of a company offering PostgreSQL support
services. I would avoid adding the extra complexity and additional support
and maintenance burdens unless you really need it - can work afford to be
without the system for a day if it crashes? And do you expect it to happen
infrequently enough that the outages will not be a problem? The advantage of
having a human available to restore functionality is that they will be able
to deal with the situations you haven't thought of, whereas an automated
solution will likely only deal with the situations you have thought of as
well as making the system more complex, thus creating more things that could
go wrong.


--
Stuart Bishop <stuart@stuartbishop.net>
http://www.stuartbishop.net/


Attachment

Re: Best high availability solution ?

From
Lincoln Yeoh
Date:
At 10:38 AM 5/31/2006 +0200, Magnus Hagander wrote:

>Since you're a Windows shop, you may already have the experience (and
>even liceneses perhaps?) to run Microsoft Cluster Service (part of 2003
>Enterprise Edition or 2000 Advanced Server). PostgreSQL will work fine
>with it. Works with shared disks using either fibrechannel or iSCSI.

Are you sure that will really work?

I thought Postgresql requires shared memory amongst the processes. Is that
not true on the Windows platform?

Does Microsoft Cluster Service somehow help with that?

Regards,
Link.




Re: Best high availability solution ?

From
Bruno Wolff III
Date:
On Wed, May 31, 2006 at 09:36:23 +0200,
  Arnaud Lesauvage <thewild@freesurf.fr> wrote:
> I am the only IT here, and my boss asked me to find a way to have
> the database always online, without my intervention.
> Last time I went on vacation, the server crashed and no one was
> able to repair it.

Your boss should either cross train a current employee or hire an additional
employee. What happens if you get hit by a bus? It sounds like no one else
there will know what's going on. A person brought in after you are gone
is going to take a significant amount of time to figure things out and the
business may have serious probelms while that is happening.
On the plus side you should be in a good position when it comes to salary
negotiations.

Re: Best high availability solution ?

From
Arnaud Lesauvage
Date:
Bruno Wolff III a écrit :
> On Wed, May 31, 2006 at 09:36:23 +0200,
>   Arnaud Lesauvage <thewild@freesurf.fr> wrote:
>> I am the only IT here, and my boss asked me to find a way to have
>> the database always online, without my intervention.
>> Last time I went on vacation, the server crashed and no one was
>> able to repair it.
>
> Your boss should either cross train a current employee or hire an additional
> employee. What happens if you get hit by a bus? It sounds like no one else
> there will know what's going on. A person brought in after you are gone
> is going to take a significant amount of time to figure things out and the
> business may have serious probelms while that is happening.
> On the plus side you should be in a good position when it comes to salary
> negotiations.

We ARE going to hire a new employee, but we like to do many things
at the same time ! ;-)

--
Arnaud


Re: Best high availability solution ?

From
"Joshua D. Drake"
Date:
> I agree with you, but I will have two servers : one will be the file
> server, the primary controller of the active directory, and the backup
> of postgresql; the other one will be the postgresql server and the
> backup of the file server and active directory.
> So restoring from this situation does not only require postgresql
> knoledge, it requires a full understanding of our network structure,
> applications, configurations, etc...


> But you have a point that having a contact for postgresql support is a
> very good idea anyway !

Here ya go:

http://www.commandprompt.com/support

(DISCLAIMER: I represent the company)

Joshua D. Drake

>
> --
> Arnaud
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>       choose an index scan if your joining column's datatypes do not
>       match
>


--

    === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
    Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/



Re: Best high availability solution ?

From
Michael Meskes
Date:
> >But you have a point that having a contact for postgresql support is a
> >very good idea anyway !
>
> Here ya go:
>
> http://www.commandprompt.com/support
>
> (DISCLAIMER: I represent the company)

Josh, don't you think the better starting point when looking for support
would be http://www.postgresql.org/support/professional_support ? :-)

There are support companies in France and other parts of Europe which
might be a better idea for a company in France. After all I think a
situation like this might need some on-site availability.

Michael
--
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

Re: Best high availability solution ?

From
Arnaud Lesauvage
Date:
Michael Meskes a écrit :
> Josh, don't you think the better starting point when looking for support
> would be http://www.postgresql.org/support/professional_support ? :-)
>
> There are support companies in France and other parts of Europe which
> might be a better idea for a company in France. After all I think a
> situation like this might need some on-site availability.

Indeed !
I see that Pervasive has offices in Belgium, and that is very good
to know !

--
Arnaud


Re: Best high availability solution ?

From
Scott Ribe
Date:
> I guess the users would start over anyway. So easiest if you
> provide a copy of the app with that other connection and
> signal them if the first server dies to just close the first
> and start the backup-application.

This requires that all the users do the right thing. Problem is, what if
there is a network problem for some users only, and they incorrectly
conclude that the server is down?

I prefer the approach of keeping the backup server up to date, whether using
PITR or Slony or your own home-grown synching, and then changing IPs. My
process involves someone making the decision that server is indeed down,
then UNPLUGGING it from the network, then changing the IP of the backup.
Actually bringing the backup online involves a bit more than changing the
IP, because PG is not the only service running on it, but there's a simple
script for the user to run.

Of course this requires that the backup be on the same subnet as the server,
a restriction which is not shared by the alternate app/DSN nor the DNS
techniques...


--
Scott Ribe
scott_ribe@killerbytes.com
http://www.killerbytes.com/
(303) 722-0567 voice



Re: Best high availability solution ?

From
Arnaud Lesauvage
Date:
Scott Ribe a écrit :
> I prefer the approach of keeping the backup server up to date, whether using
> PITR or Slony or your own home-grown synching, and then changing IPs. My
> process involves someone making the decision that server is indeed down,
> then UNPLUGGING it from the network, then changing the IP of the backup.
> Actually bringing the backup online involves a bit more than changing the
> IP, because PG is not the only service running on it, but there's a simple
> script for the user to run.
>
> Of course this requires that the backup be on the same subnet as the server,
> a restriction which is not shared by the alternate app/DSN nor the DNS
> techniques...

I am going towards this solution.
Quite hard to test the backup script in real situation, but I
believe this is the way to go.

--
Arnaud


Re: Best high availability solution ?

From
Bruce Momjian
Date:
Stuart Bishop wrote:
-- Start of PGP signed section.
> Arnaud Lesauvage wrote:
> > Hi list !
> >
> > I have a small enterprise network (~15 workstations, 1 server), all
> > running windows OSes. Most of our work is done on a PostgreSQL DB (on
> > the windows server).
> > I am the only IT here, and my boss asked me to find a way to have the
> > database always online, without my intervention.
> > Last time I went on vacation, the server crashed and no one was able to
> > repair it.
>
> If your application is normally reliable, I think the best, cheapest and
> simplest way of keeping the system online when you are on leave is to give
> your work the phone number of a company offering PostgreSQL support
> services. I would avoid adding the extra complexity and additional support
> and maintenance burdens unless you really need it - can work afford to be
> without the system for a day if it crashes? And do you expect it to happen
> infrequently enough that the outages will not be a problem? The advantage of
> having a human available to restore functionality is that they will be able
> to deal with the situations you haven't thought of, whereas an automated
> solution will likely only deal with the situations you have thought of as
> well as making the system more complex, thus creating more things that could
> go wrong.

Right.  There was a great 1998 article in the Atlantic Monthly about the
Valujet crash:

    http://www.theatlantic.com/issues/98mar/valujet1.htm

Unfortunately that requires a subscription so I pulled a relivant
paragraph:

    http://candle.pha.pa.us/main/valujet

The issue is that reducing risk can increase it, as seen with Valujet,
Three Mile Island, and Chernobyl.

--
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: Best high availability solution ?

From
"Joshua D. Drake"
Date:
Bruce Momjian wrote:
> Stuart Bishop wrote:
> -- Start of PGP signed section.
>> Arnaud Lesauvage wrote:
>>> Hi list !
>>>
>>> I have a small enterprise network (~15 workstations, 1 server), all
>>> running windows OSes. Most of our work is done on a PostgreSQL DB (on
>>> the windows server).
>>> I am the only IT here, and my boss asked me to find a way to have the
>>> database always online, without my intervention.
>>> Last time I went on vacation, the server crashed and no one was able to
>>> repair it.

Then your boss needs to ante up for a support contract. Replication
won't fix the problem you describe above. The problem you describe above
  is lack of planning and resources.

You can find support contracts for reasonable rates from

Command Prompt (us) http://www.commandprompt.com/
Pervasive (them) http://www.pervasive.com/

Sincerely,

Joshua D. Drake


>> If your application is normally reliable, I think the best, cheapest and
>> simplest way of keeping the system online when you are on leave is to give
>> your work the phone number of a company offering PostgreSQL support
>> services. I would avoid adding the extra complexity and additional support
>> and maintenance burdens unless you really need it - can work afford to be
>> without the system for a day if it crashes? And do you expect it to happen
>> infrequently enough that the outages will not be a problem? The advantage of
>> having a human available to restore functionality is that they will be able
>> to deal with the situations you haven't thought of, whereas an automated
>> solution will likely only deal with the situations you have thought of as
>> well as making the system more complex, thus creating more things that could
>> go wrong.
>
> Right.  There was a great 1998 article in the Atlantic Monthly about the
> Valujet crash:
>
>     http://www.theatlantic.com/issues/98mar/valujet1.htm
>
> Unfortunately that requires a subscription so I pulled a relivant
> paragraph:
>
>     http://candle.pha.pa.us/main/valujet
>
> The issue is that reducing risk can increase it, as seen with Valujet,
> Three Mile Island, and Chernobyl.
>


--

    === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
    Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/



Re: Best high availability solution ?

From
Bruce Momjian
Date:
Joshua D. Drake wrote:
> Bruce Momjian wrote:
> > Stuart Bishop wrote:
> > -- Start of PGP signed section.
> >> Arnaud Lesauvage wrote:
> >>> Hi list !
> >>>
> >>> I have a small enterprise network (~15 workstations, 1 server), all
> >>> running windows OSes. Most of our work is done on a PostgreSQL DB (on
> >>> the windows server).
> >>> I am the only IT here, and my boss asked me to find a way to have the
> >>> database always online, without my intervention.
> >>> Last time I went on vacation, the server crashed and no one was able to
> >>> repair it.
>
> Then your boss needs to ante up for a support contract. Replication
> won't fix the problem you describe above. The problem you describe above
>   is lack of planning and resources.
>
> You can find support contracts for reasonable rates from
>
> Command Prompt (us) http://www.commandprompt.com/
> Pervasive (them) http://www.pervasive.com/

Of course, the only community-approved URL for professional support is:

    http://www.postgresql.org/support/professional_support

--
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +