Thread: Postgres failover implementation

Postgres failover implementation

From

"Peter Schmidt"

Date:

12 December 2000, 11:47:00

PostgreSQL v7.0.2

My company is looking for a way to implement failover w/Postgres.

I've determined that two postmasters running on different machines (FreeBSD)
can share a single $PGDATA directory(NFS mount) as long as only one
postmaster is running at a time. Originally I thought I might be able to use
postmaster.pid to lock out the second postmaster, but the pid file is
overwritten by the second postmaster when it starts. I've tested a similar
approach using a "lockfile" i.e. create a file in $PGDATA and keep it locked
for the duration of postmaster. If a second postmaster tries to start up, it
can't because the lockfile is locked. When postmaster exits, the file is
unlinked, and the second postmaster is able to start.

Anyone have any thoughts or experiences with this? Comments?

TIA
Peter Schmidt

Re: Postgres failover implementation

From

Tom Lane

Date:

12 December 2000, 13:48:46

"Peter Schmidt" <peterjs@home.com> writes:
> My company is looking for a way to implement failover w/Postgres.
> I've determined that two postmasters running on different machines (FreeBSD)
> can share a single $PGDATA directory(NFS mount) as long as only one
> postmaster is running at a time.

Performance across an NFS mount will doubtless suck badly.  That might
be acceptable as an emergency backup mode of operation ... but if the
machine with the disk is up, you might as well be running the postmaster
there.

It sounds like you intend to have both the primary and secondary
database servers access an NFS server.  Seems like this still means a
single point of failure, ie the NFS box.  So what's the point?

> Originally I thought I might be able to use
> postmaster.pid to lock out the second postmaster, but the pid file is
> overwritten by the second postmaster when it starts.

The lockfile code assumes that if the PID in the file doesn't belong to
a live process *on the local machine*, then it's left over from a
crashed postmaster.  You could remove that check, perhaps, but then
you'd have to remove the PID file manually anytime you had a postmaster
crash.  (However, postmaster crashes are rare, so this might be OK.)

            regards, tom lane

Re: Postgres failover implementation

From

Laurentiu Drob

Date:

13 December 2000, 03:13:14

Peter Schmidt wrote:
>
> PostgreSQL v7.0.2
>
> My company is looking for a way to implement failover w/Postgres.
>
Hi Peter,

We have "fight" with this kind of stuff and found two directions:
-software RAID 1 with nbd (for nbd you can search http://freshmeat.net
site)
-another toy named drbd (http://www.complang.tuwien.ac.at/reisner/drbd/)
which is much easier in our opinion
For any questions email us at admin@cfrcta.ro

Best luck!

lwd.

Re: Postgres failover implementation

From

Tom Lane

Date:

13 December 2000, 14:59:42

"Schmidt, Peter" <peter.schmidt@prismedia.com> writes:
>> Seems like this still means a single point of failure, ie the NFS box.  So
>> what's the point?

> The idea is to have a failover for postmaster itself. I realize you stated
> that postmaster crashes are rare, but if the primary machine goes down we
> will want a secondary to come up with postmaster and other processes
> running.

I'm just wondering why you think that the NFS box will be more reliable
than either of the database-server boxes...

>> You could remove that check, perhaps, but then you'd have to remove the
> PID file manually anytime you had a postmaster crash.

> I don't want to touch postmaster.pid code, but I am working on similar code
> for a seperate lockfile. From what I understand, one of the only options is
> to use fcntl to lock a file on NFS mount. If I create the file, lock it, and
> postmaster machine dies, I'm hoping the lock will go away and the secondary
> will be able to lock it. That way I wouldn't need to manually remove
> it.

Hm.  I dunno if fcntl works across NFS or not.  Even if it does, I'll
bet the lock would not get released if the postmaster machine dies ---
AFAIK NFS does not have a notion of connections, so the NFS server would
be unlikely even to notice that one of its clients went away.  What
mechanism are you planning to use to get the secondary machine to
realize that the primary has died and it needs to do something?  Seems
like it might be better to rely on that mechanism for locking, too.

> Which brings me to another question - does postgres use file locking for
> isolation level or other database operations? If so, am I going to run into
> problems if the database is on NFS mount?

No, we don't use fcntl for locking (except for the Unix socket file, and
even that usage is going away in 7.1).

            regards, tom lane

RE: Postgres failover implementation

From

"Schmidt, Peter"

Date:

13 December 2000, 15:05:29

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, December 12, 2000 10:10 AM

>Performance across an NFS mount will doubtless suck badly.

It's a fact of life at this point. I'm hoping performance won't suck that much with 1 GB ethernet and NAS/RAID. In any case, we can't run postmaster on NFS mount machine.

> Seems like this still means a single point of failure, ie the NFS box. So what's the point?

The idea is to have a failover for postmaster itself. I realize you stated that postmaster crashes are rare, but if the primary machine goes down we will want a secondary to come up with postmaster and other processes running.

> You could remove that check, perhaps, but then you'd have to remove the PID file manually anytime you had a postmaster crash.

I don't want to touch postmaster.pid code, but I am working on similar code for a seperate lockfile. From what I understand, one of the only options is to use fcntl to lock a file on NFS mount. If I create the file, lock it, and postmaster machine dies, I'm hoping the lock will go away and the secondary will be able to lock it. That way I wouldn't need to manually remove it. Which brings me to another question - does postgres use file locking for isolation level or other database operations? If so, am I going to run into problems if the database is on NFS mount?

Thanks again for your comments.
Peter Schmidt

"Peter Schmidt" <peterjs@home.com> writes:
> My company is looking for a way to implement failover w/Postgres.
> I've determined that two postmasters running on different machines (FreeBSD)
> can share a single $PGDATA directory(NFS mount) as long as only one
> postmaster is running at a time.

Performance across an NFS mount will doubtless suck badly. That might
be acceptable as an emergency backup mode of operation ... but if the
machine with the disk is up, you might as well be running the postmaster
there.

It sounds like you intend to have both the primary and secondary
database servers access an NFS server. Seems like this still means a
single point of failure, ie the NFS box. So what's the point?

> Originally I thought I might be able to use
> postmaster.pid to lock out the second postmaster, but the pid file is
> overwritten by the second postmaster when it starts.

The lockfile code assumes that if the PID in the file doesn't belong to
a live process *on the local machine*, then it's left over from a
crashed postmaster. You could remove that check, perhaps, but then
you'd have to remove the PID file manually anytime you had a postmaster
crash. (However, postmaster crashes are rare, so this might be OK.)

regards, tom lane

Re: Postgres failover implementation

From

Anand Raman

Date:

14 December 2000, 06:29:21

hi all,
the discussion on failover made me look into qmail installation i did a
few months back..
I think qmail uses something called svscan which supervises the qmail
process and execs it again if it fails..
Couldnt some such thing be done with postgresql implementation which
checks if pg_ctl is alive and automatically restart it if it dies..
Thanks
Anand

On Wed, Dec 13, 2000 at 08:30:31AM -0800, Schmidt, Peter wrote:
>-----Original Message-----
>From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
>Sent: Tuesday, December 12, 2000 10:10 AM
>
>>Performance across an NFS mount will doubtless suck badly.
>
>It's a fact of life at this point. I'm hoping performance won't suck that
>much with 1 GB ethernet and NAS/RAID. In any case, we can't run postmaster
>on NFS mount machine.
>
>> Seems like this still means a single point of failure, ie the NFS box.  So
>what's the point?
>
>The idea is to have a failover for postmaster itself. I realize you stated
>that postmaster crashes are rare, but if the primary machine goes down we
>will want a secondary to come up with postmaster and other processes
>running.
>
>> You could remove that check, perhaps, but then you'd have to remove the
>PID file manually anytime you had a postmaster crash.
>
>I don't want to touch postmaster.pid code, but I am working on similar code
>for a seperate lockfile. From what I understand, one of the only options is
>to use fcntl to lock a file on NFS mount. If I create the file, lock it, and
>postmaster machine dies, I'm hoping the lock will go away and the secondary
>will be able to lock it. That way I wouldn't need to manually remove it.
>Which brings me to another question - does postgres use file locking for
>isolation level or other database operations? If so, am I going to run into
>problems if the database is on NFS mount?
>
>Thanks again for your comments.
>Peter Schmidt
>
>
>"Peter Schmidt" <peterjs@home.com> writes:
>> My company is looking for a way to implement failover w/Postgres.
>> I've determined that two postmasters running on different machines
>(FreeBSD)
>> can share a single $PGDATA directory(NFS mount) as long as only one
>> postmaster is running at a time.
>
>Performance across an NFS mount will doubtless suck badly.  That might
>be acceptable as an emergency backup mode of operation ... but if the
>machine with the disk is up, you might as well be running the postmaster
>there.
>
>It sounds like you intend to have both the primary and secondary
>database servers access an NFS server.  Seems like this still means a
>single point of failure, ie the NFS box.  So what's the point?
>
>> Originally I thought I might be able to use
>> postmaster.pid to lock out the second postmaster, but the pid file is
>> overwritten by the second postmaster when it starts.
>
>The lockfile code assumes that if the PID in the file doesn't belong to
>a live process *on the local machine*, then it's left over from a
>crashed postmaster.  You could remove that check, perhaps, but then
>you'd have to remove the PID file manually anytime you had a postmaster
>crash.  (However, postmaster crashes are rare, so this might be OK.)
>
>            regards, tom lane

Re: Postgres failover implementation

From

Anand Raman

Date:

14 December 2000, 08:21:33

as a continuation to my post i found out that all these supervise tools
come packaged in a package called daemontools available at
http://cr.yp.to/daemontools.html.. Unfortunatly i am unable to access
these pages but some one could try
Thanx.. Hope this is of some help
Anand


On Thu, Dec 14, 2000 at 04:51:00PM +0530, Anand Raman wrote:
>hi all,
>the discussion on failover made me look into qmail installation i did a
>few months back..
>I think qmail uses something called svscan which supervises the qmail
>process and execs it again if it fails..
>Couldnt some such thing be done with postgresql implementation which
>checks if pg_ctl is alive and automatically restart it if it dies..
>Thanks
>Anand
>
>On Wed, Dec 13, 2000 at 08:30:31AM -0800, Schmidt, Peter wrote:
>>-----Original Message-----
>>From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
>>Sent: Tuesday, December 12, 2000 10:10 AM
>>
>>>Performance across an NFS mount will doubtless suck badly.
>>
>>It's a fact of life at this point. I'm hoping performance won't suck that
>>much with 1 GB ethernet and NAS/RAID. In any case, we can't run postmaster
>>on NFS mount machine.
>>
>>> Seems like this still means a single point of failure, ie the NFS box.  So
>>what's the point?
>>
>>The idea is to have a failover for postmaster itself. I realize you stated
>>that postmaster crashes are rare, but if the primary machine goes down we
>>will want a secondary to come up with postmaster and other processes
>>running.
>>
>>> You could remove that check, perhaps, but then you'd have to remove the
>>PID file manually anytime you had a postmaster crash.
>>
>>I don't want to touch postmaster.pid code, but I am working on similar code
>>for a seperate lockfile. From what I understand, one of the only options is
>>to use fcntl to lock a file on NFS mount. If I create the file, lock it, and
>>postmaster machine dies, I'm hoping the lock will go away and the secondary
>>will be able to lock it. That way I wouldn't need to manually remove it.
>>Which brings me to another question - does postgres use file locking for
>>isolation level or other database operations? If so, am I going to run into
>>problems if the database is on NFS mount?
>>
>>Thanks again for your comments.
>>Peter Schmidt
>>
>>
>>"Peter Schmidt" <peterjs@home.com> writes:
>>> My company is looking for a way to implement failover w/Postgres.
>>> I've determined that two postmasters running on different machines
>>(FreeBSD)
>>> can share a single $PGDATA directory(NFS mount) as long as only one
>>> postmaster is running at a time.
>>
>>Performance across an NFS mount will doubtless suck badly.  That might
>>be acceptable as an emergency backup mode of operation ... but if the
>>machine with the disk is up, you might as well be running the postmaster
>>there.
>>
>>It sounds like you intend to have both the primary and secondary
>>database servers access an NFS server.  Seems like this still means a
>>single point of failure, ie the NFS box.  So what's the point?
>>
>>> Originally I thought I might be able to use
>>> postmaster.pid to lock out the second postmaster, but the pid file is
>>> overwritten by the second postmaster when it starts.
>>
>>The lockfile code assumes that if the PID in the file doesn't belong to
>>a live process *on the local machine*, then it's left over from a
>>crashed postmaster.  You could remove that check, perhaps, but then
>>you'd have to remove the PID file manually anytime you had a postmaster
>>crash.  (However, postmaster crashes are rare, so this might be OK.)
>>
>>            regards, tom lane

RE: Postgres failover implementation

From

Maarten Boekhold

Date:

15 December 2000, 02:35:45



Hi,

What you'll really want is a disk array that is shared by 2 machines. The
primary database is allowed to modify the data in the array. When it dies,
the secondary database machine is allowed write access to it. This
basically how Oracle does things like this. Don't know if FreeBSD allows
you to do things like this though.

This probably wont work without some twiddling. You probably can't have
both postmasters running at the same time because of caching issues.
However, a detection mechanism that checks whether the primary postmaster
is still running and starts the secondary if the primary dies should be
possible.

You'd also want some way of switching the IP-address of the machines so
that the whole fail-over is transparent to clients.

Maarten

On 13/12/2000 17:30:31 Schmidt, Peter wrote:

>-----Original Message-----
>From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
>Sent: Tuesday, December 12, 2000 10:10 AM
>
>>Performance across an NFS mount will doubtless suck badly.
>
>It's a fact of life at this point. I'm hoping performance won't suck that
>much with 1 GB ethernet and NAS/RAID. In any case, we can't run
postmaster
>on NFS mount machine.
>
>> Seems like this still means a single point of failure, ie the NFS box.
So
>what's the point?

>"Peter Schmidt" <peterjs@home.com> writes:
>> My company is looking for a way to implement failover w/Postgres.
>> I've determined that two postmasters running on different machines
>(FreeBSD)
>> can share a single $PGDATA directory(NFS mount) as long as only one
>> postmaster is running at a time.
>
>Performance across an NFS mount will doubtless suck badly.  That might
>be acceptable as an emergency backup mode of operation ... but if the
>machine with the disk is up, you might as well be running the postmaster
>there.
>
>It sounds like you intend to have both the primary and secondary
>database servers access an NFS server.  Seems like this still means a
>single point of failure, ie the NFS box.  So what's the point?
>
>> Originally I thought I might be able to use
>> postmaster.pid to lock out the second postmaster, but the pid file is
>> overwritten by the second postmaster when it starts.
>
>The lockfile code assumes that if the PID in the file doesn't belong to
>a live process *on the local machine*, then it's left over from a
>crashed postmaster.  You could remove that check, perhaps, but then
>you'd have to remove the PID file manually anytime you had a postmaster
>crash.  (However, postmaster crashes are rare, so this might be OK.)


----

Maarten Boekhold, maarten.boekhold@reuters.com

Reuters Consulting
Entrada 308
1096 ED Amsterdam
The Netherlands
tel: +31 (0)20-6601000 (switchboard)
      +31 (0)20-6601066 (direct)
      +31 (0)20-6601005 (fax)
      +31 (0)651585137 (mobile)


-----------------------------------------------------------------
        Visit our Internet site at http://www.reuters.com

Any views expressed in this message are those of  the  individual
sender,  except  where  the sender specifically states them to be
the views of Reuters Ltd.