Thread: file-locking and postmaster.pid
Hi all. I've experienced several times that PG has died somehow and the postmaster.pid file still exists 'cause PG hasn't had the ability to delete it upon proper shutdown. Upon start-up, after such an incidence, PG tells me another PG is running and that I either have to shut down the other instance, or delete the postmaster.pid file if there really isn't an instance running. This seems totally unnecessary to me. Why doesn't PG use file-locking to tell if another PG is running or not? If PG holds an exclusive-lock on the pid-file and the process crashes, or shuts down, then the lock(which is process-based and controlled by the kernel) will be removed and another PG which tries to start up can detect that. Using the existence of the pid-file as the only evidence gives too many false positives IMO. I'm sure there's a good reason for having it the way it is, having so many smart knowledgeable people working on this project. Could someone please explain the rationale of the current solution to me? -- Andreas Joseph Krogh <andreak@officenet.no> Senior Software Developer / Manager gpg public_key: http://dev.officenet.no/~andreak/public_key.asc ------------------------+---------------------------------------------+ OfficeNet AS | The most difficult thing in the world is to | Hoffsveien 17 | know how to do a thing and to watch | PO. Box 425 Skøyen | somebody else doing it wrong, without | 0213 Oslo | comment. | NORWAY | | Phone : +47 22 13 01 00 | | Direct: +47 22 13 10 03 | | Mobile: +47 909 56 963 | | ------------------------+---------------------------------------------+
Andreas Joseph Krogh <andreak@officenet.no> writes: > I've experienced several times that PG has died somehow and the postmaster.pid > file still exists 'cause PG hasn't had the ability to delete it upon proper > shutdown. Upon start-up, after such an incidence, PG tells me another PG is > running and that I either have to shut down the other instance, or delete the > postmaster.pid file if there really isn't an instance running. This seems > totally unnecessary to me. The postmaster does check to see whether the PID mentioned in the file is still alive, so it's not that easy for the above to happen. If you can provide details of a scenario where a failure is likely, we'd like to know about it. Also, what PG version are you talking about? > Why doesn't PG use file-locking to tell if another > PG is running or not? Portability. regards, tom lane
On Tuesday 23 May 2006 17:54, Tom Lane wrote: > Andreas Joseph Krogh <andreak@officenet.no> writes: > > I've experienced several times that PG has died somehow and the > > postmaster.pid file still exists 'cause PG hasn't had the ability to > > delete it upon proper shutdown. Upon start-up, after such an incidence, > > PG tells me another PG is running and that I either have to shut down the > > other instance, or delete the postmaster.pid file if there really isn't > > an instance running. This seems totally unnecessary to me. > > The postmaster does check to see whether the PID mentioned in the file > is still alive, so it's not that easy for the above to happen. If you > can provide details of a scenario where a failure is likely, we'd like > to know about it. Also, what PG version are you talking about? I have experienced this with PG-8.1.3 and will provide details if I can make it happen. Basically it has happened when I have had to "hard-reset" my laptop due to some strange bugs in Linux which have made it hang. > > Why doesn't PG use file-locking to tell if another > > PG is running or not? > > Portability. Ok. -- Andreas Joseph Krogh <andreak@officenet.no> Senior Software Developer / Manager gpg public_key: http://dev.officenet.no/~andreak/public_key.asc ------------------------+---------------------------------------------+ OfficeNet AS | The most difficult thing in the world is to | Hoffsveien 17 | know how to do a thing and to watch | PO. Box 425 Skøyen | somebody else doing it wrong, without | 0213 Oslo | comment. | NORWAY | | Phone : +47 22 13 01 00 | | Direct: +47 22 13 10 03 | | Mobile: +47 909 56 963 | | ------------------------+---------------------------------------------+
Andreas Joseph Krogh <andreak@officenet.no> writes: > On Tuesday 23 May 2006 17:54, Tom Lane wrote: >> The postmaster does check to see whether the PID mentioned in the file >> is still alive, so it's not that easy for the above to happen. If you >> can provide details of a scenario where a failure is likely, we'd like >> to know about it. Also, what PG version are you talking about? > I have experienced this with PG-8.1.3 and will provide details if I can make > it happen. Basically it has happened when I have had to "hard-reset" my > laptop due to some strange bugs in Linux which have made it hang. If you're talking about a postmaster that's auto-started during the boot sequence, then there is a risk depending on what start script you use. The problem is that depending on what else runs during the system startup, the PID assigned to the postmaster might be the same as in the last boot cycle, or it might be different by one or two counts. The postmaster disregards a pidfile containing its own PID, or its parent process' PID, or a PID not belonging to a postgres-owned process. That covers most cases but if your start script does something like su -l postgres -c "pg_ctl start ..." then you have a situation where not only the parent process (pg_ctl) but also the grandparent (a shell) is postgres-owned, and if the pidfile PID happens to match the grandparent then you lose. Solution is to either not use pg_ctl here, or write "exec pg_ctl start ...", so that there's only one postgres-owned process besides the postmaster itself. Initscripts published by PGDG itself and by Red Hat have gotten this right for awhile, but I suspect the word has not propagated to all distros. regards, tom lane
On Tue, May 23, 2006 at 05:23:16PM +0200, Andreas Joseph Krogh wrote: > Hi all. > > I've experienced several times that PG has died somehow and the postmaster.pid > file still exists 'cause PG hasn't had the ability to delete it upon proper > shutdown. Upon start-up, after such an incidence, PG tells me another PG is > running and that I either have to shut down the other instance, or delete the > postmaster.pid file if there really isn't an instance running. This seems > totally unnecessary to me. Why doesn't PG use file-locking to tell if another > PG is running or not? If PG holds an exclusive-lock on the pid-file and the > process crashes, or shuts down, then the lock(which is process-based and > controlled by the kernel) will be removed and another PG which tries to start > up can detect that. Using the existence of the pid-file as the only evidence > gives too many false positives IMO. Well, maybe you could tweak postgres startup script, add check for post master (either 'pgrep postmaster' or 'ps -axu | grep [p]ostmaster'), and delete pid file on negative results. i.e. #!/bin/bash PID=`pgrep -f /usr/bin/postmaster`; if [[ $PID ]]; then echo "'$PID'"; # postgres is already running else echo "Postmaster is not running"; # delete stale PID file fi
Adis Nezirovic <adis@linux.org.ba> writes: > Well, maybe you could tweak postgres startup script, add check for post > master (either 'pgrep postmaster' or 'ps -axu | grep [p]ostmaster'), and > delete pid file on negative results. This is exactly what you should NOT do. A start script that thinks it is smarter than the postmaster is almost certainly wrong. It is certainly dangerous, too, because auto-deleting that pidfile destroys the interlock against having two postmasters running in the same data directory (which WILL corrupt your data, quickly and irretrievably). All it takes to cause a problem is to use the start script to start a postmaster, forgetting that you already have one running ... regards, tom lane
On Tue, May 23, 2006 at 01:36:41PM -0400, Tom Lane wrote: > This is exactly what you should NOT do. > > A start script that thinks it is smarter than the postmaster is almost > certainly wrong. It is certainly dangerous, too, because auto-deleting > that pidfile destroys the interlock against having two postmasters > running in the same data directory (which WILL corrupt your data, > quickly and irretrievably). All it takes to cause a problem is to > use the start script to start a postmaster, forgetting that you already > have one running ... I do agree with you that we should not play games with postmaster. Better to be safe than sorry. (So, manually deleting pid file is the only safe option). I was just suggestion (possibly dangerous) workaround. Btw, I do check for running postmaster, using full path (I don't wan to kill every postmaster on the system), is this safe? Or there could be race condition?
On Tuesday 23 May 2006 19:36, Tom Lane wrote: > Adis Nezirovic <adis@linux.org.ba> writes: > > Well, maybe you could tweak postgres startup script, add check for post > > master (either 'pgrep postmaster' or 'ps -axu | grep [p]ostmaster'), and > > delete pid file on negative results. > > This is exactly what you should NOT do. > > A start script that thinks it is smarter than the postmaster is almost > certainly wrong. It is certainly dangerous, too, because auto-deleting > that pidfile destroys the interlock against having two postmasters > running in the same data directory (which WILL corrupt your data, > quickly and irretrievably). All it takes to cause a problem is to > use the start script to start a postmaster, forgetting that you already > have one running ... My PG is not started with startup-scripts, but with this command: pg_ctl -D $PGDATA -l $PGDIR/log/logfile-`date +%Y-%m-%d`.log start -- Andreas Joseph Krogh <andreak@officenet.no> Senior Software Developer / Manager gpg public_key: http://dev.officenet.no/~andreak/public_key.asc ------------------------+---------------------------------------------+ OfficeNet AS | The most difficult thing in the world is to | Hoffsveien 17 | know how to do a thing and to watch | PO. Box 425 Skøyen | somebody else doing it wrong, without | 0213 Oslo | comment. | NORWAY | | Phone : +47 22 13 01 00 | | Direct: +47 22 13 10 03 | | Mobile: +47 909 56 963 | | ------------------------+---------------------------------------------+
On Wednesday 24 May 2006 11:36, Andreas Joseph Krogh wrote: > On Tuesday 23 May 2006 19:36, Tom Lane wrote: > > Adis Nezirovic <adis@linux.org.ba> writes: > > > Well, maybe you could tweak postgres startup script, add check for post > > > master (either 'pgrep postmaster' or 'ps -axu | grep [p]ostmaster'), > > > and delete pid file on negative results. > > > > This is exactly what you should NOT do. > > > > A start script that thinks it is smarter than the postmaster is almost > > certainly wrong. It is certainly dangerous, too, because auto-deleting > > that pidfile destroys the interlock against having two postmasters > > running in the same data directory (which WILL corrupt your data, > > quickly and irretrievably). All it takes to cause a problem is to > > use the start script to start a postmaster, forgetting that you already > > have one running ... > > My PG is not started with startup-scripts, but with this command: > > pg_ctl -D $PGDATA -l $PGDIR/log/logfile-`date +%Y-%m-%d`.log start ... and manually after login, ie. not at boot-time. -- Andreas Joseph Krogh <andreak@officenet.no> Senior Software Developer / Manager gpg public_key: http://dev.officenet.no/~andreak/public_key.asc ------------------------+---------------------------------------------+ OfficeNet AS | The most difficult thing in the world is to | Hoffsveien 17 | know how to do a thing and to watch | PO. Box 425 Skøyen | somebody else doing it wrong, without | 0213 Oslo | comment. | NORWAY | | Phone : +47 22 13 01 00 | | Direct: +47 22 13 10 03 | | Mobile: +47 909 56 963 | | ------------------------+---------------------------------------------+
On 5/24/06, Andreas Joseph Krogh <andreak@officenet.no> wrote: > > My PG is not started with startup-scripts, but with this command: > > > > pg_ctl -D $PGDATA -l $PGDIR/log/logfile-`date +%Y-%m-%d`.log start > > ... and manually after login, ie. not at boot-time. I'd suggest trying to fix your Linux-install instead of mucking about with Postgres, and this really a pgsql-novice question, not a -hackers thing. Cheers, Andrej -- Please don't top post, and don't use HTML e-Mail :} Make your quotes concise. http://www.american.edu/econ/notes/htmlmail.htm
On Wednesday 24 May 2006 21:03, korry wrote: > > I'm sure there's a good reason for having it the way it is, having so > > many smart knowledgeable people working on this project. Could someone > > please explain the rationale of the current solution to me? > > We've ignored Andreas' original question. Why not use a lock to > indicate that the postmaster is still running? At first blush, that > seems more reliable than checking for a (possibly recycled) process ID. As Tom replied: Portability. -- Andreas Joseph Krogh <andreak@officenet.no> Senior Software Developer / Manager gpg public_key: http://dev.officenet.no/~andreak/public_key.asc ------------------------+---------------------------------------------+ OfficeNet AS | The most difficult thing in the world is to | Hoffsveien 17 | know how to do a thing and to watch | PO. Box 425 Skøyen | somebody else doing it wrong, without | 0213 Oslo | comment. | NORWAY | | Phone : +47 22 13 01 00 | | Direct: +47 22 13 10 03 | | Mobile: +47 909 56 963 | | ------------------------+---------------------------------------------+
On Wednesday 24 May 2006 20:52, Andrej Ricnik-Bay wrote: > On 5/24/06, Andreas Joseph Krogh <andreak@officenet.no> wrote: > > > My PG is not started with startup-scripts, but with this command: > > > > > > pg_ctl -D $PGDATA -l $PGDIR/log/logfile-`date +%Y-%m-%d`.log start > > > > ... and manually after login, ie. not at boot-time. > > I'd suggest trying to fix your Linux-install instead of mucking > about with Postgres, and this really a pgsql-novice question, > not a -hackers thing. I'm sorry, can't resist, but this has to be *the* dumbest reply to these sort of questions. What makes you think it *only* happens when linux freezes(btw, I suspect my NVIDIA-driver to be the problem on my laptop, not Linux itself). Still - PG *should* handle that situation too, it's like a power outage. I've been using Linux exclusively since '96 and PG since 6.5, so I don't consider myself a novice in neither. Why PG doesn't use locking *is* definitely a -hackers thing. -- Andreas Joseph Krogh <andreak@officenet.no> Senior Software Developer / Manager gpg public_key: http://dev.officenet.no/~andreak/public_key.asc ------------------------+---------------------------------------------+ OfficeNet AS | The most difficult thing in the world is to | Hoffsveien 17 | know how to do a thing and to watch | PO. Box 425 Skøyen | somebody else doing it wrong, without | 0213 Oslo | comment. | NORWAY | | Phone : +47 22 13 01 00 | | Direct: +47 22 13 10 03 | | Mobile: +47 909 56 963 | | ------------------------+---------------------------------------------+
<blockquote type="CITE"><pre> <font color="#000000">I'm sure there's a good reason for having it the way it is, having so many </font> <font color="#000000">smart knowledgeable people working on this project. Could someone please </font> <font color="#000000">explain the rationale of the current solution to me?</font> </pre></blockquote><pre> </pre> We've ignored Andreas' original question. Why not use a lock to indicate that the postmaster is still running? Atfirst blush, that seems more reliable than checking for a (possibly recycled) process ID.<br /><br /><br /> -- Korry<br /><br />
<blockquote type="CITE"><pre> <font color="#000000">On Wednesday 24 May 2006 21:03, korry wrote:</font> <font color="#000000">> > I'm sure there's a good reason for having it the way it is, having so</font> <font color="#000000">> > many smart knowledgeable people working on this project. Could someone</font> <font color="#000000">> > please explain the rationale of the current solution to me?</font> <font color="#000000">></font> <font color="#000000">> We've ignored Andreas' original question. Why not use a lock to</font> <font color="#000000">> indicate that the postmaster is still running? At first blush, that</font> <font color="#000000">> seems more reliable than checking for a (possibly recycled) process ID.</font> <font color="#000000">As Tom replied: Portability.</font> </pre></blockquote><br /> Thanks - I missed that part of Tom's message. <br /><br /><br /> The only platform (although certainlynot a minor issue) that I can think of that would have a portability issue would be Win32. You can't even <i>read</i>a locked byte in Win32. I usually solve that problem by locking a byte past the end of the file (which is portable).<br/><br /> Is there some other portability issue that I'm missing?<br /><br /><br /> -- Korry<br /><br/><br />
korry wrote: > The only platform (although certainly not a minor issue) that I can > think of that would have a portability issue would be Win32. You can't > even read a locked byte in Win32. I usually solve that problem by > locking a byte past the end of the file (which is portable). Certainly on all platforms there must be *some* locking primitive. We just need to figure out the appropiate parameters to fcntl() or flock() or lockf() on each. The Win32 API for locking seems mighty strange to me. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
<br /><blockquote type="CITE"><pre> <font color="#000000">Certainly on all platforms there must be *some* locking primitive. We</font> <font color="#000000">just need to figure out the appropiate parameters to fcntl() or flock()</font> <font color="#000000">or lockf() on each.</font> </pre></blockquote> Right. <pre> </pre><blockquote type="CITE"><pre> <font color="#000000">The Win32 API for locking seems mighty strange to me.</font> </pre></blockquote><pre> </pre> Linux/Unix byte locking is advisory (meaning that one lock can block another lock, but it can't block a read). Win32locking is mandatory (at least in the most portable form) so a lock blocks a reader. To avoid that problem, youlocka byte that you never intend to read (that is, you lock a byte past the end of the file). Locking past the end-of-fileis portable to all Unix/Linux systems that I've seen (that way, you can lock a region of a file before you growthe file).<br /><br /> -- Korry<br /><br />
Alvaro Herrera wrote: > korry wrote: > > >> The only platform (although certainly not a minor issue) that I can >> think of that would have a portability issue would be Win32. You can't >> even read a locked byte in Win32. I usually solve that problem by >> locking a byte past the end of the file (which is portable). >> > > Certainly on all platforms there must be *some* locking primitive. We > just need to figure out the appropiate parameters to fcntl() or flock() > or lockf() on each. > > The Win32 API for locking seems mighty strange to me. > > We use file locking on Win32 (and on all other platforms) in the buildfarm ... it's done from perl so maybe perl does some magic under the hood. The call looks just the same, and works fine on W32, I believe. It is roughly: use Fcntl qw(:flock); open($lockfile,">builder.LCK") || die "opening lockfile"; exit(0) unless flock($lockfile,LOCK_EX|LOCK_NB); cheers andrew
korry wrote: > > The Win32 API for locking seems mighty strange to me. > > Linux/Unix byte locking is advisory (meaning that one lock can block > another lock, but it can't block a read). No -- it is advisory meaning that a process that does not try to acquire the lock is not locked out. You can certainly block a file in exclusive mode, using the LOCK_EX flag. (And at least on my Linux system, there is mandatory locking too, using the fcntl() interface). I think the next question is -- how would the lock interface be used? We could acquire an exclusive lock on postmaster start (to make sure no backend is running), then reduce it to a shared lock. Every backend would inherit the shared lock. But the lock exchange is not guaranteed to be atomic so a new postmaster could start just after we acquire the lock and acquire the shared lock. It'd need to be complemented with another lock. > Win32 locking is mandatory (at least in the most portable form) so a > lock blocks a reader. There is also shared/exclusive locking of a file on Win32. My comment weas more directed at the fact that you have to "create some sort of lock handle" from a file handle and then lock the lock handle, or something like that. I don't recall the exact details but it was strange (as opposed to just open and then flock). -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Certainly on all platforms there must be *some* locking primitive. We > just need to figure out the appropiate parameters to fcntl() or flock() > or lockf() on each. Quite aside from the hassle factor of needing to deal with N variants of the syscalls, I'm not convinced that it's guaranteed to work. ISTR that for instance NFS file locking is pretty much Alice-in-Wonderland :-( Since the entire point here is to have a guaranteed bulletproof check, locks that work most of the time on most platforms/filesystems aren't gonna be an improvement. regards, tom lane
Andrew Dunstan wrote: > We use file locking on Win32 (and on all other platforms) in the > buildfarm ... it's done from perl so maybe perl does some magic under > the hood. The call looks just the same, and works fine on W32, I > believe. It is roughly: > > use Fcntl qw(:flock); > open($lockfile,">builder.LCK") || die "opening lockfile"; > exit(0) unless flock($lockfile,LOCK_EX|LOCK_NB); flock on Perl is implemented using platform-dependent system calls. Per the docs, flock FILEHANDLE,OPERATION Calls flock(2), or an emulation of it, on FILEHANDLE. Returns true for success, false on failure. Produces a fatal error if used on a machine that doesn't implement flock(2),fcntl(2) locking, or lockf(3). "flock" is Perl's portable file locking interface, althoughit locks only entire files, not records. Note that it may fail! This seems to indicate that some platforms do not provide either locking mechanism. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > Note that it may fail! This seems to indicate that some platforms do > not provide either locking mechanism. (Which means the whole discussion is a waste of time) -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > Alvaro Herrera wrote: > > >> Note that it may fail! This seems to indicate that some platforms do >> not provide either locking mechanism. >> > > (Which means the whole discussion is a waste of time) > > Umm, no, I don't think so. It will block instead of failing unless you request a non blocking call. Failure means someone else holds the lock. But what Tom says about NFS is probably true, and a good enough reason not to trust locking in general for this purpose, I think cheers andrew
On Wed, 2006-05-24 at 16:34 -0400, Alvaro Herrera wrote: <blockquote type="CITE"><pre> <font color="#000000">korry wrote:</font> <font color="#000000">> > The Win32 API for locking seems mighty strange to me.</font> <font color="#000000">> </font> <font color="#000000">> Linux/Unix byte locking is advisory (meaning that one lock can block</font> <font color="#000000">> another lock, but it can't block a read).</font> <font color="#000000">No -- it is advisory meaning that a process that does not try to acquire</font> <font color="#000000">the lock is not locked out. </font> </pre></blockquote><br /> Right, that's why I said "can block" instead of "will block". An advisory lock will only blockanother locker, not another reader (except in Win32).<br /><br /><blockquote type="CITE"><pre> <font color="#000000">You can certainly block a file in exclusive</font> <font color="#000000">mode, using the LOCK_EX flag. (And at least on my Linux system, there</font> <font color="#000000">is mandatory locking too, using the fcntl() interface).</font> </pre></blockquote><br /> My fault - I'm not really talking about "file locking", I'm talking about byte-range locking (vialockf() and family). <br /><br /> I don't believe that you can use byte-range locking to block read-access to a file,you can only use byte-range locking to block other locks.<br /><br /> A simple exclusive lock on the first byte pastthe end of the file will do. <br /><br /><blockquote type="CITE"><pre> <font color="#000000">I think the next question is -- how would the lock interface be used?</font> <font color="#000000">We could acquire an exclusive lock on postmaster start (to make sure no</font> <font color="#000000">backend is running), then reduce it to a shared lock. Every backend</font> <font color="#000000">would inherit the shared lock. But the lock exchange is not guaranteed</font> <font color="#000000">to be atomic so a new postmaster could start just after we acquire the</font> <font color="#000000">lock and acquire the shared lock. It'd need to be complemented with</font> <font color="#000000">another lock.</font> </pre></blockquote><br /> You never need to reduce it to a shared lock. On postmaster startup, try to lock the sentinelbyte (one byte past the end-of-file). If you can lock it, you know that no other postmaster has that byte locked. If you can't lock it, another postmaster is running. It is an atomic operation. <br /><br /> However, Tom may becorrect about NFS locking, but I guess I'm surprised that anyone would care :-)<br /><br /><blockquote type="CITE"><pre> <font color="#000000">> Win32 locking is mandatory (at least in the most portable form) so a</font> <font color="#000000">> lock blocks a reader.</font> <font color="#000000">There is also shared/exclusive locking of a file on Win32. </font> </pre></blockquote><br /> Yes, but Win32 shared locking only works on NTFS-type file systems. And you don't need sharedlocking anyway.<br /><br /> -- Korry<br /><br /><br />
<blockquote type="CITE"><pre> <font color="#000000">Alvaro Herrera <<a href="mailto:alvherre@commandprompt.com">alvherre@commandprompt.com</a>> writes:</font> <font color="#000000">> Certainly on all platforms there must be *some* locking primitive. We</font> <font color="#000000">> just need to figure out the appropiate parameters to fcntl() or flock()</font> <font color="#000000">> or lockf() on each.</font> </pre></blockquote><br /> I use lockf() (not fcntl() or flock()) on every platform other than Win32. Of course, I may notrun on every system that PostgreSQL supports.<br /><br /><blockquote type="CITE"><pre> <font color="#000000">Quite aside from the hassle factor of needing to deal with N variants of</font> <font color="#000000">the syscalls, I'm not convinced that it's guaranteed to work. ISTR that</font> <font color="#000000">for instance NFS file locking is pretty much Alice-in-Wonderland :-(</font> <font color="#000000">Since the entire point here is to have a guaranteed bulletproof check,</font> <font color="#000000">locks that work most of the time on most platforms/filesystems aren't</font> <font color="#000000">gonna be an improvement.</font> </pre></blockquote><br /> NFS file locking may certainly be problematic. I don't know about NFS byte-range locking.<br /><br/> What we currently have in place is not bulletproof. I think holding a byte-range lock in addition to the "is theresome process with the right pid?" check might be a little more bullet resistant :-)<br /><br /><br /> --Korry<br /><br />
Andrew Dunstan wrote: > Alvaro Herrera wrote: > >Alvaro Herrera wrote: > > > >>Note that it may fail! This seems to indicate that some platforms do > >>not provide either locking mechanism. > > > >(Which means the whole discussion is a waste of time) > > Umm, no, I don't think so. It will block instead of failing unless you > request a non blocking call. Failure means someone else holds the lock. I removed the part of the manual I had written which said that it will raise an error if the platform it's running doesn't have any locking primitive. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
korry <korry@appx.com> writes: > However, Tom may be correct about NFS locking, but I guess I'm surprised > that anyone would care :-) Whether we think it's a real good idea or not, *plenty* of people run databases across NFS. We can't blow off that set of users. regards, tom lane
korry wrote: > > I think the next question is -- how would the lock interface be used? > > We could acquire an exclusive lock on postmaster start (to make sure no > > backend is running), then reduce it to a shared lock. Every backend > > would inherit the shared lock. But the lock exchange is not guaranteed > > to be atomic so a new postmaster could start just after we acquire the > > lock and acquire the shared lock. It'd need to be complemented with > > another lock. > > You never need to reduce it to a shared lock. On postmaster startup, > try to lock the sentinel byte (one byte past the end-of-file). If you > can lock it, you know that no other postmaster has that byte locked. If > you can't lock it, another postmaster is running. It is an atomic > operation. This doesn't work if the postmaster dies but a backend continues to run, which is arguably the most important case we need to protect against. > However, Tom may be correct about NFS locking, but I guess I'm surprised > that anyone would care :-) Quite a lot of people run NFS-mounted data directories ... -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
korry <korry@appx.com> writes: > What we currently have in place is not bulletproof. Well, it fails in the safe direction: the postmaster may occasionally refuse to start when it should, but it won't ever start when it should not. It appears to me that anything relying on file locking will tend to fail in the other direction, and that's not acceptable IMHO. regards, tom lane
<blockquote type="CITE"><pre> <font color="#000000">> What we currently have in place is not bulletproof.</font> <font color="#000000">Well, it fails in the safe direction: the postmaster may occasionally</font> <font color="#000000">refuse to start when it should, but it won't ever start when it should</font> <font color="#000000">not. It appears to me that anything relying on file locking will tend</font> <font color="#000000">to fail in the other direction, and that's not acceptable IMHO.</font> </pre></blockquote><br /> I was suggesting that we keep the current check in place too - if the lock exists, another postmastermust be running, if the lock doesn't exist, check the pid.<br /><br /> However...<br /><br /> Thinking a littleharder about Andreas' original suggestion... what he's really suggesting is an exclusion mechanism that relies on thekernel to clean up after a shared process (with no danger of recycling, like a pid will do).<br /><br /> How about a semaphorewith a SEM_UNDO? That's guaranteed atomic (or it better be :-), the kernel automatically cleans up after a failure,if the mechanism fails, it fails in the safe direction (the kernel may not have cleaned up the semaphore before anew postmaster starts). And, I think it would be reasonably portable - I haven't carefully eyeballed the Win32 semaphorecode so I don't know if it supports SEM_UNDO.<br /><br /> (Sorry if this has been suggested before)<br /><br /> -- Korry<br /><br />
<blockquote type="CITE"><pre> <font color="#000000">> You never need to reduce it to a shared lock. On postmaster startup,</font> <font color="#000000">> try to lock the sentinel byte (one byte past the end-of-file). If you</font> <font color="#000000">> can lock it, you know that no other postmaster has that byte locked. If</font> <font color="#000000">> you can't lock it, another postmaster is running. It is an atomic</font> <font color="#000000">> operation. </font> <font color="#000000">This doesn't work if the postmaster dies but a backend continues to run,</font> <font color="#000000">which is arguably the most important case we need to protect against.</font> </pre></blockquote><br /> I may be confused here, but I don't see the problem - byte-range locks are not inherited acrossa fork. A backend would never hold the lock, a backend would never even look for the lock.<br /><br /><br /><blockquotetype="CITE"><pre> <font color="#000000">> However, Tom may be correct about NFS locking, but I guess I'm surprised</font> <font color="#000000">> that anyone would care :-)</font> <font color="#000000">Quite a lot of people run NFS-mounted data directories ...</font> </pre></blockquote><br /> I'm happy to take your word for that, and I agree that if NFS is important and locking is brain-deadon NFS, then relying solely on a lock is unacceptable.<br /><br /><br /> -- Korry<br /><br />
korry <korry@appx.com> writes: >> Well, it fails in the safe direction: the postmaster may occasionally >> refuse to start when it should, but it won't ever start when it should >> not. It appears to me that anything relying on file locking will tend >> to fail in the other direction, and that's not acceptable IMHO. > I was suggesting that we keep the current check in place too - if the > lock exists, another postmaster must be running, if the lock doesn't > exist, check the pid. But then you've not accomplished anything. The complaints about the pid-based mechanism are about false positives, not false negatives. Adding an independent check won't eliminate the false positives. > How about a semaphore with a SEM_UNDO? That's guaranteed atomic (or it > better be :-), the kernel automatically cleans up after a failure, if > the mechanism fails, it fails in the safe direction (the kernel may not > have cleaned up the semaphore before a new postmaster starts). And, I > think it would be reasonably portable - I haven't carefully eyeballed > the Win32 semaphore code so I don't know if it supports SEM_UNDO. We already have two platforms that don't use the SysV semaphore interface, and even on ones that have it, I wouldn't want to assume they all support SEM_UNDO. But aside from any portability issues, ISTM this would have its own failure modes. In particular you still have to rely on a pid-file (only now it's holding a semaphore ID not a PID), and there's still a bit of a leap of faith required to get from the observation that somebody is holding a lock on semaphore X to the conclusion that that somebody is a conflicting postmaster. It doesn't look to me like this is any better than the PID solution, really, as far as false positives go. As for false negatives: ipcrm. regards, tom lane
korry wrote: > > > You never need to reduce it to a shared lock. On postmaster startup, > > > try to lock the sentinel byte (one byte past the end-of-file). If you > > > can lock it, you know that no other postmaster has that byte locked. If > > > you can't lock it, another postmaster is running. It is an atomic > > > operation. > > > > This doesn't work if the postmaster dies but a backend continues to run, > > which is arguably the most important case we need to protect against. > > I may be confused here, but I don't see the problem - byte-range locks > are not inherited across a fork. A backend would never hold the lock, a > backend would never even look for the lock. Well, you are wrong here. We _want_ every backend to hold a shared lock. We need to stop a postmaster from starting if there is a backend running that was started by a no-longer-running postmaster. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Well, you are wrong here. We _want_ every backend to hold a shared > lock. We need to stop a postmaster from starting if there is a backend > running that was started by a no-longer-running postmaster. Note that we currently rely on checking for SysV shared memory attach counts to protect against this case; the postmaster PID doesn't enter into it. We don't have to insist on the postmaster interlock handling this too. (Although surely it'd be nice to not depend on SysV attach counts for this, because that's a portability issue in itself.) regards, tom lane
<br /><blockquote type="CITE"><pre> <font color="#000000">We already have two platforms that don't use the SysV semaphore</font> <font color="#000000">interface, and even on ones that have it, I wouldn't want to assume they</font> <font color="#000000">all support SEM_UNDO.</font> </pre></blockquote> Which platforms, just out of curiousity? I assume that Win32 is one of them.<br /><br /><blockquotetype="CITE"><pre> <font color="#000000">But aside from any portability issues, ISTM this would have its own</font> <font color="#000000">failure modes. In particular you still have to rely on a pid-file</font> <font color="#000000">(only now it's holding a semaphore ID not a PID)</font> </pre></blockquote> You've lost me... why would you store the semid and <i>not</i> the pid? I was thinking that the semidmight be a postgresql.conf thingie.<br /><br /><blockquote type="CITE"><pre> <font color="#000000"> and there's still</font> <font color="#000000">a bit of a leap of faith required to get from the observation that</font> <font color="#000000">somebody is holding a lock on semaphore X to the conclusion that that</font> <font color="#000000">somebody is a conflicting postmaster. </font> </pre></blockquote> Isn't that sort of like saying that if a postmaster.pid file exists, it must have been written by a postmaster? Pick a semaphore id and dedicate it to postmaster exclusion. <br /><br /><blockquote type="CITE"><pre> <font color="#000000">It doesn't look to me like this</font> <font color="#000000">is any better than the PID solution, really, as far as false positives</font> <font color="#000000">go. </font> </pre></blockquote><br /> As long as the kernel cleans up SEM_UNDO semaphores, I guess I don't see have you would have afalse positive. Oh, I guess I should say that is you use a SEM_UNDO semaphore, you don't need the pid check anymore. And,no worry about NFS.<br /><br /><blockquote type="CITE"><pre> <font color="#000000">As for false negatives: ipcrm.</font> </pre></blockquote> Yes, that's a problem, but I think it's the same as "rm postmaster.pid", isn't it?<br /><br />
<blockquote type="CITE"><pre> <font color="#000000">> > > You never need to reduce it to a shared lock. On postmaster startup,</font> <font color="#000000">> > > try to lock the sentinel byte (one byte past the end-of-file). If you</font> <font color="#000000">> > > can lock it, you know that no other postmaster has that byte locked. If</font> <font color="#000000">> > > you can't lock it, another postmaster is running. It is an atomic</font> <font color="#000000">> > > operation. </font> <font color="#000000">> > </font> <font color="#000000">> > This doesn't work if the postmaster dies but a backend continues to run,</font> <font color="#000000">> > which is arguably the most important case we need to protect against.</font> <font color="#000000">> </font> <font color="#000000">> I may be confused here, but I don't see the problem - byte-range locks</font> <font color="#000000">> are not inherited across a fork. A backend would never hold the lock, a</font> <font color="#000000">> backend would never even look for the lock.</font> <font color="#000000">Well, you are wrong here. We _want_ every backend to hold a shared</font> <font color="#000000">lock. We need to stop a postmaster from starting if there is a backend</font> <font color="#000000">running that was started by a no-longer-running postmaster.</font> </pre></blockquote><pre> </pre> Oh... didn't know that. How is that accomplished now? There must be some code beside the pid file check.<br /><br/> -- Korry<br /><br />
korry <korry@appx.com> writes: > Isn't that sort of like saying that if a postmaster.pid file exists, it > must have been written by a postmaster? Pick a semaphore id and > dedicate it to postmaster exclusion. That's not workable, unless you want to assume that nothing on the system except Postgres uses SysV semaphores. Otherwise something else could randomly gobble up the semid you want to use. I don't care very much for requiring a distinct semid to be hand-specified for each postmaster on a machine, either. At least for my use, that would be a grade-A PITA: I normally have several postmasters of different vintages running on the same development machine, and having to configure each one with its own semid is an extra step I'd rather not deal with. > As long as the kernel cleans up SEM_UNDO semaphores, I guess I don't see > have you would have a false positive. My point was that you couldn't reliably tell a postmaster interested in a different data directory from a postmaster interested in your own data directory. Even with a configured semid, I don't see that that's real reliable. I know the first thing I'd do is fix my postmaster start scripts to specify semid on the command line rather than requiring it to be in the conf file, and as soon as I do that, the connection to the data directory is gone :-( --- now my security is utterly dependent on not screwing up by launching a postmaster with the wrong semid for the data directory it's pointed at. The only scenario where the PID-based solution is at serious risk of false positives is where there are multiple postmasters on the same machine, so unless you've got a bulletproof answer for this case, you haven't made an improvement over what we've got. Anyway the real problem here is that neither PIDs nor semids are strongly wired to a particular data directory, which is the thing you're really trying to protect. File locks would really be much nicer all around, if we could trust them, because they *would* be directly connected to a data directory. regards, tom lane
<br /><blockquote type="CITE"><pre> <font color="#000000">That's not workable, unless you want to assume that nothing on the</font> <font color="#000000">system except Postgres uses SysV semaphores. Otherwise something else</font> <font color="#000000">could randomly gobble up the semid you want to use. I don't care very</font> <font color="#000000">much for requiring a distinct semid to be hand-specified for each</font> <font color="#000000">postmaster on a machine, either. </font> </pre></blockquote><br /> Yeah, that does suck. Ok, naming problems seem to make semaphores useless.<br /><br /> I'm backto byte-range locking, but if NFS is important and is truly unreliable, then that's out too.<br /><br /> I've never hadlocking problems on NFS (probably because we tell our users not to use NFS), but now that I think about it, SMB lockingis very unreliable so Win32 would be an issue too.<br /><br /> -- Korry<br /><br />
On Thursday 25 May 2006 14:35, korry wrote: > > That's not workable, unless you want to assume that nothing on the > > system except Postgres uses SysV semaphores. Otherwise something else > > could randomly gobble up the semid you want to use. I don't care very > > much for requiring a distinct semid to be hand-specified for each > > postmaster on a machine, either. > > Yeah, that does suck. Ok, naming problems seem to make semaphores > useless. > > I'm back to byte-range locking, but if NFS is important and is truly > unreliable, then that's out too. > > I've never had locking problems on NFS (probably because we tell our > users not to use NFS), but now that I think about it, SMB locking is > very unreliable so Win32 would be an issue too. What I don't get is why everybody think that because one solution doesn't fit all needs on all platforms(or NFS), it shouldn't be implemented on those platforms it *does* work on. Why can't those platforms(like Linux) benefit from a better solution, if one exists? There are plenty of examples of software providing better solutions on platforms supporting more features. -- Andreas Joseph Krogh <andreak@officenet.no> Senior Software Developer / Manager gpg public_key: http://dev.officenet.no/~andreak/public_key.asc ------------------------+---------------------------------------------+ OfficeNet AS | The most difficult thing in the world is to | Hoffsveien 17 | know how to do a thing and to watch | PO. Box 425 Skøyen | somebody else doing it wrong, without | 0213 Oslo | comment. | NORWAY | | Phone : +47 22 13 01 00 | | Direct: +47 22 13 10 03 | | Mobile: +47 909 56 963 | | ------------------------+---------------------------------------------+
Andreas Joseph Krogh <andreak@officenet.no> writes: > What I don't get is why everybody think that because one solution doesn't fit > all needs on all platforms(or NFS), it shouldn't be implemented on those > platforms it *does* work on. (1) Because we're not really interested in supporting multiple fundamentally different approaches to postmaster interlocking. The system is complicated enough already. (2) Because according to discussion so far, we can't rely on this "solution" anywhere. Postgres can't easily tell whether its data directory is mounted over NFS, for example. regards, tom lane