On Thu, Apr 14, 2011 at 10:26:33AM -0400, A.M. wrote:
> 1) the SysV nattch method's primary purpose is to protect the shmem
> region. This is no longer necessary in my patch because the shared
> memory in unlinked immediately after creation, so only the initial
> postmaster and its children have access.
Umm, you don't unlink SysV shared memory. All the flag does is make
sure it goes away when the last user goes away. In the mean time people
can still connect to it.
> The lock file contents are currently important to get the pid of a
> potential, conflicting postmaster. With the fcntl API, we can return
> a live conflicting PID (whether a postmaster or a stuck child), so
> that's an improvement. This could be used, for example, for STONITH,
> to reliably kill a dying replication clone- just loop on the pids
> returned from the lock.
SysV shared memory also gives you a PID, that's the point.
>
> Even if the fcntl check passes, the pid in the lock file is checked, so the lock file behavior remains the same.
The interlock is to make sure there are no living postmaster children.
The lockfile won't tell you that. So the issue is that while fcntl can
work, sysv can do better.
Also, I think you underestimate the value of the current interlock.
Before this people did manage to trash their databases regularly this
way. Lockfiles can be deleted and yes, people do it all the time.
Actually, it occurs to me you can solve NFS problem by putting the
lockfile in the socket dir. That can't possibly be on NFS.
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
> - Charles de Gaulle