Thread: BUG #15196: bogus data in lock file "postmaster.pid"

BUG #15196: bogus data in lock file "postmaster.pid"

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      15196
Logged by:          Daniel Migowski
Email address:      dmigowski@gmail.com
PostgreSQL version: 9.5.13
Operating system:   Windows
Description:

Hello,

my system crashed and after a reboot PostgreSQL would not come up again.
After some while I found out that there was information in the System event
logs and it was immediately clear that I had to remove the stale log file.


On my Linux production systems it is not that bad, because Systemd kills the
pid files on reboots, but I somehow feel like on Windows the database
shouldn't be that bad. I always tell my customers that with a cluster of MS
SQL Servers, Oracle Servers and PostgreSQL servers after a Power out only
all the PostgreSQL servers will come up flawlessly. And now I see that the
system is broken - because of a stale pid file. This is somehow absurd and
unbelievable. 

I know this topic has been brought up more than once, and from a user point
of view this just feels that the database is extremely fragile (which from
my experience isn't the case).

Why don't you implement one of the tons of other possibilites to log the
current installation on a windows system? One possibility would be named
pipes, which are unique to a system, no matter which user created it. I
personally implemented that for one of my applications, by encoding the
directory somehow and instantiate a named pipe named
"//./pipe/PostgreSQL@c_Program_Files_PostgrSQL_10.5". 

The server that holds the pipe is the running server. If the server crashed
the pipe is automatically closed. All problems gone. 

Please consider this. Would have spared me a small headache and panic today.


PS: Have a look at the CreateNamedFunction at
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365150(v=vs.85).aspx,
call it with FILE_FLAG_FIRST_PIPE_INSTANCE and nMaxInstances=1, and you have
a working check. Looks even simpler than the code I image you are using for
the pid files.


Re: BUG #15196: bogus data in lock file "postmaster.pid"

From
Tom Lane
Date:
=?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes:
> my system crashed and after a reboot PostgreSQL would not come up again.
> After some while I found out that there was information in the System event
> logs and it was immediately clear that I had to remove the stale log file.
> On my Linux production systems it is not that bad, because Systemd kills the
> pid files on reboots, but I somehow feel like on Windows the database
> shouldn't be that bad.

TBH, my opinion about that is "don't use Windows for production"; but
especially not an installation you haven't vetted for plug-pull safety.
A database can't be any more robust than the platform it sits on top of.

(That goes just as much for non-Windows of course.  If you haven't
verified fsync safety of a Unix-ish system, it probably isn't safe.)

> Why don't you implement one of the tons of other possibilites to log the
> current installation on a windows system?

I have no particular desire to do this differently on Windows than
elsewhere, especially since there's exactly zero evidence that doing
it differently would improve anything.  Filesystem corruption after
a crash can affect lots of things.

            regards, tom lane