Re: pg_listener entries deleted under heavy NOTIFY load only on Windows - Mailing list pgsql-bugs

From Tom Lane
Subject Re: pg_listener entries deleted under heavy NOTIFY load only on Windows
Date
Msg-id 20137.1233168073@sss.pgh.pa.us
Whole thread Raw
In response to Re: pg_listener entries deleted under heavy NOTIFY load only on Windows  ("Marshall, Steve" <smarshall@wsi.com>)
Responses Re: pg_listener entries deleted under heavy NOTIFY load only on Windows
List pgsql-bugs
"Marshall, Steve" <smarshall@wsi.com> writes:
> I don't think a check for process existance is a bad idea, or even a
> bandaid.  The comment in the code block in async.c says it is removing
> the entry in pg_listener because the backend process does not exist.

In general, the way to see if a process exists is to try to kill() it;
the fact that the kill failed is sufficient proof, at least in
Unix-land.  If it's possible for kill() to fail for transient reasons
in our Windows implementation, that's a bug in the Windows emulation
of kill.

Another reason behind the async.c coding is that even if the process
does still exist, there's no point in maintaining a pg_listener entry
for it if we can't signal it.

Thirdly, this is hardly the only place where we expect kill() to work
reliably.  You've managed to create a reproducible case illustrating
that it's not being reliable, but the same bug might account for other
failures much harder to reproduce and investigate.

So my opinion is that the real issue here is why is the kill()
implementation failing when it should not.  We need to fix that,
not put band-aids in async.c.

As to how to fix it, I'll defer to other people more
Windows-knowledgeable.  Maybe taking out the timeout is really
the best answer.

            regards, tom lane

pgsql-bugs by date:

Previous
From: "Marshall, Steve"
Date:
Subject: Re: pg_listener entries deleted under heavy NOTIFY load only on Windows
Next
From: Teodor Sigaev
Date:
Subject: Re: server crash when tsearch2 function is called from update trigger