Re: walreceiver is uninterruptible on win32 - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: walreceiver is uninterruptible on win32
Date
Msg-id p2h3f0b79eb1004142013heb708f31v52d7f20220f4f1df@mail.gmail.com
Whole thread Raw
In response to Re: walreceiver is uninterruptible on win32  (Magnus Hagander <magnus@hagander.net>)
Responses Re: walreceiver is uninterruptible on win32  (Magnus Hagander <magnus@hagander.net>)
Re: walreceiver is uninterruptible on win32  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wed, Apr 14, 2010 at 11:15 PM, Magnus Hagander <magnus@hagander.net> wrote:
>> http://archives.postgresql.org/pgsql-hackers/2010-04/msg00077.php
>>> As for the code itself, don't we need a CHECK_FOR_INTERRUPTS in there
>>> for it to be actually useful?
>>
>> Since the backend version of select() receives any incoming signals
>> on Win32, CHECK_FOR_INTERRUPTS seems not to be needed in the loop,
>> at least in walreceiver. No? The patch doesn't put it in there, and
>> I was able to interrupt walreceiver executing libpqrcv_PQexec() when
>> I tested the patch on Win32.
>
> It will call the signal handler, yes. Normally, the signal handler
> just sets a flag somewhere, which needs to be checked with
> CHECK_FOR_INTERRUPTS.
>
> From how I read the walreceiver.c code (which I'm not that familiar
> with), the signal handlers call ProcessWalRcvInterrupts() which in
> turn has CHECK_FOR_INTERRUPTS in it, and this is where it ends up
> being called.

Yes. While establishing replication connection (i.e., executing
walrcv_connect function), the SIGTERM signal handler directly calls
ProcessWalRcvInterrupts() which does CHECK_FOR_INTERRUPTS() and
elog(FATAL).

>>> The patch also seems confused about whether it's intending to be a
>>> general-purpose solution or not.  You can maybe take some shortcuts
>>> if it's only going to be for walreceiver, but if you're going to put
>>> it in dblink it is *not* acceptable to take shortcuts like not
>>> processing errors completely.
>>
>> The patch still takes some shortcuts since we decided to postpone
>> the fix for dblink to 9.1 or later.
>
> Given those shortcuts, can't we simplify it even further like
> attached?

No, we need to repeat PQgetResult() at least two times. The first call
of it reads the RowDescription, DataRow and CommandComplete messages
and switches the state to PGASYNC_READY. The second one reads the
ReadyForQuery message and switches the state to PGASYNC_IDLE. So if we
don't repeat it, libpqrcv_PQexec() would end in a half-finished state.

> (If nothing else, your code did PQclear() on an
> uninitialized pointer - must've been pure luck that it worked)

PQclear(NULL) might be called in my patch, but this is not a problem
since PQclear() does nothing if the specified PGresult argument is NULL.

> Looking at the call-sites, there are bugs now - if PQexec() returns
> NULL, we don't deal with it. It also doesn't always free the result
> properly. I've added checks for that.

As Tom pointed out in another post, we don't need to treat the
"result is NULL" case as special. OTOH, as you pointed out, I
forgot calling PQclear() when the second call of libpqrcv_PQexec()
in libpqrcv_connect() fails. I added it to the patch. Thanks!

> Finally, I've updated some of the comments.

Thanks a lot! I applied that improvements to the patch.

I attached the revised patch.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Thoughts on pg_hba.conf rejection
Next
From: Greg Smith
Date:
Subject: Re: Rogue TODO list created