Re: NOTIFY does not work as expected - Mailing list pgsql-bugs

From Andres Freund
Subject Re: NOTIFY does not work as expected
Date
Msg-id 20181019204542.jemuckjxo7ga7q5g@alap3.anarazel.de
Whole thread Raw
In response to Re: NOTIFY does not work as expected  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: NOTIFY does not work as expected
List pgsql-bugs
On 2018-10-19 13:36:31 -0400, Tom Lane wrote:
> I wrote:
> > Andres Freund <andres@anarazel.de> writes:
> >> That distinction was introduced because people (IIRC you actually) were
> >> worried that we'd be less likely to get error messages out to the
> >> client. Especially when you check unconditionally before actually doing
> >> the write, it's going to be far less likely that we are able to send
> >> something out to the client.
> 
> > Far less likely than what?  If we got a ProcDie signal we'd more often
> > than not have handled it long before reaching here.  If we hadn't, though,
> > we could arrive here with ProcDiePending set but the latch clear, in which
> > case we fail to honor the interrupt until the client does something.
> > Don't really think that's acceptable :-(.  I'm also not seeing why it's
> > okay to commit ProcDie hara-kiri immediately if the socket is
> > write-blocked but not otherwise --- the two cases are going to look about
> > the same from the client side.

The reason we ended up there is that before the change that made it
behave like that, is that otherwise backends that are trying to write
something to the client, but the client isn't accepting any writes
(hung, forgotten, intentional DOS, whatnot), you have an unkillable
backend as soon as the the network buffers fill up.

But if we always check for procDiePending, even when not blocked, we'd
be able to send out an error message in fewer cases.

There's no problem with being unkillable when writing out data without
blocking, because it's going to take pretty finite time to copy a few
bytes into the kernel buffers.

Obviously it's not perfect to not be able to send a message in cases a
backend is killed while blocked in network, but there's not really a way
around that.

You can pretty easily trigger these cases and observe the difference by
doing something like COPY TO STDOUT of a large table, SIGSTOP the psql,
attach strace to the backend, and then terminate the backend.  Without
checking interrupts while blocked the backend doesn't get terminated.


> If we're willing to accept a ProcDie interrupt during secure_read at all,
> I don't see why not to do it even if we got some data.  We'll accept the
> interrupt anyway the next time something happens to do
> CHECK_FOR_INTERRUPTS; and it's unlikely that that would not be till after
> we'd completed the query, so the net effect is just going to be that we
> waste some cycles first.

I don't immediately see a problem with changing this for reads.


> Likewise, I see little merit in holding off ProcDie during secure_write.
> If we could try to postpone the interrupt until a message boundary, so as
> to avoid losing protocol sync, there would be value in that --- but this
> code is at the wrong level to implement any such behavior, and it's
> not trying to.  So we still have the situation that the interrupt service
> is being postponed without any identifiable gain in predictability of
> behavior.

See earlier explanation.

Greetings,

Andres Freund


pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: NOTIFY does not work as expected
Next
From: Keith Fiske
Date:
Subject: FreeBSD 11 compiling from source cannot find readline headers