Thread: NOTIFY in multi-statement PQexec() not sent outside of transaction

NOTIFY in multi-statement PQexec() not sent outside of transaction

From
John Muehlhausen
Date:
macOS 10.13.6 (17G11023)
PostgreSQL 10.5 (Debian 10.5-2.pgdg90+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit

Recreate with the following script.  Expected behavior with TEST_DSN set to your database.  To see the broken behavior also set TEST_BREAK=1
---

# python 3.6
# set TEST_DSN env var

import os
import sys
import select
import psycopg2 # 2.8.4
from multiprocessing import Process, Queue

def notifier(input,output):
with psycopg2.connect(os.environ['TEST_DSN']) as conn:
conn.autocommit=True
with conn.cursor() as cur:
input.get()
if 'TEST_BREAK' in os.environ:
# docs seem to indicate that an implied transaction
# will not exist if there is an explicit begin/commit?
cur.execute(("notify __test; "
"begin; select pg_advisory_lock(7777); "
"select pg_advisory_unlock(7777); commit"))
else:
# this version works but introduces a race condition
# into my test suite where I am not sure that I am "in"
# a PQexec() but might instead be between such calls
cur.execute("notify __test")
cur.execute(( "select pg_advisory_lock(7777); "
"select pg_advisory_unlock(7777)"))
print('notifier unlocked', file=sys.stderr)
output.put(())

with psycopg2.connect(os.environ['TEST_DSN']) as conn:
conn.autocommit=True
with conn.cursor() as cur:
cur.execute("listen __test; select pg_advisory_lock(7777)")

input, output = Queue(), Queue()
p = Process(target=notifier, args=(output,input))
p.start()
output.put(())

print('waiting for notification',file=sys.stderr)
select.select([conn],[],[])
conn.poll()
print('notification received',file=sys.stderr)

cur.execute("select pg_advisory_unlock(7777)")

input.get()
p.join()

Re: NOTIFY in multi-statement PQexec() not sent outside of transaction

From
"David G. Johnston"
Date:
On Monday, April 20, 2020, John Muehlhausen <jgm@jgm.org> wrote:

# docs seem to indicate that an implied transaction
# will not exist if there is an explicit begin/commit?
cur.execute(("notify __test; "
"begin; select pg_advisory_lock(7777); "
"select pg_advisory_unlock(7777); commit"))


A more comprehensive reading of the docs finds:

“ If the BEGIN follows some statements that were executed as an implicit transaction block, those statements are not immediately committed; in effect, they are retroactively included into the new regular transaction block. “.

David J.

Re: NOTIFY in multi-statement PQexec() not sent outside of transaction

From
John Muehlhausen
Date:
Wrapping the notify in a transaction does not fix it.  Shouldn't the notify be available as soon as the containing transaction is committed?

cur.execute(("begin; notify __test; commit; "
                "begin; select pg_advisory_lock(7777); "
                "select pg_advisory_unlock(7777); commit"))


On Mon, Apr 20, 2020 at 1:26 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Monday, April 20, 2020, John Muehlhausen <jgm@jgm.org> wrote:

# docs seem to indicate that an implied transaction
# will not exist if there is an explicit begin/commit?
cur.execute(("notify __test; "
"begin; select pg_advisory_lock(7777); "
"select pg_advisory_unlock(7777); commit"))


A more comprehensive reading of the docs finds:

“ If the BEGIN follows some statements that were executed as an implicit transaction block, those statements are not immediately committed; in effect, they are retroactively included into the new regular transaction block. “.

David J.

Re: NOTIFY in multi-statement PQexec() not sent outside of transaction

From
Tom Lane
Date:
John Muehlhausen <jgm@jgm.org> writes:
> Wrapping the notify in a transaction does not fix it.  Shouldn't the notify
> be available as soon as the containing transaction is committed?

A notify message will be sent to the client when the backend is
(a) idle (*not* in the middle of processing a transaction string) and
(b) not in an open transaction.

You appear to be unhappy with point (a), but I think changing that
would break more clients than it fixes.  It's intentional that
we don't send notifies mid-command, and this sure looks like a
variant of mid-command from here.

            regards, tom lane



Re: NOTIFY in multi-statement PQexec() not sent outside of transaction

From
John Muehlhausen
Date:
Tom, thanks for the explanation!

My perspective as a libpq user is that multi-statement PQexec() should have the same effects as multiple PQexec() calls other than for the former dropping the results of all but the most recent statement.  Rule (a) appears to break this assumption, and at minimum this difference should be documented?

As a practical matter, what I was trying to do is signal a test case that Postgres is in the middle of processing a transaction string.  Rule (a) seems to mean that NOTIFY cannot be used for this.  Is there some other mechanism?  I don't want to depend on fragile timer-based logic such as "I've waited long enough so surely it is blocked in PQexec()"... that kind of thing always gets my test cases into trouble.  This particular test case disrupts TCP during this state to ensure that reset logic works from this state.

Why do you think that it would break more clients than it fixes?  Surely it is rare to end an explicit transaction containing a notify in a multi-statement command?  For all other cases the behavior would be unchanged.

Thanks again,
John

On Mon, Apr 20, 2020 at 2:42 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
John Muehlhausen <jgm@jgm.org> writes:
> Wrapping the notify in a transaction does not fix it.  Shouldn't the notify
> be available as soon as the containing transaction is committed?

A notify message will be sent to the client when the backend is
(a) idle (*not* in the middle of processing a transaction string) and
(b) not in an open transaction.

You appear to be unhappy with point (a), but I think changing that
would break more clients than it fixes.  It's intentional that
we don't send notifies mid-command, and this sure looks like a
variant of mid-command from here.

                        regards, tom lane

Re: NOTIFY in multi-statement PQexec() not sent outside of transaction

From
Tom Lane
Date:
John Muehlhausen <jgm@jgm.org> writes:
> My perspective as a libpq user is that multi-statement PQexec() should have
> the same effects as multiple PQexec() calls other than for the former
> dropping the results of all but the most recent statement.

Well, that's not even close to true, because of the rules about statements
within a multi-statement string getting merged into single transactions.

> As a practical matter, what I was trying to do is signal a test case that
> Postgres is in the middle of processing a transaction string.

I'm not really sure why you want to go about that in this particular way;
under ordinary circumstances the client app wouldn't see the notify until
after PQexec returned, anyway.  (Yes, I realize that there are ways around
that, but still, you can't really do anything with the connection until
PQexec returns.)

Perhaps you could achieve a similar effect using single-row mode,
ie break the connection after collecting a few rows from a multi-row
query?  That's not PQexec at all of course, but if you use PQsendQuery
then the server doesn't know the difference.

            regards, tom lane



Re: NOTIFY in multi-statement PQexec() not sent outside of transaction

From
John Muehlhausen
Date:


> My perspective as a libpq user is that multi-statement PQexec() should have
> the same effects as multiple PQexec() calls other than for the former
> dropping the results of all but the most recent statement.

Well, that's not even close to true, because of the rules about statements
within a multi-statement string getting merged into single transactions.

My meaning was that with appropriate begin/commit augmentation, the user should be able to achieve the same effects with multi-statement, and that Rule (a) makes this unachievable, at least in the case of NOTIFY.

My goal is to write a test case where the service being tested (which is a child process of the testing process) is known to be blocking in a PQexec call, and during this exact state the testing process disrupts the TCP connection between the service and postgres in various ways, to be sure that the service recovers.

I'm not sure how to know that PQexec() has entered and not yet returned in my service process other than to examine the effects of a partially complete PQexec() itself.  If I can't use NOTIFY then perhaps I can sniff for the network traffic sent at the beginning of PQexec(), or perhaps there is some extra libpq logging I could enable to know that PQexec() is in process?

Still, even without considering my usage, I find it odd that this would not be considered a bug, for the reason in my first sentence above.  The current behavior means that the user has to think about transactions "and something else" when reasoning about the effects of statements.  I would vote to remove the "and something else" which would then remove the need to augment the current documentation.  As it stands, at minimum, the documentation needs to warn the user that similar notify effects are unachievable in the multi-statement realm?

-John

Re: NOTIFY in multi-statement PQexec() not sent outside of transaction

From
"David G. Johnston"
Date:

Thanks again,
John

On Mon, Apr 20, 2020 at 2:42 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
John Muehlhausen <jgm@jgm.org> writes:
> Wrapping the notify in a transaction does not fix it.  Shouldn't the notify
> be available as soon as the containing transaction is committed?

A notify message will be sent to the client when the backend is
(a) idle (*not* in the middle of processing a transaction string) and
(b) not in an open transaction.

You appear to be unhappy with point (a), but I think changing that
would break more clients than it fixes.  It's intentional that
we don't send notifies mid-command, and this sure looks like a
variant of mid-command from here.
 
On Mon, Apr 20, 2020 at 12:56 PM John Muehlhausen <jgm@jgm.org> wrote:

Rule (a) appears to break this assumption, and at minimum this difference should be documented?

It is [1] though the comment itself is in a place where LISTEN is the topic of interest, not NOTIFY.  It seems like a similar note or paragraph needs to be placed where people using LISTEN would be typically looking.

"At present, NotificationResponse can only be sent outside a transaction, and thus it will not occur in the middle of a command-response series, though it might occur just before ReadyForQuery." 
 

Specifically, in 52.2.2.1 Multiple Statements In A Simple Query:

... "Conversely, if a BEGIN appears in a multi-statement Query message, then it starts a regular transaction block that will only be terminated by an explicit COMMIT or ROLLBACK, whether that appears in this Query message or a later one. If the BEGIN follows some statements that were executed as an implicit transaction block, those statements are not immediately committed; in effect, they are retroactively included into the new regular transaction block."

Add something like:

Finally, if multiple explicit transactions are present in a single Simple Query command then statements that send events that take effect at commit (i.e., LISTEN and NOTIFY) will not happen until just before the ReadyToQuery event that ends the current Simple Query command.

(I'm now less certain this is a good change but I'll leave it out there for discussion)

Then, in the NOTIFY documentation [2]:

"Firstly, if a NOTIFY is executed inside a transaction, the notify events are not delivered until and unless the transaction is committed."

It feels like "... the notify events are not sent until ..." would be clearer.  And then add:

Furthermore, if the transaction containing the NOTIFY is part of a multi-transaction command the notify event will be sent post-command completion.

Specifically:

NOTIFY interacts with SQL transactions in some important ways. Firstly, if a NOTIFY is executed inside a transaction, the notify events are not delivered until and unless the transaction is committed (Furthermore, if the transaction containing the NOTIFY is part of a multi-transaction command the notify event will be sent post-command completion.) This is appropriate, since if the transaction is aborted, all the commands within it have had no effect, including NOTIFY. But it can be disconcerting if one is expecting the notification events to be delivered immediately. Secondly, 

 
David J.

Re: NOTIFY in multi-statement PQexec() not sent outside of transaction

From
"David G. Johnston"
Date:
On Mon, Apr 20, 2020 at 1:27 PM John Muehlhausen <jgm@jgm.org> wrote:


> My perspective as a libpq user is that multi-statement PQexec() should have
> the same effects as multiple PQexec() calls other than for the former
> dropping the results of all but the most recent statement.
 
Still, even without considering my usage, I find it odd that this would not be considered a bug, for the reason in my first sentence above.  The current behavior means that the user has to think about transactions "and something else" when reasoning about the effects of statements.  I would vote to remove the "and something else" which would then remove the need to augment the current documentation.  As it stands, at minimum, the documentation needs to warn the user that similar notify effects are unachievable in the multi-statement realm?

The deeper into the architecture the change the more resistant to change it should be.  This is pretty low-level and the benefits don't seem that great.  It is functional, and the usage pattern involved, multi-statement commands, is one that is actively discouraged - so from my perspective classifying this as a bug that the core team should fix is unappealing.  If someone wants to offer up a feature enhancement patch for critique then we'll see.

David J.

Re: NOTIFY in multi-statement PQexec() not sent outside of transaction

From
John Muehlhausen
Date:
Thanks for the thoughts David.  I assume there is a regression test suite for postgres?  In addition to documentation, can a test case be added that will ensure the present behavior remains unchanged until someone thinks about fixing it?  This will serve as another reminder about the issue?

On Mon, Apr 20, 2020 at 3:39 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Mon, Apr 20, 2020 at 1:27 PM John Muehlhausen <jgm@jgm.org> wrote:


> My perspective as a libpq user is that multi-statement PQexec() should have
> the same effects as multiple PQexec() calls other than for the former
> dropping the results of all but the most recent statement.
 
Still, even without considering my usage, I find it odd that this would not be considered a bug, for the reason in my first sentence above.  The current behavior means that the user has to think about transactions "and something else" when reasoning about the effects of statements.  I would vote to remove the "and something else" which would then remove the need to augment the current documentation.  As it stands, at minimum, the documentation needs to warn the user that similar notify effects are unachievable in the multi-statement realm?

The deeper into the architecture the change the more resistant to change it should be.  This is pretty low-level and the benefits don't seem that great.  It is functional, and the usage pattern involved, multi-statement commands, is one that is actively discouraged - so from my perspective classifying this as a bug that the core team should fix is unappealing.  If someone wants to offer up a feature enhancement patch for critique then we'll see.

David J.