Thread: Listen / Notify - what to do when the queue is full

Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

15 November 2009, 14:19:51

We still need to decide what to do with queue full situations in the proposed
listen/notify implementation. I have a new version of the patch to allow for a
variable payload size. However, the whole notification must fit into one page so
the payload needs to be less than 8K.

I have also added the XID, so that we can write to the queue before committing
to clog which allows for rollback if we encounter write errors (disk full for
example). Especially the implications of this change make the patch a lot more
complicated.

The queue is slru-based, slru uses int page numbers, so we can use up to
2147483647 (INT_MAX) pages with some small changes in slru.c.

When do we have a full queue? Well, the idea is that notifications are written
to the queue and that they are read as soon as the notifying transaction
commits. Only if a listening backend is busy, it won't read the
notifications and
so it won't update its pointer for some time. With the current space we can
acommodate at least 2147483647 notifications or more, depending on the
payload length. That gives us something in between of 214 GB (100 Bytes per
notification) and 17 TB (8000 Bytes per notification). So in order to have a
full queue, we need to generate that amount of notifications while one backend
is still busy and is not reading the accumulating notifications. In general
chances are not too high that anyone will ever have a full notification queue,
but we need to define the behavior anyway...

These are the solutions that I currently see:
1) drop new notifications if the queue is full (silently or with rollback)2) block until readers catch up (what if the
backendthat tries to write the notifications actually is the "lazy" reader that everybody is waiting for to
proceed?)3)invent a new signal reason and send SIGUSR1 to the "lazy" readers, they need to interrupt whatever they
aredoing and copy the

notifications into their own address space (without delivering the notifications since they are in a transaction at
thatmoment).

For 1) there can be warnings way ahead of when the queue is actually full, like
one when it is 50% full, another one when it is 75% full and so on and
they could
point to the backend that is most behind in reading notifications...

I think that 2) is the least practical approach. If there is a pile of at least
2,147,483,647 notifications, then a backend hasn't read the notifications
for a long long time... Chances are low that it will read them within the next
few seconds.
In a sense 2) implies 3) for the special case that the writing backend is
the one that everybody is waiting for to proceed reading notifications,
in the end this backend is waiting for itself.

For 3) the question is if we can just invent a new signal reason
PROCSIG_NOTIFYCOPY_INTERRUPT or similar and upon reception the backend
copies the notification data to its private address space?
Would this function be called by every backend after at most a few seconds
even if it is processing a long running query?

Admittedly, once 3) is in place we can also put a smaller queue into
shared memory
and remove the slru thing alltogether but we need to be sure that we can
interrupt the backends at any time since the queue size would be a lot smaller
than 200 GB...

Joachim

Re: Listen / Notify - what to do when the queue is full

From

"Greg Sabino Mullane"

Date:

16 November 2009, 10:06:11

-----BEGIN PGP SIGNED MESSAGE-----                                     
Hash: RIPEMD160                                                        


> We still need to decide what to do with queue full situations in 
> the proposed listen/notify implementation. I have a new version  
> of the patch to allow for a variable payload size. However, the  
> whole notification must fit into one page so the payload needs   
> to be less than 8K.                                              

That sounds fine to me, FWIW.

> I have also added the XID, so that we can write to the queue before 
> committing to clog which allows for rollback if we encounter write  
> errors (disk full for example). Especially the implications of this 
> change make the patch a lot more complicated.                       

Can you elaborate on the use case for this?

> so it won't update its pointer for some time. With the current space we can
> acommodate at least 2147483647 notifications or more, depending on the
> payload length.

That's a whole lot of notifications. I doubt any program out there is using
anywhere near that number at the moment. In my applications, having a
few hundred notifications active at one time is "a lot" in my book. :)

> These are the solutions that I currently see:
>
> 1) drop new notifications if the queue is full (silently or with rollback)

I like this one best, but not with silence of course. While it's not the most
polite thing to do, this is for a super extreme edge case. I'd rather just
throw an exception if the queue is full rather than start messing with the
readers. It's a possible denial of service attack too, but so is the current
implementation in a way - at least I don't think apps would perform very
optimally with 2147483647 entries in the pg_listener table :)

If you need some real-world use cases involving payloads, let me know, I've
been waiting for this feature for some time and have it all mapped out.

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200911160902
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAksBXC0ACgkQvJuQZxSWSsh5XQCg2qPh+MovjPAdbxTmlOGu51HF
6OYAn0f+tt6lXJhVKoAAmh1QlWfRC4kl
=Izb1
-----END PGP SIGNATURE-----

Re: Listen / Notify - what to do when the queue is full

From

Merlin Moncure

Date:

16 November 2009, 10:15:22

On Mon, Nov 16, 2009 at 9:05 AM, Greg Sabino Mullane >> We still need
to decide what to do with queue full situations in
>> the proposed listen/notify implementation. I have a new version
>> of the patch to allow for a variable payload size. However, the
>> whole notification must fit into one page so the payload needs
>> to be less than 8K.
>
> That sounds fine to me, FWIW.

+1! I think this should satisfy everyone.


>> I have also added the XID, so that we can write to the queue before
>> committing to clog which allows for rollback if we encounter write
>> errors (disk full for example). Especially the implications of this
>> change make the patch a lot more complicated.
>
> Can you elaborate on the use case for this?

Tom specifically asked for it: "The old implementation was acid so the
new one should be to"

>> so it won't update its pointer for some time. With the current space we can
>> acommodate at least 2147483647 notifications or more, depending on the
>> payload length.
>
> That's a whole lot of notifications. I doubt any program out there is using
> anywhere near that number at the moment. In my applications, having a
> few hundred notifications active at one time is "a lot" in my book. :)
>
>> These are the solutions that I currently see:
>>
>> 1) drop new notifications if the queue is full (silently or with rollback)
>
> I like this one best, but not with silence of course. While it's not the most
> polite thing to do, this is for a super extreme edge case. I'd rather just
> throw an exception if the queue is full rather than start messing with the
> readers. It's a possible denial of service attack too, but so is the current
> implementation in a way - at least I don't think apps would perform very
> optimally with 2147483647 entries in the pg_listener table :)
>
> If you need some real-world use cases involving payloads, let me know, I've
> been waiting for this feature for some time and have it all mapped out.

me too.  Joachim: when I benchmarked the original patch, I was seeing
a few log messages that suggested there might be something going
inside.  In any event, the performance was fantastic.

merlin

Re: Listen / Notify - what to do when the queue is full

From

Andrew Chernow

Date:

16 November 2009, 10:35:30

Greg Sabino Mullane wrote:
> -----BEGIN PGP SIGNED MESSAGE-----                                     
> Hash: RIPEMD160                                                        
> 
> 
>> We still need to decide what to do with queue full situations in 
>> the proposed listen/notify implementation. I have a new version  
>> of the patch to allow for a variable payload size. However, the  
>> whole notification must fit into one page so the payload needs   
>> to be less than 8K.                                              
> 
> That sounds fine to me, FWIW.
> 

Agreed.  Thank you for all your work.

>> 1) drop new notifications if the queue is full (silently or with rollback)
> 
> I like this one best, but not with silence of course. While it's not the most
> polite thing to do, this is for a super extreme edge case. I'd rather just
> throw an exception if the queue is full rather than start messing with the

+1

-- 
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

18 November 2009, 19:59:57

On Sun, Nov 15, 2009 at 7:19 PM, Joachim Wieland <joe@mcknight.de> wrote:
> These are the solutions that I currently see:
>  1) [...]
>  2) [...]
>  3) [...]

4) Allow readers to read uncommitted notifications as well. Instead of
delivering them, the backends just copy them over into their own
address space and deliver them later on...

Going with option 4) allows readers to always read all notifications
in the queue... This also allows a backend to send more notifications
than the queue can hold. So we are only limited by the backends'
memory. Every notification that is sent will eventually be delivered.

The queue can still fill up if one of the backends is busy for a long
long long time... Then the next writer just blocks and waits.

Attached patch implements this behavior as well as a variable payload
size, limited to 8000 characters. The variable payload also offers an
automatic speed control... The smaller your notifications are, the
more efficiently a page can be used and the faster you are. :-)

Once we are fine that this is the way to go, I'll submit a documentation patch.

Joachim

Attachment

listennotify.2.diff

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

18 November 2009, 20:48:35

Joachim Wieland <joe@mcknight.de> writes:
> 4) Allow readers to read uncommitted notifications as well.

The question that strikes me here is one of timing --- apparently,
readers will now have to check the queue *without* having received
a signal?  That could amount to an unpleasant amount of extra overhead
when the notify system isn't even in use.  (Users who don't care about
notify will define "unpleasant amount" as "not zero".)

I haven't read the patch, so maybe you have some cute solution to that,
but if so please explain what.
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Greg Stark

Date:

18 November 2009, 21:05:16

On Mon, Nov 16, 2009 at 2:35 PM, Andrew Chernow <ac@esilo.com> wrote:
>
>>> 1) drop new notifications if the queue is full (silently or with
>>> rollback)
>>
>> I like this one best, but not with silence of course. While it's not the
>> most
>> polite thing to do, this is for a super extreme edge case. I'd rather just
>> throw an exception if the queue is full rather than start messing with the
>
> +1

So if you guys are going to insist on turning the notification
mechanism isn't a queueing mechanism I think it at least behooves you
to have it degrade gracefully into a notification mechanism and not
become entirely useless by dropping notification messages.

That is, if the queue overflows what you should do is drop the
payloads and condense all the messages for a given class into a single
notification for that class with "unknown payload". That way if a
cache which wants to invalidate specific objects gets a queue overflow
condition then at least it knows it should rescan the original data
and rebuild the cache and not just serve invalid data.

I still think you're on the wrong path entirely and will end up with a
mechanism which serves neither use case very well instead of two
separate mechanisms that are properly designed for the two use cases.

-- 
greg

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

18 November 2009, 21:07:07

On Thu, Nov 19, 2009 at 1:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Joachim Wieland <joe@mcknight.de> writes:
>> 4) Allow readers to read uncommitted notifications as well.
>
> The question that strikes me here is one of timing --- apparently,
> readers will now have to check the queue *without* having received
> a signal?  That could amount to an unpleasant amount of extra overhead
> when the notify system isn't even in use.  (Users who don't care about
> notify will define "unpleasant amount" as "not zero".)

The sequence in CommitTransaction() is like that:

1) add notifications to queue
2) commit to clog
3) signal backends

Only those backends are signalled that listen to at least one channel,
if the notify system isn't in use, then nobody will ever be signalled
anyway.

If a backend is reading a transaction id that has not yet committed,
it will not deliver the notification. It knows that eventually it will
receive a signal from that transaction and then it first checks its
list of uncommitted notifications it has already read and then checks
the queue for more pending notifications.

Joachim

Re: Listen / Notify - what to do when the queue is full

From

Andrew Chernow

Date:

18 November 2009, 21:26:25

> 
> That is, if the queue overflows what you should do is drop the
> payloads and condense all the messages for a given class into a single
> notification for that class with "unknown payload". That way if a
> cache which wants to invalidate specific objects gets a queue overflow
> condition then at least it knows it should rescan the original data
> and rebuild the cache and not just serve invalid data.
> 

That's far more complicated than throwing an error and it discards user payload 
information.  Let the error indicate a rescan is needed.

-- 
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

Re: Listen / Notify - what to do when the queue is full

From

Josh Berkus

Date:

18 November 2009, 21:26:55

On 11/16/09 3:19 AM, Joachim Wieland wrote:
>  1) drop new notifications if the queue is full (silently or with rollback)
>  2) block until readers catch up (what if the backend that tries to write the
>     notifications actually is the "lazy" reader that everybody is waiting for to
>     proceed?)
>  3) invent a new signal reason and send SIGUSR1 to the "lazy" readers, they
>     need to interrupt whatever they are doing and copy the
> notifications into their
>     own address space (without delivering the notifications since they are in a
>     transaction at that moment).

(4) drop *old* notifications if the queue is full.

Since everyone has made the point that LISTEN is not meant to be a full
queueing system, I have no problem dropping notifications LRU-style.  If
we've run out of room, the oldest notifications should go first; we
probably don't care about them anyway.

We should probably also log the fact that we ran out of room, so that
the DBA knows that they ahve a design issue.  For volume reasons, I
don't think we want to log every dropped message.

Alternately, it would be great to have a configuration option which
would allow the DBA to choose any of 3 behaviors via GUC:

drop-oldest (as above)
drop-largest (if we run out of room, drop the largest payloads first to
save space)
error (if we run out of room, error and rollback)

--Josh Berkus

Re: Listen / Notify - what to do when the queue is full

From

Andrew Chernow

Date:

18 November 2009, 22:09:11

> We should probably also log the fact that we ran out of room, so that
> the DBA knows that they ahve a design issue.

Can't they just bump allowed memory and avoid a redesign?

> Alternately, it would be great to have a configuration option which
> would allow the DBA to choose any of 3 behaviors via GUC:
> 
> drop-oldest (as above)
> drop-largest (if we run out of room, drop the largest payloads first to
> save space)
> error (if we run out of room, error and rollback)
> 

I mentioned this up thread.  I completely agree that overflow behavior should be 
tunable.

-- 
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

18 November 2009, 22:50:09

Joachim Wieland <joe@mcknight.de> writes:
> The sequence in CommitTransaction() is like that:

> 1) add notifications to queue
> 2) commit to clog
> 3) signal backends

> Only those backends are signalled that listen to at least one channel,
> if the notify system isn't in use, then nobody will ever be signalled
> anyway.

> If a backend is reading a transaction id that has not yet committed,
> it will not deliver the notification.

But you were saying that this patch would enable sending more data than
would fit in the queue.  How will that happen if the other backends
don't look at the queue until you signal them?
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

18 November 2009, 23:12:46

Josh Berkus <josh@agliodbs.com> writes:
> (4) drop *old* notifications if the queue is full.

> Since everyone has made the point that LISTEN is not meant to be a full
> queueing system, I have no problem dropping notifications LRU-style.

NO, NO, NO, a thousand times no!

That turns NOTIFY into an unreliable signaling system, and if I haven't
made this perfectly clear yet, any such change will be committed over my
dead body.

If we are unable to insert a new message into the queue, the correct
recourse is to fail the transaction that is trying to insert the *new*
message.  Not to drop messages from already-committed transactions.
Failing the current transaction still leaves things in a consistent
state, ie, you don't get messages from aborted transactions but that's
okay because they didn't change the database state.

I think Greg has a legitimate concern about whether this redesign
reduces the usefulness of NOTIFY for existing use-cases, though.
Formerly, since pg_listener would effectively coalesce notifies
across multiple sending transactions instead of only one, it was
impossible to "overflow the queue", unless maybe you managed to
bloat pg_listener to the point of being out of disk space, and
even that was pretty hard.  There will now be a nonzero chance
of transactions failing at commit because of queue full.  If the
chance is large this will be an issue.  (Is it sane to wait for
the queue to be drained?)

BTW, did we discuss the issue of 2PC transactions versus notify?
The current behavior of 2PC with notify is pretty cheesy and will
become more so if we make this change --- you aren't really
guaranteed that the notify will happen, even though the prepared
transaction did commit.  I think it might be better to disallow
NOTIFY inside a prepared xact.
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

18 November 2009, 23:14:17

Andrew Chernow <ac@esilo.com> writes:
> I mentioned this up thread.  I completely agree that overflow behavior should be 
> tunable.

There is only one correct overflow behavior.
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Andrew Chernow

Date:

18 November 2009, 23:32:31

Tom Lane wrote:
> Andrew Chernow <ac@esilo.com> writes:
>> I mentioned this up thread.  I completely agree that overflow behavior should be 
>> tunable.
> 
> There is only one correct overflow behavior.
> 

I count three.

1. wait
2. error
3. skip

#1 and #2 are very similar to a file system.  If FS buffers are full on write, 
it makes you wait.  In non-blocking mode, it throws an EAGAIN error.  IMHO those 
two behaviors are totally acceptable for handling notify overflow.  #3 is pretty 
weak but I *think* there are uses for it.

-- 
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

18 November 2009, 23:44:20

Andrew Chernow <ac@esilo.com> writes:
> Tom Lane wrote:
>> There is only one correct overflow behavior.

> I count three.

Waiting till you can insert is reasonable (especially if we have some
behavior that nudges other backends to empty the queue).  If by "skip"
you mean losing the notify but still committing, that's incorrect.
There is no room for debate about that.
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Andrew Chernow

Date:

19 November 2009, 00:03:49

Tom Lane wrote:
> Andrew Chernow <ac@esilo.com> writes:
>> Tom Lane wrote:
>>> There is only one correct overflow behavior.
> 
>> I count three.
> 
> Waiting till you can insert is reasonable (especially if we have some
> behavior that nudges other backends to empty the queue).  If by "skip"
> you mean losing the notify but still committing, that's incorrect.
> There is no room for debate about that.

Yeah like I said, skip felt weak.

In regards to waiting, what would happen if other backends couldn't help empty 
the queue because they to are clogged?  ISTM that any attempt to flush to other 
non-disk queues is doomed to possible overflows as well.  Then what? 
Personally, I would just wait until room became available or the transaction was 
canceled.  We could get fancy and tack a timeout value onto the wait.

-- 
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

19 November 2009, 00:12:57

Andrew Chernow <ac@esilo.com> writes:
> Personally, I would just wait until room became available or the transaction was 
> canceled.

Works for me, as long as there's a CHECK_FOR_INTERRUPTS in there to
allow a cancel to happen.  The current patch seems to have a lot of
pointless logging and no CHECK_FOR_INTERRUPTS ;-)
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Andreas 'ads' Scherbaum

Date:

19 November 2009, 08:52:28

On Wed, 18 Nov 2009 22:12:18 -0500 Tom Lane wrote:

> Josh Berkus <josh@agliodbs.com> writes:
> > (4) drop *old* notifications if the queue is full.
> 
> > Since everyone has made the point that LISTEN is not meant to be a full
> > queueing system, I have no problem dropping notifications LRU-style.
> 
> NO, NO, NO, a thousand times no!
> 
> That turns NOTIFY into an unreliable signaling system, and if I haven't
> made this perfectly clear yet, any such change will be committed over my
> dead body.
> 
> If we are unable to insert a new message into the queue, the correct
> recourse is to fail the transaction that is trying to insert the *new*
> message.  Not to drop messages from already-committed transactions.
> Failing the current transaction still leaves things in a consistent
> state, ie, you don't get messages from aborted transactions but that's
> okay because they didn't change the database state.

+1

And in addition i don't like the idea of having the sender sitting
around until there's room for more messages in the queue, because some
very old backends didn't remove the stuff from the same.

So, yes, just failing the current transaction seems reasonable. We are
talking about millions of messages in the queue ...


Bye

--             Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

19 November 2009, 09:16:43

On Thu, Nov 19, 2009 at 4:12 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> There will now be a nonzero chance
> of transactions failing at commit because of queue full.  If the
> chance is large this will be an issue.  (Is it sane to wait for
> the queue to be drained?)

Exactly. The whole idea of putting the notification system to an slru queue
was to make this nonzero chance a very-close-to-zero nonzero chance.

Currently with pages from 0..0xFFFF we can have something between
160,000,000 (no payload) and 2,000,000 (biggest payload) notifications
in the queue at the same time.

We are free to remove the slru limitation by making slru.c work with 8
character file names. Then you can multiply both limits by 32,000 and
then it should be very-close-to-zero, at least in my point of view...

The actual queue-full behavior is then (or maybe is already now) just
a theoretical aspect that we need to agree on to make the whole
concept sound.

The current patch would just wait until some space becomes available
in the queue and it guarantees that no notification is lost. Furthermore
it guarantees that a transaction can listen on an unlimited number of
channels and that it can send an unlimited number of notifications,
not related to the size of the queue. It can also send that unlimited
number of notifications if it is one of the listeners of those notifications.

The only real limit is now the backend's memory but as long as nobody
proves that he needs unlimited notifications with a limited amount of
memory we just keep it like that.

I will add a CHECK_FOR_INTERRUPTS() and resubmit so that you
can cancel a NOTIFY while the queue is full. Also I've put in an
optimization to only signal those backends in a queue full situation
that are not yet up-to-date (which will probably turn out to be only one
backend - the slowest that is in a long running transaction - after some
time...).

> BTW, did we discuss the issue of 2PC transactions versus notify?
> The current behavior of 2PC with notify is pretty cheesy and will
> become more so if we make this change --- you aren't really
> guaranteed that the notify will happen, even though the prepared
> transaction did commit.  I think it might be better to disallow
> NOTIFY inside a prepared xact.

Yes, I have been thinking about that also. So what should happen
when you prepare a transaction that has sent a NOTIFY before?

Joachim

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

19 November 2009, 09:24:19

On Thu, Nov 19, 2009 at 1:51 PM, Andreas 'ads' Scherbaum
<adsmail@wars-nicht.de> wrote:
> And in addition i don't like the idea of having the sender sitting
> around until there's room for more messages in the queue, because some
> very old backends didn't remove the stuff from the same.

The only valid reason why a backend has not processed the
notifications in the queue
must be a backend that is still in a transaction since then (and has
executed LISTEN
some time before).

Joachim

Re: Listen / Notify - what to do when the queue is full

From

"Greg Sabino Mullane"

Date:

19 November 2009, 09:59:11

-----BEGIN PGP SIGNED MESSAGE-----                               
Hash: RIPEMD160                                                  

>> (4) drop *old* notifications if the queue is full.
>>                                                   
>> Since everyone has made the point that LISTEN is not meant to be a full
>> queueing system, I have no problem dropping notifications LRU-style.   
>                                                                         
> NO, NO, NO, a thousand times no!                                        

+1. Don't even think about going there. </me gives horrified shudder>

...
> even that was pretty hard.  There will now be a nonzero chance
> of transactions failing at commit because of queue full.  If the
> chance is large this will be an issue.  (Is it sane to wait for 
> the queue to be drained?)                                       

I think this chance will be pretty small - you need a *lot* of 
unread notifications before this edge case is reached, so I think
we can be pretty severe in our response, and put the responsibility
on cleanup on the user, rather than having the backend try to
move things around, cleanup the queue selectively, etc.

> BTW, did we discuss the issue of 2PC transactions versus notify?
> The current behavior of 2PC with notify is pretty cheesy and will
> become more so if we make this change --- you aren't really
> guaranteed that the notify will happen, even though the prepared
> transaction did commit.  I think it might be better to disallow
> NOTIFY inside a prepared xact.

That's a tough one. On the one hand, simply stating that NOTIFY and 2PC
don't play together in the docs would be a straightforward solution
(and not a bad one, as 2PC is already rare and delicate and should not
be used lightly). But what I really don't like the is the idea of a
notify that *may* work or may not - so let's keep it boolean: it either
works 100% of the time with 2PC, or doesn't at all. Should we throw
a warning or error if a client attempts to combine the two?

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation
PGP Key: 0x14964AC8 200911190857
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAksFTxEACgkQvJuQZxSWSsjkiACfYeevKZ0QngZcZXUoTPP6wXh6
iOMAoLvkPlEV6ywGqyaaloqQrnoryILU
=rioB
-----END PGP SIGNATURE-----

Re: Listen / Notify - what to do when the queue is full

From

Heikki Linnakangas

Date:

19 November 2009, 10:42:42

Joachim Wieland wrote:
> On Thu, Nov 19, 2009 at 4:12 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> BTW, did we discuss the issue of 2PC transactions versus notify?
>> The current behavior of 2PC with notify is pretty cheesy and will
>> become more so if we make this change --- you aren't really
>> guaranteed that the notify will happen, even though the prepared
>> transaction did commit.  I think it might be better to disallow
>> NOTIFY inside a prepared xact.

That will make anyone currently using 2PC with notify/listen unhappy.

> Yes, I have been thinking about that also. So what should happen
> when you prepare a transaction that has sent a NOTIFY before?

From the user's point of view, nothing should happen at prepare.

At a quick glance, it doesn't seem hard to support 2PC. Messages should
be put to the queue at prepare, as just before normal commit, but the
backends won't see them until they see that the XID has committed.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com

Re: Listen / Notify - what to do when the queue is full

From

"Florian G. Pflug"

Date:

19 November 2009, 10:51:44

Heikki Linnakangas wrote:
> Joachim Wieland wrote:
>> On Thu, Nov 19, 2009 at 4:12 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yes, I have been thinking about that also. So what should happen
>> when you prepare a transaction that has sent a NOTIFY before?
> 
> From the user's point of view, nothing should happen at prepare.
> 
> At a quick glance, it doesn't seem hard to support 2PC. Messages should
> be put to the queue at prepare, as just before normal commit, but the
> backends won't see them until they see that the XID has committed.

Yeah, but if the server is restarted after the PREPARE but before the 
COMMIT, the notification will be lost, since all notification queue 
entries are lost upon restart with the slru design, no?

best regards,
Florian Pflug

Re: Listen / Notify - what to do when the queue is full

From

Heikki Linnakangas

Date:

19 November 2009, 12:01:06

Florian G. Pflug wrote:
> Heikki Linnakangas wrote:
>> At a quick glance, it doesn't seem hard to support 2PC. Messages should
>> be put to the queue at prepare, as just before normal commit, but the
>> backends won't see them until they see that the XID has committed.
> 
> Yeah, but if the server is restarted after the PREPARE but before the
> COMMIT, the notification will be lost, since all notification queue
> entries are lost upon restart with the slru design, no?

That's why they're stored in the 2PC state file in pg_twophase. See
AtPrepare_Notify().

Hmm, thinking about this a bit more, I don't think the messages should
be sent until commit (ie. 2nd phase). Although the information is safe
in the state file, if anyone starts to LISTEN between the PREPARE
TRANSACTION and COMMIT PREPARED calls, he would miss the notifications.
I'm not sure if it's well-defined what happens if someone starts to
LISTEN while another transaction has already sent a notification, but it
would be rather surprising if such a window existed where it doesn't
exist with non-prepared transactions.

A better approach is to do something similar to what we do now: at
prepare, just store the notifications in the state file like we do
already. In notify_twophase_postcommit(), copy the messages to the
shared queue. Although it's the same approach we have now, it becomes a
lot cleaner with the patch, because we're not piggybacking the messages
on the backend-private queue of the current transaction, but sending the
messages directly on behalf of the prepared transaction being committed.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

19 November 2009, 12:17:40

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> A better approach is to do something similar to what we do now: at
> prepare, just store the notifications in the state file like we do
> already. In notify_twophase_postcommit(), copy the messages to the
> shared queue. Although it's the same approach we have now, it becomes a
> lot cleaner with the patch, because we're not piggybacking the messages
> on the backend-private queue of the current transaction, but sending the
> messages directly on behalf of the prepared transaction being committed.

This is still ignoring the complaint: you are creating a clear risk
that COMMIT PREPARED will fail.

I'm not sure that it's really worth it, but one way this could be made
safe would be for PREPARE to "reserve" the required amount of queue
space, such that nobody else could use it during the window from
PREPARE to COMMIT PREPARED.

On the whole I'd be just as happy to disallow NOTIFY in a 2PC
transaction.  We have no evidence that anyone out there is using the
combination, and if they are, they can do the work to make it safe.
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Andreas 'ads' Scherbaum

Date:

19 November 2009, 12:19:17

On Thu, 19 Nov 2009 14:23:57 +0100 Joachim Wieland wrote:

> On Thu, Nov 19, 2009 at 1:51 PM, Andreas 'ads' Scherbaum
> <adsmail@wars-nicht.de> wrote:
> > And in addition i don't like the idea of having the sender sitting
> > around until there's room for more messages in the queue, because some
> > very old backends didn't remove the stuff from the same.
> 
> The only valid reason why a backend has not processed the
> notifications in the queue
> must be a backend that is still in a transaction since then (and has
> executed LISTEN
> some time before).

Yes, i know. The same backend is probably causing more trouble
anyway (blocking vacuum, xid wraparound, ...).

--             Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project

Re: Listen / Notify - what to do when the queue is full

From

"Florian G. Pflug"

Date:

19 November 2009, 12:38:09

Tom Lane wrote:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> A better approach is to do something similar to what we do now: at 
>> prepare, just store the notifications in the state file like we do 
>> already. In notify_twophase_postcommit(), copy the messages to the 
>> shared queue. Although it's the same approach we have now, it
>> becomes a lot cleaner with the patch, because we're not
>> piggybacking the messages on the backend-private queue of the
>> current transaction, but sending the messages directly on behalf of
>> the prepared transaction being committed.
> 
> This is still ignoring the complaint: you are creating a clear risk 
> that COMMIT PREPARED will fail.
> 
> I'm not sure that it's really worth it, but one way this could be
> made safe would be for PREPARE to "reserve" the required amount of
> queue space, such that nobody else could use it during the window
> from PREPARE to COMMIT PREPARED.

I'd see no problem with "COMMIT PREPARED" failing, as long as it was
possible to retry the COMMIT PREPARED at a later time. There surely are
other failure cases for COMMIT PREPARED too, like an IO error that
prevents the clog bit from being set, or a server crash half-way through
COMMIT PREPARED.

best regards,
Florian Pflug

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

19 November 2009, 12:42:14

"Florian G. Pflug" <fgp@phlo.org> writes:
> Tom Lane wrote:
>> This is still ignoring the complaint: you are creating a clear risk 
>> that COMMIT PREPARED will fail.

> I'd see no problem with "COMMIT PREPARED" failing, as long as it was
> possible to retry the COMMIT PREPARED at a later time. There surely are
> other failure cases for COMMIT PREPARED too, like an IO error that
> prevents the clog bit from being set, or a server crash half-way through
> COMMIT PREPARED.

Yes, there are failure cases that are outside our control.  That's no
excuse for creating one that's within our control.
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

"Florian G. Pflug"

Date:

19 November 2009, 13:43:38

Tom Lane wrote:
> "Florian G. Pflug" <fgp@phlo.org> writes:
>> Tom Lane wrote:
>>> This is still ignoring the complaint: you are creating a clear
>>> risk that COMMIT PREPARED will fail.
> 
>> I'd see no problem with "COMMIT PREPARED" failing, as long as it
>> was possible to retry the COMMIT PREPARED at a later time. There
>> surely are other failure cases for COMMIT PREPARED too, like an IO
>> error that prevents the clog bit from being set, or a server crash
>> half-way through COMMIT PREPARED.
> 
> Yes, there are failure cases that are outside our control.  That's no
>  excuse for creating one that's within our control.

True. On the other hand, people might prefer having to deal with (very
unlikely) COMMIT PREPARED *transient* failures over not being able to
use NOTIFY together with 2PC at all. Especially since any credible
distributed transaction manager has to deal with COMMIT PREPARED
failures anyway.

Just my $0.02, though.

best regards,
Florian Pflug

Re: Listen / Notify - what to do when the queue is full

From

Heikki Linnakangas

Date:

19 November 2009, 13:55:35

Tom Lane wrote:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> A better approach is to do something similar to what we do now: at
>> prepare, just store the notifications in the state file like we do
>> already. In notify_twophase_postcommit(), copy the messages to the
>> shared queue. Although it's the same approach we have now, it becomes a
>> lot cleaner with the patch, because we're not piggybacking the messages
>> on the backend-private queue of the current transaction, but sending the
>> messages directly on behalf of the prepared transaction being committed.
> 
> This is still ignoring the complaint: you are creating a clear risk
> that COMMIT PREPARED will fail.
> 
> I'm not sure that it's really worth it, but one way this could be made
> safe would be for PREPARE to "reserve" the required amount of queue
> space, such that nobody else could use it during the window from
> PREPARE to COMMIT PREPARED.

Hmm, ignoring 2PC for a moment, I think the patch suffers from a little
race condition:

Session 1: BEGIN;
Session 1: INSERT INTO foo ..;
Session 1: NOTIFY 'foo';
Session 1: COMMIT -- commit begins
Session 1: [commit processing runs AtCommit_NotifyBeforeCommit()]
Session 2: LISTEN 'foo';
Session 2: SELECT * FROM foo;
Session 1: [AtCommit_NotifyAfterCommit() signals listening backends]
Session 2: [waits for notifications]

Because session 2 began listening after session 1 had already sent its
notifications, it missed them. But the SELECT didn't see the INSERT,
because the inserting transaction hadn't fully finished yet.

The window isn't as small as it might seem at first glance, because the
WAL is fsynced between the BeforeCommit and AfterCommit actions.

I think we could fix that by arranging things so that a backend refrains
from advancing its own 'pos' beyond the first notification it has
written itself, until commit is completely finished. I'm not sure but
might already be true if we don't receive interrupts between
BeforeCommit and AfterCommit. LISTEN can then simply start reading from
QUEUE_TAIL instead of QUEUE_HEAD, and in the above example session 2
will see the notifications sent by session 1.

That will handle 2PC as well. We can send the notifications in
prepare-phase, and any LISTEN that starts after the prepare-phase will
see the notifications because they're still in the queue. There is no
risk of running out of disk space in COMMIT PREPARED, because the
notifications have already been written to disk. However, the
notification queue can't be truncated until the prepared transaction
finishes; does anyone think that's a show-stopper?

> On the whole I'd be just as happy to disallow NOTIFY in a 2PC
> transaction.  We have no evidence that anyone out there is using the
> combination, and if they are, they can do the work to make it safe.

Yeah, I doubt we'd hear many complaints in practice.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com

Re: Listen / Notify - what to do when the queue is full

From

Chris Browne

Date:

19 November 2009, 17:32:33

greg@turnstep.com ("Greg Sabino Mullane") writes:
>> BTW, did we discuss the issue of 2PC transactions versus notify?
>> The current behavior of 2PC with notify is pretty cheesy and will
>> become more so if we make this change --- you aren't really
>> guaranteed that the notify will happen, even though the prepared
>> transaction did commit.  I think it might be better to disallow
>> NOTIFY inside a prepared xact.
>
> That's a tough one. On the one hand, simply stating that NOTIFY and 2PC
> don't play together in the docs would be a straightforward solution
> (and not a bad one, as 2PC is already rare and delicate and should not
> be used lightly). But what I really don't like the is the idea of a
> notify that *may* work or may not - so let's keep it boolean: it either
> works 100% of the time with 2PC, or doesn't at all. Should we throw
> a warning or error if a client attempts to combine the two?

+1 from me...

It should either work, or not work, as opposed to something
nondeterministic.

While it's certainly a nice thing for features to be orthogonal, and for
interactions to "just work," I can see making a good case for NOTIFY and
2PC not playing together.
-- 
select 'cbbrowne' || '@' || 'gmail.com';
http://linuxfinances.info/info/slony.html
Why isn't phonetic spelled the way it sounds?

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

19 November 2009, 18:04:20

On Thu, Nov 19, 2009 at 6:55 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> Hmm, ignoring 2PC for a moment, I think the patch suffers from a little
> race condition:
>
> Session 1: BEGIN;
> Session 1: INSERT INTO foo ..;
> Session 1: NOTIFY 'foo';
> Session 1: COMMIT -- commit begins
> Session 1: [commit processing runs AtCommit_NotifyBeforeCommit()]
----> Session 2 must not read uncommited notifications selectively
> Session 2: LISTEN 'foo';
> Session 2: SELECT * FROM foo;
> Session 1: [AtCommit_NotifyAfterCommit() signals listening backends]
> Session 2: [waits for notifications]
>
> Because session 2 began listening after session 1 had already sent its
> notifications, it missed them.

I think you are right. However note that session 1 does not actively
send notifications to anybody, it just puts them into the queue. It's
every backend's own job to process the queue and see which messages
are interesting and which are not. The example you brought up fails if
Session 2 disregards the notifications based on the current set of
channels that it is listening to at this point. If I understand you
correctly what you are suggesting is to not read uncommitted
notifications from the queue in a reading backend or read all
notifications (regardless of which channel it has been sent to), such
that the backend can apply the check ("Am i listening on this
channel?") later on.

> I think we could fix that by arranging things so that a backend refrains
> from advancing its own 'pos' beyond the first notification it has
> written itself, until commit is completely finished.

In the end this is similar to the idea to not read uncommitted
notifications which was what I did at the beginning. However then you
run into a full queue a lot faster. Imagine a queue length of 1000
with 3 transactions writing 400 notifications each... All three might
fail if they run in parallel, even though space would be sufficient
for at least two of them, and if they are executed in a sequence, all
of them could deliver their notifications.

Given your example, what I am proposing now is to stop reading from
the queue once we see a not-yet-committed notification but once the
queue is full, read the uncommitted notifications, effectively copying
them over into the backend's own memory... Once the transaction
commits and sends a signal, we can process, send and discard the
previously copied notifications. In the above example, at some point
one, two or all three backends would see that the queue is full and
everybody would read the uncommitted notifications of the other one,
copy them into the own memory and space will be freed in the queue.

> That will handle 2PC as well. We can send the notifications in
> prepare-phase, and any LISTEN that starts after the prepare-phase will
> see the notifications because they're still in the queue. There is no
> risk of running out of disk space in COMMIT PREPARED, because the
> notifications have already been written to disk. However, the
> notification queue can't be truncated until the prepared transaction
> finishes; does anyone think that's a show-stopper?

Note that we don't preserve notifications when the database restarts.
But 2PC can cope with restarts. How would that fit together? Also I am
not sure how you are going to deliver notifications that happen
between the PREPARE TRANSACTION and the COMMIT PREPARED (because you
have only one queue pointer which you are not going to advance...) ?

Joachim

Re: Listen / Notify - what to do when the queue is full

From

Heikki Linnakangas

Date:

20 November 2009, 02:50:39

Joachim Wieland wrote:
> The example you brought up fails if
> Session 2 disregards the notifications based on the current set of
> channels that it is listening to at this point.

Right. Session 2 might not be listening at all yet.

> If I understand you
> correctly what you are suggesting is to not read uncommitted
> notifications from the queue in a reading backend or read all
> notifications (regardless of which channel it has been sent to), such
> that the backend can apply the check ("Am i listening on this
> channel?") later on.

Right.

> Note that we don't preserve notifications when the database restarts.
> But 2PC can cope with restarts. How would that fit together? 

The notifications are written to the state file at prepare. They can be
recovered from there and written to the queue again at server start (see
twophase_rmgr.c).

> Also I am
> not sure how you are going to deliver notifications that happen
> between the PREPARE TRANSACTION and the COMMIT PREPARED (because you
> have only one queue pointer which you are not going to advance...) ?

Yeah, that's a problem. One uncommitted notification will block all
others too. In theory you have the same problem without 2PC, but it's OK
because you don't expect one COMMIT to take much longer to finish than
others.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

20 November 2009, 04:34:27

On Fri, Nov 20, 2009 at 7:51 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
>> Note that we don't preserve notifications when the database restarts.
>> But 2PC can cope with restarts. How would that fit together?
>
> The notifications are written to the state file at prepare. They can be
> recovered from there and written to the queue again at server start (see
> twophase_rmgr.c).

Okay, but which of the backends would then leave its pointer at that
place in the queue upon restart?

This is also an issue for the non-restart case, what if you prepare
the transaction in one backend and commit in the other?

Joachim

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

20 November 2009, 05:36:02

On Thu, Nov 19, 2009 at 11:04 PM, Joachim Wieland <joe@mcknight.de> wrote:
> Given your example, what I am proposing now is to stop reading from
> the queue once we see a not-yet-committed notification but once the
> queue is full, read the uncommitted notifications, effectively copying
> them over into the backend's own memory... Once the transaction
> commits and sends a signal, we can process, send and discard the
> previously copied notifications. In the above example, at some point
> one, two or all three backends would see that the queue is full and
> everybody would read the uncommitted notifications of the other one,
> copy them into the own memory and space will be freed in the queue.

Attached is the patch that implements the described modifications.


Joachim

Attachment

listennotify.3.diff

Re: Listen / Notify - what to do when the queue is full

From

Heikki Linnakangas

Date:

20 November 2009, 05:48:17

Joachim Wieland wrote:
> On Fri, Nov 20, 2009 at 7:51 AM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>>> Note that we don't preserve notifications when the database restarts.
>>> But 2PC can cope with restarts. How would that fit together?
>> The notifications are written to the state file at prepare. They can be
>> recovered from there and written to the queue again at server start (see
>> twophase_rmgr.c).
> 
> Okay, but which of the backends would then leave its pointer at that
> place in the queue upon restart?
> 
> This is also an issue for the non-restart case, what if you prepare
> the transaction in one backend and commit in the other?

The dummy procs that represent prepared transactions need to be treated
as backends. Each prepared transaction needs a slot of its own in the
backends array.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com

Re: Listen / Notify - what to do when the queue is full

From

Heikki Linnakangas

Date:

20 November 2009, 07:33:40

Joachim Wieland wrote:
> On Thu, Nov 19, 2009 at 11:04 PM, Joachim Wieland <joe@mcknight.de> wrote:
>> Given your example, what I am proposing now is to stop reading from
>> the queue once we see a not-yet-committed notification but once the
>> queue is full, read the uncommitted notifications, effectively copying
>> them over into the backend's own memory... Once the transaction
>> commits and sends a signal, we can process, send and discard the
>> previously copied notifications. In the above example, at some point
>> one, two or all three backends would see that the queue is full and
>> everybody would read the uncommitted notifications of the other one,
>> copy them into the own memory and space will be freed in the queue.
> 
> Attached is the patch that implements the described modifications.

That's still not enough if session 2 that issues the LISTEN wasn't
previously subscribed to any channels. It will still miss the notification.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com

Re: Listen / Notify - what to do when the queue is full

From

Jeff Davis

Date:

30 November 2009, 01:28:17

On Fri, 2009-11-20 at 10:35 +0100, Joachim Wieland wrote:
> On Thu, Nov 19, 2009 at 11:04 PM, Joachim Wieland <joe@mcknight.de> wrote:
> > Given your example, what I am proposing now is to stop reading from
> > the queue once we see a not-yet-committed notification but once the
> > queue is full, read the uncommitted notifications, effectively copying
> > them over into the backend's own memory... Once the transaction
> > commits and sends a signal, we can process, send and discard the
> > previously copied notifications. In the above example, at some point
> > one, two or all three backends would see that the queue is full and
> > everybody would read the uncommitted notifications of the other one,
> > copy them into the own memory and space will be freed in the queue.
> 
> Attached is the patch that implements the described modifications.

This is a first-pass review.

Comments:
* Why don't we read all notifications into backend-local memory at
every opportunity? It looks like sometimes it's only reading the
committed ones, and I don't see the advantage of leaving it in the SLRU.

Code comments:
* I see a compiler warning:
ipci.c: In function ‘CreateSharedMemoryAndSemaphores’:
ipci.c:222: warning: implicit declaration of function ‘AsyncShmemInit’* sanity_check test fails, needs updating* guc
testfails, needs updating* no docs

The overall design looks pretty good to me. From my very brief pass over
the code, it looks like:* When the queue is full, the transaction waits until there is room
before it's committed. This may have an effect on 2PC, but the consensus
seemed to be that we could restrict the combination of 2PC + NOTIFY.
This also introduces the possibility of starvation, I suppose, but that
seems remote.* When the queue is full, the inserter tries to signal the listening
backends, and tries to make room in the queue.* Backends read the notifications when signaled, or when inserting (in
case the inserting backend is also the one preventing the queue from
shrinking).

I haven't looked at everything yet, but this seems like it's in
reasonable shape from a high level. Joachim, can you clean the patch up,
include docs, and fix the tests? If so, I'll do a full review.

Regards,Jeff Davis

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

30 November 2009, 09:14:29

Hi Jeff,

the current patch suffers from what Heikki recently spotted: If one
backend is putting notifications in the queue and meanwhile another
backend executes LISTEN and commits, then this listening backend
committed earlier and is supposed to receive the notifications of the
notifying backend - even though its transaction started later.

I have a new version that deals with this problem but I need to clean
it up a bit. I am planning to post it this week.

On Mon, Nov 30, 2009 at 6:15 AM, Jeff Davis <pgsql@j-davis.com> wrote:
>  * Why don't we read all notifications into backend-local memory at
> every opportunity? It looks like sometimes it's only reading the
> committed ones, and I don't see the advantage of leaving it in the SLRU.

Exactly because of the problem above we cannot do it. Once the
notification is removed from the queue, then no other backend can
execute a LISTEN anymore because there is no way for it to get that
information. Also we'd need to read _all_ notifications, not only the
committed ones because we don't know what our backend will LISTEN to
in the future.

On the other hand, reading uncommitted notifications guarantees that
we can send an unlimited number of notifications (limited by main
memory) and that we don't run into a full queue in this example:

Queue length: 1000
3 notifying backends, 400 notifications to be sent by each backend.

If all of them send their notifications at the same time, we risk that
all three run into a full queue...

We could still preserve that behavior on the cost that we allow LISTEN
to block until the queue is within its limits again.

>  * When the queue is full, the inserter tries to signal the listening
> backends, and tries to make room in the queue.
>  * Backends read the notifications when signaled, or when inserting (in
> case the inserting backend is also the one preventing the queue from
> shrinking).

Exactly, but it doesn't solve the problem described above. :-(

ISTM that we have two options:

a) allow LISTEN to block if the queue is full - NOTIFY will never fail
(but block as well) and will eventually succeed
b) NOTIFY could fail and make the transaction roll back - LISTEN
always succeeds immediately

Again: This is corner-case behavior and only happens after some
hundreds of gigabytes of notifications have been put to the queue and
have not yet been processed by all listening backends. I like a)
better, but b) is easier to implement...

> I haven't looked at everything yet, but this seems like it's in
> reasonable shape from a high level. Joachim, can you clean the patch up,
> include docs, and fix the tests? If so, I'll do a full review.

As soon as everybody is fine with the approach, I will work on the docs patch.

Joachim

Re: Listen / Notify - what to do when the queue is full

From

Jeff Davis

Date:

04 December 2009, 02:22:57

On Mon, 2009-11-30 at 14:14 +0100, Joachim Wieland wrote:
> I have a new version that deals with this problem but I need to clean
> it up a bit. I am planning to post it this week.

Are planning to send a new version soon?

As it is, we're 12 days from the end of this commitfest, so we don't
have much time to hand the patch back and forth before we're out of
time.

Regards,Jeff Davis

Re: Listen / Notify - what to do when the queue is full

From

Greg Smith

Date:

07 December 2009, 00:39:21

Jeff Davis wrote: <blockquote cite="mid:1259907460.19545.158.camel@jdavis" type="cite"><pre wrap="">On Mon, 2009-11-30
at14:14 +0100, Joachim Wieland wrote: </pre><blockquote type="cite"><pre wrap="">I have a new version that deals with
thisproblem but I need to clean
 
it up a bit. I am planning to post it this week.   </pre></blockquote><pre wrap="">
Are planning to send a new version soon?

As it is, we're 12 days from the end of this commitfest, so we don't
have much time to hand the patch back and forth before we're out of
time. </pre></blockquote> Joachim has been making good progress here, but given the current code maturity vs.
implementationcomplexity of this work my guess is that even if we got an updated version today it's not going to hit
commitquality given the time left.  I'm going to mark this one "returned with feedback", and I do hope that work
continueson this patch so it's early in the queue for the final CommitFest for this version.  It seems like it just
needsa bit more time to let the design mature and get the kinks worked out and it could turn into a useful feature--I
knowI've wanted NOTIFY with a payload for a years.<br /><br /><pre class="moz-signature" cols="72">-- 
 
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
<a class="moz-txt-link-abbreviated" href="mailto:greg@2ndQuadrant.com">greg@2ndQuadrant.com</a>  <a
class="moz-txt-link-abbreviated"href="http://www.2ndQuadrant.com">www.2ndQuadrant.com</a>
 
</pre>

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

09 December 2009, 06:43:35

Hi,

On Mon, Dec 7, 2009 at 5:38 AM, Greg Smith <greg@2ndquadrant.com> wrote:
> JI'm going to mark this one "returned with feedback", and I
> do hope that work continues on this patch so it's early in the queue for the
> final CommitFest for this version.  It seems like it just needs a bit more
> time to let the design mature and get the kinks worked out and it could turn
> into a useful feature--I know I've wanted NOTIFY with a payload for a years.

I am perfectly fine with postponing the patch to the next commitfest. To get
some more feedback and to allow everyone to play with it, I am attaching the
latest version of the patch.

What has changed:

Transactional processing is now hopefully correct:

Examples:

Backend 1:                    Backend 2:

transaction starts
NOTIFY foo;
commit starts
                              transaction starts
                              LISTEN foo;
                              commit starts
                              commit to clog
commit to clog

=> Backend 2 will receive Backend 1's notification.

Backend 1:                    Backend 2:

transaction starts
NOTIFY foo;
commit starts
                              transaction starts
                              UNLISTEN foo;
                              commit starts
                              commit to clog
commit to clog

=> Backend 2 will not receive Backend 1's notification.

This is done by introducing an additional "AsyncCommitOrderLock". It is grabbed
exclusively from transactions that execute LISTEN / UNLISTEN and in shared mode
for transactions that executed NOTIFY only. LISTEN/UNLISTEN transactions then
register the XIDs of the NOTIFYing transactions that are about to commit
at the same time in order to later find out which notifications are visible and
which ones are not.

If the queue is full, any other transaction that is trying to place a
notification to the queue is rolled back! This is basically a consequence of
the former. There are two warnings that will show up in the log once the queue
is more than 50% full and another one if it is more than 75% full. The biggest
threat to run into a full queue are probably backends that are LISTENing and
are idle in transaction.

I have added a function pg_listening() which just contains the names of the
channels that a backend is listening to.

I especially invite people who know more about the transactional stuff than I
do to take a close look at what I have done regarding notification visibility.

One open question regarding the payload is if we need to limit it to ASCII to
not risk conversion issues between different backend character sets?

The second open issue is what we should do regarding 2PC. These options have
been brought up so far:

 1) allow NOTIFY in 2PC but it can happen that the transaction needs to be
     rolled back if the queue is full
 2) disallow NOTIFY in 2PC alltogether
 3) put notifications to the queue on PREPARE TRANSACTION and make backends not
     advance their pointers further than those notifications but wait for the
     2PC transaction to commit. 2PC transactions would never fail but you
     effectively stop the notification system until the 2PC transaction commits.

Comments?

Best regards,
Joachim

Attachment

listennotify.6.diff

Re: Listen / Notify - what to do when the queue is full

From

Robert Haas

Date:

07 January 2010, 14:41:08

On Wed, Dec 9, 2009 at 5:43 AM, Joachim Wieland <joe@mcknight.de> wrote:
> Hi,
>
> On Mon, Dec 7, 2009 at 5:38 AM, Greg Smith <greg@2ndquadrant.com> wrote:
>> JI'm going to mark this one "returned with feedback", and I
>> do hope that work continues on this patch so it's early in the queue for the
>> final CommitFest for this version.  It seems like it just needs a bit more
>> time to let the design mature and get the kinks worked out and it could turn
>> into a useful feature--I know I've wanted NOTIFY with a payload for a years.
>
> I am perfectly fine with postponing the patch to the next commitfest. To get
> some more feedback and to allow everyone to play with it, I am attaching the
> latest version of the patch.
>
> What has changed:
>
> Transactional processing is now hopefully correct:
>
> Examples:
>
> Backend 1:                    Backend 2:
>
> transaction starts
> NOTIFY foo;
> commit starts
>                              transaction starts
>                              LISTEN foo;
>                              commit starts
>                              commit to clog
> commit to clog
>
> => Backend 2 will receive Backend 1's notification.
>
>
> Backend 1:                    Backend 2:
>
> transaction starts
> NOTIFY foo;
> commit starts
>                              transaction starts
>                              UNLISTEN foo;
>                              commit starts
>                              commit to clog
> commit to clog
>
> => Backend 2 will not receive Backend 1's notification.
>
> This is done by introducing an additional "AsyncCommitOrderLock". It is grabbed
> exclusively from transactions that execute LISTEN / UNLISTEN and in shared mode
> for transactions that executed NOTIFY only. LISTEN/UNLISTEN transactions then
> register the XIDs of the NOTIFYing transactions that are about to commit
> at the same time in order to later find out which notifications are visible and
> which ones are not.
>
> If the queue is full, any other transaction that is trying to place a
> notification to the queue is rolled back! This is basically a consequence of
> the former. There are two warnings that will show up in the log once the queue
> is more than 50% full and another one if it is more than 75% full. The biggest
> threat to run into a full queue are probably backends that are LISTENing and
> are idle in transaction.
>
>
> I have added a function pg_listening() which just contains the names of the
> channels that a backend is listening to.
>
>
> I especially invite people who know more about the transactional stuff than I
> do to take a close look at what I have done regarding notification visibility.
>
>
> One open question regarding the payload is if we need to limit it to ASCII to
> not risk conversion issues between different backend character sets?
>
>
> The second open issue is what we should do regarding 2PC. These options have
> been brought up so far:
>
>  1) allow NOTIFY in 2PC but it can happen that the transaction needs to be
>     rolled back if the queue is full
>  2) disallow NOTIFY in 2PC alltogether
>  3) put notifications to the queue on PREPARE TRANSACTION and make backends not
>     advance their pointers further than those notifications but wait for the
>     2PC transaction to commit. 2PC transactions would never fail but you
>     effectively stop the notification system until the 2PC transaction commits.
>
> Comments?

Joachim - This no longer applies - please rebase, repost, and add a
link to the new version to the commitfest app.

Jeff - Do you want to continue reviewing this for the next CommitFest,
or hand off to another reviewer?

Thanks,

...Robert

Re: Listen / Notify - what to do when the queue is full

From

Jeff Davis

Date:

07 January 2010, 15:04:41

On Thu, 2010-01-07 at 13:40 -0500, Robert Haas wrote:
> Joachim - This no longer applies - please rebase, repost, and add a
> link to the new version to the commitfest app.
> 
> Jeff - Do you want to continue reviewing this for the next CommitFest,
> or hand off to another reviewer?

Sure, I'll review it.

Regards,Jeff Davis

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

08 January 2010, 08:48:18

On Thu, Jan 7, 2010 at 7:40 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Joachim - This no longer applies - please rebase, repost, and add a
> link to the new version to the commitfest app.

Updated patch attached.

>From my point of view these are the current open questions:

- is the general approach reasonable (i.e. to put notifications into
an slru-based queue instead of into shared memory to allow for a
better handling of notification bursts)

- is the transactional behavior correct (see upthread)

- do we need to limit the payload to pure ASCII ? I think yes, we need
to. I also think we need to reject other payloads with elog(ERROR...).

- how to deal with 2PC (see upthread) ?

- how to deal with hot standby (has not been discussed yet)


Joachim

Attachment

listennotify.7.diff

Re: Listen / Notify - what to do when the queue is full

From

Merlin Moncure

Date:

08 January 2010, 14:19:50

On Fri, Jan 8, 2010 at 7:48 AM, Joachim Wieland <joe@mcknight.de> wrote:
> - do we need to limit the payload to pure ASCII ? I think yes, we need
> to. I also think we need to reject other payloads with elog(ERROR...).

Just noticed this...don't you mean UTF8?  Are we going to force non
English speaking users to send all payloads in English?

merlin

Re: Listen / Notify - what to do when the queue is full

From

Stefan Kaltenbrunner

Date:

08 January 2010, 14:24:38

Merlin Moncure wrote:
> On Fri, Jan 8, 2010 at 7:48 AM, Joachim Wieland <joe@mcknight.de> wrote:
>> - do we need to limit the payload to pure ASCII ? I think yes, we need
>> to. I also think we need to reject other payloads with elog(ERROR...).
> 
> Just noticed this...don't you mean UTF8?  Are we going to force non
> English speaking users to send all payloads in English?

hmm ASCII only sounds weird though doing UTF8 would have obvious 
conversion problems in some circumstances - what about bytea (or why 
_do_ we have to limit this to something?).

Stefan

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

08 January 2010, 14:25:05

Merlin Moncure <mmoncure@gmail.com> writes:
> On Fri, Jan 8, 2010 at 7:48 AM, Joachim Wieland <joe@mcknight.de> wrote:
>> - do we need to limit the payload to pure ASCII ? I think yes, we need
>> to. I also think we need to reject other payloads with elog(ERROR...).

> Just noticed this...don't you mean UTF8?  Are we going to force non
> English speaking users to send all payloads in English?

No, he meant ASCII.  Otherwise we're going to have to deal with encoding
conversion issues.
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Merlin Moncure

Date:

08 January 2010, 16:04:23

On Fri, Jan 8, 2010 at 1:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Merlin Moncure <mmoncure@gmail.com> writes:
>> On Fri, Jan 8, 2010 at 7:48 AM, Joachim Wieland <joe@mcknight.de> wrote:
>>> - do we need to limit the payload to pure ASCII ? I think yes, we need
>>> to. I also think we need to reject other payloads with elog(ERROR...).
>
>> Just noticed this...don't you mean UTF8?  Are we going to force non
>> English speaking users to send all payloads in English?
>
> No, he meant ASCII.  Otherwise we're going to have to deal with encoding
> conversion issues.

That seems pretty awkward...instead of forcing an ancient, useless to
90% of the world encoding, why not send bytea (if necessary hex/b64
encoded)?  I'm just trying to imagine how databases encoded in non
ascii superset encodings would use this feature...

If we must use ascii, we should probably offer conversion functions
to/from text, right?  I definitely understand the principle of the
utility of laziness, but is this a proper case of simply dumping the
problem onto the user?

merlin

Re: Listen / Notify - what to do when the queue is full

From

Stefan Kaltenbrunner

Date:

08 January 2010, 16:12:08

Andrew Chernow wrote:
>> conversion problems in some circumstances - what about bytea (or why 
>> _do_ we have to limit this to something?).
>>
> 
> I agree with bytea.  Zero conversions and the most flexible.  Payload 
> encoding/format should be decided by the user.

yeah that is exactly why I think they this would be the most flexible 
option...


Stefan

Re: Listen / Notify - what to do when the queue is full

From

Andrew Chernow

Date:

08 January 2010, 16:47:08

> conversion problems in some circumstances - what about bytea (or why 
> _do_ we have to limit this to something?).
> 

I agree with bytea.  Zero conversions and the most flexible.  Payload 
encoding/format should be decided by the user.

-- 
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

08 January 2010, 17:10:41

Merlin Moncure <mmoncure@gmail.com> writes:
> On Fri, Jan 8, 2010 at 1:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> No, he meant ASCII. �Otherwise we're going to have to deal with encoding
>> conversion issues.

> That seems pretty awkward...instead of forcing an ancient, useless to
> 90% of the world encoding, why not send bytea

You mean declare the notify parameter as bytea instead of text, and dump
all encoding conversion issues onto the user?  We could do that I guess.
I'm not convinced that it's either more functional or more convenient to
use than a parameter that's declared as text and restricted to ASCII.

> If we must use ascii, we should probably offer conversion functions
> to/from text, right?

There's encode() and decode(), which is what people would wind up using
quite a lot of the time if we declare the parameter to be bytea.
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Merlin Moncure

Date:

08 January 2010, 17:23:24

On Fri, Jan 8, 2010 at 4:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Merlin Moncure <mmoncure@gmail.com> writes:
>> On Fri, Jan 8, 2010 at 1:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> No, he meant ASCII.  Otherwise we're going to have to deal with encoding
>>> conversion issues.
>
>> That seems pretty awkward...instead of forcing an ancient, useless to
>> 90% of the world encoding, why not send bytea
>
> You mean declare the notify parameter as bytea instead of text, and dump
> all encoding conversion issues onto the user?  We could do that I guess.
> I'm not convinced that it's either more functional or more convenient to
> use than a parameter that's declared as text and restricted to ASCII.
>
>> If we must use ascii, we should probably offer conversion functions
>> to/from text, right?
>
> There's encode() and decode(), which is what people would wind up using
> quite a lot of the time if we declare the parameter to be bytea.

all good points...IF notify payload can be result of expression so for
non const cases where you are expecting non ascii data you can throw a
quick encode.  I had assumed it didn't (wrongly?) because current
notify relname takes a strict literal, so this would work (If the
below works, disregard the rest and I concede ascii is fine):
notify test encode('some_string', 'hex');

A quick look at gram.y changes suggest this wont work if I read it
correctly.  How would someone from Japan issue a notify/payload from
the console?

merlin

Re: Listen / Notify - what to do when the queue is full

From

"Greg Sabino Mullane"

Date:

11 January 2010, 01:00:34

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


>>> - do we need to limit the payload to pure ASCII ? I think yes, we need
>>> to. I also think we need to reject other payloads with elog(ERROR...).

>...[snip other followups]

On the one hand, I don't see the problem with ASCII here - the
payload is meant as a quick shorthand convenience, not a literal payload
of important information. On the other, it should at least match
the current rules for the listen and notify names themselves, which
means allowing more than ASCII.

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 201001102303
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAktKo40ACgkQvJuQZxSWSshg9ACg2uiDYuhBnRQqFS6Ej3O9VLcC
2TgAn035OrYcdERn4I1VI4NRQFBIcXZ/
=yJmK
-----END PGP SIGNATURE-----

Re: Listen / Notify - what to do when the queue is full

From

Peter Eisentraut

Date:

11 January 2010, 02:58:35

On mån, 2010-01-11 at 04:05 +0000, Greg Sabino Mullane wrote:
> On the one hand, I don't see the problem with ASCII here - the
> payload is meant as a quick shorthand convenience, not a literal payload
> of important information.

Is it not?  The notify name itself is already a quick shorthand
convenience.  Who knows what the payload is actually meant for.  Have
use cases been presented and analyzed?

Re: Listen / Notify - what to do when the queue is full

From

Arnaud Betremieux

Date:

11 January 2010, 06:12:42

A use case : use NOTIFY in a rule to send the primary key of a row that 
has been updated (for instance to manage a cache).

This requires a patch on top of this one, and it really is a separate 
concern, but I thought I'd give the use case anyway, since I believe it 
is relevant to the issues here.

I can see four kinds of NOTIFY statements :

1) The existing case    : NOTIFY channel
2) With Joachim's patch : NOTIFY channel 'payload'
3) My use case          : NOTIFY channel 'pay'||'load' (actually NOTIFY 
channel '<table_name>#'||OLD.id)
4) Taken one step further : NOTIFY channel (SELECT payload FROM payloads 
WHERE ...)

I'm working on a proof of concept patch to use Joachim's new notify 
function to introduce case 3. I think this means going through the 
planner and executor, so I might as well do case 4 as well. A use case I 
can see for case 4 is sending information in a rule or trigger about an 
updated object, when that information is stored in a separate table 
(versioning or audit information for example).

Cases 1 and 2 could remain utility commands, while cases 3 and 4 could 
go through the planner and the executor, the notify plan node calling 
Joachim's new notify function on execution.

Best regards,
Arnaud Betremieux

On 11/01/2010 07:58, Peter Eisentraut wrote:
> On mån, 2010-01-11 at 04:05 +0000, Greg Sabino Mullane wrote:
>    
>> On the one hand, I don't see the problem with ASCII here - the
>> payload is meant as a quick shorthand convenience, not a literal payload
>> of important information.
>>      
> Is it not?  The notify name itself is already a quick shorthand
> convenience.  Who knows what the payload is actually meant for.  Have
> use cases been presented and analyzed?
>
>
>
>

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

11 January 2010, 09:26:15

Arnaud Betremieux <arnaud.betremieux@keyconsulting.fr> writes:
> 3) My use case          : NOTIFY channel 'pay'||'load' (actually NOTIFY 
> channel '<table_name>#'||OLD.id)
> 4) Taken one step further : NOTIFY channel (SELECT payload FROM payloads 
> WHERE ...)

> I'm working on a proof of concept patch to use Joachim's new notify 
> function to introduce case 3. I think this means going through the 
> planner and executor, so I might as well do case 4 as well.

It would be a lot less work to introduce a function like send_notify()
that could be invoked within a regular SELECT.  Pushing a utility
statement through the planner/executor code path will do enough violence
to the system design that such a patch would probably be rejected out of
hand.
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Andrew Chernow

Date:

11 January 2010, 09:34:49

Arnaud Betremieux wrote:
> A use case : use NOTIFY in a rule to send the primary key of a row that 
> has been updated (for instance to manage a cache).
> 
> This requires a patch on top of this one, and it really is a separate 
> concern, but I thought I'd give the use case anyway, since I believe it 
> is relevant to the issues here.
> 
> I can see four kinds of NOTIFY statements :
> 
> 1) The existing case    : NOTIFY channel
> 2) With Joachim's patch : NOTIFY channel 'payload'
> 3) My use case          : NOTIFY channel 'pay'||'load' (actually NOTIFY 
> channel '<table_name>#'||OLD.id)
> 4) Taken one step further : NOTIFY channel (SELECT payload FROM payloads 
> WHERE ...)
> 

I know I'd be looking to send utf8 and byteas.  Can notify as it stands today 
take an expression for the payload (#4)?

The other issue is that libpq expects a string, so if non-c-string safe data is 
to be sent a protocol change is needed or the server must hex encode all 
payloads before transit and libpq must decode it; also requiring an 
'payload_len' member be added to PGnotify.  The latter is better IMHO as 
protocol changes are nasty.  Although, only needed to support bytea.  If all we 
want is utf8, then there is no issue with libpq.

-- 
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

Re: Listen / Notify - what to do when the queue is full

From

Merlin Moncure

Date:

11 January 2010, 09:51:37

On Mon, Jan 11, 2010 at 8:25 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>> I'm working on a proof of concept patch to use Joachim's new notify
>> function to introduce case 3. I think this means going through the
>> planner and executor, so I might as well do case 4 as well.
>
> It would be a lot less work to introduce a function like send_notify()
> that could be invoked within a regular SELECT.  Pushing a utility

+1

IMO, this neatly solves the problem and addresses some of the concerns
I was raising upthread.  A bytea version would be nice but a text only
version is workable.

merlin

Re: Listen / Notify - what to do when the queue is full

From

Arnaud Betremieux

Date:

11 January 2010, 10:08:19

On 11/01/2010 14:25, Tom Lane wrote:
> Arnaud Betremieux<arnaud.betremieux@keyconsulting.fr>  writes:
>    
>> 3) My use case          : NOTIFY channel 'pay'||'load' (actually NOTIFY
>> channel '<table_name>#'||OLD.id)
>> 4) Taken one step further : NOTIFY channel (SELECT payload FROM payloads
>> WHERE ...)
>>      
>    
>> I'm working on a proof of concept patch to use Joachim's new notify
>> function to introduce case 3. I think this means going through the
>> planner and executor, so I might as well do case 4 as well.
>>      
> It would be a lot less work to introduce a function like send_notify()
> that could be invoked within a regular SELECT.  Pushing a utility
> statement through the planner/executor code path will do enough violence
> to the system design that such a patch would probably be rejected out of
> hand.
>    
Introducing a send_notify function does sound a lot simpler and cleaner, 
and I think I'll try it this way. The only thing that bothers me is the 
syntax :

... DO ALSO SELECT send_notify('payload')
... DO ALSO SELECT send_notify(a) FROM b

How about a new grammar for NOTIFY <channel> a_expr, which would go 
through the rewriter to be transformed as a SELECT ?
so NOTIFY (SELECT a FROM b) would become SELECT send_notify(SELECT a 
FROM b) ?

Re: Listen / Notify - what to do when the queue is full

From

Jeff Davis

Date:

19 January 2010, 03:14:47

Initial comments:

* compiler warnings

ipci.c: In function ‘CreateSharedMemoryAndSemaphores’:
ipci.c:228: warning: implicit declaration of function ‘AsyncShmemInit’

* 2PC

Adds complexity, and I didn't see any clear, easy solution after
reading the thread. I don't see this as a showstopper, so I'd leave
this until later.

* Hot Standby

It would be nice to have NOTIFY work with HS, but I don't think that's
going to happen this cycle. I don't think this is a showstopper,
either.

* ASCII?

I tend to think that encoding should be handled properly here, or we
should use BYTEA. My reasoning is that TEXT is a datatype, whereas ASCII
is not, and I don't see a reason to invent it now. We might as well use
real TEXT (with encoding support) or use BYTEA which works fine for
ASCII anyway.

* send_notify()

Someone suggested a function like this, which sounds useful to me. Not
a showstopper, but if it's not too difficult it might be worth
including. (Incidentally, the function signature should match the type
of the payload, which is unclear if we're going to use ASCII.)

* AsyncCommitOrderLock

This is a big red flag to me. The name by itself is scary. The fact
that it is held across a series of fairly interesting operations is
even scarier.

As you say in the comments, it's acquired before recording the
transaction commit in the clog, and released afterward. What you
actually have is something like:
   AtCommit_NotifyBeforeCommit();
   HOLD_INTERRUPTS();
   s->state = TRANS_COMMIT;
   latestXid = RecordTransactionCommit(false);
   TRACE_POSTGRESQL_TRANSACTION_COMMIT(MyProc->lxid);
   ProcArrayEndTransaction(MyProc, latestXid);
   CallXactCallbacks(XACT_EVENT_COMMIT);
   ResourceOwnerRelease(TopTransactionResourceOwner,                        RESOURCE_RELEASE_BEFORE_LOCKS,
         true, true);
 
   AtEOXact_Buffers(true);
   AtEOXact_RelationCache(true);
   AtEarlyCommit_Snapshot();
   AtEOXact_Inval(true);
   smgrDoPendingDeletes(true);
   AtEOXact_MultiXact();
   AtCommit_NotifyAfterCommit();

That is a lot of stuff happening between the acquisition and
release. There are two things particularly scary about this (to me):
* holding an LWLock while performing I/O (in smgrDoPendingDeletes())
* holding an LWLock while acquiring another LWLock (e.g.  ProcArrayEndTransaction())

An LWLock-based deadlock is a hard deadlock -- not detected by the
deadlock detector, and there's not much you can do even if it were --
right in the transaction-completing code. That means that the whole
system would be locked up. I'm not sure that such a deadlock condition
exists, but it seems likely (if not, it's only because other areas of
the code avoided this practice), and hard to prove that it's safe. And
it's probably bad for performance to increase the length of time
transactions are serialized, even if only for cases that involve
LISTEN/NOTIFY.

I believe this needs a re-think. What is the real purpose for
AsyncCommitOrderLock, and can we acheive that another way? It seems
that you're worried about a transaction that issues a LISTEN and
committing not getting a notification from a NOTIFYing transaction
that commits concurrently (and slightly after the LISTEN). But
SignalBackends() is called after transaction commit, and should signal
all backends who committed a LISTEN before that time, right?

* The transaction IDs are used because Send_Notify() is called before
the AsyncCommitOrderLock acquire, and so the backend could potentially
be reading uncommitted notifications that are "about" to be committed
(or aborted). Then, the queue is not read further until that transaction
completes. That's not really commented effectively, and I suspect the
process could be simpler. For instance, why can't the backend always
read all of the data from the queue, notifying if the transaction is
committed and saving to a local list otherwise (which would be checked
on the next wakeup)?

* Can you clarify the big comment at the top of async.c? It's a helpful
overview, but it glosses over some of the touchy synchronization steps
going on. I find myself trying to make a map of these steps myself.

Regards,Jeff Davis

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

19 January 2010, 05:58:46

Hi Jeff,

thanks a lot for your review. I will reply to your review again in
detail but I'd like to answer your two main questions already now.

On Tue, Jan 19, 2010 at 8:08 AM, Jeff Davis <pgsql@j-davis.com> wrote:
> * AsyncCommitOrderLock
>
> I believe this needs a re-think. What is the real purpose for
> AsyncCommitOrderLock, and can we acheive that another way? It seems
> that you're worried about a transaction that issues a LISTEN and
> committing not getting a notification from a NOTIFYing transaction
> that commits concurrently (and slightly after the LISTEN).

Yes, that is exactly the point. However I am not worried about a
notification getting lost but rather about determining the visibility
of the notifications.

In the end we need to be able to know about the order of LISTEN,
UNLISTEN and NOTIFY commits to find out who should receive which
notifications. As you cannot determine if xid1 has committed before or
after xid2 retrospectively I enforced the order by an LWLock and by
saving the list of xids currently being committed.

There are also two examples in
http://archives.postgresql.org/pgsql-hackers/2009-12/msg00790.php
about that issue.

> But SignalBackends() is called after transaction commit, and should signal
> all backends who committed a LISTEN before that time, right?

Yes, any listening backend is being signaled but that doesn't help to
find out about the exact order of the almost-concurrent events that
happened before.

> * The transaction IDs are used because Send_Notify() is called before
> the AsyncCommitOrderLock acquire, and so the backend could potentially
> be reading uncommitted notifications that are "about" to be committed
> (or aborted). Then, the queue is not read further until that transaction
> completes. That's not really commented effectively, and I suspect the
> process could be simpler. For instance, why can't the backend always
> read all of the data from the queue, notifying if the transaction is
> committed and saving to a local list otherwise (which would be checked
> on the next wakeup)?

It's true that the backends could always read up to the end of the
queue and copy everything into the local memory. However you still
need to apply the same checks before you deliver the notifications:
You need to make sure that the transaction has committed and that you
were listening to the channels of the notifications at the time they
got sent / committed. Also you need to copy really _everything_
because you could start to listen to a channel after copying its
uncommitted notifications.

There are other reasons (tail pointer management and signaling
strategy) but in the end it seemed more straightforward to stop as
soon as we hit an uncommitted notification. We will receive a signal
for it eventually anyway and can then start again and read further.

Also I think (but I have no numbers about it) that it makes the
backends work more on the same slru pages.

Joachim

Re: Listen / Notify - what to do when the queue is full

From

Arnaud Betremieux

Date:

19 January 2010, 06:54:26

Regarding the send_notify function, I have been working on it and have a
patch (attached) that applies on top of Joachim's.

It introduces a send_notify SQL function that calls the Async_Notify C
function.

It's pretty straightforward and will need to be refined to take into
account the encoding decisions that are made in Async_Notify, but it
should be a good starting point, and it doesn't introduce much in the
way of complexity.

On a side note, since that might be a discussion for another thread, it
would be nice and more DBA friendly, to have some kind of syntactic
sugar around it to be able to use
NOTIFY channel (SELECT col FROM table);
instead of
SELECT send_notify('channel', (SELECT col FROM table));

Best regards,
Arnaud Betremieux

On 19/01/2010 08:08, Jeff Davis wrote:
> Initial comments:
>
> * compiler warnings
>
> ipci.c: In function ‘CreateSharedMemoryAndSemaphores’:
> ipci.c:228: warning: implicit declaration of function ‘AsyncShmemInit’
>
> * 2PC
>
> Adds complexity, and I didn't see any clear, easy solution after
> reading the thread. I don't see this as a showstopper, so I'd leave
> this until later.
>
> * Hot Standby
>
> It would be nice to have NOTIFY work with HS, but I don't think that's
> going to happen this cycle. I don't think this is a showstopper,
> either.
>
> * ASCII?
>
> I tend to think that encoding should be handled properly here, or we
> should use BYTEA. My reasoning is that TEXT is a datatype, whereas ASCII
> is not, and I don't see a reason to invent it now. We might as well use
> real TEXT (with encoding support) or use BYTEA which works fine for
> ASCII anyway.
>
> * send_notify()
>
> Someone suggested a function like this, which sounds useful to me. Not
> a showstopper, but if it's not too difficult it might be worth
> including. (Incidentally, the function signature should match the type
> of the payload, which is unclear if we're going to use ASCII.)
>
> * AsyncCommitOrderLock
>
> This is a big red flag to me. The name by itself is scary. The fact
> that it is held across a series of fairly interesting operations is
> even scarier.
>
> As you say in the comments, it's acquired before recording the
> transaction commit in the clog, and released afterward. What you
> actually have is something like:
>
>      AtCommit_NotifyBeforeCommit();
>
>      HOLD_INTERRUPTS();
>
>      s->state = TRANS_COMMIT;
>
>      latestXid = RecordTransactionCommit(false);
>
>      TRACE_POSTGRESQL_TRANSACTION_COMMIT(MyProc->lxid);
>
>      ProcArrayEndTransaction(MyProc, latestXid);
>
>      CallXactCallbacks(XACT_EVENT_COMMIT);
>
>      ResourceOwnerRelease(TopTransactionResourceOwner,
>                           RESOURCE_RELEASE_BEFORE_LOCKS,
>                           true, true);
>
>      AtEOXact_Buffers(true);
>
>      AtEOXact_RelationCache(true);
>
>      AtEarlyCommit_Snapshot();
>
>      AtEOXact_Inval(true);
>
>      smgrDoPendingDeletes(true);
>
>      AtEOXact_MultiXact();
>
>      AtCommit_NotifyAfterCommit();
>
> That is a lot of stuff happening between the acquisition and
> release. There are two things particularly scary about this (to me):
>
>   * holding an LWLock while performing I/O (in smgrDoPendingDeletes())
>
>   * holding an LWLock while acquiring another LWLock (e.g.
>     ProcArrayEndTransaction())
>
> An LWLock-based deadlock is a hard deadlock -- not detected by the
> deadlock detector, and there's not much you can do even if it were --
> right in the transaction-completing code. That means that the whole
> system would be locked up. I'm not sure that such a deadlock condition
> exists, but it seems likely (if not, it's only because other areas of
> the code avoided this practice), and hard to prove that it's safe. And
> it's probably bad for performance to increase the length of time
> transactions are serialized, even if only for cases that involve
> LISTEN/NOTIFY.
>
> I believe this needs a re-think. What is the real purpose for
> AsyncCommitOrderLock, and can we acheive that another way? It seems
> that you're worried about a transaction that issues a LISTEN and
> committing not getting a notification from a NOTIFYing transaction
> that commits concurrently (and slightly after the LISTEN). But
> SignalBackends() is called after transaction commit, and should signal
> all backends who committed a LISTEN before that time, right?
>
> * The transaction IDs are used because Send_Notify() is called before
> the AsyncCommitOrderLock acquire, and so the backend could potentially
> be reading uncommitted notifications that are "about" to be committed
> (or aborted). Then, the queue is not read further until that transaction
> completes. That's not really commented effectively, and I suspect the
> process could be simpler. For instance, why can't the backend always
> read all of the data from the queue, notifying if the transaction is
> committed and saving to a local list otherwise (which would be checked
> on the next wakeup)?
>
> * Can you clarify the big comment at the top of async.c? It's a helpful
> overview, but it glosses over some of the touchy synchronization steps
> going on. I find myself trying to make a map of these steps myself.
>
> Regards,
>     Jeff Davis
>
>
>

Attachment

listennotify.function.diff

Re: Listen / Notify - what to do when the queue is full

From

Jeff Davis

Date:

19 January 2010, 19:41:25

On Wed, 2009-12-09 at 11:43 +0100, Joachim Wieland wrote:
> Examples:
> 
> Backend 1:                    Backend 2:
> 
> transaction starts
> NOTIFY foo;
> commit starts
>                               transaction starts
>                               LISTEN foo;
>                               commit starts
>                               commit to clog
> commit to clog
> 
> => Backend 2 will receive Backend 1's notification.

How does the existing notification mechanism solve this problem? Is it
really a problem? Why would Backend2 expect to receive the notification?

> 
> Backend 1:                    Backend 2:
> 
> transaction starts
> NOTIFY foo;
> commit starts
>                               transaction starts
>                               UNLISTEN foo;
>                               commit starts
>                               commit to clog
> commit to clog
> 
> => Backend 2 will not receive Backend 1's notification.

This is the same problem, except that it doesn't matter. A spurious
notification is not a bug, right?

Regards,Jeff Davis

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

19 January 2010, 20:05:33

Jeff Davis <pgsql@j-davis.com> writes:
> How does the existing notification mechanism solve this problem? Is it
> really a problem? Why would Backend2 expect to receive the notification?

The intended way to use LISTEN/NOTIFY for status tracking is
1. LISTEN foo; (and commit the listen)2. examine current database state3. assume that we'll get a NOTIFY for any change
thatcommits subsequently to what we saw in step 2

In the current implementation, a transaction that is in process of
commit during step 1 might possibly not see your pg_listener record
as committed, and so it might not send you a NOTIFY for whatever it did.
If it still hasn't committed when you perform step 2, then you'd fail to
see its changes as part of your initial state, *and* you'd not get a
NOTIFY when the changes did become visible. The way we prevent this
race condition is that a listener takes exclusive lock on pg_listener
before entering its record, and doesn't release the lock until after
committing. Notifiers likewise take exclusive lock. This serializes
things so that either the modifying transaction commits before the
listener completes step 1 (and hence the listener will see its updates
in step 2), or the listener is guaranteed to have a committed record
in pg_listener when the modifying process determines whom to notify.

I guess Joachim is trying to provide a similar guarantee for the new
implementation, but I'm not clear on why it would require locking.
The new implementation is broadcast and ISTM it shouldn't require the
modifying transaction to know which processes are listening.

I haven't read the patch but I agree that the description you give is
pretty scary from a performance standpoint. More locks around
transaction commit doesn't seem like a good idea. If they're only taken
when an actual LISTEN or NOTIFY has happened in the current transaction,
that'd be okay (certainly no worse than what happens now) but the naming
suggested that this'd happen unconditionally.
regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Jeff Davis

Date:

19 January 2010, 20:19:16

On Tue, 2010-01-19 at 19:05 -0500, Tom Lane wrote:
> I guess Joachim is trying to provide a similar guarantee for the new
> implementation, but I'm not clear on why it would require locking.
> The new implementation is broadcast and ISTM it shouldn't require the
> modifying transaction to know which processes are listening.

I think there is a better way. I'll dig into it a little more.

> I haven't read the patch but I agree that the description you give is
> pretty scary from a performance standpoint.  More locks around
> transaction commit doesn't seem like a good idea. 

I was also worried about holding multiple LWLocks at once -- is such
practice generally avoided in the rest of the code?

> If they're only taken
> when an actual LISTEN or NOTIFY has happened in the current transaction,
> that'd be okay (certainly no worse than what happens now) but the naming
> suggested that this'd happen unconditionally.

It appears that the locks are only taken when LISTEN or NOTIFY is
involved.

Regards,Jeff Davis

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

19 January 2010, 20:25:58

Jeff Davis <pgsql@j-davis.com> writes:
> I was also worried about holding multiple LWLocks at once -- is such
> practice generally avoided in the rest of the code?

It's allowed but remember that there is no deadlock detection in lwlock.c.
You must be very certain that there is only one possible order in which
such locks could be taken.  Interactions with heavyweight locks would be
bad news as well.

> It appears that the locks are only taken when LISTEN or NOTIFY is
> involved.

On the whole it might be better if a heavyweight lock were used,
such that it'll automatically clean up after commit.  (I'm still
wondering if we couldn't do without the lock altogether though.)
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Jeff Davis

Date:

19 January 2010, 20:28:59

On Tue, 2010-01-19 at 19:24 -0500, Tom Lane wrote:
> Jeff Davis <pgsql@j-davis.com> writes:
> > I was also worried about holding multiple LWLocks at once -- is such
> > practice generally avoided in the rest of the code?
> 
> It's allowed but remember that there is no deadlock detection in lwlock.c.
> You must be very certain that there is only one possible order in which
> such locks could be taken.  Interactions with heavyweight locks would be
> bad news as well.

That was my worry initially.

> On the whole it might be better if a heavyweight lock were used,
> such that it'll automatically clean up after commit.  (I'm still
> wondering if we couldn't do without the lock altogether though.)

Yes, I think there's a better way as well. I'll look into it.

Regards,Jeff Davis

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

20 January 2010, 04:25:54

On Wed, Jan 20, 2010 at 1:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I guess Joachim is trying to provide a similar guarantee for the new
> implementation, but I'm not clear on why it would require locking.
> The new implementation is broadcast and ISTM it shouldn't require the
> modifying transaction to know which processes are listening.

It is rather about a listening backend seeing a notification in the
global queue without knowing if it should deliver the notification to
its frontend or not. The backend needs to know if its own LISTEN
committed before or after the NOTIFY committed that it sees in the
queue. As I have understood there is no way to find out if a
transaction has committed before or after another transaction. If we
had this, it would be easy without a lock.

> I haven't read the patch but I agree that the description you give is
> pretty scary from a performance standpoint.  More locks around
> transaction commit doesn't seem like a good idea.  If they're only taken
> when an actual LISTEN or NOTIFY has happened in the current transaction,
> that'd be okay (certainly no worse than what happens now) but the naming
> suggested that this'd happen unconditionally.

The lock is taken exclusively by transactions doing LISTEN/UNLISTEN
and in shared mode by transactions that execute only NOTIFY. It should
really not degrade performance but I understand Jeff's concerns about
deadlocks.

Joachim

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

20 January 2010, 12:14:52

Joachim Wieland <joe@mcknight.de> writes:
> On Wed, Jan 20, 2010 at 1:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I guess Joachim is trying to provide a similar guarantee for the new
>> implementation, but I'm not clear on why it would require locking.

> It is rather about a listening backend seeing a notification in the
> global queue without knowing if it should deliver the notification to
> its frontend or not. The backend needs to know if its own LISTEN
> committed before or after the NOTIFY committed that it sees in the
> queue.

In that case I think you've way overcomplicated matters.  Just deliver
the notification.  We don't really care if the listener gets additional
notifications; the only really bad case would be if it failed to get an
event that was generated after it committed a LISTEN.
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

20 January 2010, 16:44:16

On Wed, Jan 20, 2010 at 5:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> In that case I think you've way overcomplicated matters.  Just deliver
> the notification.  We don't really care if the listener gets additional
> notifications; the only really bad case would be if it failed to get an
> event that was generated after it committed a LISTEN.

Okay, what about unprocessed notifications in the queue and a backend
executing UNLISTEN: can we assume that it is not interested in
notifications anymore once it executes UNLISTEN and discard all of
them even though there might be notifications that have been sent (and
committed) before the UNLISTEN committed?

Joachim

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

20 January 2010, 16:55:13

Joachim Wieland <joe@mcknight.de> writes:
> Okay, what about unprocessed notifications in the queue and a backend
> executing UNLISTEN: can we assume that it is not interested in
> notifications anymore once it executes UNLISTEN and discard all of
> them even though there might be notifications that have been sent (and
> committed) before the UNLISTEN committed?

Yes.  That is the case with the existing implementation as well, no?
We don't consider sending notifies until transaction end, so anything
that commits during the xact in which you UNLISTEN will get dropped.
Again, a little bit of sloppiness here doesn't seem important.  Issuing
UNLISTEN implies the client is not interested anymore.
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Jeff Davis

Date:

20 January 2010, 18:08:35

On Wed, 2010-01-20 at 15:54 -0500, Tom Lane wrote:
> Joachim Wieland <joe@mcknight.de> writes:
> > Okay, what about unprocessed notifications in the queue and a backend
> > executing UNLISTEN: can we assume that it is not interested in
> > notifications anymore once it executes UNLISTEN and discard all of
> > them even though there might be notifications that have been sent (and
> > committed) before the UNLISTEN committed?
> 
> Yes.  That is the case with the existing implementation as well, no?
> We don't consider sending notifies until transaction end, so anything
> that commits during the xact in which you UNLISTEN will get dropped.

Only if the transaction containing UNLISTEN commits. Are you saying it
would also be OK to drop NOTIFYs if a backend's UNLISTEN transaction
aborts?

> Again, a little bit of sloppiness here doesn't seem important.  Issuing
> UNLISTEN implies the client is not interested anymore.

Thinking out loud: If we're taking this approach, I wonder if it might
be a good idea to PreventTransactionChain for LISTEN and UNLISTEN? It
might simplify things for users because they wouldn't be expecting
transaction-like behavior, except for the NOTIFYs themselves.

Regards,Jeff Davis

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

20 January 2010, 18:39:14

On Wed, Jan 20, 2010 at 11:08 PM, Jeff Davis <pgsql@j-davis.com> wrote:
>> Yes.  That is the case with the existing implementation as well, no?
>> We don't consider sending notifies until transaction end, so anything
>> that commits during the xact in which you UNLISTEN will get dropped.
>
> Only if the transaction containing UNLISTEN commits. Are you saying it
> would also be OK to drop NOTIFYs if a backend's UNLISTEN transaction
> aborts?

If the backend's UNLISTEN transaction aborts, then it has never
executed UNLISTEN...

So it will continue to get notifications (if it has executed a LISTEN before).


Joachim

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

20 January 2010, 18:40:51

Jeff Davis <pgsql@j-davis.com> writes:
> On Wed, 2010-01-20 at 15:54 -0500, Tom Lane wrote:
>> Yes.  That is the case with the existing implementation as well, no?
>> We don't consider sending notifies until transaction end, so anything
>> that commits during the xact in which you UNLISTEN will get dropped.

> Only if the transaction containing UNLISTEN commits. Are you saying it
> would also be OK to drop NOTIFYs if a backend's UNLISTEN transaction
> aborts?

No, I would say not, but that wasn't being proposed was it?  The
decisions about what to do are only made at/after commit.

> Thinking out loud: If we're taking this approach, I wonder if it might
> be a good idea to PreventTransactionChain for LISTEN and UNLISTEN?

That shouldn't be necessary IMO.  There's never been such a restriction
before.
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Jeff Davis

Date:

20 January 2010, 22:06:54

On Tue, 2010-01-19 at 19:24 -0500, Tom Lane wrote:
> (I'm still
> wondering if we couldn't do without the lock altogether though.)

Here's the problem as I see it:

If we insert the notifications into the queue before actually recording
the commit, there's a window in between where another backend could
perform the expected sequence as you wrote:
1. LISTEN foo; (and commit the listen) 2. examine current database state 3. assume that we'll get a
NOTIFYfor any change that commits subsequently to what we saw in step 2

and miss the NOTIFYs, and not see the updated database state.

But I don't think that the NOTIFYs will actually be missed. Once put
into the queue, the notification will only be removed from the queue
after all backends have read it. But no backend will advance past it as
long as the notification is from an uncommitted transaction. By the time
the notifying transaction is committed, the listening transaction will
also be committed, and therefore subscribed to the queue.

The newly-listening backend will be awakened properly as well, because
that's done after the notifying transaction commits, and therefore will
wake up any listening transactions that committed earlier.

However, there's still a problem inserting into the queue when no
backends are listening. Perhaps that can be solved right before we wake
up the listening backends after the notifying transaction commits: if
there are no listening backends, clear the queue.

We still might get spurious notifications if they were committed before
the LISTEN transaction was committed. And we also might get spurios
notifications if the UNLISTEN doesn't take effect quite quickly enough.
Those are both acceptable.

If the above scheme is too complex, we can always use a heavyweight
lock. However, there's no pg_listener so it's not obvious what LOCKTAG
to use. We can just pick something arbitrary, like the Oid of the new
pg_listening() function, I suppose. Is there any precedent for that?

Thoughts?

Regards,Jeff Davis

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

21 January 2010, 05:15:14

On Thu, Jan 21, 2010 at 3:06 AM, Jeff Davis <pgsql@j-davis.com> wrote:
> Here's the problem as I see it:

You are writing a lot of true facts but I miss to find a real
problem... What exactly do you see as a problem?

The only time you are writing "problem" is in this paragraph:

> However, there's still a problem inserting into the queue when no
> backends are listening. Perhaps that can be solved right before we wake
> up the listening backends after the notifying transaction commits: if
> there are no listening backends, clear the queue.

This gets already done, in SignalBackends(), a notifying transactions
counts the number of listening backends. If no other backend is
listening, then it signals itself so that the queue gets cleaned (i.e.
the global pointer gets forwarded and the slru pages will be truncated
if possible).

I have been working on simplifying the patch yesterday, I still need
to adapt comments and review it again but I am planning to post the
new version tonight. Then we have a common base again to discuss
further :-)

Thanks for your review,
Joachim

Re: Listen / Notify - what to do when the queue is full

From

Jeff Davis

Date:

21 January 2010, 13:22:18

On Thu, 2010-01-21 at 10:14 +0100, Joachim Wieland wrote:
> On Thu, Jan 21, 2010 at 3:06 AM, Jeff Davis <pgsql@j-davis.com> wrote:
> > Here's the problem as I see it:
> 
> You are writing a lot of true facts but I miss to find a real
> problem... What exactly do you see as a problem?

I worded that in a confusing way, I apologize. My point was that I don't
think we need a lock, because I don't see any situation in which the
notifications would be lost.

> I have been working on simplifying the patch yesterday, I still need
> to adapt comments and review it again but I am planning to post the
> new version tonight. Then we have a common base again to discuss
> further :-)

Sounds good.

Regards,Jeff Davis

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

21 January 2010, 19:08:15

On Thu, Jan 21, 2010 at 6:21 PM, Jeff Davis <pgsql@j-davis.com> wrote:
>> I have been working on simplifying the patch yesterday, I still need
>> to adapt comments and review it again but I am planning to post the
>> new version tonight. Then we have a common base again to discuss
>> further :-)
>
> Sounds good.

Attached is a new version of the patch. I haven't changed it with
regards to 2PC or the ASCII / bytea issues but now the transactional
behavior is less strict which allows to remove the
AsyncCommitOrderLock.

I will leave the commitfest status on "Waiting on author" but I will
be travelling for the weekend and wanted to share my current version.

Joachim

Attachment

listennotify.8.diff

Re: Listen / Notify - what to do when the queue is full

From

Jeff Davis

Date:

29 January 2010, 19:02:06

Comments:

* In standard_ProcessUtility(), case NotifyStmt, you add a comment:
   /* XXX the new listen/notify version can be enabled    * for Hot Standby */
 but I don't think that's correct. We may be able to support LISTEN on the standby, but not NOTIFY (right?). I don't
thinkwe should be adding speculative comments anyway, because there is some work  still needed before HS can support
LISTEN(notably, WAL-logging NOTIFY).
 

* You have a TODO list as a comment. Can you remove it and explain those items on list if they aren't already?

* You have the comment:
   /*    * How long can a payload string possibly be? Actually it needs      to be one    * byte less to provide space
forthe trailing terminating '\0'.    */
 
 That should be written more simply, like "Maximum size of the payload, including terminating NULL."

* You have the Assert:
   Assert(strlen(n->payload) <= NOTIFY_PAYLOAD_MAX_LENGTH);
 which is inconsistent with the earlier test:
   if (stmt->payload       && strlen(stmt->payload) > NOTIFY_PAYLOAD_MAX_LENGTH - 1)       ereport(ERROR, ...

* ASCII-only is still an open issue.

* 2PC is still an open issue (notifications are lost on restart, and there may be other problems, as well). I think
it'seasy enough to throw an error on PREPARE TRANSACTION if there are any  notifications, right?
 

* There's a bug where an UNLISTEN can abort, and yet you still miss the notification. This is because you unlisten
beforeyou actually  commit the transaction, and an error between those times will cause  the UNLISTEN to take effect
eventhough the rest of the transaction  fails. For example:
 
   -- session 1   LISTEN foo;   BEGIN;   UNLISTEN foo;
   -- session 2   NOTIFY foo;
   -- gdb in session 1   (gdb) break AtCommit_NotifyBeforeCommit   (gdb) c
   -- session 1   COMMIT;
   -- gdb in session 1   (gdb) finish   (gdb) p op_strict(7654322)   (gdb) quit
 The notification is missed. It's fixed easily enough by doing the  UNLISTEN step in AtCommit_NotifyAfterCommit.

I'm still looking through some of the queue stuff, and I'll send an
update soon. I wanted to give you some feedback now though.

Regards,Jeff Davis

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

02 February 2010, 19:10:41

On Sat, Jan 30, 2010 at 12:02 AM, Jeff Davis <pgsql@j-davis.com> wrote:
> Comments:
>
> * In standard_ProcessUtility(), case NotifyStmt, you add a comment:
>
>    /* XXX the new listen/notify version can be enabled
>     * for Hot Standby */
>
>  but I don't think that's correct. We may be able to support LISTEN
>  on the standby, but not NOTIFY (right?). I don't think we should
>  be adding speculative comments anyway, because there is some work
>  still needed before HS can support LISTEN (notably, WAL-logging
>  NOTIFY).

I admit that it was not clear what I meant. The comment should only
address LISTEN / NOTIFY on the standby server. Do you see any problems
allowing it? Of course it's not all that useful because all
transactions are read-only but it still allows different clients to
communicate.
Of course listening on the standby server to notifications from the
primary server is not possible currently but in my point of view this
is a completely new use case for LISTEN/NOTIFY as it is not local but
across servers.

> * You have a TODO list as a comment. Can you remove it and explain
>  those items on list if they aren't already?

Sure.

I was wondering if we should have a hard limit on the maximal number
of notifications per transaction. You can now easily fill up your
backend's memory with notifications. However we had the same problem
with the old implementation already and nobody complained. The
difference is just that now you can fill it up a lot faster because
you can send a large payload.

The second doubt I had is about the truncation behavior of slru. ISTM
that it doesn't truncate at the end of the page range once the head
pointer has already wrapped around.
There is the following comment in slru.c describing this fact:

    /*
     * While we are holding the lock, make an important safety check: the
     * planned cutoff point must be <= the current endpoint page. Otherwise we
     * have already wrapped around, and proceeding with the truncation would
     * risk removing the current segment.
     */

I wanted to check if we can do anything about it and if we need to do
anything at all...

> * You have the comment:
>  That should be written more simply [...]

done

> * You have the Assert:

done

> * ASCII-only is still an open issue.

still an open issue...

> * 2PC is still an open issue (notifications are lost on restart,
>  and there may be other problems, as well). I think it's easy enough
>  to throw an error on PREPARE TRANSACTION if there are any
>  notifications, right?

we could forbid them, yes. Notifications aren't lost on restart
however, they get recorded in the 2PC state files. Currently the patch
works fine with 2PC (including server restart) with the exception that
in the queue full case we might need to roll back the prepared
transaction. Now the question is, should we forbid NOTIFY for 2PC
altogether only because in the unlikely event of a full queue we
cannot guarantee that we can commit the transaction?

One solution is to treat a 2PC transaction like a backend with its own
pointer to the queue. As long as the prepared transaction is not
committed, its pointer does not move and so we don't move forward the
global tail pointer. Here the NOTIFYs sent by the 2PC transaction are
already in the queue and the transaction can always commit. The
drawback of this idea is that if you forget to commit the prepared
transaction and leave it around uncommitted, your queue will fill up
inevitably because you do not truncate anymore...

> * There's a bug where an UNLISTEN can abort, and yet you still miss
>  the notification.
> [...]
>  The notification is missed. It's fixed easily enough by doing the
>  UNLISTEN step in AtCommit_NotifyAfterCommit.

Thanks, very well spotted... Actually the same is true for LISTEN... I
have reworked the patch to do the changes to listenChannels only in
the post-commit functions.

I have also included Arnaud Betremieux's send_notify function. Should
we really call the function send_notify? What about
pg_send_notification or just pg_notify? I am not a native English
speaker but to me it sounds strange to send a verb (notify) and I'd
rather prefix the name with pg_...?
Currently the function always returns void. Is this the preferred return type?

Joachim

Attachment

listennotify.9.diff

Re: Listen / Notify - what to do when the queue is full

From

Jeff Davis

Date:

02 February 2010, 21:04:45

On Wed, 2010-02-03 at 00:10 +0100, Joachim Wieland wrote:
> I admit that it was not clear what I meant. The comment should only
> address LISTEN / NOTIFY on the standby server. Do you see any problems
> allowing it?

The original comment was a part of the NotifyStmt case, and I don't
think we can support NOTIFY issued on a standby system -- surely there's
no way for the standby to communicate the notification to the master.
Anyway, this is getting a little sidetracked; I don't think we need to
worry about HS right now.

> I was wondering if we should have a hard limit on the maximal number
> of notifications per transaction. You can now easily fill up your
> backend's memory with notifications. However we had the same problem
> with the old implementation already and nobody complained. The
> difference is just that now you can fill it up a lot faster because
> you can send a large payload.

I don't see a need for that. The only way that is likely to happen is
with triggers, and there are other ways to fill up the memory using
triggers. Even if the option is there, all we can do is abort.

> The second doubt I had is about the truncation behavior of slru. ISTM
> that it doesn't truncate at the end of the page range once the head
> pointer has already wrapped around.
> There is the following comment in slru.c describing this fact:
> 
>     /*
>      * While we are holding the lock, make an important safety check: the
>      * planned cutoff point must be <= the current endpoint page. Otherwise we
>      * have already wrapped around, and proceeding with the truncation would
>      * risk removing the current segment.
>      */
> 
> I wanted to check if we can do anything about it and if we need to do
> anything at all...

I'll have to take a look in more detail.

> Now the question is, should we forbid NOTIFY for 2PC
> altogether only because in the unlikely event of a full queue we
> cannot guarantee that we can commit the transaction?

I believe that was the consensus in the thread. The people who use 2PC
are the people that want the most rock-solid guarantees. I don't like
forbidding feature combinations, but I don't see a good alternative.

> One solution is to treat a 2PC transaction like a backend with its own
> pointer to the queue. As long as the prepared transaction is not
> committed, its pointer does not move and so we don't move forward the
> global tail pointer. Here the NOTIFYs sent by the 2PC transaction are
> already in the queue and the transaction can always commit. The
> drawback of this idea is that if you forget to commit the prepared
> transaction and leave it around uncommitted, your queue will fill up
> inevitably because you do not truncate anymore...

There's also a problem if the power goes out, you restart, and then the
queue fills up before you COMMIT PREPARED.

> > * There's a bug where an UNLISTEN can abort, and yet you still miss
> >  the notification.
> > [...]
> >  The notification is missed. It's fixed easily enough by doing the
> >  UNLISTEN step in AtCommit_NotifyAfterCommit.
> 
> Thanks, very well spotted... Actually the same is true for LISTEN... I
> have reworked the patch to do the changes to listenChannels only in
> the post-commit functions.

I'm worried that this creates the opposite problem: that a LISTEN
transaction might commit before a NOTIFY transaction, and yet miss the
notification.

It seems safest to me to add a backend (LISTEN) to the list before
commit, and remove a backend (UNLISTEN) after commit. That way we are
sure to only receive spurious notifications, and can't miss any.

> I have also included Arnaud Betremieux's send_notify function. Should
> we really call the function send_notify? What about
> pg_send_notification or just pg_notify?

Agreed: pg_notify sounds good to me.

> Is [void] the preferred return type?

I can't think of anything else that it might return. NOTIFY doesn't
return much ;)

Regards,Jeff Davis

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

03 February 2010, 05:34:21

On Wed, Feb 3, 2010 at 2:05 AM, Jeff Davis <pgsql@j-davis.com> wrote:
>> Thanks, very well spotted... Actually the same is true for LISTEN... I
>> have reworked the patch to do the changes to listenChannels only in
>> the post-commit functions.
>
> I'm worried that this creates the opposite problem: that a LISTEN
> transaction might commit before a NOTIFY transaction, and yet miss the
> notification.

See the following comment and let me know if you agree...

! /*
!  * Exec_ListenBeforeCommit --- subroutine for AtCommit_NotifyBeforeCommit
!  *
!  * Note that we do only set our pointer here and do not yet add the channel to
!  * listenChannels. Since our transaction could still roll back we do this only
!  * after commit. We know that our tail pointer won't move between here and
!  * directly after commit, so we won't miss a notification.
!  */

However this introduces a new problem when an initial LISTEN aborts:
Then we are not listening to anything but for other backends it looks
like we were. This is tracked by the boolean variable
backendExecutesInitialListen and gets cleaned up in AtAbort_Notify().

> It seems safest to me to add a backend (LISTEN) to the list before
> commit, and remove a backend (UNLISTEN) after commit. That way we are
> sure to only receive spurious notifications, and can't miss any.

If a LISTEN aborted we would not only receive a few spurious
notifications from it but would receive notifications on this channel
forever even though we have never executed LISTEN on it successfully.

Joachim

Re: Listen / Notify - what to do when the queue is full

From

Robert Haas

Date:

07 February 2010, 01:18:38

On Wed, Feb 3, 2010 at 4:34 AM, Joachim Wieland <joe@mcknight.de> wrote:
> On Wed, Feb 3, 2010 at 2:05 AM, Jeff Davis <pgsql@j-davis.com> wrote:
>>> Thanks, very well spotted... Actually the same is true for LISTEN... I
>>> have reworked the patch to do the changes to listenChannels only in
>>> the post-commit functions.
>>
>> I'm worried that this creates the opposite problem: that a LISTEN
>> transaction might commit before a NOTIFY transaction, and yet miss the
>> notification.
>
> See the following comment and let me know if you agree...
>
> ! /*
> !  * Exec_ListenBeforeCommit --- subroutine for AtCommit_NotifyBeforeCommit
> !  *
> !  * Note that we do only set our pointer here and do not yet add the channel to
> !  * listenChannels. Since our transaction could still roll back we do this only
> !  * after commit. We know that our tail pointer won't move between here and
> !  * directly after commit, so we won't miss a notification.
> !  */
>
> However this introduces a new problem when an initial LISTEN aborts:
> Then we are not listening to anything but for other backends it looks
> like we were. This is tracked by the boolean variable
> backendExecutesInitialListen and gets cleaned up in AtAbort_Notify().
>
>
>> It seems safest to me to add a backend (LISTEN) to the list before
>> commit, and remove a backend (UNLISTEN) after commit. That way we are
>> sure to only receive spurious notifications, and can't miss any.
>
> If a LISTEN aborted we would not only receive a few spurious
> notifications from it but would receive notifications on this channel
> forever even though we have never executed LISTEN on it successfully.

Jeff, do you think this patch is ready for committer?  If so, please
mark it as such on commitfest.postgresql.org - otherwise, please
clarify what you think the action items are.

Thanks,

...Robert

Re: Listen / Notify - what to do when the queue is full

From

Jeff Davis

Date:

07 February 2010, 02:45:04

On Sun, 2010-02-07 at 00:18 -0500, Robert Haas wrote:
> Jeff, do you think this patch is ready for committer?  If so, please
> mark it as such on commitfest.postgresql.org - otherwise, please
> clarify what you think the action items are.

I'll post an update tomorrow.

Regards,Jeff Davis

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

07 February 2010, 12:33:50

On Wed, Feb 3, 2010 at 2:05 AM, Jeff Davis <pgsql@j-davis.com> wrote:
> The original comment was a part of the NotifyStmt case, and I don't
> think we can support NOTIFY issued on a standby system -- surely there's
> no way for the standby to communicate the notification to the master.
> Anyway, this is getting a little sidetracked; I don't think we need to
> worry about HS right now.

True but I was not talking about moving any notifications to different
servers. Clients listening on one server should receive the
notifications from NOTIFYs executed on this server, no matter if it is
a standby or the master server. It seems to me that currently we
cannot support Listen/Notify on the Hot Standby because it calls
GetCurrentTransactionId() which is not available there. On the other
hand this is just done because we fear that a transaction could
rollback while actually committing. On the read-only standby however a
transaction that commits always commits sucessfully because it has not
changed any data (right?)...

>> The second doubt I had is about the truncation behavior of slru. ISTM
>> that it doesn't truncate at the end of the page range once the head
>> pointer has already wrapped around.
>> There is the following comment in slru.c describing this fact:
>>
>>     /*
>>      * While we are holding the lock, make an important safety check: the
>>      * planned cutoff point must be <= the current endpoint page. Otherwise we
>>      * have already wrapped around, and proceeding with the truncation would
>>      * risk removing the current segment.
>>      */
>>
>> I wanted to check if we can do anything about it and if we need to do
>> anything at all...
>
> I'll have to take a look in more detail.

This is still kind of an open item but it's an slru issue and should
also be true for other functionality that uses slru queues.

>> Now the question is, should we forbid NOTIFY for 2PC
>> altogether only because in the unlikely event of a full queue we
>> cannot guarantee that we can commit the transaction?
>
> I believe that was the consensus in the thread. The people who use 2PC
> are the people that want the most rock-solid guarantees. I don't like
> forbidding feature combinations, but I don't see a good alternative.

If this was consensus, then fine... I have forbidden it in the latest patch now.

>> One solution is to treat a 2PC transaction like a backend with its own
>> pointer to the queue. As long as the prepared transaction is not
>> committed, its pointer does not move and so we don't move forward the
>> global tail pointer. Here the NOTIFYs sent by the 2PC transaction are
>> already in the queue and the transaction can always commit. The
>> drawback of this idea is that if you forget to commit the prepared
>> transaction and leave it around uncommitted, your queue will fill up
>> inevitably because you do not truncate anymore...
>
> There's also a problem if the power goes out, you restart, and then the
> queue fills up before you COMMIT PREPARED.

This I don't understand... If power goes out and we restart, we'd
first put all notifications from the prepared transactions into the
queue. We know that they fit because they have fit earlier as well (we
wouldn't allow user connections until we have worked through all 2PC
state files).

> I'm worried that this creates the opposite problem: that a LISTEN
> transaction might commit before a NOTIFY transaction, and yet miss the
> notification.

(see my other email on this item...)

>> I have also included Arnaud Betremieux's send_notify function. Should
>> we really call the function send_notify? What about
>> pg_send_notification or just pg_notify?
>
> Agreed: pg_notify sounds good to me.

Function name changed.

There was another problem that the slru files did not all get deleted
at server restart, which is fixed now.

Regarding the famous ASCII-restriction open item I have now realized
what I haven't thought of previously: notifications are not
transferred between databases, they always stay in one database. Since
libpq does the conversion between server and client encoding, it is
questionable if we really need to restrict this at all... But I am not
an encoding expert so whoever feels like he can confirm or refute
this, please speak up.

Joachim

Attachment

listennotify.10.diff

Re: Listen / Notify - what to do when the queue is full

From

Alvaro Herrera

Date:

08 February 2010, 12:16:47

Joachim Wieland wrote:

> + typedef struct AsyncQueueEntry
> + {
> +     /*
> +      * this record has the maximal length, but usually we limit it to
> +      * AsyncQueueEntryEmptySize + strlen(payload).
> +      */
> +     Size            length;
> +     Oid                dboid;
> +     TransactionId    xid;
> +     int32            srcPid;
> +     char            channel[NAMEDATALEN];
> +     char            payload[NOTIFY_PAYLOAD_MAX_LENGTH];
> + } AsyncQueueEntry;
> + #define AsyncQueueEntryEmptySize \
> +      (sizeof(AsyncQueueEntry) - NOTIFY_PAYLOAD_MAX_LENGTH + 1)

These are the on-disk notifications, right?  It seems to me a bit
wasteful to store channel name always as NAMEDATALEN bytes.  Can we
truncate it at its strlen?  I realize that this would cause the struct
definition to be uglier (you will no longer be able to have both channel
and payload pointers, only a char[1] pointer to a data area to which you
write both).  Typical channel names should be short, so IMHO this is
worthwhile.  Besides, I think the uglification of code this causes
should be fairly contained ...

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Listen / Notify - what to do when the queue is full

From

Jeff Davis

Date:

09 February 2010, 02:14:00

In this version of the patch, there is a compiler warning:
 async.c: In function ‘AtPrepare_Notify’: async.c:593: warning: unused variable ‘p’

and also two trivial merge conflicts: an OID conflict for the functions
you added, and a trivial code conflict.

On Sun, 2010-02-07 at 17:32 +0100, Joachim Wieland wrote:
> On Wed, Feb 3, 2010 at 2:05 AM, Jeff Davis <pgsql@j-davis.com> wrote:
> > The original comment was a part of the NotifyStmt case, and I don't
> > think we can support NOTIFY issued on a standby system -- surely there's
> > no way for the standby to communicate the notification to the master.
> > Anyway, this is getting a little sidetracked; I don't think we need to
> > worry about HS right now.
> 
> True but I was not talking about moving any notifications to different
> servers. Clients listening on one server should receive the
> notifications from NOTIFYs executed on this server, no matter if it is
> a standby or the master server.

I'm not sure I agree with that philosophy. If the driving use-case for
LISTEN/NOTIFY is cache invalidation, then a NOTIFY on the master should
make it to all listening backends on the slaves.

> This is still kind of an open item but it's an slru issue and should
> also be true for other functionality that uses slru queues.

I haven't found out anything new here.

> This I don't understand... If power goes out and we restart, we'd
> first put all notifications from the prepared transactions into the
> queue. We know that they fit because they have fit earlier as well (we
> wouldn't allow user connections until we have worked through all 2PC
> state files).
> 

Ok, it appears you've thought the 2PC interaction through more than I
have. Even if we don't include it this time, I'm glad to hear that there
is a plan to do so. Feel free to include support if you have it ready,
but it's late in the CF so I don't want you to get sidetracked on that
issue.

> There was another problem that the slru files did not all get deleted
> at server restart, which is fixed now.

Good catch.

> Regarding the famous ASCII-restriction open item I have now realized
> what I haven't thought of previously: notifications are not
> transferred between databases, they always stay in one database. Since
> libpq does the conversion between server and client encoding, it is
> questionable if we really need to restrict this at all... But I am not
> an encoding expert so whoever feels like he can confirm or refute
> this, please speak up.

I would like to support encoded text, but I think there are other
problems. For instance, what if one server has a client_encoding that
doesn't support some of the glyphs being sent by the notifying backend?
Then we have a mess, because we can't deliver it.

I think we just have to say "ASCII only" here, because there's no
reasonable way to handle this, regardless of implementation. If the user
issues SELECT and the client_encoding can't support some of the glyphs,
we can throw an error. But for a notification? We just have no mechanism
for that.

Regards,Jeff Davis

Re: Listen / Notify - what to do when the queue is full

From

Jeff Davis

Date:

09 February 2010, 17:46:25

On Mon, 2010-02-08 at 22:13 -0800, Jeff Davis wrote:
> I would like to support encoded text, but I think there are other
> problems. For instance, what if one server has a client_encoding that
> doesn't support some of the glyphs being sent by the notifying backend?
> Then we have a mess, because we can't deliver it.

I was thinking more about this. It seems clear that we want the backend
that issues the notify to only put 7-bit ASCII in the payload.

But if the client sends the letters 'string' as a payload, and the
representation in the server encoding is something other than the normal
7-bit ASCII representation of 'string', it will be incorrect, right?

Looking at the documentation, it appears that all of the server
encodings represent 7-bit ascii characters using the same 7-bit ascii
representation. Is that true?

Regards,Jeff Davis

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

09 February 2010, 17:51:45

Jeff Davis <pgsql@j-davis.com> writes:
> Looking at the documentation, it appears that all of the server
> encodings represent 7-bit ascii characters using the same 7-bit ascii
> representation. Is that true?

Correct.  We only support ASCII-superset encodings, both for frontend
and backend.

Limiting NOTIFY payloads to 7-bit would definitely avoid the issue.
The question is if that's more of a pain than a benefit.
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Robert Haas

Date:

09 February 2010, 18:30:48

On Tue, Feb 9, 2010 at 4:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jeff Davis <pgsql@j-davis.com> writes:
>> Looking at the documentation, it appears that all of the server
>> encodings represent 7-bit ascii characters using the same 7-bit ascii
>> representation. Is that true?
>
> Correct.  We only support ASCII-superset encodings, both for frontend
> and backend.
>
> Limiting NOTIFY payloads to 7-bit would definitely avoid the issue.
> The question is if that's more of a pain than a benefit.

I think it's a reasonable restriction for now.  We have limited time
remaining here; and we can always consider relaxing the restriction in
the future when we have more time to think through the issues.  It'll
still be a big improvement over what we have now.

...Robert

Re: Listen / Notify - what to do when the queue is full

From

Jeff Davis

Date:

09 February 2010, 19:00:44

On Tue, 2010-02-09 at 16:51 -0500, Tom Lane wrote:
> Limiting NOTIFY payloads to 7-bit would definitely avoid the issue.
> The question is if that's more of a pain than a benefit.

I don't see any alternative. If one backend sends a NOTIFY payload that
contains a non-ASCII character, there's a risk that we won't be able to
deliver it to another backend with a client_encoding that can't
represent that character. 

Also, just the fact that client_encoding can be changed at pretty much
any time is a potential problem, because it's difficult to know whether
a particular notification was sent using the old client_encoding or the
new one (because it's asynchronous).

Regards,Jeff Davis

Re: Listen / Notify - what to do when the queue is full

From

Andrew Chernow

Date:

09 February 2010, 20:03:35

Jeff Davis wrote:
> On Tue, 2010-02-09 at 16:51 -0500, Tom Lane wrote:
>> Limiting NOTIFY payloads to 7-bit would definitely avoid the issue.
>> The question is if that's more of a pain than a benefit.
> 
> I don't see any alternative. If one backend sends a NOTIFY payload that

Wouldn't binary payloads be an alternative?  NOTE: I may have missed this 
discussion.  Sorry if it has already been covered.

-- 
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

Re: Listen / Notify - what to do when the queue is full

From

Jeff Davis

Date:

09 February 2010, 20:30:03

On Tue, 2010-02-09 at 19:02 -0500, Andrew Chernow wrote:
> Wouldn't binary payloads be an alternative?  NOTE: I may have missed this 
> discussion.  Sorry if it has already been covered.

The Notify struct has a "char *" field, which can't hold embedded NULL
bytes, so it can't really be binary. But it can't be arbitrary text,
because it has to be encoded in a way that works for every possible
client encoding (otherwise there's a possibility of an error, and no way
to handle it).

Also, the query starts out as text, so we need a way to interpret the
text in an encoding-independent way.

So, I think ASCII is the natural choice here.

Regards,Jeff Davis

Re: Listen / Notify - what to do when the queue is full

From

Andrew Dunstan

Date:

09 February 2010, 20:56:56

Jeff Davis wrote:
> Also, the query starts out as text, so we need a way to interpret the
> text in an encoding-independent way.
>
> So, I think ASCII is the natural choice here.
>
>
>   

It's not worth hanging up this facility over this issue, ISTM. If we 
want something more that ASCII then a base64 or hex encoded string could 
possibly meet the need in the first instance.

cheers

andrew

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

09 February 2010, 21:25:56

Andrew Dunstan <andrew@dunslane.net> writes:
> Jeff Davis wrote:
>> So, I think ASCII is the natural choice here.

> It's not worth hanging up this facility over this issue, ISTM. If we 
> want something more that ASCII then a base64 or hex encoded string could 
> possibly meet the need in the first instance.

Yeah, that would serve people who want to push either binary or
non-ASCII data through the pipe.  It would leave all risks of encoding
problems on the user's head, though.
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Andrew Dunstan

Date:

09 February 2010, 21:32:28


Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>   
>> Jeff Davis wrote:
>>     
>>> So, I think ASCII is the natural choice here.
>>>       
>
>   
>> It's not worth hanging up this facility over this issue, ISTM. If we 
>> want something more that ASCII then a base64 or hex encoded string could 
>> possibly meet the need in the first instance.
>>     
>
> Yeah, that would serve people who want to push either binary or
> non-ASCII data through the pipe.  It would leave all risks of encoding
> problems on the user's head, though.
>
>             
>   

True. It's a workaround, but I think it's acceptable at this stage. We 
need to get some experience with this facility before we can refine it.

cheers

andrew

Re: Listen / Notify - what to do when the queue is full

From

Alvaro Herrera

Date:

09 February 2010, 21:38:45

Andrew Dunstan wrote:
> 
> 
> Tom Lane wrote:
> >Andrew Dunstan <andrew@dunslane.net> writes:
> >>Jeff Davis wrote:
> >>>So, I think ASCII is the natural choice here.
> >
> >>It's not worth hanging up this facility over this issue, ISTM.
> >>If we want something more that ASCII then a base64 or hex
> >>encoded string could possibly meet the need in the first
> >>instance.
> >
> >Yeah, that would serve people who want to push either binary or
> >non-ASCII data through the pipe.  It would leave all risks of encoding
> >problems on the user's head, though.
> 
> True. It's a workaround, but I think it's acceptable at this stage.
> We need to get some experience with this facility before we can
> refine it.

Hmm?  If we decide now that it's not going to have encoding conversion,
we won't able to change it later.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

09 February 2010, 21:42:16

Alvaro Herrera <alvherre@commandprompt.com> writes:
> Andrew Dunstan wrote:
>> True. It's a workaround, but I think it's acceptable at this stage.
>> We need to get some experience with this facility before we can
>> refine it.

> Hmm?  If we decide now that it's not going to have encoding conversion,
> we won't able to change it later.

How so?  If the feature currently allows only ASCII data, the behavior
would be upward compatible with a future version that is able to accept
and convert non-ASCII characters.
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

10 February 2010, 19:52:43

On Mon, Feb 8, 2010 at 5:16 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> These are the on-disk notifications, right?  It seems to me a bit
> wasteful to store channel name always as NAMEDATALEN bytes.  Can we
> truncate it at its strlen?

Attached is a new and hopefully more or less final patch for LISTEN / NOTIFY.

The following items have been addressed in this patch:

 - only store strlen(channel) instead of NAMEDATALEN bytes on disk
 - limit to 7-bit ASCII
 - forbid 2PC and LISTEN/NOTIFY for now
 - documentation changes
 - add missing tab completion for NOTIFY
 - fix pg_notify() behavior with respect to NULLs, too long and too
short parameters
 - rebased to current HEAD, OID conflicts resolved


Joachim

Attachment

listennotify.11.diff

Re: Listen / Notify - what to do when the queue is full

From

Simon Riggs

Date:

14 February 2010, 09:47:32

On Thu, 2010-02-11 at 00:52 +0100, Joachim Wieland wrote:
> On Mon, Feb 8, 2010 at 5:16 PM, Alvaro Herrera
> <alvherre@commandprompt.com> wrote:
> > These are the on-disk notifications, right?  It seems to me a bit
> > wasteful to store channel name always as NAMEDATALEN bytes.  Can we
> > truncate it at its strlen?
> 
> Attached is a new and hopefully more or less final patch for LISTEN / NOTIFY.
> 
> The following items have been addressed in this patch:
> 
>  - only store strlen(channel) instead of NAMEDATALEN bytes on disk
>  - limit to 7-bit ASCII
>  - forbid 2PC and LISTEN/NOTIFY for now
>  - documentation changes
>  - add missing tab completion for NOTIFY
>  - fix pg_notify() behavior with respect to NULLs, too long and too
> short parameters
>  - rebased to current HEAD, OID conflicts resolved

Some minor review comments without having taken in much of previous
discussion:

* Patch doesn't apply cleanly anymore, close.

* In async.c you say "new async notification model". Please say "async
notification model in 9.0 is". In (2) you say there is a central queue,
but don't describe purpose of queue or what it contains. Some other
typos in header comments.

* There is no mention of what to do with pg_notify at checkpoint. Look
at how pg_subtrans handles this. Should pg_notify do the same?

* Is there a lazy backend avoidance scheme as exists for relcache
invalidation messages? see storage/ipc/sinval...
OK, I see asyncQueueFillWarning() but nowhere else says it exists and
there aren't any comments in it to say what it does or why

* What do you expect the DBA to do when they receive a queue fill
warning? Where is that written down?

* Not clear of the purpose of backendSendsNotifications. In
AtCommit_NotifyAfterCommit() the logic seems strange. Code is    /* Allow transactions that have not executed
LISTEN/UNLISTEN/NOTIFYto     * return as soon as possible */    if (!pendingActions && !backendSendsNotifications)
 return;

but since backendSendsNotifications is set true earlier there is no fast
path. 

* AtSubCommit_Notify doesn't seem to have a fastpath when no notify
commands have been executed.

* In Send_Notify() you throw ERROR while still holding the lock. It
seems better practice to drop the lock then ERROR.

* Why is Send_Notify() a separate function? It's only called from one
point in the code.

* We know that sometimes a FATAL error can occur without properly
processing the abort. Do we depend upon this never happening?

* Can we make NUM_ASYNC_BUFFERS = 8? 4 just seems too small

* backendSendsNotifications is future tense. The variable should be
called something like hasSentNotifications. Couple of other variables
with similar names

Hope that helps

-- Simon Riggs           www.2ndQuadrant.com

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

14 February 2010, 12:22:34

On Sun, Feb 14, 2010 at 2:46 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Some minor review comments without having taken in much of previous
> discussion:
>
> * Patch doesn't apply cleanly anymore, close.

New version is rebased against current HEAD.

> * In async.c you say "new async notification model". Please say "async
> notification model in 9.0 is".

I haven't changed that line at all... ;-)  But the fact that the
former "new async model" is now the old one really shows that a
version number makes sense here.

> In (2) you say there is a central queue,

fixed with some additional comments.

> * There is no mention of what to do with pg_notify at checkpoint. Look
> at how pg_subtrans handles this. Should pg_notify do the same?

Actually we don't care... We even hope that the pg_notify pages are
not flushed at all. Notifications don't survive a server restart
anyway and upon restart we just delete whatever is in the directory.

> * Is there a lazy backend avoidance scheme as exists for relcache
> invalidation messages? see storage/ipc/sinval...

We cannot avoid a lazy backend. If a listening backend is (probably
idle) in a running transaction we don't clean up the notification
queue anymore but wait for the slowest backend to do it. But there are
other effects of long running transactions that you will probably
notice long before your notification queue gets filled up.

> OK, I see asyncQueueFillWarning() but nowhere else says it exists and
> there aren't any comments in it to say what it does or why
> * What do you expect the DBA to do when they receive a queue fill
> warning? Where is that written down?

Comment added as well as an errdetail for whoever sees this message:

WARNING:  pg_notify queue is more than 50% full. Among the slowest
backends: 27744
DETAIL:  Cleanup can only proceed if this backend ends its current transaction

I have also added a note in the NOTIFY documentation about this.

> * Not clear of the purpose of backendSendsNotifications. In
> AtCommit_NotifyAfterCommit() the logic seems strange. Code is
>        /* Allow transactions that have not executed
>                 LISTEN/UNLISTEN/NOTIFY to
>         * return as soon as possible */
>        if (!pendingActions && !backendSendsNotifications)
>                return;
> but since backendSendsNotifications is set true earlier there is no fast
> path.

What exactly is the strange logic that you're seeing? The fast path is
for transactions that do neither LISTEN/UNLISTEN nor NOTIFY. Without
LISTEN/UNLISTEN they have pendingActions = NIL and without NOTIFY they
have backendSendsNotifications = false and will return immediately.

backendSendsNotifications is only set to be true if the backend has
executed NOTIFY...

> * AtSubCommit_Notify doesn't seem to have a fastpath when no notify
> commands have been executed.

I haven't changed this function at all. I don't see much benefit
either as it's just concatenating empty lists. This immediately
returns from the respective list handling function.

> * In Send_Notify() you throw ERROR while still holding the lock. It
> seems better practice to drop the lock then ERROR.

Explicit LWLockRelease() added.

> * Why is Send_Notify() a separate function? It's only called from one
> point in the code.

I have now inlined Send_Notify().

> * We know that sometimes a FATAL error can occur without properly
> processing the abort. Do we depend upon this never happening?

What exactly could happen? I thought that a backend either does a
clean shutdown or that the postmaster starts over. We depend on a
listening backend to execute its on_shmem_exit() handlers before
leaving the game...

> * Can we make NUM_ASYNC_BUFFERS = 8? 4 just seems too small

I have done some tests with NUM_ASYNC_BUFFERS and couldn't find any
measurable performance difference at all, so I kept it at 4. 8 isn't
exaggerated either and seems to be a fine number though. Changed to 8.

> * backendSendsNotifications is future tense. The variable should be
> called something like hasSentNotifications. Couple of other variables
> with similar names

Variable names changed as proposed.

New patch attached, thanks for the review.

Joachim

Attachment

listennotify.12.diff

Re: Listen / Notify - what to do when the queue is full

From

Simon Riggs

Date:

14 February 2010, 13:42:32

On Sun, 2010-02-14 at 17:22 +0100, Joachim Wieland wrote:
> > * There is no mention of what to do with pg_notify at checkpoint.
> Look
> > at how pg_subtrans handles this. Should pg_notify do the same?
> 
> Actually we don't care... We even hope that the pg_notify pages are
> not flushed at all. Notifications don't survive a server restart
> anyway and upon restart we just delete whatever is in the directory.

Suspected that was true, just checking it was commented somewhere.

-- Simon Riggs           www.2ndQuadrant.com

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

14 February 2010, 14:47:56

Joachim Wieland <joe@mcknight.de> writes:
> + #define ERRCODE_TOO_MANY_ENTRIES            MAKE_SQLSTATE('5','4', '0','3','1')

Do you have any evidence that there is actually a DB2 error code
matching this, or is this errcode just invented?  The one page
Google finds doesn't list it:
http://publib.boulder.ibm.com/iseries/v5r1/ic2924/index.htm?info/rzala/rzalastc.html
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Simon Riggs

Date:

15 February 2010, 06:34:17

On Sun, 2010-02-14 at 17:22 +0100, Joachim Wieland wrote:

> New patch attached, thanks for the review.

Next set of questions

* Will this work during Hot Standby now? The barrier was that it wrote
to a table and so we could not allow that. ISTM this new version can and
should work with Hot Standby. Can you test that and if so, remove the
explicit barrier code and change tests and docs to enable it?

* We also discussed the idea of having a NOTIFY command that would work
from Primary to Standby. All this would need is some code to WAL log the
NOTIFY if not in Hot Standby and for some recovery code to send the
NOTIFY to any listeners on the standby. I would suggest that would be an
option on NOTIFY to WAL log the notification:
e.g. NOTIFY me 'with_payload' FOR STANDBY ALSO;

* Don't really like pg_listening() as a name. Perhaps pg_listening_to()
or pg_listening_on() or pg_listening_for() or pg_listening_channels() or
pg_listen_channels()

* I think it's confusing that pg_notify is both a data structure and a
function. Suggest changing one of those to avoid issues in
understanding. "Use pg_notify" might be confused by a DBA.

-- Simon Riggs           www.2ndQuadrant.com

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

15 February 2010, 08:00:59

On Sun, Feb 14, 2010 at 11:44 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Next set of questions
>
> * Will this work during Hot Standby now? The barrier was that it wrote
> to a table and so we could not allow that. ISTM this new version can and
> should work with Hot Standby. Can you test that and if so, remove the
> explicit barrier code and change tests and docs to enable it?

I have tested it already. The point where it currently fails is the
following line:
qe->xid = GetCurrentTransactionId();

We record the TransactionId (of the notifying transaction) in the
notification in order to later check if this transaction has committed
successfully or not. If you tell me how we can find this out in HS, we
might be done...

The reason why we are doing all this is because we fear that we can
not write the notifications to disk once we have committed to clog...
So we write them to disk before committing to clog and therefore need
to record the TransactionId.

> * We also discussed the idea of having a NOTIFY command that would work
> from Primary to Standby. All this would need is some code to WAL log the
> NOTIFY if not in Hot Standby and for some recovery code to send the
> NOTIFY to any listeners on the standby. I would suggest that would be an
> option on NOTIFY to WAL log the notification:
> e.g. NOTIFY me 'with_payload' FOR STANDBY ALSO;

What should happen if you wanted to replay a NOTIFY WAL record in the
standby but cannot write to the pg_notify/ directory?

> * Don't really like pg_listening() as a name. Perhaps pg_listening_to()
> or pg_listening_on() or pg_listening_for() or pg_listening_channels() or
> pg_listen_channels()

pg_listen_channels() sounds best to me but I leave this decision to a
native speaker.

> * I think it's confusing that pg_notify is both a data structure and a
> function. Suggest changing one of those to avoid issues in
> understanding. "Use pg_notify" might be confused by a DBA.

You are talking about the libpq datastructure PGnotify I suppose... I
don't see it overly confusing but I wouldn't object changing it. There
was a previous discussion about the name, see the last paragraph of
http://archives.postgresql.org/message-id/dc7b844e1002021510i4aaa879fy8bbdd003729d28da@mail.gmail.com

Joachim

Re: Listen / Notify - what to do when the queue is full

From

Simon Riggs

Date:

15 February 2010, 08:46:41

On Mon, 2010-02-15 at 12:59 +0100, Joachim Wieland wrote:
> > * I think it's confusing that pg_notify is both a data structure and
> a
> > function. Suggest changing one of those to avoid issues in
> > understanding. "Use pg_notify" might be confused by a DBA.
> 
> You are talking about the libpq datastructure PGnotify I suppose... I
> don't see it overly confusing but I wouldn't object changing it. There
> was a previous discussion about the name, see the last paragraph of
> http://archives.postgresql.org/message-id/dc7b844e1002021510i4aaa879fy8bbdd003729d28da@mail.gmail.com

No, which illustrates the confusion nicely!
Function and datastructure.

-- Simon Riggs           www.2ndQuadrant.com

Re: Listen / Notify - what to do when the queue is full

From

Simon Riggs

Date:

15 February 2010, 08:49:15

On Mon, 2010-02-15 at 12:59 +0100, Joachim Wieland wrote:
> On Sun, Feb 14, 2010 at 11:44 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> > Next set of questions
> >
> > * Will this work during Hot Standby now? The barrier was that it wrote
> > to a table and so we could not allow that. ISTM this new version can and
> > should work with Hot Standby. Can you test that and if so, remove the
> > explicit barrier code and change tests and docs to enable it?
> 
> I have tested it already. The point where it currently fails is the
> following line:
> 
>     qe->xid = GetCurrentTransactionId();
> 
> We record the TransactionId (of the notifying transaction) in the
> notification in order to later check if this transaction has committed
> successfully or not. If you tell me how we can find this out in HS, we
> might be done...
> 
> The reason why we are doing all this is because we fear that we can
> not write the notifications to disk once we have committed to clog...
> So we write them to disk before committing to clog and therefore need
> to record the TransactionId.

That's a shame. So it will never work in Hot Standby mode unless you can
think of a different way.

> > * We also discussed the idea of having a NOTIFY command that would work
> > from Primary to Standby. All this would need is some code to WAL log the
> > NOTIFY if not in Hot Standby and for some recovery code to send the
> > NOTIFY to any listeners on the standby. I would suggest that would be an
> > option on NOTIFY to WAL log the notification:
> > e.g. NOTIFY me 'with_payload' FOR STANDBY ALSO;
> 
> What should happen if you wanted to replay a NOTIFY WAL record in the
> standby but cannot write to the pg_notify/ directory?

Same thing that happens to any action that cannot be replayed. Why
should that be a problem?

-- Simon Riggs           www.2ndQuadrant.com

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

15 February 2010, 15:21:45

On Mon, Feb 15, 2010 at 1:48 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Mon, 2010-02-15 at 12:59 +0100, Joachim Wieland wrote:
>> I have tested it already. The point where it currently fails is the
>> following line:
>>
>>       qe->xid = GetCurrentTransactionId();
>
> That's a shame. So it will never work in Hot Standby mode unless you can
> think of a different way.

We could probably fake this on the Hot Standby in the following way:

We introduce a commit record for every notifying transaction and write
it into the queue itself. So right before writing anything else, we
write an entry which informs readers that the following records are
not yet committed. Then we write the actual notifications and commit.
In post-commit we return back to the commit record and flip its
status. Reading backends would stop at the commit record and we'd
signal them so that they can continue. This actually plays nicely with
Tom's intent to not have interleaved notifications in the queue (makes
things a bit easier but would probably work either way)...

However we'd need to make sure that we clean up that commit record
even if something weird happens (similar to TransactionIdDidAbort()
returning true) in order to allow the readers to proceed.

Comments?

Joachim

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

15 February 2010, 16:01:09

Joachim Wieland <joe@mcknight.de> writes:
> We could probably fake this on the Hot Standby in the following way:

> We introduce a commit record for every notifying transaction and write
> it into the queue itself. So right before writing anything else, we
> write an entry which informs readers that the following records are
> not yet committed. Then we write the actual notifications and commit.
> In post-commit we return back to the commit record and flip its
> status.

This doesn't seem likely to work --- it essentially makes commit non
atomic.  There has to be one and only one authoritative reference as
to whether transaction X committed.

I think that having HS slave sessions issue notifies is a fairly silly
idea anyway.  They can't write the database, so exactly what condition
are they going to be notifying others about?

What *would* be useful is for HS slaves to be able to listen for notify
messages issued by writing sessions on the master.  This patch gets rid
of the need for LISTEN to change on-disk state, so in principle we can
do it.  The only bit we seem to lack is WAL transmission of the messages
(plus of course synchronization in case a slave session is too slow
about picking up messages).  Definitely a 9.1 project at this point
though.
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Jeff Davis

Date:

15 February 2010, 17:43:56

On Sun, 2010-02-14 at 22:44 +0000, Simon Riggs wrote:
> * We also discussed the idea of having a NOTIFY command that would work
> from Primary to Standby. All this would need is some code to WAL log the
> NOTIFY if not in Hot Standby and for some recovery code to send the
> NOTIFY to any listeners on the standby. I would suggest that would be an
> option on NOTIFY to WAL log the notification:
> e.g. NOTIFY me 'with_payload' FOR STANDBY ALSO;

My first reaction is that it should not be optional. If we allow a slave
system to LISTEN on a condition, what's the point if it doesn't receive
the notifications from the master?

Cache invalidation seems to be the driving use case for LISTEN/NOTIFY.
Only the master can invalidate the cache (as Tom points out downthread);
and users on the slave system want to know about that invalidation if
they are explicitly listening for it.

Regards,Jeff Davis

Re: Listen / Notify - what to do when the queue is full

From

"Greg Sabino Mullane"

Date:

16 February 2010, 12:03:04

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> * We also discussed the idea of having a NOTIFY command that 
> would work from Primary to Standby.

Just curious, what's a use case for this?

- -- 
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201002161102
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkt6wZ4ACgkQvJuQZxSWSsjrYwCfSWvHlTBFT/fIYcBToX9C57GO
toAAoOLQhBj6NdVTayaVtRH8L7nk16qM
=LBAH
-----END PGP SIGNATURE-----

Re: Listen / Notify - what to do when the queue is full

From

Jeff Davis

Date:

16 February 2010, 14:22:24

On Tue, 2010-02-16 at 16:02 +0000, Greg Sabino Mullane wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: RIPEMD160
> 
> 
> > * We also discussed the idea of having a NOTIFY command that 
> > would work from Primary to Standby.
> 
> Just curious, what's a use case for this?

If you have some kind of cache above the DBMS, you need to invalidate it
when a part of the database is updated. It makes sense that every reader
would want to know about the update, not just those connected to the
master.

Regards,Jeff Davis

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

16 February 2010, 18:42:12

Joachim Wieland <joe@mcknight.de> writes:
> [ listen/notify patch ]

Applied after rather a lot of hacking.

Aside from the issues previously raised, I changed the NOTIFY syntax to
include a comma between channel name and payload.  The submitted syntax
with no comma looked odd to me, and it would have been a real nightmare
to extend if we ever decide we want to support expressions in NOTIFY.

I found a number of implementation problems having to do with wraparound
behavior and error recovery.  I think they're all fixed, but any
remaining bugs are probably my fault not yours.
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

17 February 2010, 01:25:29

Simon Riggs <simon@2ndQuadrant.com> writes:
> * Don't really like pg_listening() as a name. Perhaps pg_listening_to()
> or pg_listening_on() or pg_listening_for() or pg_listening_channels() or
> pg_listen_channels()

BTW, I used pg_listening_channels() for that.

> * I think it's confusing that pg_notify is both a data structure and a
> function. Suggest changing one of those to avoid issues in
> understanding. "Use pg_notify" might be confused by a DBA.

I didn't change that.  The data structure is PGnotify, which seems
enough different from pg_notify to not be a real serious problem.
There is a duplication with the $PGDATA subdirectory pg_notify/,
but that one is not a user-facing name, so I thought it wasn't
really an issue.
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Joachim Wieland

Date:

17 February 2010, 06:16:52

On Tue, Feb 16, 2010 at 11:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Joachim Wieland <joe@mcknight.de> writes:
>> [ listen/notify patch ]
>
> I found a number of implementation problems having to do with wraparound
> behavior and error recovery.  I think they're all fixed, but any
> remaining bugs are probably my fault not yours.

First, thanks for the rework you have done and thanks for applying this.

While I can see a lot of improvements over my version, I think the
logic in asyncQueueProcessPageEntries() needs to be reordered:

+ static bool
+ asyncQueueProcessPageEntries(QueuePosition *current,
+                              QueuePosition stop,
+                              char *page_buffer)
[...]
+     do
+     {
[...]
+         /*
+          * Advance *current over this message, possibly to the next page.
+          * As noted in the comments for asyncQueueReadAllNotifications, we
+          * must do this before possibly failing while processing the message.
+          */
+         reachedEndOfPage = asyncQueueAdvance(current, qe->length);
[...]
+             if (TransactionIdDidCommit(qe->xid))
[...]
+             else if (TransactionIdDidAbort(qe->xid))
[...]
+             else
+             {
+                 /*
+                  * The transaction has neither committed nor aborted so far,
+                  * so we can't process its message yet.  Break out of the loop.
+                  */
+                 reachedStop = true;
+                 break;

In the beginning you are advancing *current but later on you could
find out that the transaction is still running. As the position in the
queue has already advanced you would miss one notification here
because you'd restart directly behind this notification in the
queue...


Joachim

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

17 February 2010, 11:34:29

Joachim Wieland <joe@mcknight.de> writes:
> While I can see a lot of improvements over my version, I think the
> logic in asyncQueueProcessPageEntries() needs to be reordered:

Hmmm ... I was intending to cover the case of a failure in
TransactionIdDidCommit too, but I can see it will take a bit more
thought.
        regards, tom lane

Re: Listen / Notify - what to do when the queue is full

From

Simon Riggs

Date:

18 February 2010, 13:59:44

On Mon, 2010-02-15 at 15:00 -0500, Tom Lane wrote:
> Joachim Wieland <joe@mcknight.de> writes:
> > We could probably fake this on the Hot Standby in the following way:
> 
> > We introduce a commit record for every notifying transaction and write
> > it into the queue itself. So right before writing anything else, we
> > write an entry which informs readers that the following records are
> > not yet committed. Then we write the actual notifications and commit.
> > In post-commit we return back to the commit record and flip its
> > status.
> 
> This doesn't seem likely to work --- it essentially makes commit non
> atomic.  There has to be one and only one authoritative reference as
> to whether transaction X committed.

I thought a bit more about this and don't really understand why we need
an xid at all. When we discussed this before the role of a NOTIFY was to
remind us to refresh a cache, not as a way of delivering a transactional
payload. If the cache refresh use case is still the objective why does
it matter whether we commit or not when we issue a NOTIFY? Surely, the
rare case where we actually abort right at the end of the transaction
will just cause an unnecessary cache refresh. 

> I think that having HS slave sessions issue notifies is a fairly silly
> idea anyway.  They can't write the database, so exactly what condition
> are they going to be notifying others about?

Agreed

> What *would* be useful is for HS slaves to be able to listen for notify
> messages issued by writing sessions on the master.  This patch gets rid
> of the need for LISTEN to change on-disk state, so in principle we can
> do it.  The only bit we seem to lack is WAL transmission of the messages
> (plus of course synchronization in case a slave session is too slow
> about picking up messages).  Definitely a 9.1 project at this point
> though.

OK

-- Simon Riggs           www.2ndQuadrant.com

Re: Listen / Notify - what to do when the queue is full

From

Josh Berkus

Date:

18 February 2010, 14:11:50

On 2/18/10 9:58 AM, Simon Riggs wrote:
> I thought a bit more about this and don't really understand why we need
> an xid at all. When we discussed this before the role of a NOTIFY was to
> remind us to refresh a cache, not as a way of delivering a transactional
> payload. If the cache refresh use case is still the objective why does
> it matter whether we commit or not when we issue a NOTIFY? Surely, the
> rare case where we actually abort right at the end of the transaction
> will just cause an unnecessary cache refresh. 

Actually, even for that use, it doesn't wash to have notifies being sent
for transactions which have not yet committed.  Think of cases (and I
have two applications) where data is being pushed into an external
non-transactional cache or queue (like Memcached, Redis or ApacheMQ)
from PostgreSQL on the backend.  If the transaction fails, it's
important that the data not get pushed.

I guess I'm not following why HS would be different from a single server
in this regard, though?

Mind you, this is all 9.1 discussion, no?

--Josh Berkus

Re: Listen / Notify - what to do when the queue is full

From

Merlin Moncure

Date:

18 February 2010, 14:12:36

On Thu, Feb 18, 2010 at 12:58 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Mon, 2010-02-15 at 15:00 -0500, Tom Lane wrote:
>> Joachim Wieland <joe@mcknight.de> writes:
>> > We could probably fake this on the Hot Standby in the following way:
>>
>> > We introduce a commit record for every notifying transaction and write
>> > it into the queue itself. So right before writing anything else, we
>> > write an entry which informs readers that the following records are
>> > not yet committed. Then we write the actual notifications and commit.
>> > In post-commit we return back to the commit record and flip its
>> > status.
>>
>> This doesn't seem likely to work --- it essentially makes commit non
>> atomic.  There has to be one and only one authoritative reference as
>> to whether transaction X committed.
>
> I thought a bit more about this and don't really understand why we need
> an xid at all. When we discussed this before the role of a NOTIFY was to
> remind us to refresh a cache, not as a way of delivering a transactional
> payload. If the cache refresh use case is still the objective why does
> it matter whether we commit or not when we issue a NOTIFY? Surely, the
> rare case where we actually abort right at the end of the transaction
> will just cause an unnecessary cache refresh.

notifications serve many more purposes than cache refreshes...it's a
generic 'wake up and do something' to the client.

For example, one of those things could be for the client to shut down.If the server errors out of the transaction that
setup the client to 
shut down, you probably wouldn't want the client to shut down.  I
don't think that's a big deal really, but it conflicts with the old
behavior.

However, being able to send notifications immediately (not at end of
transaction) would be exceptionally useful in some cases.   This
happens when the notifying backend is waiting on some sort of response
from the notified client.  If you could NOTIFY IMMEDIATELY, then you
could ping the client and get the response in a single transaction
without using dblink based hacks.

merlin

Re: Listen / Notify - what to do when the queue is full

From

Tom Lane

Date:

18 February 2010, 14:47:26

Merlin Moncure <mmoncure@gmail.com> writes:
> On Thu, Feb 18, 2010 at 12:58 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> I thought a bit more about this and don't really understand why we need
>> an xid at all. When we discussed this before the role of a NOTIFY was to
>> remind us to refresh a cache, not as a way of delivering a transactional
>> payload. If the cache refresh use case is still the objective why does
>> it matter whether we commit or not when we issue a NOTIFY? Surely, the
>> rare case where we actually abort right at the end of the transaction
>> will just cause an unnecessary cache refresh.

> notifications serve many more purposes than cache refreshes...it's a
> generic 'wake up and do something' to the client.

The point to my mind is that the previous implementation guaranteed that
failed transactions would not send notifies.  I don't think we can just
drop that semantic consistency statement and not break applications.

Also, as Josh notes, even for cache refresh uses it is *critical* that
the notifies not be delivered to listeners till after the sender
commits; else you have race conditions where the listeners look for
changes before they can see them.  So it's difficult to make it
much simpler than this anyhow.
        regards, tom lane