Thread: notification: pg_notify ?

notification: pg_notify ?

From
Neil Conway
Date:
Jeff Davis asked on -general why NOTIFY doesn't take an optional
argument, specifying a message that is passed to the listening backend.
This feature is supported by Oracle and other databases and I think it's
quite useful, so I've started to implement it. Most of the modifications
have been pretty straight-forward, except for 2 issues:

(1) Processing notifies. Currently, the only data that is passed from
the notifying backend to the listening one is the PID of the notifier,
which is stored in the "notification" column of pg_listener. In order to
pass messages from notifier to listener, I could add another column to
pg_listener, but IMHO that's a bad idea: there is really no reason for
this kind of data to be in pg_listener in the first place. pg_listener
should simply list the PIDs of listening backends, as well as the
conditions upon which they are listening -- any data that is related to
specific notifications should be put elsewhere.

(2) Multiple notifications on the same condition name in a short time
span are delivered as a single notification. This isn't currently a
problem because the NOTIFY itself doesn't carry any data (other than
backend PID), it just informs the listener that an event has occurred.
If we allow NOTIFY to send a message to the listener, this is not good
-- the listener should be notified for each and every notification,
since the contents of the message could be important.

Solution: Create a new system catalog, pg_notify. This should contain 4
columns:
relname:  the name of the NOTIFY condition that has been sentmessage:  the optional message sent by the NOTIFYsender:
thePID of the backend that sent the NOTIFYreceiver: the PID of the listening backend
 

AFAICT, this should resolve the two issues mentioned above. The actual
notification of a listening backend is still done at transaction commit,
by sending a SIGUSR2: however, all this does is to ask the backend to
scan through pg_notify, looking for tuples containing its PID in
"receiver". Therefore, even if Unix doesn't send multiple signals for
multiple notifications, a single signal should be enough to ensure a
scan of pg_notify, where any additional notifications will be found.

If we continued to add columns to pg_listener, there would be a limit of
1 tuple per listening backend: thus, we would still run into problems
with multiple notifications being ignored.

Can anyone see a better way to do this? Are there any problems with the
implementation I've outlined?

Any feedback would be appreciated.

Cheers,

Neil

-- 
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC



Re: notification: pg_notify ?

From
Tom Lane
Date:
Neil Conway <nconway@klamath.dyndns.org> writes:
> Solution: Create a new system catalog, pg_notify.

It's not apparent to me why that helps much.

There is a very significant performance problem with LISTEN/NOTIFY
via pg_listener: in any application that generates notifications at
a significant rate, pg_listener will accumulate dead tuples at that
same rate, and we will soon find ourselves wasting lots of time
scanning through dead tuples.  Frequent VACUUMs might help, but the
whole thing is really quite silly: why are we using a storage mechanism
that's designed entirely for *stable* storage of data to pass inherently
*transient* signals?  If the system crashes, we have absolutely zero
interest in the former contents of pg_listener (and indeed need to go
to some trouble to get rid of them).

So if someone wants to undertake a revision of the listen/notify code,
I think the first thing to do ought to be to throw away pg_listener
entirely and develop some lower-overhead, shared-memory-based
communication mechanism.  You could do worse than to use the shared
cache inval code as a model --- or perhaps even incorporate LISTEN
signaling into that mechanism.  (Actually that seems like a good plan,
so as not to use shared memory inefficiently by dedicating two separate
memory pools to parallel purposes.)

If you follow the SI model then NOTIFY messages would essentially be
broadcast to all backends, and whether any given backend pays attention
to one is its own problem; no one else cares.

A deficiency of the SI implementation (and probably anything else that
relies solely on shared memory) is that it can suffer from buffer
overrun, since there's a fixed-size message pool.  For the purposes
of cache inval, we cope with buffer overrun by just invalidating
everything in sight.  It might be a workable tradeoff to cope with
buffer overrun for LISTEN/NOTIFY by reporting notifies on all conditions
currently listened for.  Assuming that overrun is infrequent, the net
performance gain from being able to use shared memory is probably worth
the occasional episode of wasted work.


BTW, I would like to see a spec for this "notify with parameter" feature
before it's implemented, not after.  Exactly what semantics do you have
in mind?
        regards, tom lane


Re: notification: pg_notify ?

From
Neil Conway
Date:
On Thu, 2002-03-21 at 22:41, Tom Lane wrote:
> Neil Conway <nconway@klamath.dyndns.org> writes:
> > Solution: Create a new system catalog, pg_notify.
> 
> It's not apparent to me why that helps much.

Well, it solves the functional problem at hand -- this feature can now
be implemented. However, I agree with you that there are still problems
with NOTIFY and pg_listener, as you have outlined.

> So if someone wants to undertake a revision of the listen/notify code,
> I think the first thing to do ought to be to throw away pg_listener
> entirely and develop some lower-overhead, shared-memory-based
> communication mechanism.  You could do worse than to use the shared
> cache inval code as a model --- or perhaps even incorporate LISTEN
> signaling into that mechanism.  (Actually that seems like a good plan,
> so as not to use shared memory inefficiently by dedicating two separate
> memory pools to parallel purposes.)

That's very interesting. I need to read the code you're referring to
before I can comment further, but I'll definately look into this. That's
a good idea.

> If you follow the SI model then NOTIFY messages would essentially be
> broadcast to all backends,

My apologies, but what's the SI model?

> A deficiency of the SI implementation (and probably anything else that
> relies solely on shared memory) is that it can suffer from buffer
> overrun, since there's a fixed-size message pool.  For the purposes
> of cache inval, we cope with buffer overrun by just invalidating
> everything in sight.  It might be a workable tradeoff to cope with
> buffer overrun for LISTEN/NOTIFY by reporting notifies on all conditions
> currently listened for.

This assumes that the NOTIFY condition we're waiting for is fairly
routine (e.g. "table x is updated, refresh the cache"). If a NOTIFY
actually represents the occurence of a non-trivial condition, this could
be a problem (e.g. "the site crashed, page the sys-admin", and the
buffer happens to overflow at 2 AM :-) ). However, it's questionable
whether that is an appropriate usage of NOTIFY.

> BTW, I would like to see a spec for this "notify with parameter" feature
> before it's implemented, not after.

What information would you like to know?

>  Exactly what semantics do you have in mind?

The current syntax I'm using is:
NOTIFY condition_name [ [WITH MESSAGE] 'my message' ];

But I'm open to suggestions for improvement.

Cheers,

Neil

-- 
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC



Re: notification: pg_notify ?

From
Tom Lane
Date:
Neil Conway <nconway@klamath.dyndns.org> writes:
>> BTW, I would like to see a spec for this "notify with parameter" feature
>> before it's implemented, not after.

> The current syntax I'm using is:
>     NOTIFY condition_name [ [WITH MESSAGE] 'my message' ];

Hm.  How are you going to transmit that to the client side without
changing the FE/BE protocol?  (While we will no doubt find reasons
to change the protocol in the future, I'm not eager to force a protocol
update right now; at least not without more reason than just NOTIFY
parameters.)  If we want to avoid a protocol break then it seems
like the value transmitted to the client has to be a single string.

I guess we could say that what's transmitted is a single string in
the formcondition_name.additional_text
(or pick some other delimiter instead of dot, but doesn't seem like
it matters much).  Pretty grotty though.

Another thought that comes to mind is that we could reinterpret the
parameter of LISTEN as a pattern to match against the strings generated
by NOTIFY --- then there's no need to draw a hard-and-fast distinction
between condition name and parameter text; it's all in the eye of the
beholder.  However it's tough to see how to do this without breaking
backwards compatibility at the syntax level --- you'd really want LISTEN
to be accepting a string literal, rather than a name, to make this
happen.

That brings up the more general point that you'd want at least
the "message" part of NOTIFY to be computable as an SQL expression,
not just a literal.  It might be entertaining to try to reimplement
NOTIFY as something that's internally like a SELECT, just with a
funny data destination.  I find this attractive because if it were
a SELECT then it could have (at least on the inside) a WHERE clause,
which'd make it possible to handle NOTIFYs in conditional rules in
a less broken fashion than we do now.
        regards, tom lane


Re: notification: pg_notify ?

From
"Christopher Kings-Lynne"
Date:
> >  Exactly what semantics do you have in mind?
>
> The current syntax I'm using is:
>
>     NOTIFY condition_name [ [WITH MESSAGE] 'my message' ];
>
> But I'm open to suggestions for improvement.

Have you considered visiting the oracle site and finding their documentation
for their NOTIFY statement and making sure you're using compatible syntax?
They might have extra stuff as well.

Chris



Re: notification: pg_notify ?

From
Neil Conway
Date:
On Thu, 2002-03-21 at 23:41, Christopher Kings-Lynne wrote:
> > >  Exactly what semantics do you have in mind?
> >
> > The current syntax I'm using is:
> >
> >     NOTIFY condition_name [ [WITH MESSAGE] 'my message' ];
> >
> > But I'm open to suggestions for improvement.
> 
> Have you considered visiting the oracle site and finding their documentation
> for their NOTIFY statement and making sure you're using compatible syntax?

Oracle's implementation uses a completely different syntax to begin
with: it's called DBMS_ALERT.

> They might have extra stuff as well.

From a brief scan of their docs, it doesn't look like it. In fact, their
implementation seems to be worse than PostgreSQL's in at least one
respect: "A waiting application is blocked in the database and cannot do
any other work."

Cheers,

Neil

-- 
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC



Re: notification: pg_notify ?

From
"Christopher Kings-Lynne"
Date:
> On Thu, 2002-03-21 at 23:41, Christopher Kings-Lynne wrote:
> > > >  Exactly what semantics do you have in mind?
> > >
> > > The current syntax I'm using is:
> > >
> > >     NOTIFY condition_name [ [WITH MESSAGE] 'my message' ];
> > >
> > > But I'm open to suggestions for improvement.
> >
> > Have you considered visiting the oracle site and finding their
> documentation
> > for their NOTIFY statement and making sure you're using
> compatible syntax?
>
> Oracle's implementation uses a completely different syntax to begin
> with: it's called DBMS_ALERT.

OK - not Oracle then.  Didn't you say some other db did it - what about
their syntax?

Chris



Re: notification: pg_notify ?

From
Hannu Krosing
Date:
On Fri, 2002-03-22 at 06:40, Tom Lane wrote:
> Neil Conway <nconway@klamath.dyndns.org> writes:
> >> BTW, I would like to see a spec for this "notify with parameter" feature
> >> before it's implemented, not after.
> 
> > The current syntax I'm using is:
> >     NOTIFY condition_name [ [WITH MESSAGE] 'my message' ];
> 
> Hm.  How are you going to transmit that to the client side without
> changing the FE/BE protocol?  (While we will no doubt find reasons
> to change the protocol in the future, I'm not eager to force a protocol
> update right now; at least not without more reason than just NOTIFY
> parameters.)  If we want to avoid a protocol break then it seems
> like the value transmitted to the client has to be a single string.
> 
> I guess we could say that what's transmitted is a single string in
> the form
>     condition_name.additional_text
> (or pick some other delimiter instead of dot, but doesn't seem like
> it matters much).  Pretty grotty though.
> 
> Another thought that comes to mind is that we could reinterpret the
> parameter of LISTEN as a pattern to match against the strings generated
> by NOTIFY --- then there's no need to draw a hard-and-fast distinction
> between condition name and parameter text; it's all in the eye of the
> beholder.

That'ts what I suggested a few weeks ago in a well hidden message at the
end of reply to somewhat related question ;)

>  However it's tough to see how to do this without breaking
> backwards compatibility at the syntax level --- you'd really want LISTEN
> to be accepting a string literal, rather than a name, to make this
> happen.

Can't we accept both - name for simple things and string for regexes.

> That brings up the more general point that you'd want at least
> the "message" part of NOTIFY to be computable as an SQL expression,
> not just a literal.

I think this should be any expression that returns text.

I even wouldnt mind if I had to use explicit insert:

insert into pg_notify 
select relname || '.' || cast(myobjectid as text), listenerpid
from pg_listener
where 'inv' ~ relname 

Just the delivery has to be automatic.

> It might be entertaining to try to reimplement
> NOTIFY as something that's internally like a SELECT, just with a
> funny data destination.

I thought that NOTIFY is implemented as an INSERT internally, no ?

> I find this attractive because if it were
> a SELECT then it could have (at least on the inside) a WHERE clause,
> which'd make it possible to handle NOTIFYs in conditional rules in
> a less broken fashion than we do now.

--------------
Hannu




Re: notification: pg_notify ?

From
Neil Conway
Date:
On Thu, 2002-03-21 at 22:41, Tom Lane wrote:
> It might be a workable tradeoff to cope with
> buffer overrun for LISTEN/NOTIFY by reporting notifies on all conditions
> currently listened for.  Assuming that overrun is infrequent, the net
> performance gain from being able to use shared memory is probably worth
> the occasional episode of wasted work.

I've thought about this some more, and I don't think that solution will
be sufficient.

Spurious notifications seems like a pretty serious drawback, and I don't
think they solve anything. As I mentioned earlier, if the event a notify
signifies is non-trivial, this could have serious repercussions.

But more importantly, what happens when the buffer overruns and we
notify all backends? If a listening backend is in the middle of a
transaction when it is notified, it just sets a flag and goes back to
processing (i.e. it doesn't clear the buffer).

If a listening backend is idle when it is notified, it checks the
buffer: but since this is normal behavior, any idle & notified backend
will have already checked the buffer! I don't see how the "notify
everyone" scheme solves anything -- if a backend _could_ respond
quickly, it also would already done so and we wouldn't have an overrun
buffer in the first place.

If we notify all backends and then clear the notification buffer,
backends in the midst of a transaction will check the buffer when they
finish their transaction but find it empty. Since this has the potential
to destroy legitimate notifications, this is clearly not an option.

Ultimately, we're just coming up with kludges to work around a
fundamental flaw (we're using a static buffer for a dynamically sized
resource). (Am I the only one who keeps running into shared memory
limitations? :-)

I can see two viable solutions:

(1) Use the shared-memory-based buffer scheme you suggested. When a
backend executes a NOTIFY, it stores it until transaction commit (as in
current sources). When the transaction commits, it checks to see if
there would be a buffer overflow if it added the NOTIFY to the buffer --
if so, it complains loudly to the log, and sleeps. When it awakens, it
repeats (try to add to buffer; else, sleep).

(2) The pg_notify scheme I suggested. It only marginally improves the
situation, but it does preserve the behavior we have now.

I think #1 isn't as bad as it might at first seem. The notification
buffer only overflows in a rare (and arguably broken) situation: when
the listening backend is in a (very) long-lived transaction, so that the
notification buffer is never checked and eventually fills up. If we
strongly suggest to application developers that they avoid this
situation in the first place (by not starting long-running transactions
in listening backends), and we also make the size of the buffer
configurable, this situation is tolerable.

Comments? Can anyone see a better solution? Is #1 reasonable behavior?

Cheers,

Neil

-- 
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC



Re: notification: pg_notify ?

From
Tom Lane
Date:
Neil Conway <nconway@klamath.dyndns.org> writes:
> (1) Use the shared-memory-based buffer scheme you suggested. When a
> backend executes a NOTIFY, it stores it until transaction commit (as in
> current sources). When the transaction commits, it checks to see if
> there would be a buffer overflow if it added the NOTIFY to the buffer --
> if so, it complains loudly to the log, and sleeps. When it awakens, it
> repeats (try to add to buffer; else, sleep).

This is NOT an improvement over the current arrangement.  It implies
that a notification might be postponed indefinitely, thereby allowing
listeners to keep using stale data indefinitely.

LISTEN/NOTIFY is basically designed for invalidate-your-cache
arrangements (which is what led into this discussion originally, no?).
In *any* caching arrangement, it is far better to have the occasional
spurious data drop than to fail to drop stale data when you need to.
Accordingly, a forced cache clear is an appropriate response to
overrun of the communications buffer.

I can certainly imagine applications where the messages are too
important to trust to a not-fully-reliable transmission medium;
but I don't think that LISTEN/NOTIFY should be loaded down with
that sort of requirement.  You can easily build 100% reliable
(and correspondingly slow and expensive) communications mechanisms
using standard SQL operations.  I think the design center for
LISTEN/NOTIFY should be exactly the case of maintaining client-side
caches --- at least that's what I used it for when I had occasion
to use it, several years ago when I first got involved with Postgres.
And for that application, a cheap mechanism that never loses a
notification, but might occasionally over-notify, is just what you
want.
        regards, tom lane


Re: notification: pg_notify ?

From
Greg Copeland
Date:
What if we used a combination of the two approaches?  That is, when an
overflow occurs, overflow into a table?  That way, nothing is lost and
spurious random events don't have to occur.  That way, things are faster
when overflows are not occurring.  When the system gets too far behind,
it simply overflows into the the existing table until the system can
catch up.  This way, we don't have to waste resources notifying listens
that would otherwise not need to be notified.

Greg




On Fri, 2002-03-22 at 23:13, Tom Lane wrote:
> Neil Conway <nconway@klamath.dyndns.org> writes:
> > (1) Use the shared-memory-based buffer scheme you suggested. When a
> > backend executes a NOTIFY, it stores it until transaction commit (as in
> > current sources). When the transaction commits, it checks to see if
> > there would be a buffer overflow if it added the NOTIFY to the buffer --
> > if so, it complains loudly to the log, and sleeps. When it awakens, it
> > repeats (try to add to buffer; else, sleep).
>
> This is NOT an improvement over the current arrangement.  It implies
> that a notification might be postponed indefinitely, thereby allowing
> listeners to keep using stale data indefinitely.
>
> LISTEN/NOTIFY is basically designed for invalidate-your-cache
> arrangements (which is what led into this discussion originally, no?).
> In *any* caching arrangement, it is far better to have the occasional
> spurious data drop than to fail to drop stale data when you need to.
> Accordingly, a forced cache clear is an appropriate response to
> overrun of the communications buffer.
>
> I can certainly imagine applications where the messages are too
> important to trust to a not-fully-reliable transmission medium;
> but I don't think that LISTEN/NOTIFY should be loaded down with
> that sort of requirement.  You can easily build 100% reliable
> (and correspondingly slow and expensive) communications mechanisms
> using standard SQL operations.  I think the design center for
> LISTEN/NOTIFY should be exactly the case of maintaining client-side
> caches --- at least that's what I used it for when I had occasion
> to use it, several years ago when I first got involved with Postgres.
> And for that application, a cheap mechanism that never loses a
> notification, but might occasionally over-notify, is just what you
> want.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org


Re: notification: pg_notify ?

From
Tom Lane
Date:
Greg Copeland <greg@CopelandConsulting.Net> writes:
> What if we used a combination of the two approaches?  That is, when an
> overflow occurs, overflow into a table?

I think this is a really bad idea.

The major problem with it is that the overflow path would be complex,
infrequently exercised, and therefore almost inevitably buggy.  (Look
at all the problems we had for so long with SI overflow response.  I'd
still not like to have to swear there are none left.)

Also, I do not think you could get away with merging listen/notify with
the system cache inval mechanism if you wanted to have table overflow for
listen/notify.  SI is too low level --- to point out just one problem,
a new backend's access to the SI message queue has to be initialized
long before we are ready to do any table access.  So you'd be requiring
dedicated shared memory space just for listen/notify.  That's a hard
sell in my book.

> That way, nothing is lost and spurious random events don't have to
> occur.

I think this argument is spurious.  Almost any client-side caching
arrangement is going to have cases where it's best to issue a "flush
everything" kind of event rather than expend the effort to keep track
of exactly what has to be invalidated by particular kinds of changes.
As long as such changes are infrequent, you have better performance
and better reliability by not trying to do the extra bookkeeping for
exact invalidation.  Why shouldn't the signal transport mechanism 
be able to do the same thing?

Also, the notion that the NOTIFY mechanism can't be lossy misses the
fact that you've got a perfectly good non-lossy mechanism at hand
already: user tables.  The traditional way of using NOTIFY has been
to stick the important data into tables and use NOTIFY simply to
cue listeners to look in those tables.  I don't foresee this changing;
it'll simply be possible to give somewhat finer-grain notification of
what/where to look.  I don't think that forcing NOTIFY to have the
same kinds of semantics as SQL tables do is the right design approach.
IMHO the only reason NOTIFY exists at all is to provide a simpler,
higher-performance communication pathway than you can get with tables.
        regards, tom lane


Re: notification: pg_notify ?

From
Neil Conway
Date:
On Sat, 2002-03-23 at 12:46, Tom Lane wrote:
> Also, the notion that the NOTIFY mechanism can't be lossy misses the
> fact that you've got a perfectly good non-lossy mechanism at hand
> already: user tables.  The traditional way of using NOTIFY has been
> to stick the important data into tables and use NOTIFY simply to
> cue listeners to look in those tables.  I don't foresee this changing;
> it'll simply be possible to give somewhat finer-grain notification of
> what/where to look.  I don't think that forcing NOTIFY to have the
> same kinds of semantics as SQL tables do is the right design approach.
> IMHO the only reason NOTIFY exists at all is to provide a simpler,
> higher-performance communication pathway than you can get with tables.

Okay, I agree (of course, it would be nice to have a more reliable
NOTIFY mechanism, but I can't see of a way to implement a
high-performance, reliable mechanism without at least one serious
drawback). And as you rightly point out, there are other methods for
people who need more reliability.

So the new behavior of NOTIFY should be: when the notifying backend
commits its transaction, the notification is stored in a shared memory
buffer of fixed size, and the listening backend is sent a SIGUSR2. If
the shared memory buffer is full, it is completely emptied. In the
listening backend's SIGUSR2 signal handler, a flag is set and the
backend goes back to its current transaction. When it becomes idle, it
checks the shared buffer: if it can't find any matching elements in the
buffer, it knows an overrun has occurred. When informing the front-end,
a notification that results from an overrun is signified by a
notification with a NULL message and with the PID of the notifying
backend sent to some constant (say, -1). This informs the front-end that
an overrun has occurred, so it can take appropriate action.

Is this behavior acceptable to everyone?

I can see 1 potential problem: there is a race condition in the "detect
an overrun" logic. If an overrun occurs and the buffer is flushed but
then another notification for one of the listening backends arrives, a
backend will only inform the front-end about the most recent
notification: there will be no indication that an overrun occurred, or
that there were other legitimate notifications in the buffer before the
overrun. It would be nice to be able to tell clients 100% "an overrun
just occurred, be careful", but apparently that's not even possible.

Cheers,

Neil

-- 
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC



Re: notification: pg_notify ?

From
Tom Lane
Date:
Neil Conway <nconway@klamath.dyndns.org> writes:
> I can see 1 potential problem: there is a race condition in the "detect
> an overrun" logic.

Only if you do it that way :-(.  Take another look at the SI messaging
logic: it will *not* lose overrun notifications.
        regards, tom lane


Re: notification: pg_notify ?

From
Mikhail Terekhov
Date:

Tom Lane wrote:

> There is a very significant performance problem with LISTEN/NOTIFY
> via pg_listener: in any application that generates notifications at
> a significant rate, pg_listener will accumulate dead tuples at that
> same rate, and we will soon find ourselves wasting lots of time
> scanning through dead tuples.  Frequent VACUUMs might help, but the


That's unfortunate, may be if backend could reuse tuple on updates could help?


> whole thing is really quite silly: why are we using a storage mechanism
> that's designed entirely for *stable* storage of data to pass inherently
> *transient* signals?  If the system crashes, we have absolutely zero



Because there is no other easy way to guarantee message delivery?


> interest in the former contents of pg_listener (and indeed need to go
> to some trouble to get rid of them).


There is no free beer :)

Regards,
Mikhail Terekhov





Re: notification: pg_notify ?

From
Gavin Sherry
Date:
On Wed, 3 Apr 2002, Mikhail Terekhov wrote:

> 
> 
> Tom Lane wrote:
> 
>  
> > There is a very significant performance problem with LISTEN/NOTIFY
> > via pg_listener: in any application that generates notifications at
> > a significant rate, pg_listener will accumulate dead tuples at that
> > same rate, and we will soon find ourselves wasting lots of time
> > scanning through dead tuples.  Frequent VACUUMs might help, but the
> 
> 
> That's unfortunate, may be if backend could reuse tuple on updates could help?

There is already a TODO item to address this. But row reuse is the wrong
solution to the problem. See below.

> 
> 
> > whole thing is really quite silly: why are we using a storage mechanism
> > that's designed entirely for *stable* storage of data to pass inherently
> > *transient* signals?  If the system crashes, we have absolutely zero
> 
> 
> 
> Because there is no other easy way to guarantee message delivery?

Shared memory is much easier and, to all intents and purposes, as reliable
for this kind of usage. It is much faster and is the-right-way-to-do-it. 

I don't believe that the question 'what happens if there is a buffer
overrun?' is a valid criticism of this approach. In the case of the
backend cache invalidation system, the backends just blow away their cache
to be on the safe side. A buffer overrun (rare as it would be,
considering the different usage patterns of the shared memory for
notification) would result in an elog(ERROR) from within the backend which
has attempted to execute the notification. After all, running out of
memory is an error in this case.

Gavin



Re: notification: pg_notify ?

From
Tom Lane
Date:
Gavin Sherry <swm@linuxworld.com.au> writes:
>> Because there is no other easy way to guarantee message delivery?

> Shared memory is much easier and, to all intents and purposes, as reliable
> for this kind of usage. It is much faster and is the-right-way-to-do-it. 

Right.  Since we do not attempt to preserve NOTIFY messages over a
system crash, there's no good reason to keep the messages in a table.
Except for the problem that shared memory is of limited size.
But if we are willing to define the semantics in a way that allows
buffer overflow recovery, that can be dealt with.

> A buffer overrun (rare as it would be,
> considering the different usage patterns of the shared memory for
> notification) would result in an elog(ERROR) from within the backend which
> has attempted to execute the notification.

Hmm.  That's a different way of attacking the overflow problem.  I don't
much care for it, but I can see that some applications might prefer this
behavior to cache-style overrun response (ie, issue forced NOTIFYs on
all conditions).  Maybe support both ways?
        regards, tom lane


Re: notification: pg_notify ?

From
Mikhail Terekhov
Date:
Gavin Sherry wrote:

> On Wed, 3 Apr 2002, Mikhail Terekhov wrote:
>>
>>Tom Lane wrote:
>>
>>>There is a very significant performance problem with LISTEN/NOTIFY
>>>via pg_listener: in any application that generates notifications at
>>>a significant rate, pg_listener will accumulate dead tuples at that
>>>same rate, and we will soon find ourselves wasting lots of time
>>>scanning through dead tuples.  Frequent VACUUMs might help, but the
>>>
>>That's unfortunate, may be if backend could reuse tuple on updates could help?
> 
> There is already a TODO item to address this. But row reuse is the wrong
> solution to the problem. See below.
> 

It is not a solution to the whole LISTEN/NOTIFY problem, but it is a
solution to the dead tuples accumulation.

> 
>>
>>>whole thing is really quite silly: why are we using a storage mechanism
>>>that's designed entirely for *stable* storage of data to pass inherently
>>>*transient* signals?  If the system crashes, we have absolutely zero
>>>
>>Because there is no other easy way to guarantee message delivery?
>>
> 
> Shared memory is much easier and, to all intents and purposes, as reliable
> for this kind of usage. It is much faster and is the-right-way-to-do-it. 
> 

That highly depends on WHAT-you-want-to-do :)
If the new shared memory implementation will guarantee message delivery
at the same degree as current implementation then it is the-right-way-to-do-it.
If not then let's not broke existing functionality! Let's implement it as an
additional functionality, say FASTNOTIFY or RIGHTNOTIFY ;)>

> I don't believe that the question 'what happens if there is a buffer
> overrun?' is a valid criticism of this approach. In the case of the
> backend cache invalidation system, the backends just blow away their cache


Forgive my ignorance, you mean sending backend?


> to be on the safe side. A buffer overrun (rare as it would be,

Regards,
Mikhail



Re: notification: pg_notify ?

From
Mikhail Terekhov
Date:

Tom Lane wrote:

> LISTEN/NOTIFY is basically designed for invalidate-your-cache
> arrangements (which is what led into this discussion originally, no?).


Why do you think so? Even if you are right and original design was
just for invalidate-your-cache arrangements, current implementation
has much more functionality and can be used as a reliable message
transmission mechanism (we use it that way). There is no reason to
broke this reliability.


> In *any* caching arrangement, it is far better to have the occasional
> spurious data drop than to fail to drop stale data when you need to.
> Accordingly, a forced cache clear is an appropriate response to
> overrun of the communications buffer.
> 

There are not only caching arrangements out there!
This resembles me the difference between poll(2) and select(2).
They are both useful in different cases.


> I can certainly imagine applications where the messages are too
> important to trust to a not-fully-reliable transmission medium;


That is exactly what we are using LISTEN/NOTIFY for. We don't need
separate message passing system, we don't need waste system resources
polling database and application is simpler and easier to maintain.


> but I don't think that LISTEN/NOTIFY should be loaded down with
> that sort of requirement.  You can easily build 100% reliable


This functionality is already in Postgres. 

May be it is not perfect but why remove it?


> (and correspondingly slow and expensive) communications mechanisms
> using standard SQL operations.  I think the design center for


Could you please elaborate on how to do that without polling?

> LISTEN/NOTIFY should be exactly the case of maintaining client-side
> caches --- at least that's what I used it for when I had occasion
> to use it, several years ago when I first got involved with Postgres.
> And for that application, a cheap mechanism that never loses a
> notification, but might occasionally over-notify, is just what you
> want.
> 
Again, client side cache is not the only one application of LISTEN/NOTIFY.

If we need a cheap mechanism for maintaining client side cache let's
implement one. Why remove existing functionality!



Re: notification: pg_notify ?

From
Tom Lane
Date:
Mikhail Terekhov <terekhov@emc.com> writes:
> Why do you think so? Even if you are right and original design was
> just for invalidate-your-cache arrangements, current implementation
> has much more functionality and can be used as a reliable message
> transmission mechanism (we use it that way).

It is *not* reliable, at least not in the sense of "the message is
guaranteed to be delivered even if there's a system crash".  Which is
the normal meaning of "reliable" in SQL environments.  If you want that
level of reliability, you need to pass your messages by storing them
in a regular table.

LISTEN/NOTIFY can optimize your message passing by avoiding unnecessary
polling of the table in the normal no-crash case.  But they are not a
substitute for having a table, and I don't see a reason to bog them down
with an intermediate level of reliability that isn't buying anything.
        regards, tom lane


Re: notification: pg_notify ?

From
Mikhail Terekhov
Date:
Tom Lane wrote:

> It is *not* reliable, at least not in the sense of "the message is
> guaranteed to be delivered even if there's a system crash".  Which is
> the normal meaning of "reliable" in SQL environments.  If you want that


That is exactly what I mean by "reliable".


Please correct me if I'm wrong but the buffer overrun problem in the new
LISTEN/NOTOFY mechanism means that it is perfectly possible that sending
backend may drop all or some of the pending NOTIFY messages in case of such
an overrun. If this is the case then this new mechanism would be step
backward in terms of functionality relative to the current implementation.

There will be no guaranty even in a no-crash case.


> level of reliability, you need to pass your messages by storing them
> in a regular table.
> 

That is exactly what I do in my application. I store messages in a regular
table and then send a notify to other clients. But I'd like to have a
guaranty that without system crash all my notifies will be delivered.
I use this method when I need to send some additional information except
the notice's name. Another case is similar to your cache invalidation
example. The big difference is that I need to maintain a kind of cache for
the large number of big tables and I need to know promptly when these
tables change. I can't afford to update this cache frequently enough in
case of polling. And when there is no NOTIFY delivery guaranty the only
solution is polling. Occasional delivery of NOTIFY messages may only improve
in some sense the polling strategy. One can not rely on them.

> LISTEN/NOTIFY can optimize your message passing by avoiding unnecessary
> polling of the table in the normal no-crash case.  But they are not a


Guaranteed delivery in the normal no-crash case avoids polling
completely in case of cache invalidation scenario. DB crash recovery is a
very complex task for an application. Some time a recovery is not possible
at all. But for cache invalidation a DB crash is nothing more than cache
reinitialisation (you will get this crash notification without LISTEN/NOTIFY
message ;) Even stronger: you can't receive a crash notification with
LISTEN/NOTIFY mechanism).

And again, this no-crash case guaranty is already here! We don't need to
do anything!


> substitute for having a table, and I don't see a reason to bog them down


Sure their are not substitute, and I'm not the one who proposed to extend 

LISTEN/NOTIFY mechanism with additional information ;) This whole thread
was started to extend LISTEN/NOTIFY mechanism to support optional messages.
If we are agree that LISTEN/NOTIFY is not a substitute for having a table for
such a messages, then what is the purpose to reimplement this feature with
a loss of functionality?
> with an intermediate level of reliability that isn't buying anything.>

If you mean reliability in no-crash case then it gives a lot - it eliminates
need for polling completely. And once again, we already have this level of
reliability.

What exactly PG will get with this new LISTEN/NOTIFY mechanism? If the profit
has so great value, let's implement it as an additional feature, not as a
replacement of the existing one with loss of functionality.


Regards
Mikhail Terekhov




Re: notification: pg_notify ?

From
Tom Lane
Date:
Mikhail Terekhov <terekhov@emc.com> writes:
> Please correct me if I'm wrong but the buffer overrun problem in the new
> LISTEN/NOTOFY mechanism means that it is perfectly possible that sending
> backend may drop all or some of the pending NOTIFY messages in case of such
> an overrun.

You would be guaranteed to get *some* notify.  You wouldn't be
guaranteed to receive the auxiliary info that's proposed to be added to
the basic message type; also you might get notify reports for conditions
that hadn't actually been signaled.

> If this is the case then this new mechanism would be step
> backward in terms of functionality relative to the current implementation.

The current mechanism is hardly perfect; it drops multiple occurrences
of the same NOTIFY.  Yes, the behavior would be different, but that
doesn't immediately translate to "a step backwards".

> That is exactly what I do in my application. I store messages in a regular
> table and then send a notify to other clients. But I'd like to have a
> guaranty that without system crash all my notifies will be delivered.

Please re-read the proposal.  It will not break your application.
        regards, tom lane


Re: notification: pg_notify ?

From
Gavin Sherry
Date:
On Tue, 9 Apr 2002, Tom Lane wrote:

> Mikhail Terekhov <terekhov@emc.com> writes:
> > Please correct me if I'm wrong but the buffer overrun problem in the new
> > LISTEN/NOTOFY mechanism means that it is perfectly possible that sending
> > backend may drop all or some of the pending NOTIFY messages in case of such
> > an overrun.
> 
> You would be guaranteed to get *some* notify.  You wouldn't be
> guaranteed to receive the auxiliary info that's proposed to be added to
> the basic message type; also you might get notify reports for conditions
> that hadn't actually been signaled.

I poked around the notify code and had a think about the ideas which have
been put forward. I think the buffer overrun issue can be addressed by
allowing users to define the importance of the notify they are making. Eg:

NOTIFY HARSH <condition>

If there is to be a buffer overrun, all conditions are notified and the
buffer is, eventually, reset.

NOTIFY SAFE <condition>

(Yes, bad keywords). This on the other hand would check if there is to be
a buffer overrun and (after a SendPostmasterSignal(PMSIGNAL_WAKEN_CHILDREN) 
fails to reduce the buffer) it would invalidate the transaction with an
elog(ERROR). This can be done since AtCommit_Notify() is run before
RecordTransactionCommit().

This does not deal with recovery from a crash. The only way it could is by
plugging the listen and notify signals into the xlog. This seems very
messy though.

Gavin