Thread: notification payloads
This feature (ability to add a message payload to a NOTIFY) is on the TODO list and I had undertaken to implement it. However, pressure of other work has conspired to make that difficult, and Abhijit Menon-Sen recently very kindly offered to help out. Therer was some discussion of implementation late last year here: http://groups.google.com/group/pgsql.hackers/browse_frm/thread/e63a5ac43e2508ce/ce47016235bd5a62?tvc=1&q=notify+payload&hl=en#ce47016235bd5a62 However, in various pieces of off-list discussion it appears that there is some opposition either to the design or to the feature itself. What is more, there will clearly be vigorous opposition to any implementation which does NOT remove the use of pg_listener (which I understand and I think largely agree with). So, before an investment of any more time is made by either Abhijit or myself, I would like to get confirmation that a) there is broad agreement on the desirability of the feature and b) that there is broad agreement on the general design (i.e. to use a circular buffer in shared memory, of configurable size, to hold the outstanding message queue). Please speak up or forever .... cheers andrew
Andrew Dunstan wrote: > > So, before an investment of any more time is made by either Abhijit or > myself, I would like to get confirmation that a) there is broad > agreement on the desirability of the feature Yes, absolutely desirable. > and b) that there is broad > agreement on the general design (i.e. to use a circular buffer in shared > memory, of configurable size, to hold the outstanding message queue). Would it spill out to disk and expand (and shrink again) as required? Loss of notifications should not occur imho. Regards, Dave.
Dave Page wrote: > Andrew Dunstan wrote: >> >> So, before an investment of any more time is made by either Abhijit >> or myself, I would like to get confirmation that a) there is broad >> agreement on the desirability of the feature > > Yes, absolutely desirable. good ;-) > >> and b) that there is broad agreement on the general design (i.e. to >> use a circular buffer in shared memory, of configurable size, to hold >> the outstanding message queue). > > Would it spill out to disk and expand (and shrink again) as required? > Loss of notifications should not occur imho. > > No loss, but, per previous discussion, it would block and try to get other backends to collect their outstanding notifications. Let's say we provide 100Kb for this (which is not a heck of a lot) , that the average notification might be, say, 40 bytes of name plus 60 bytes of message. Then we have room for about 1000 messages in the queue. This would get ugly only if backend presumably in the middle of some very long transaction, refused to pick up its messages despite prodding. But ISTM that means we just need to pick a few strategic spots that will call CHECK_FOR_NOTIFICATIONS() even in the middle of a transaction and store them locally. cheers andrew
Andrew Dunstan wrote: > Let's say we provide 100Kb for this (which is not a heck of a lot) , > that the average notification might be, say, 40 bytes of name plus 60 > bytes of message. Then we have room for about 1000 messages in the > queue. This would get ugly only if backend presumably in the middle of > some very long transaction, refused to pick up its messages despite > prodding. But ISTM that means we just need to pick a few strategic spots > that will call CHECK_FOR_NOTIFICATIONS() even in the middle of a > transaction and store them locally. Why have the name on each message? Presumably names are going to be few compared to the total number of messages, so maybe store the names in a separate hash table and link them with a numeric identifier. That gives you room for a lot more messages. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > Andrew Dunstan wrote: > > >> Let's say we provide 100Kb for this (which is not a heck of a lot) , >> that the average notification might be, say, 40 bytes of name plus 60 >> bytes of message. Then we have room for about 1000 messages in the >> queue. This would get ugly only if backend presumably in the middle of >> some very long transaction, refused to pick up its messages despite >> prodding. But ISTM that means we just need to pick a few strategic spots >> that will call CHECK_FOR_NOTIFICATIONS() even in the middle of a >> transaction and store them locally. >> > > Why have the name on each message? Presumably names are going to be few > compared to the total number of messages, so maybe store the names in a > separate hash table and link them with a numeric identifier. That gives > you room for a lot more messages. > > Maybe, but at the cost of some considerable complexity ISTM, especially as this all needs to be in shared memory. On any machine with significant workload a few Mb of memory would not be missed. How many messages do we reasonably expect to be in the queue? Judging by our usage here it would be a handful at most, but maybe others have far more intensive uses. Is anyone really doing notifies at a rate of many per second? cheers andrew
Alvaro Herrera <alvherre@commandprompt.com> writes: > Why have the name on each message? Presumably names are going to be few > compared to the total number of messages, so maybe store the names in a > separate hash table and link them with a numeric identifier. That gives > you room for a lot more messages. That can be done by the application, if its notify payloads are such that that's a useful optimization. However it seems entirely possible to me that the payload strings might be nonrepeating and the overhead of a separate table completely wasted. regards, tom lane
Andrew Dunstan wrote: > > No loss, but, per previous discussion, it would block and try to get > other backends to collect their outstanding notifications. > > Let's say we provide 100Kb for this (which is not a heck of a lot) , > that the average notification might be, say, 40 bytes of name plus 60 > bytes of message. Then we have room for about 1000 messages in the > queue. This would get ugly only if backend presumably in the middle of > some very long transaction, refused to pick up its messages despite > prodding. But ISTM that means we just need to pick a few strategic spots > that will call CHECK_FOR_NOTIFICATIONS() even in the middle of a > transaction and store them locally. Sounds good. Regards, Dave.
"Andrew Dunstan" <andrew@dunslane.net> writes: >>> and b) that there is broad agreement on the general design (i.e. to use a >>> circular buffer in shared memory, of configurable size, to hold the >>> outstanding message queue). >> >> Would it spill out to disk and expand (and shrink again) as required? Loss of >> notifications should not occur imho. > > No loss, but, per previous discussion, it would block and try to get other > backends to collect their outstanding notifications. > > Let's say we provide 100Kb for this (which is not a heck of a lot) , that the > average notification might be, say, 40 bytes of name plus 60 bytes of message. > Then we have room for about 1000 messages in the queue. This would get ugly > only if backend presumably in the middle of some very long transaction, refused > to pick up its messages despite prodding. But ISTM that means we just need to > pick a few strategic spots that will call CHECK_FOR_NOTIFICATIONS() even in the > middle of a transaction and store them locally. Keep in mind that the usual place you run into problems with this type of buffering is where you have two processes talking to each other. Say a producer-consumer type of design. You want to be sure you never deadlock with each process waiting for the other to consume a notification. I don't think this is a problem in this case because it just means the state you enter when you're blocked waiting for your buffer to have free space MUST be amongst the times you call CHECK_FOR_NOTIFICATIONS(). If you didn't plan to have this local storage in the backend it would be difficult to guarantee that clients would handle this situation correctly. Perhaps that was obvious already. If so, sorry for worrying for nothing. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes: > ... But ISTM that means we just need to pick a few strategic spots > that will call CHECK_FOR_NOTIFICATIONS() even in the middle of a > transaction and store them locally. Minor comment --- I don't believe in having a separate "sprinkle" of notify-specific checks. It needs to be set up so that CHECK_FOR_INTERRUPTS will deal with the catch-up-please signal. We've already done (most of) the work of making sure CHECK_FOR_INTERRUPTS is called often enough, and AFAICS we'd end up needing CHECK_FOR_NOTIFICATIONS in exactly those same loops anyway. It definitely helps here that CHECK_FOR_NOTIFICATIONS need affect only localized state of a particular subsystem that nothing else depends on. I've been wishing we could handle SI inval at more places than we do now, but that seems a lot harder :-( regards, tom lane
Ühel kenal päeval, E, 2007-03-26 kell 14:07, kirjutas Tom Lane: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Why have the name on each message? Presumably names are going to be few > > compared to the total number of messages, so maybe store the names in a > > separate hash table and link them with a numeric identifier. That gives > > you room for a lot more messages. > > That can be done by the application, if its notify payloads are such > that that's a useful optimization. However it seems entirely possible > to me that the payload strings might be nonrepeating and the overhead > of a separate table completely wasted. What we could do is use one name for many messages/listeners/notifies, so that in case we have 10 backends listening to "ACCOUNTS_CHANGE', then we can keep the ACCOUNTS_CHANGE part only once, and reuse it's id also for LISTENs. That would get the same storage savings as Alvaros proposed hash and only be live during the time whenthere are any listeners. So perhaps it Alvaros proposal can be rephrased thus: "Why have the name on each message? The names are already stored in listen table, just reuse numeric identifier pointing to item in that table. That gives you room for a lot more messages." If there is no name in listen table, it means that nobody is interested and the message can be dropped right away. > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com
Ühel kenal päeval, E, 2007-03-26 kell 11:30, kirjutas Andrew Dunstan: > This feature (ability to add a message payload to a NOTIFY) is on the > TODO list and I had undertaken to implement it. However, pressure of > other work has conspired to make that difficult, and Abhijit Menon-Sen > recently very kindly offered to help out. > > Therer was some discussion of implementation late last year here: > http://groups.google.com/group/pgsql.hackers/browse_frm/thread/e63a5ac43e2508ce/ce47016235bd5a62?tvc=1&q=notify+payload&hl=en#ce47016235bd5a62 > > However, in various pieces of off-list discussion it appears that there > is some opposition either to the design or to the feature itself. What > is more, there will clearly be vigorous opposition to any implementation > which does NOT remove the use of pg_listener (which I understand and I > think largely agree with). > > So, before an investment of any more time is made by either Abhijit or > myself, I would like to get confirmation that a) there is broad > agreement on the desirability of the feature and b) that there is broad > agreement on the general design (i.e. to use a circular buffer in shared > memory, of configurable size, to hold the outstanding message queue). > Please speak up or forever .... I find the feature very useful, and have even done some preliminary design work for shared memory implementation, where each listener is required to copy data to its own privat memory ASAP and notifier waits in case there is not enough room in shared memory buffer. Alas, I have lost my design 3 a4 pages of design notes for organising things in fixed-size buffer, but the basic operation data-wise sould be as in attached sql file. -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com
Attachment
Ühel kenal päeval, T, 2007-03-27 kell 11:17, kirjutas Hannu Krosing: > Ühel kenal päeval, E, 2007-03-26 kell 11:30, kirjutas Andrew Dunstan: > > This feature (ability to add a message payload to a NOTIFY) is on the > > TODO list and I had undertaken to implement it. However, pressure of > > other work has conspired to make that difficult, and Abhijit Menon-Sen > > recently very kindly offered to help out. > > > > Therer was some discussion of implementation late last year here: > > http://groups.google.com/group/pgsql.hackers/browse_frm/thread/e63a5ac43e2508ce/ce47016235bd5a62?tvc=1&q=notify+payload&hl=en#ce47016235bd5a62 > > > > However, in various pieces of off-list discussion it appears that there > > is some opposition either to the design or to the feature itself. What > > is more, there will clearly be vigorous opposition to any implementation > > which does NOT remove the use of pg_listener (which I understand and I > > think largely agree with). > > > > So, before an investment of any more time is made by either Abhijit or > > myself, I would like to get confirmation that a) there is broad > > agreement on the desirability of the feature and b) that there is broad > > agreement on the general design (i.e. to use a circular buffer in shared > > memory, of configurable size, to hold the outstanding message queue). > > Please speak up or forever .... Now that I think about it again, maybe we should NOT go for a shared memory implementation after all, as we now have HOT updates and thanks to the fact, that we have 1:1 correspondence between the backends and deleters in LISTEN/NOTIFY we can have much more exact DEAD-ness conditions and can reuse space even in presence of long-running transactions. IOW, once we have deleted the message, we can be sure that no other backend will ever be interested in that row. That means it may be possible to use a design similar to the one I just sent and just make the tables not wal-logged and have dead space reused in HOT-like manner. Straight HOT wil not be useful here, as usage is INSERT/DELETE instead of UPDATE, but similar principles, including heap space and index pointer reuse could probably be done. -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > >> ... But ISTM that means we just need to pick a few strategic spots >> that will call CHECK_FOR_NOTIFICATIONS() even in the middle of a >> transaction and store them locally. >> > > Minor comment --- I don't believe in having a separate "sprinkle" of > notify-specific checks. It needs to be set up so that > CHECK_FOR_INTERRUPTS will deal with the catch-up-please signal. We've > already done (most of) the work of making sure CHECK_FOR_INTERRUPTS is > called often enough, and AFAICS we'd end up needing > CHECK_FOR_NOTIFICATIONS in exactly those same loops anyway. > > > OK, this works for me - it will make things simpler. cheers andrew
Hannu Krosing wrote: > So perhaps it Alvaros proposal can be rephrased thus: > "Why have the name on each message? The names are already stored in > listen table, just reuse numeric identifier pointing to item in that > table. That gives you room for a lot more messages." > > If there is no name in listen table, it means that nobody is interested > and the message can be dropped right away. > > Er, what listen table? The only thing that will be in shared memory is a queue and some bookkeeping (queue head per backend). Each backend will be responsible for catching the notifications it is interested in and discarding the rest (see earlier discussion). cheers andrew
Hannu Krosing wrote: > I find the feature very useful, and have even done some preliminary > design work for shared memory implementation, where each listener is > required to copy data to its own privat memory ASAP and notifier waits > in case there is not enough room in shared memory buffer. > > Alas, I have lost my design 3 a4 pages of design notes for organising > things in fixed-size buffer, but the basic operation data-wise sould be > as in attached sql file. > > > [snip] This looks somewhat like what I originally had in mind, but discussion led to something much simpler. The only thing we need to share is the queue. Everything else can be done in the individual backends. cheers andrew
Hannu Krosing wrote: > Now that I think about it again, maybe we should NOT go for a > shared memory implementation after all, as we now have HOT updates and > thanks to the fact, that we have 1:1 correspondence between the backends > and > deleters in LISTEN/NOTIFY we can have much more exact DEAD-ness > conditions and > can reuse space even in presence of long-running transactions. > > IOW, once we have deleted the message, we can be sure that no other > backend will ever be interested in that row. > > That means it may be possible to use a design similar to the one I just > sent and just make the tables not wal-logged and have dead space reused > in HOT-like manner. > > Straight HOT wil not be useful here, as usage is INSERT/DELETE instead > of UPDATE, but similar principles, including heap space and index > pointer reuse could probably be done. > > The only advantage to this ISTM is that we would eliminate the possibility of blocking. But it still strikes me as rather more complex and thus possibly more fragile that what was previously discussed. cheers andrew
Gregory Stark wrote: > "Andrew Dunstan" <andrew@dunslane.net> writes: > > >>>> and b) that there is broad agreement on the general design (i.e. to use a >>>> circular buffer in shared memory, of configurable size, to hold the >>>> outstanding message queue). >>>> >>> Would it spill out to disk and expand (and shrink again) as required? Loss of >>> notifications should not occur imho. >>> >> No loss, but, per previous discussion, it would block and try to get other >> backends to collect their outstanding notifications. >> >> Let's say we provide 100Kb for this (which is not a heck of a lot) , that the >> average notification might be, say, 40 bytes of name plus 60 bytes of message. >> Then we have room for about 1000 messages in the queue. This would get ugly >> only if backend presumably in the middle of some very long transaction, refused >> to pick up its messages despite prodding. But ISTM that means we just need to >> pick a few strategic spots that will call CHECK_FOR_NOTIFICATIONS() even in the >> middle of a transaction and store them locally. >> > > Keep in mind that the usual place you run into problems with this type of > buffering is where you have two processes talking to each other. Say a > producer-consumer type of design. You want to be sure you never deadlock > with each process waiting for the other to consume a notification. > > I don't think this is a problem in this case because it just means the state > you enter when you're blocked waiting for your buffer to have free space MUST > be amongst the times you call CHECK_FOR_NOTIFICATIONS(). If you didn't plan to > have this local storage in the backend it would be difficult to guarantee that > clients would handle this situation correctly. > > Perhaps that was obvious already. If so, sorry for worrying for nothing. > > > No, it's a good point. the pseudo-code might look something like loop res = enqueue_notification collect_my_messages exit when res == success send_out_signal sleep_a_bit end loop That should be enough I think. cheers andrew
Ühel kenal päeval, T, 2007-03-27 kell 07:11, kirjutas Andrew Dunstan: > > Hannu Krosing wrote: > > So perhaps it Alvaros proposal can be rephrased thus: > > "Why have the name on each message? The names are already stored in > > listen table, just reuse numeric identifier pointing to item in that > > table. That gives you room for a lot more messages." > > > > If there is no name in listen table, it means that nobody is interested > > and the message can be dropped right away. > > > > > > Er, what listen table? equivalent of pg_listener, see my mail with attached .sql > The only thing that will be in shared memory is a > queue and some bookkeeping (queue head per backend). Each backend will > be responsible for catching the notifications it is interested in and > discarding the rest At least the list of which backends listen to which events should be also in shared mem. How else would we know how many copies to make for each backend or when we can release the memory in case we make one copy ? > (see earlier discussion). could you post a link to archives ? I could not find anything relevant in the discussion link from the head of this mail thread. > cheers > > andrew -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com NOTICE: This communication contains privileged or other confidential information. If you have received it in error, please advise the sender by reply email and immediately delete the message and any attachments without copying or disclosing the contents.
Ühel kenal päeval, T, 2007-03-27 kell 16:13, kirjutas Hannu Krosing: > > How else would we know how many copies to make for each backend or when > we can release the memory in case we make one copy ? > > > (see earlier discussion). > > could you post a link to archives ? Sorry, found it now. I was confused by google.groups threads display ;) > I could not find anything relevant in the discussion link from the head > of this mail thread. > > > cheers > > > > andrew -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com
Hannu Krosing <hannu@skype.net> writes: > Ühel kenal päeval, T, 2007-03-27 kell 07:11, kirjutas Andrew Dunstan: >> Er, what listen table? > At least the list of which backends listen to which events should be > also in shared mem. No, the intent is specifically that there will be *no* such global structure. All it does is add complexity, not to mention make it harder to size shared memory. > How else would we know how many copies to make for each backend or when > we can release the memory in case we make one copy ? The proposed design is essentially a clone of the sinval messaging system, which does not need to know either of those and does not make "one copy per backend". There's one copy, period. regards, tom lane
Tom Lane wrote: > Hannu Krosing <hannu@skype.net> writes: > >> Ühel kenal päeval, T, 2007-03-27 kell 07:11, kirjutas Andrew Dunstan: >> >>> Er, what listen table? >>> > > >> At least the list of which backends listen to which events should be >> also in shared mem. >> > > No, the intent is specifically that there will be *no* such global > structure. All it does is add complexity, not to mention make it > harder to size shared memory. > > >> How else would we know how many copies to make for each backend or when >> we can release the memory in case we make one copy ? >> > > The proposed design is essentially a clone of the sinval messaging > system, which does not need to know either of those and does not make > "one copy per backend". There's one copy, period. > > > Further design notes: What we will need to keep track of (a la sinval) is a queue pointer per backend. I think we can add an optimization to that by keeping a count of events listened to per backend, so that we only wake up active listeners. For non listeners we can just catch them up without bothering to wake them. This will help to avoid the potential for a "thundering herd" effect that has apparently bothered some people. We'll also need to store the database id along with the event name and message, since pg_listener is per db rather than per cluster. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > We'll also need to store the database id along with the event name and > message, since pg_listener is per db rather than per cluster. Well, that's an artifact of the historical implementation ... does anyone want to argue that LISTEN should be cluster-wide given the opportunity? regards, tom lane
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> We'll also need to store the database id along with the event name and >> message, since pg_listener is per db rather than per cluster. > > Well, that's an artifact of the historical implementation ... does > anyone want to argue that LISTEN should be cluster-wide given the > opportunity? That would be a problem if you try to run multiple installations of an application that uses NOTIFY/LISTEN in separate databases in a single cluster. Applications would overhear each other. I'd consider that as a bug, not a feature. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Mon, Mar 26, 2007 at 10:30 AM, Andrew Dunstan <andrew@dunslane.net> wrote: > > This feature (ability to add a message payload to a NOTIFY) is on the TODO > list and I had undertaken to implement it. However, pressure of other work > has conspired to make that difficult, and Abhijit Menon-Sen recently very > kindly offered to help out. Was looking through old threads regarding various listen/notify reimplementation ideas. Did anything ever come out of this? merlin
Merlin Moncure wrote: > On Mon, Mar 26, 2007 at 10:30 AM, Andrew Dunstan <andrew@dunslane.net> wrote: > >> This feature (ability to add a message payload to a NOTIFY) is on the TODO >> list and I had undertaken to implement it. However, pressure of other work >> has conspired to make that difficult, and Abhijit Menon-Sen recently very >> kindly offered to help out. >> > > Was looking through old threads regarding various listen/notify > reimplementation ideas. Did anything ever come out of this? > > > No. pressure of work :-( It's still very high on my TODO list. cheers andrew