Thread: Discussion on a LISTEN-ALL syntax
Howdy all, NOTE: Grey-beard coder, pgsql newbie. All info/tips/suggestions welcome! I have a use-case where I’d like to LISTEN for all NOTIFY channels. Right now I simply issue a LISTEN for every channel name of interest, but in production the channels will number in the low thousands. The current implementation uses a linked list, and a linear probe through the list of desired channels which will always return true becomes quite expensive at this scale. I have a work-around available by creating the “ALL” channel and making the payload include the actual channel name, but this has a few of drawbacks: * it does not play nice with clients that actually want a small subset of channels; * it requires code modification at every NOTIFY; * it requires extra code on the client side. The work-around subjects the developer (me :-) to significant risk of foot-gun disease, so I'd like to propose a 'LISTEN *' equivalent to 'UNLISTEN *'. The implementation in src/backend/commands/async.c seems straightforward enough, but it feels prudent to select a syntax that doesn't make some kind of actual pattern matching syntactically ugly in the future. Choosing 'LISTEN *' has a nice symmetry with 'UNLISTEN *', but I don't have enough SQL chops to know if it cause problems. If anyone has a better work-around, please speak up! If not, and we can come to some resolution on a future-resistant syntax, I'd happily start working up a patch set. Thanks, -- Trey Boudreau
Trey Boudreau <trey@treysoft.com> writes: > so I'd like to propose a 'LISTEN *' equivalent to 'UNLISTEN *'. Seems reasonable in the abstract, and given the UNLISTEN * precedent it's hard to quibble with that syntax choice. I think what actually needs discussing are the semantics, specifically how this'd interact with other LISTEN/UNLISTEN actions. Explain what you think should be the behavior after: LISTEN foo; LISTEN *; UNLISTEN *; -- are we still listening on foo? LISTEN *; LISTEN foo; UNLISTEN *; -- how about now? LISTEN *; UNLISTEN foo; -- how about now? LISTEN *; LISTEN foo; UNLISTEN foo; -- does that make a difference? I don't have any strong preferences about this, but we ought to have a clear idea of the behavior we want before we start coding. regards, tom lane
> On Dec 20, 2024, at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Seems reasonable in the abstract, and given the UNLISTEN * precedent > it's hard to quibble with that syntax choice. I think what actually > needs discussing are the semantics, specifically how this'd interact > with other LISTEN/UNLISTEN actions. My first pass at the documentation looks like this: <para> The special wildcard <literal>*</literal> cancels all listener registrations for the current session and replaces them with a virtual registration that matches all channels. Further <command>LISTEN</command> and <command>UNLISTEN <replaceable class="parameter">channel</replaceable></command> commands will be ignored until the session sees the <command>UNLISTEN *</command> command. </para> > Explain what you think should > be the behavior after: > > LISTEN foo; > LISTEN *; > UNLISTEN *; > -- are we still listening on foo? > No, as the ‘LISTEN *’ wipes existing registrations. > LISTEN *; > LISTEN foo; > UNLISTEN *; > -- how about now? Not listening on ‘foo’ or anything else. > LISTEN *; > UNLISTEN foo; > -- how about now? ‘UNLISTEN foo’ ignored. > LISTEN *; > LISTEN foo; > UNLISTEN foo; > -- does that make a difference? ‘LISTEN foo’ and ‘UNLISTEN foo’ ignored, leaving only the wildcard. > I don't have any strong preferences about this, but we ought to > have a clear idea of the behavior we want before we start coding. These semantics made sense to me, but I have limited experience and a very specific use case in mind. Changing the behavior of ‘UNLISTEN *’ feels extremely impolite, and if we leave that alone I don’t see using the ‘LISTEN *’ syntax with behavior that leaves other LISTENs in place. We could have a different set of keywords, like LISTEN_ALL/UNLISTEN_ALL that doesn’t interfere with the existing behavior. -- Trey
On Fri, Dec 20, 2024 at 1:58 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Trey Boudreau <trey@treysoft.com> writes:
> so I'd like to propose a 'LISTEN *' equivalent to 'UNLISTEN *'.
Seems reasonable in the abstract, and given the UNLISTEN * precedent
it's hard to quibble with that syntax choice. I think what actually
needs discussing are the semantics, specifically how this'd interact
with other LISTEN/UNLISTEN actions. Explain what you think should
be the behavior after:
Answers premised on the framing explained below:
LISTEN foo;
LISTEN *;
UNLISTEN *;
-- are we still listening on foo?
Yes; the channels are orthogonal and thus order doesn't matter.
LISTEN *;
LISTEN foo;
UNLISTEN *;
-- how about now?
Yes
LISTEN *;
UNLISTEN foo;
-- how about now?
The unlisten was a no-op since listen foo was not issued; * receives everything, always.
LISTEN *;
LISTEN foo;
UNLISTEN foo;
-- does that make a difference?
If any notify foo happened in between listen foo and unlisten foo the session would receive the notify message twice - once implicitly via * and once explicitly via foo.
Alternatively, the server could see that "foo" is subscribed too for PID listener, send the message and then skip over looking for a * subscription for PID listener. Basically document that we won't send duplicates if both listen * and listen foo are present.
I don't have any strong preferences about this, but we ought to
have a clear idea of the behavior we want before we start coding.
I'm inclined to make this clearly distinct from the semantics of listen/notify. Both in command form, what is affected, and the message.
Something like:
MONITOR NOTIFICATION QUEUE;
UNMONITOR NOTIFICATION QUEUE;
Asynchronous notification "foo" [with payload ...] sent by server process with PID nnn.
If you also LISTEN foo you would also receive:
Asynchronous notification "foo" [with payload ...] received from server process with PID nnn.
Unlisten undoes Listen
Unmonitor undoes Monitor
Upon session disconnect both Unlisten * and Unmonitor are executed.
If we must shoehorn this into the existing syntax and messages I'd still want to say that * is simply a special channel name that the system recognizes and sends all notify messages to. There is no way to limit which messages get sent to you via unlisten and if you also listen to the channel foo explicitly you end up receiving multiple messages. (Alternatively, send it just to foo and have the server not look for a * listen for that specific session.)
Adding a "do not send" listing (or option) to the implementation doesn't seem beneficial enough to deal with, and would be the only way: Listen *; Unlisten foo; would be capable of not having foo messages sent to the * subscribing client. In short, a "deny (do not send) all" base posture and then permit-only policies built on top of it. Listen * is the permit-all policy.
David J.
On Fri, Dec 20, 2024 at 2:42 PM Trey Boudreau <trey@treysoft.com> wrote:
> On Dec 20, 2024, at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Seems reasonable in the abstract, and given the UNLISTEN * precedent
> it's hard to quibble with that syntax choice. I think what actually
> needs discussing are the semantics, specifically how this'd interact
> with other LISTEN/UNLISTEN actions.
My first pass at the documentation looks like this:
<para>
The special wildcard <literal>*</literal> cancels all listener
registrations for the current session and replaces them with a
virtual registration that matches all channels. Further
<command>LISTEN</command> and <command>UNLISTEN <replaceable
class="parameter">channel</replaceable></command> commands will
be ignored until the session sees the <command>UNLISTEN *</command>
command.
</para>
I just sent my thoughts here as well. The choice to "cancel all listener registrations" seems unintuitive and unnecessary - so long as we either document or handle deduplication internally.
As I noted in my email, * is a permit-all policy in a "deny by default" system. Such a system is allowed to have other more targeted "allow" policies existing at the same time. If the permit-all policy gets removed then those individual allow policies immediately become useful again. If you want to remove those targeted allowed policies execute Unlisten * before executing Listen *.
I dislike the non-symmetric meaning of * in the command sequence above but it likely is better than inventing a whole new syntax.
David J.
On Fri, Dec 20, 2024 at 2:42 PM Trey Boudreau <trey@treysoft.com> wrote:
We could have a different set of keywords, like LISTEN_ALL/UNLISTEN_ALL
that doesn’t interfere with the existing behavior.
I think we will need something along these lines. We've given * a meaning in UNLISTEN * that doesn't match what this proposal wants to accomplish.
I suggested using monitor/unmonitor but I suppose any unquoted symbol or keyword that is invalid as a channel name would work within the Listen/Unlisten syntax.
Otherwise I mis-spoke in my previous design since regardless of whether Listen * unregisters existing channels or not Unlisten * will remove everything and leave the session back at nothing. In which case you might as well just remove the redundant channel listeners.
David J.
Trey Boudreau <trey@treysoft.com> writes: > My first pass at the documentation looks like this: > <para> > The special wildcard <literal>*</literal> cancels all listener > registrations for the current session and replaces them with a > virtual registration that matches all channels. Further > <command>LISTEN</command> and <command>UNLISTEN <replaceable > class="parameter">channel</replaceable></command> commands will > be ignored until the session sees the <command>UNLISTEN *</command> > command. > </para> Hmph. After thinking about it a bit I have a different idea (and I see David has yet a third one). So maybe this is more contentious than it seems. But at any rate, I have two fundamental thoughts: * "Listen to all but X" seems like a reasonable desire. * The existing implementation already has the principle that you can't listen to a channel more than once; that is, LISTEN foo; LISTEN foo; -- this is a no-op, not a duplicate subscription Therefore I propose: * "LISTEN *" wipes away all previous listen state, and sets up a state where you're listening to all channels (within your database). * "UNLISTEN *" wipes away all previous listen state, and sets up a state where you're listening to no channels (which is the same as it does now). * "LISTEN foo" adds "foo" to what you are listening to, with no effect if you already were listening to foo (whether it was a virtual or explicit listen). * "UNLISTEN foo" removes "foo" from what you are listening to, with no effect if you already weren't listening to foo. This is just about the same as the current behavior, and it makes "LISTEN *" act the same as though you had somehow explicitly listed every possible channel. Which I think is a lot cleaner than conceptualizing it as an independent gating behavior, as well as more useful because it'll permit "all but" behavior. The implementation of this could be something like struct { bool all; /* true if listening to all */ List *plus; /* channels explicitly listened */ List *minus; /* channels explicitly unlistened */ } ListenChannels; with the proviso that "plus" must be empty if "all" is true, while "minus" must be empty if "all" is false. The two lists are always empty right after LISTEN * or UNLISTEN *, but could be manipulated by subsequent channel-specific LISTEN/UNLISTEN. (Since only one list would be in use at a time, you could alternatively combine "plus" and "minus" into a single list of exceptions to the all/none state. I suspect that would be confusingly error-prone to code; but perhaps it would turn out elegantly.) One other thing that needs to be thought about in any case is what the pg_listening_channels() function ought to return in these newly-possible states. regards, tom lane
On Friday, December 20, 2024, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Trey Boudreau <trey@treysoft.com> writes:
* "Listen to all but X" seems like a reasonable desire.
This I concur with, and would add: let me name my channels accounting.payables, accounting.receivables, sales.leads; and let me listen or ignore all accounting/sales channel names.
But staying within the existing “deny default, permissive grants only” design to meet this specific goal seems like a reasonable incremental step to accept. Let others wanting to work on a more expansive capability change brings those patches forth.
As for exposing this to the user, this allow-all “channel” would be presented as any other normal channel. The reader would need to know about the special meaning of whatever label we end up using. IOW, the wildcard is the label and no attempt to tie real in-use channel names to it should or even could be attempted.
David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes: > On Friday, December 20, 2024, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> * "Listen to all but X" seems like a reasonable desire. > This I concur with, and would add: let me name my channels > accounting.payables, accounting.receivables, sales.leads; and let me listen > or ignore all accounting/sales channel names. Hmm. That reminds me that there was recently a proposal to allow LISTEN/UNLISTEN with pattern arguments. (It wasn't anything you'd expect like regex patterns or LIKE patterns, but some off-the-wall syntax, which I doubt we'd accept in that form. But clearly there's some desire for that out there.) While I don't say we need to implement that as part of this, it'd be a good idea to anticipate that that will happen. And that kind of blows a hole in my idea, because mine was predicated on the assumption that you could unambiguously match UNLISTENs against LISTENs. A patterned UNLISTEN might revoke a superset or subset of previous LISTENs, and I'm not sure you could readily tell which. I think we can still hold to the idea that LISTEN * or UNLISTEN * cancels all previous requests, but it's feeling like we might have to accumulate subsequent requests without trying to make contradictory ones cancel out. Is it okay if the behavior is explicitly dependent on the order of those requests, more or less "last match wins"? If not, how do we avoid that? > As for exposing this to the user, this allow-all “channel” would be > presented as any other normal channel. The reader would need to know about > the special meaning of whatever label we end up using. IOW, the wildcard is > the label and no attempt to tie real in-use channel names to it should or > even could be attempted. Don't think that quite flies. We might have to regurgitate the state explicitly: LISTEN * UNLISTEN foo.* LISTEN foo.bar.* showing that we're listening to channels foo.bar.*, but not other channels beginning "foo", and also to all channels not beginning "foo". regards, tom lane
> On 20 Dec 2024, at 23:07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > ..it makes "LISTEN *" act the same as though you had somehow explicitly listed > every possible channel. When thinking about it while reading this thread, this is what I came up with as well. Since the current workings of LISTEN is so well established I can't see how we could make this anything but a natural extension of the current. -- Daniel Gustafsson
On 20/12/2024 23:45, Tom Lane wrote: > Don't think that quite flies. We might have to regurgitate the > state explicitly: > > LISTEN * > UNLISTEN foo.* > LISTEN foo.bar.* > > showing that we're listening to channels foo.bar.*, but not other > channels beginning "foo", and also to all channels not beginning > "foo". Could I perhaps propose a sort of wildmat[1] syntax? The above sequence could be expressed simply as: LISTEN *,!foo.*,foo.bar.* I would like this in psql's backslash commands, too. [1] https://en.wikipedia.org/wiki/Wildmat -- Vik Fearing
Vik Fearing <vik@postgresfriends.org> writes: > Could I perhaps propose a sort of wildmat[1] syntax? > The above sequence could be expressed simply as: > LISTEN *,!foo.*,foo.bar.* That doesn't absolve you from having to say what happens if the user then issues another "LISTEN zed" or "UNLISTEN foo.bar.baz" command. We can't break the existing behavior that "LISTEN foo" followed by "LISTEN bar" results in listening to both channels. So on the whole this seems like it just adds complexity without removing any. I'm inclined to limit things to one pattern per LISTEN/UNLISTEN command, with more complex behaviors reached by issuing a sequence of commands. regards, tom lane
On 21/12/2024 05:23, Tom Lane wrote: > Vik Fearing <vik@postgresfriends.org> writes: >> Could I perhaps propose a sort of wildmat[1] syntax? >> The above sequence could be expressed simply as: >> LISTEN *,!foo.*,foo.bar.* > That doesn't absolve you from having to say what happens if the > user then issues another "LISTEN zed" or "UNLISTEN foo.bar.baz" > command. We can't break the existing behavior that "LISTEN foo" > followed by "LISTEN bar" results in listening to both channels. > So on the whole this seems like it just adds complexity without > removing any. I'm inclined to limit things to one pattern per > LISTEN/UNLISTEN command, with more complex behaviors reached > by issuing a sequence of commands. Fair enough. -- Vik Fearing
> On Dec 20, 2024, at 4:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > "David G. Johnston" <david.g.johnston@gmail.com> writes: >> On Friday, December 20, 2024, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> * "Listen to all but X" seems like a reasonable desire. > >> This I concur with, and would add: let me name my channels >> accounting.payables, accounting.receivables, sales.leads; and let me listen >> or ignore all accounting/sales channel names. > > Hmm. That reminds me that there was recently a proposal to allow > LISTEN/UNLISTEN with pattern arguments. (It wasn't anything you'd > expect like regex patterns or LIKE patterns, but some off-the-wall > syntax, which I doubt we'd accept in that form. But clearly there's > some desire for that out there.) > I dug into the archives prior to starting this discussion. If folks really want this then someone should probably promote the ‘ltree’ data type from contrib to built-in and reuse the matching code. NOTIFY, LISTEN, and UNLISTEN all use ‘ColId’ in the grammar, limiting patterns to NAMEDATALEN, and that probably needs to change. I didn’t propose it because it seemed like too big of a lift for a newbie project. > While I don't say we need to implement that as part of this, > it'd be a good idea to anticipate that that will happen. And > that kind of blows a hole in my idea, because mine was predicated on > the assumption that you could unambiguously match UNLISTENs against > LISTENs. A patterned UNLISTEN might revoke a superset or subset > of previous LISTENs, and I'm not sure you could readily tell which. > A version of LISTEN/UNLISTEN that accepts real patterns probably wants a new keyword, like LISTEN_LTREE. If someone uses the new keyword then they explicitly opt-out of non-pattern searches, perhaps? > I think we can still hold to the idea that LISTEN * or UNLISTEN * > cancels all previous requests, but it's feeling like we might > have to accumulate subsequent requests without trying to make > contradictory ones cancel out. Is it okay if the behavior is > explicitly dependent on the order of those requests, more or > less "last match wins"? If not, how do we avoid that? I’d like a solution that doesn’t require walking the entire exception list. From your earlier email I started sketching up something based on simplehash.h, but that doesn’t lend itself to any sort of pattern matching. I don’t think you can go too far down the road of resolving pattern matching conflicts until we settle on the pattern matching technique. It feels like it will devolve to dynamically assembling some kind of unified regex tree from the various include/exclude patterns. I’d want to do a pretty serious literature search to see if someone has already solved the problem. Can/Should we stick to something simpler for now? -- Trey
On Dec 20, 2024, at 4:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Hmph. After thinking about it a bit I have a different idea
(and I see David has yet a third one). So maybe this is more
contentious than it seems. But at any rate, I have two
fundamental thoughts:
* "Listen to all but X" seems like a reasonable desire.
* The existing implementation already has the principle that
you can't listen to a channel more than once; that is,
LISTEN foo;
LISTEN foo; -- this is a no-op, not a duplicate subscription
Therefore I propose:
* "LISTEN *" wipes away all previous listen state, and
sets up a state where you're listening to all channels
(within your database).
* "UNLISTEN *" wipes away all previous listen state, and
sets up a state where you're listening to no channels
(which is the same as it does now).
* "LISTEN foo" adds "foo" to what you are listening to,
with no effect if you already were listening to foo
(whether it was a virtual or explicit listen).
* "UNLISTEN foo" removes "foo" from what you are listening to,
with no effect if you already weren't listening to foo.
I have an implementation of this that replaces List with a simplehash.h
variant, merging 'plus/minus' as ‘exceptions’.
My previous cut at this replaced the list with ‘*’, but since we nowOne other thing that needs to be thought about in any case
is what the pg_listening_channels() function ought to return
in these newly-possible states.
allow exceptions, how about preceding the list with ‘*” in the
Want-all case, following with the list of exceptions?
In another branch of this discussion covering patterns I mentioned
building a tree of regular expressions. If we go with the notion of
‘want-all/want-none, with exceptions’ then we could introduce a
function like ‘pg_listens_use_regexes(bool)’. When true we’d
build a pre-parsed regex from the exception list by encapsulating
the patterns in something like ‘(^’<pattern>‘$)’ and aggregating with ‘|’.
We could alternatively have ‘pg_listen_pattern(style)’, with style
choices of IDENT (current behavior), REGEX, LTREE, LIKE, etc.
So long as we treated all of the exceptions as the same type it seems
pretty sane. Allowing mixing would take lots of work.
-- Trey