Discussion on a LISTEN-ALL syntax

Started by Trey Boudreauabout 1 year ago15 messages
#1Trey Boudreau
trey@treysoft.com

Howdy all,

NOTE: Grey-beard coder, pgsql newbie. All info/tips/suggestions welcome!

I have a use-case where I’d like to LISTEN for all NOTIFY channels. Right now I simply
issue a LISTEN for every channel name of interest, but in production the channels will
number in the low thousands. The current implementation uses a linked list, and a linear
probe through the list of desired channels which will always return true becomes quite
expensive at this scale.

I have a work-around available by creating the “ALL” channel and making the payload
include the actual channel name, but this has a few of drawbacks:
* it does not play nice with clients that actually want a small subset of channels;
* it requires code modification at every NOTIFY;
* it requires extra code on the client side.

The work-around subjects the developer (me :-) to significant risk of foot-gun disease,
so I'd like to propose a 'LISTEN *' equivalent to 'UNLISTEN *'.

The implementation in src/backend/commands/async.c seems straightforward enough, but it
feels prudent to select a syntax that doesn't make some kind of actual pattern matching
syntactically ugly in the future. Choosing 'LISTEN *' has a nice symmetry with 'UNLISTEN
*', but I don't have enough SQL chops to know if it cause problems.

If anyone has a better work-around, please speak up! If not, and we can come to some
resolution on a future-resistant syntax, I'd happily start working up a patch set.

Thanks,
-- Trey Boudreau

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Trey Boudreau (#1)
Re: Discussion on a LISTEN-ALL syntax

Trey Boudreau <trey@treysoft.com> writes:

so I'd like to propose a 'LISTEN *' equivalent to 'UNLISTEN *'.

Seems reasonable in the abstract, and given the UNLISTEN * precedent
it's hard to quibble with that syntax choice. I think what actually
needs discussing are the semantics, specifically how this'd interact
with other LISTEN/UNLISTEN actions. Explain what you think should
be the behavior after:

LISTEN foo;
LISTEN *;
UNLISTEN *;
-- are we still listening on foo?

LISTEN *;
LISTEN foo;
UNLISTEN *;
-- how about now?

LISTEN *;
UNLISTEN foo;
-- how about now?

LISTEN *;
LISTEN foo;
UNLISTEN foo;
-- does that make a difference?

I don't have any strong preferences about this, but we ought to
have a clear idea of the behavior we want before we start coding.

regards, tom lane

#3Trey Boudreau
trey@treysoft.com
In reply to: Tom Lane (#2)
Re: Discussion on a LISTEN-ALL syntax

On Dec 20, 2024, at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Seems reasonable in the abstract, and given the UNLISTEN * precedent
it's hard to quibble with that syntax choice. I think what actually
needs discussing are the semantics, specifically how this'd interact
with other LISTEN/UNLISTEN actions.

My first pass at the documentation looks like this:

<para>
The special wildcard <literal>*</literal> cancels all listener
registrations for the current session and replaces them with a
virtual registration that matches all channels. Further
<command>LISTEN</command> and <command>UNLISTEN <replaceable
class="parameter">channel</replaceable></command> commands will
be ignored until the session sees the <command>UNLISTEN *</command>
command.
</para>

Explain what you think should
be the behavior after:

LISTEN foo;
LISTEN *;
UNLISTEN *;
-- are we still listening on foo?

No, as the ‘LISTEN *’ wipes existing registrations.

LISTEN *;
LISTEN foo;
UNLISTEN *;
-- how about now?

Not listening on ‘foo’ or anything else.

LISTEN *;
UNLISTEN foo;
-- how about now?

‘UNLISTEN foo’ ignored.

LISTEN *;
LISTEN foo;
UNLISTEN foo;
-- does that make a difference?

‘LISTEN foo’ and ‘UNLISTEN foo’ ignored, leaving only the wildcard.

I don't have any strong preferences about this, but we ought to
have a clear idea of the behavior we want before we start coding.

These semantics made sense to me, but I have limited experience and
a very specific use case in mind. Changing the behavior of ‘UNLISTEN *’
feels extremely impolite, and if we leave that alone I don’t see using
the ‘LISTEN *’ syntax with behavior that leaves other LISTENs in place.

We could have a different set of keywords, like LISTEN_ALL/UNLISTEN_ALL
that doesn’t interfere with the existing behavior.
-- Trey

#4David G. Johnston
david.g.johnston@gmail.com
In reply to: Tom Lane (#2)
Re: Discussion on a LISTEN-ALL syntax

On Fri, Dec 20, 2024 at 1:58 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Trey Boudreau <trey@treysoft.com> writes:

so I'd like to propose a 'LISTEN *' equivalent to 'UNLISTEN *'.

Seems reasonable in the abstract, and given the UNLISTEN * precedent
it's hard to quibble with that syntax choice. I think what actually
needs discussing are the semantics, specifically how this'd interact
with other LISTEN/UNLISTEN actions. Explain what you think should
be the behavior after:

Answers premised on the framing explained below:

LISTEN foo;

LISTEN *;
UNLISTEN *;
-- are we still listening on foo?

Yes; the channels are orthogonal and thus order doesn't matter.

LISTEN *;
LISTEN foo;
UNLISTEN *;
-- how about now?

Yes

LISTEN *;
UNLISTEN foo;
-- how about now?

The unlisten was a no-op since listen foo was not issued; * receives
everything, always.

LISTEN *;
LISTEN foo;
UNLISTEN foo;
-- does that make a difference?

If any notify foo happened in between listen foo and unlisten foo the
session would receive the notify message twice - once implicitly via * and
once explicitly via foo.
Alternatively, the server could see that "foo" is subscribed too for PID
listener, send the message and then skip over looking for a * subscription
for PID listener. Basically document that we won't send duplicates if both
listen * and listen foo are present.

I don't have any strong preferences about this, but we ought to
have a clear idea of the behavior we want before we start coding.

I'm inclined to make this clearly distinct from the semantics of
listen/notify. Both in command form, what is affected, and the message.

Something like:
MONITOR NOTIFICATION QUEUE;
UNMONITOR NOTIFICATION QUEUE;
Asynchronous notification "foo" [with payload ...] sent by server process
with PID nnn.

If you also LISTEN foo you would also receive:
Asynchronous notification "foo" [with payload ...] received from server
process with PID nnn.

Unlisten undoes Listen
Unmonitor undoes Monitor
Upon session disconnect both Unlisten * and Unmonitor are executed.

If we must shoehorn this into the existing syntax and messages I'd still
want to say that * is simply a special channel name that the system
recognizes and sends all notify messages to. There is no way to limit
which messages get sent to you via unlisten and if you also listen to the
channel foo explicitly you end up receiving multiple messages.
(Alternatively, send it just to foo and have the server not look for a *
listen for that specific session.)

Adding a "do not send" listing (or option) to the implementation doesn't
seem beneficial enough to deal with, and would be the only way: Listen *;
Unlisten foo; would be capable of not having foo messages sent to the *
subscribing client. In short, a "deny (do not send) all" base posture and
then permit-only policies built on top of it. Listen * is the permit-all
policy.

David J.

#5David G. Johnston
david.g.johnston@gmail.com
In reply to: Trey Boudreau (#3)
Re: Discussion on a LISTEN-ALL syntax

On Fri, Dec 20, 2024 at 2:42 PM Trey Boudreau <trey@treysoft.com> wrote:

On Dec 20, 2024, at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Seems reasonable in the abstract, and given the UNLISTEN * precedent
it's hard to quibble with that syntax choice. I think what actually
needs discussing are the semantics, specifically how this'd interact
with other LISTEN/UNLISTEN actions.

My first pass at the documentation looks like this:

<para>
The special wildcard <literal>*</literal> cancels all listener
registrations for the current session and replaces them with a
virtual registration that matches all channels. Further
<command>LISTEN</command> and <command>UNLISTEN <replaceable
class="parameter">channel</replaceable></command> commands will
be ignored until the session sees the <command>UNLISTEN *</command>
command.
</para>

I just sent my thoughts here as well. The choice to "cancel all listener
registrations" seems unintuitive and unnecessary - so long as we either
document or handle deduplication internally.

As I noted in my email, * is a permit-all policy in a "deny by default"
system. Such a system is allowed to have other more targeted "allow"
policies existing at the same time. If the permit-all policy gets removed
then those individual allow policies immediately become useful again. If
you want to remove those targeted allowed policies execute Unlisten *
before executing Listen *.

I dislike the non-symmetric meaning of * in the command sequence above but
it likely is better than inventing a whole new syntax.

David J.

#6David G. Johnston
david.g.johnston@gmail.com
In reply to: Trey Boudreau (#3)
Re: Discussion on a LISTEN-ALL syntax

On Fri, Dec 20, 2024 at 2:42 PM Trey Boudreau <trey@treysoft.com> wrote:

We could have a different set of keywords, like LISTEN_ALL/UNLISTEN_ALL
that doesn’t interfere with the existing behavior.

I think we will need something along these lines. We've given * a meaning
in UNLISTEN * that doesn't match what this proposal wants to accomplish.

I suggested using monitor/unmonitor but I suppose any unquoted symbol or
keyword that is invalid as a channel name would work within the
Listen/Unlisten syntax.

Otherwise I mis-spoke in my previous design since regardless of whether
Listen * unregisters existing channels or not Unlisten * will remove
everything and leave the session back at nothing. In which case you might
as well just remove the redundant channel listeners.

David J.

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Trey Boudreau (#3)
Re: Discussion on a LISTEN-ALL syntax

Trey Boudreau <trey@treysoft.com> writes:

My first pass at the documentation looks like this:

<para>
The special wildcard <literal>*</literal> cancels all listener
registrations for the current session and replaces them with a
virtual registration that matches all channels. Further
<command>LISTEN</command> and <command>UNLISTEN <replaceable
class="parameter">channel</replaceable></command> commands will
be ignored until the session sees the <command>UNLISTEN *</command>
command.
</para>

Hmph. After thinking about it a bit I have a different idea
(and I see David has yet a third one). So maybe this is more
contentious than it seems. But at any rate, I have two
fundamental thoughts:

* "Listen to all but X" seems like a reasonable desire.

* The existing implementation already has the principle that
you can't listen to a channel more than once; that is,
LISTEN foo;
LISTEN foo; -- this is a no-op, not a duplicate subscription

Therefore I propose:

* "LISTEN *" wipes away all previous listen state, and
sets up a state where you're listening to all channels
(within your database).

* "UNLISTEN *" wipes away all previous listen state, and
sets up a state where you're listening to no channels
(which is the same as it does now).

* "LISTEN foo" adds "foo" to what you are listening to,
with no effect if you already were listening to foo
(whether it was a virtual or explicit listen).

* "UNLISTEN foo" removes "foo" from what you are listening to,
with no effect if you already weren't listening to foo.

This is just about the same as the current behavior, and it makes
"LISTEN *" act the same as though you had somehow explicitly listed
every possible channel. Which I think is a lot cleaner than
conceptualizing it as an independent gating behavior, as well
as more useful because it'll permit "all but" behavior.

The implementation of this could be something like

struct {
bool all; /* true if listening to all */
List *plus; /* channels explicitly listened */
List *minus; /* channels explicitly unlistened */
} ListenChannels;

with the proviso that "plus" must be empty if "all" is true,
while "minus" must be empty if "all" is false. The two lists
are always empty right after LISTEN * or UNLISTEN *, but could
be manipulated by subsequent channel-specific LISTEN/UNLISTEN.

(Since only one list would be in use at a time, you could
alternatively combine "plus" and "minus" into a single list
of exceptions to the all/none state. I suspect that would
be confusingly error-prone to code; but perhaps it would turn
out elegantly.)

One other thing that needs to be thought about in any case
is what the pg_listening_channels() function ought to return
in these newly-possible states.

regards, tom lane

#8David G. Johnston
david.g.johnston@gmail.com
In reply to: Tom Lane (#7)
Re: Discussion on a LISTEN-ALL syntax

On Friday, December 20, 2024, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Trey Boudreau <trey@treysoft.com> writes:

* "Listen to all but X" seems like a reasonable desire.

This I concur with, and would add: let me name my channels
accounting.payables, accounting.receivables, sales.leads; and let me listen
or ignore all accounting/sales channel names.

But staying within the existing “deny default, permissive grants only”
design to meet this specific goal seems like a reasonable incremental step
to accept. Let others wanting to work on a more expansive capability
change brings those patches forth.

As for exposing this to the user, this allow-all “channel” would be
presented as any other normal channel. The reader would need to know about
the special meaning of whatever label we end up using. IOW, the wildcard is
the label and no attempt to tie real in-use channel names to it should or
even could be attempted.

David J.

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: David G. Johnston (#8)
Re: Discussion on a LISTEN-ALL syntax

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Friday, December 20, 2024, Tom Lane <tgl@sss.pgh.pa.us> wrote:

* "Listen to all but X" seems like a reasonable desire.

This I concur with, and would add: let me name my channels
accounting.payables, accounting.receivables, sales.leads; and let me listen
or ignore all accounting/sales channel names.

Hmm. That reminds me that there was recently a proposal to allow
LISTEN/UNLISTEN with pattern arguments. (It wasn't anything you'd
expect like regex patterns or LIKE patterns, but some off-the-wall
syntax, which I doubt we'd accept in that form. But clearly there's
some desire for that out there.)

While I don't say we need to implement that as part of this,
it'd be a good idea to anticipate that that will happen. And
that kind of blows a hole in my idea, because mine was predicated on
the assumption that you could unambiguously match UNLISTENs against
LISTENs. A patterned UNLISTEN might revoke a superset or subset
of previous LISTENs, and I'm not sure you could readily tell which.

I think we can still hold to the idea that LISTEN * or UNLISTEN *
cancels all previous requests, but it's feeling like we might
have to accumulate subsequent requests without trying to make
contradictory ones cancel out. Is it okay if the behavior is
explicitly dependent on the order of those requests, more or
less "last match wins"? If not, how do we avoid that?

As for exposing this to the user, this allow-all “channel” would be
presented as any other normal channel. The reader would need to know about
the special meaning of whatever label we end up using. IOW, the wildcard is
the label and no attempt to tie real in-use channel names to it should or
even could be attempted.

Don't think that quite flies. We might have to regurgitate the
state explicitly:

LISTEN *
UNLISTEN foo.*
LISTEN foo.bar.*

showing that we're listening to channels foo.bar.*, but not other
channels beginning "foo", and also to all channels not beginning
"foo".

regards, tom lane

#10Daniel Gustafsson
daniel@yesql.se
In reply to: Tom Lane (#7)
Re: Discussion on a LISTEN-ALL syntax

On 20 Dec 2024, at 23:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

..it makes "LISTEN *" act the same as though you had somehow explicitly listed
every possible channel.

When thinking about it while reading this thread, this is what I came up with
as well. Since the current workings of LISTEN is so well established I can't
see how we could make this anything but a natural extension of the current.

--
Daniel Gustafsson

#11Vik Fearing
vik@postgresfriends.org
In reply to: Tom Lane (#9)
Re: Discussion on a LISTEN-ALL syntax

On 20/12/2024 23:45, Tom Lane wrote:

Don't think that quite flies. We might have to regurgitate the
state explicitly:

LISTEN *
UNLISTEN foo.*
LISTEN foo.bar.*

showing that we're listening to channels foo.bar.*, but not other
channels beginning "foo", and also to all channels not beginning
"foo".

Could I perhaps propose a sort of wildmat[1]https://en.wikipedia.org/wiki/Wildmat syntax?

The above sequence could be expressed simply as:

    LISTEN *,!foo.*,foo.bar.*

I would like this in psql's backslash commands, too.

[1]: https://en.wikipedia.org/wiki/Wildmat

--

Vik Fearing

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Vik Fearing (#11)
Re: Discussion on a LISTEN-ALL syntax

Vik Fearing <vik@postgresfriends.org> writes:

Could I perhaps propose a sort of wildmat[1] syntax?
The above sequence could be expressed simply as:
    LISTEN *,!foo.*,foo.bar.*

That doesn't absolve you from having to say what happens if the
user then issues another "LISTEN zed" or "UNLISTEN foo.bar.baz"
command. We can't break the existing behavior that "LISTEN foo"
followed by "LISTEN bar" results in listening to both channels.
So on the whole this seems like it just adds complexity without
removing any. I'm inclined to limit things to one pattern per
LISTEN/UNLISTEN command, with more complex behaviors reached
by issuing a sequence of commands.

regards, tom lane

#13Vik Fearing
vik@postgresfriends.org
In reply to: Tom Lane (#12)
Re: Discussion on a LISTEN-ALL syntax

On 21/12/2024 05:23, Tom Lane wrote:

Vik Fearing <vik@postgresfriends.org> writes:

Could I perhaps propose a sort of wildmat[1] syntax?
The above sequence could be expressed simply as:
    LISTEN *,!foo.*,foo.bar.*

That doesn't absolve you from having to say what happens if the
user then issues another "LISTEN zed" or "UNLISTEN foo.bar.baz"
command. We can't break the existing behavior that "LISTEN foo"
followed by "LISTEN bar" results in listening to both channels.
So on the whole this seems like it just adds complexity without
removing any. I'm inclined to limit things to one pattern per
LISTEN/UNLISTEN command, with more complex behaviors reached
by issuing a sequence of commands.

Fair enough.

--

Vik Fearing

#14Trey Boudreau
trey@treysoft.com
In reply to: Tom Lane (#9)
Re: Discussion on a LISTEN-ALL syntax

On Dec 20, 2024, at 4:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Friday, December 20, 2024, Tom Lane <tgl@sss.pgh.pa.us> wrote:

* "Listen to all but X" seems like a reasonable desire.

This I concur with, and would add: let me name my channels
accounting.payables, accounting.receivables, sales.leads; and let me listen
or ignore all accounting/sales channel names.

Hmm. That reminds me that there was recently a proposal to allow
LISTEN/UNLISTEN with pattern arguments. (It wasn't anything you'd
expect like regex patterns or LIKE patterns, but some off-the-wall
syntax, which I doubt we'd accept in that form. But clearly there's
some desire for that out there.)

I dug into the archives prior to starting this discussion. If folks really want
this then someone should probably promote the ‘ltree’ data type from contrib
to built-in and reuse the matching code. NOTIFY, LISTEN, and UNLISTEN
all use ‘ColId’ in the grammar, limiting patterns to NAMEDATALEN, and that
probably needs to change. I didn’t propose it because it seemed like too big
of a lift for a newbie project.

While I don't say we need to implement that as part of this,
it'd be a good idea to anticipate that that will happen. And
that kind of blows a hole in my idea, because mine was predicated on
the assumption that you could unambiguously match UNLISTENs against
LISTENs. A patterned UNLISTEN might revoke a superset or subset
of previous LISTENs, and I'm not sure you could readily tell which.

A version of LISTEN/UNLISTEN that accepts real patterns probably
wants a new keyword, like LISTEN_LTREE. If someone uses the new keyword
then they explicitly opt-out of non-pattern searches, perhaps?

I think we can still hold to the idea that LISTEN * or UNLISTEN *
cancels all previous requests, but it's feeling like we might
have to accumulate subsequent requests without trying to make
contradictory ones cancel out. Is it okay if the behavior is
explicitly dependent on the order of those requests, more or
less "last match wins"? If not, how do we avoid that?

I’d like a solution that doesn’t require walking the entire exception list. From
your earlier email I started sketching up something based on simplehash.h,
but that doesn’t lend itself to any sort of pattern matching. I don’t think you
can go too far down the road of resolving pattern matching conflicts until
we settle on the pattern matching technique. It feels like it will devolve to
dynamically assembling some kind of unified regex tree from the various
include/exclude patterns. I’d want to do a pretty serious literature search
to see if someone has already solved the problem.

Can/Should we stick to something simpler for now?
-- Trey

#15Trey Boudreau
trey@treysoft.com
In reply to: Tom Lane (#7)
Re: Discussion on a LISTEN-ALL syntax

On Dec 20, 2024, at 4:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hmph. After thinking about it a bit I have a different idea
(and I see David has yet a third one). So maybe this is more
contentious than it seems. But at any rate, I have two
fundamental thoughts:

* "Listen to all but X" seems like a reasonable desire.

* The existing implementation already has the principle that
you can't listen to a channel more than once; that is,
LISTEN foo;
LISTEN foo; -- this is a no-op, not a duplicate subscription

Therefore I propose:

* "LISTEN *" wipes away all previous listen state, and
sets up a state where you're listening to all channels
(within your database).

* "UNLISTEN *" wipes away all previous listen state, and
sets up a state where you're listening to no channels
(which is the same as it does now).

* "LISTEN foo" adds "foo" to what you are listening to,
with no effect if you already were listening to foo
(whether it was a virtual or explicit listen).

* "UNLISTEN foo" removes "foo" from what you are listening to,
with no effect if you already weren't listening to foo.

I have an implementation of this that replaces List with a simplehash.h
variant, merging 'plus/minus' as ‘exceptions’.

One other thing that needs to be thought about in any case
is what the pg_listening_channels() function ought to return
in these newly-possible states.

My previous cut at this replaced the list with ‘*’, but since we now
allow exceptions, how about preceding the list with ‘*” in the
Want-all case, following with the list of exceptions?

In another branch of this discussion covering patterns I mentioned
building a tree of regular expressions. If we go with the notion of
‘want-all/want-none, with exceptions’ then we could introduce a
function like ‘pg_listens_use_regexes(bool)’. When true we’d
build a pre-parsed regex from the exception list by encapsulating
the patterns in something like ‘(^’<pattern>‘$)’ and aggregating with ‘|’.
We could alternatively have ‘pg_listen_pattern(style)’, with style
choices of IDENT (current behavior), REGEX, LTREE, LIKE, etc.
So long as we treated all of the exceptions as the same type it seems
pretty sane. Allowing mixing would take lots of work.

-- Trey