listen/notify argument (old topic revisited)

Started by Jeff Davisalmost 24 years ago34 messageshackers
Jump to latest
#1Jeff Davis
pgsql@j-davis.com

A while ago, I started a small discussion about passing arguments to a NOTIFY
so that the listening backend could get more information about the event.

There wasn't exactly a consensus from what I understand, but the last thing I
remember is that someone intended to speed up the notification process by
storing the events in shared memory segments (IIRC this was Tom's idea). That
would create a remote possibility of a spurious notification, but the idea is
that the listening application can check the status and determine what
happened.

I looked at the TODO, but I couldn't find anything, nor could I find anything
in the docs.

Is someone still interested in implementing this feature? Are there still
people who disagree with the above implementation strategy?

Regards,
Jeff

#2Neil Conway
neilc@samurai.com
In reply to: Jeff Davis (#1)
Re: listen/notify argument (old topic revisited)

On Tue, Jul 02, 2002 at 02:37:19AM -0700, Jeff Davis wrote:

A while ago, I started a small discussion about passing arguments to a NOTIFY
so that the listening backend could get more information about the event.

Funny, I was just about to post to -hackers about this.

There wasn't exactly a consensus from what I understand, but the last thing I
remember is that someone intended to speed up the notification process by
storing the events in shared memory segments (IIRC this was Tom's idea). That
would create a remote possibility of a spurious notification, but the idea is
that the listening application can check the status and determine what
happened.

Yes, that was Tom Lane. IMHO, we need to replace the existing
pg_listener scheme with an improved model if we want to make any
significant improvements to asynchronous notifications. In summary,
the two designs that have been suggested are:

pg_notify: a new system catalog, stores notifications only --
pg_listener stores only listening backends.

shmem: all notifications are done via shared memory and not stored
in system catalogs at all, in a manner similar to the cache
invalidation code that already exists. This avoids the MVCC-induced
performence problem with storing notification in system catalogs,
but can lead to spurrious notifications -- the statically sized
buffer in which notifications are stored can overflow. Applications
will be able to differentiate between overflow-induced and regular
messages.

Is someone still interested in implementing this feature? Are there still
people who disagree with the above implementation strategy?

Some people objected to shmem at the time; personally, I'm not really
sure which design is best. Any comments from -hackers?

If there's a consensus on which route to take, I'll probably implement
the preferred design for 7.3. However, I think that a proper
implementation of notify messages will need an FE/BE protocol change,
so that will need to wait for 7.4.

Cheers,

Neil

--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC

#3Bruce Momjian
bruce@momjian.us
In reply to: Jeff Davis (#1)
Re: listen/notify argument (old topic revisited)

Jeff Davis wrote:

A while ago, I started a small discussion about passing arguments to a NOTIFY
so that the listening backend could get more information about the event.

There wasn't exactly a consensus from what I understand, but the last thing I
remember is that someone intended to speed up the notification process by
storing the events in shared memory segments (IIRC this was Tom's idea). That
would create a remote possibility of a spurious notification, but the idea is
that the listening application can check the status and determine what
happened.

I don't see a huge value to using shared memory. Once we get
auto-vacuum, pg_listener will be fine, and shared memory like SI is just
too hard to get working reliabily because of all the backends
reading/writing in there. We have tables that have the proper sharing
semantics; I think we should use those and hope we get autovacuum soon.

As far as the message, perhaps passing the oid of the pg_listener row to
the backend would help, and then the backend can look up any message for
that oid in pg_listener.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#3)
Re: listen/notify argument (old topic revisited)

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I don't see a huge value to using shared memory. Once we get
auto-vacuum, pg_listener will be fine,

No it won't. The performance of notify is *always* going to suck
as long as it depends on going through a table. This is particularly
true given the lack of any effective way to index pg_listener; the
more notifications you feed through, the more dead rows there are
with the same key...

and shared memory like SI is just
too hard to get working reliabily because of all the backends
reading/writing in there.

A curious statement considering that PG depends critically on SI
working. This is a solved problem.

regards, tom lane

#5Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#4)
Re: listen/notify argument (old topic revisited)

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I don't see a huge value to using shared memory. Once we get
auto-vacuum, pg_listener will be fine,

No it won't. The performance of notify is *always* going to suck
as long as it depends on going through a table. This is particularly
true given the lack of any effective way to index pg_listener; the
more notifications you feed through, the more dead rows there are
with the same key...

Why can't we do efficient indexing, or clear out the table? I don't
remember.

and shared memory like SI is just
too hard to get working reliabily because of all the backends
reading/writing in there.

A curious statement considering that PG depends critically on SI
working. This is a solved problem.

My point is that SI was buggy for years until we found all the bugs, so
yea, it is a solved problem, but solved with difficulty.

Do we want to add another SI-type capability that could be as difficult
to get working properly, or will the notify piggyback on the existing SI
code. If that latter, that would be fine with me, but we still have the
overflow queue problem.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#5)
Re: listen/notify argument (old topic revisited)

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Why can't we do efficient indexing, or clear out the table? I don't
remember.

I don't recall either, but I do recall that we tried to index it and
backed out the changes. In any case, a table on disk is just plain
not the right medium for transitory-by-design notification messages.

A curious statement considering that PG depends critically on SI
working. This is a solved problem.

My point is that SI was buggy for years until we found all the bugs, so
yea, it is a solved problem, but solved with difficulty.

The SI message mechanism itself was not the source of bugs, as I recall
it (although certainly the code was incomprehensible in the extreme;
the original programmer had absolutely no grasp of readable coding style
IMHO). The problem was failure to properly design the interactions with
relcache and catcache, which are pretty complex in their own right.
An SI-like NOTIFY mechanism wouldn't have those issues.

regards, tom lane

#7Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#6)
Re: listen/notify argument (old topic revisited)

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Why can't we do efficient indexing, or clear out the table? I don't
remember.

I don't recall either, but I do recall that we tried to index it and
backed out the changes. In any case, a table on disk is just plain
not the right medium for transitory-by-design notification messages.

OK, I can help here. I added an index on pg_listener so lookups would
go faster in the backend, but inserts/updates into the table also
require index additions, and your feeling was that the table was small
and we would be better without the index and just sequentially scanning
the table. I can easily add the index and make sure it is used properly
if you are now concerned about table access time.

I think your issue was that it is only looked up once, and only updated
once, so there wasn't much sense in having that index maintanance
overhead, i.e. you only used the index once per row.

(I remember the item being on TODO for quite a while when we discussed
this.)

Of course, a shared memory system probably is going to either do it
sequentailly or have its own index issues, so I don't see a huge
advantage to going to shared memory, and I do see extra code and a queue
limit.

A curious statement considering that PG depends critically on SI
working. This is a solved problem.

My point is that SI was buggy for years until we found all the bugs, so
yea, it is a solved problem, but solved with difficulty.

The SI message mechanism itself was not the source of bugs, as I recall
it (although certainly the code was incomprehensible in the extreme;
the original programmer had absolutely no grasp of readable coding style
IMHO). The problem was failure to properly design the interactions with
relcache and catcache, which are pretty complex in their own right.
An SI-like NOTIFY mechanism wouldn't have those issues.

Oh, OK, interesting. So _that_ was the issue there.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#7)
Re: listen/notify argument (old topic revisited)

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Of course, a shared memory system probably is going to either do it
sequentailly or have its own index issues, so I don't see a huge
advantage to going to shared memory, and I do see extra code and a queue
limit.

Disk I/O vs. no disk I/O isn't a huge advantage? Come now.

A shared memory system would use sequential (well, actually
circular-buffer) access, which is *exactly* what you want given
the inherently sequential nature of the messages. The reason that
table storage hurts is that we are forced to do searches, which we
could eliminate if we had control of the storage ordering. Again,
it comes down to the fact that tables don't provide the right
abstraction for this purpose.

The "extra code" argument doesn't impress me either; async.c is
currently 900 lines, about 2.5 times the size of sinvaladt.c which is
the guts of SI message passing. I think it's a good bet that a SI-like
notify module would be much smaller than async.c is now; it's certainly
unlikely to be significantly larger.

The queue limit problem is a valid argument, but it's the only valid
complaint IMHO; and it seems a reasonable tradeoff to make for the
other advantages.

regards, tom lane

#9Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#8)
Re: listen/notify argument (old topic revisited)

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Of course, a shared memory system probably is going to either do it
sequentailly or have its own index issues, so I don't see a huge
advantage to going to shared memory, and I do see extra code and a queue
limit.

Disk I/O vs. no disk I/O isn't a huge advantage? Come now.

My assumption is that it throws to disk as backing store, which seems
better to me than dropping the notifies. Is disk i/o a real performance
penalty for notify, and is performance a huge issue for notify anyway,
assuming autovacuum?

A shared memory system would use sequential (well, actually
circular-buffer) access, which is *exactly* what you want given
the inherently sequential nature of the messages. The reason that
table storage hurts is that we are forced to do searches, which we
could eliminate if we had control of the storage ordering. Again,
it comes down to the fact that tables don't provide the right
abstraction for this purpose.

To me, it just seems like going to shared memory is taking our existing
table structure and moving it to memory. Yea, there is no tuple header,
and yea we can make a circular list, but we can't index the thing, so is
spinning around a circular list any better than a sequential scan of a
table. Yea, we can delete stuff better, but autovacuum would help with
that. It just seems like we are reinventing the wheel.

Are there other uses for this? Can we make use of RAM-only tables?

The "extra code" argument doesn't impress me either; async.c is
currently 900 lines, about 2.5 times the size of sinvaladt.c which is
the guts of SI message passing. I think it's a good bet that a SI-like
notify module would be much smaller than async.c is now; it's certainly
unlikely to be significantly larger.

The queue limit problem is a valid argument, but it's the only valid
complaint IMHO; and it seems a reasonable tradeoff to make for the
other advantages.

I am just not excited about it. What do others think?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#9)
Re: listen/notify argument (old topic revisited)

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Is disk i/o a real performance
penalty for notify, and is performance a huge issue for notify anyway,

Yes, and yes. I have used NOTIFY in production applications, and I know
that performance is an issue.

The queue limit problem is a valid argument, but it's the only valid
complaint IMHO; and it seems a reasonable tradeoff to make for the
other advantages.

BTW, it occurs to me that as long as we make this an independent message
buffer used only for NOTIFY (and *not* try to merge it with SI), we
don't have to put up with overrun-reset behavior. The overrun reset
approach is useful for SI because there are only limited times when
we are prepared to handle SI notification in the backend work cycle.
However, I think a self-contained NOTIFY mechanism could be much more
flexible about when it will remove messages from the shared buffer.
Consider this:

1. To send NOTIFY: grab write lock on shared-memory circular buffer.
If enough space, insert message, release lock, send signal, done.
If not enough space, release lock, send signal, sleep some small
amount of time, and then try again. (Hard failure would occur only
if the proposed message size exceeds the buffer size; as long as we
make the buffer size a parameter, this is the DBA's fault not ours.)

2. On receipt of signal: grab read lock on shared-memory circular
buffer, copy all data up to write pointer into private memory,
advance my (per-process) read pointer, release lock. This would be
safe to do pretty much anywhere we're allowed to malloc more space,
so it could be done say at the same points where we check for cancel
interrupts. Therefore, the expected time before the shared buffer
is emptied after a signal is pretty small.

In this design, if someone sits in a transaction for a long time,
there is no risk of shared memory overflow; that backend's private
memory for not-yet-reported NOTIFYs could grow large, but that's
his problem. (We could avoid unnecessary growth by not storing
messages that don't correspond to active LISTENs for that backend.
Indeed, a backend with no active LISTENs could be left out of the
circular buffer participation list altogether.)

We'd need to separate this processing from the processing that's used to
force SI queue reading (dz's old patch), so we'd need one more signal
code than we use now. But we do have SIGUSR1 available.

regards, tom lane

#11Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#10)
Re: listen/notify argument (old topic revisited)

Let me tell you what would be really interesting. If we didn't report
the pid of the notifying process and we didn't allow arbitrary strings
for notify (just pg_class relation names), we could just add a counter
to pg_class that is updated for every notify. If a backend is
listening, it remembers the counter at listen time, and on every commit
checks the pg_class counter to see if it has incremented. That way,
there is no queue, no shared memory, and there is no scanning. You just
pull up the cache entry for pg_class and look at the counter.

One problem is that pg_class would be updated more frequently. Anyway,
just an idea.

---------------------------------------------------------------------------

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Is disk i/o a real performance
penalty for notify, and is performance a huge issue for notify anyway,

Yes, and yes. I have used NOTIFY in production applications, and I know
that performance is an issue.

The queue limit problem is a valid argument, but it's the only valid
complaint IMHO; and it seems a reasonable tradeoff to make for the
other advantages.

BTW, it occurs to me that as long as we make this an independent message
buffer used only for NOTIFY (and *not* try to merge it with SI), we
don't have to put up with overrun-reset behavior. The overrun reset
approach is useful for SI because there are only limited times when
we are prepared to handle SI notification in the backend work cycle.
However, I think a self-contained NOTIFY mechanism could be much more
flexible about when it will remove messages from the shared buffer.
Consider this:

1. To send NOTIFY: grab write lock on shared-memory circular buffer.
If enough space, insert message, release lock, send signal, done.
If not enough space, release lock, send signal, sleep some small
amount of time, and then try again. (Hard failure would occur only
if the proposed message size exceeds the buffer size; as long as we
make the buffer size a parameter, this is the DBA's fault not ours.)

2. On receipt of signal: grab read lock on shared-memory circular
buffer, copy all data up to write pointer into private memory,
advance my (per-process) read pointer, release lock. This would be
safe to do pretty much anywhere we're allowed to malloc more space,
so it could be done say at the same points where we check for cancel
interrupts. Therefore, the expected time before the shared buffer
is emptied after a signal is pretty small.

In this design, if someone sits in a transaction for a long time,
there is no risk of shared memory overflow; that backend's private
memory for not-yet-reported NOTIFYs could grow large, but that's
his problem. (We could avoid unnecessary growth by not storing
messages that don't correspond to active LISTENs for that backend.
Indeed, a backend with no active LISTENs could be left out of the
circular buffer participation list altogether.)

We'd need to separate this processing from the processing that's used to
force SI queue reading (dz's old patch), so we'd need one more signal
code than we use now. But we do have SIGUSR1 available.

regards, tom lane

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#12Jeff Davis
pgsql@j-davis.com
In reply to: Bruce Momjian (#11)
Re: listen/notify argument (old topic revisited)

On Tuesday 02 July 2002 06:03 pm, Bruce Momjian wrote:

Let me tell you what would be really interesting. If we didn't report
the pid of the notifying process and we didn't allow arbitrary strings
for notify (just pg_class relation names), we could just add a counter
to pg_class that is updated for every notify. If a backend is
listening, it remembers the counter at listen time, and on every commit
checks the pg_class counter to see if it has incremented. That way,
there is no queue, no shared memory, and there is no scanning. You just
pull up the cache entry for pg_class and look at the counter.

One problem is that pg_class would be updated more frequently. Anyway,
just an idea.

I think that currently a lot of people use select() (after all, it's mentioned
in the docs) in the frontend to determine when a notify comes into a
listening backend. If the backend only checks on commit, and the backend is
largely idle except for notify processing, might it be a while before the
frontend realizes that a notify was sent?

Regards,
Jeff

Show quoted text

---------------------------------------------------------------------------

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Is disk i/o a real performance
penalty for notify, and is performance a huge issue for notify anyway,

Yes, and yes. I have used NOTIFY in production applications, and I know
that performance is an issue.

The queue limit problem is a valid argument, but it's the only valid
complaint IMHO; and it seems a reasonable tradeoff to make for the
other advantages.

BTW, it occurs to me that as long as we make this an independent message
buffer used only for NOTIFY (and *not* try to merge it with SI), we
don't have to put up with overrun-reset behavior. The overrun reset
approach is useful for SI because there are only limited times when
we are prepared to handle SI notification in the backend work cycle.
However, I think a self-contained NOTIFY mechanism could be much more
flexible about when it will remove messages from the shared buffer.
Consider this:

1. To send NOTIFY: grab write lock on shared-memory circular buffer.
If enough space, insert message, release lock, send signal, done.
If not enough space, release lock, send signal, sleep some small
amount of time, and then try again. (Hard failure would occur only
if the proposed message size exceeds the buffer size; as long as we
make the buffer size a parameter, this is the DBA's fault not ours.)

2. On receipt of signal: grab read lock on shared-memory circular
buffer, copy all data up to write pointer into private memory,
advance my (per-process) read pointer, release lock. This would be
safe to do pretty much anywhere we're allowed to malloc more space,
so it could be done say at the same points where we check for cancel
interrupts. Therefore, the expected time before the shared buffer
is emptied after a signal is pretty small.

In this design, if someone sits in a transaction for a long time,
there is no risk of shared memory overflow; that backend's private
memory for not-yet-reported NOTIFYs could grow large, but that's
his problem. (We could avoid unnecessary growth by not storing
messages that don't correspond to active LISTENs for that backend.
Indeed, a backend with no active LISTENs could be left out of the
circular buffer participation list altogether.)

We'd need to separate this processing from the processing that's used to
force SI queue reading (dz's old patch), so we'd need one more signal
code than we use now. But we do have SIGUSR1 available.

regards, tom lane

#13Bruce Momjian
bruce@momjian.us
In reply to: Jeff Davis (#12)
Re: listen/notify argument (old topic revisited)

Jeff Davis wrote:

On Tuesday 02 July 2002 06:03 pm, Bruce Momjian wrote:

Let me tell you what would be really interesting. If we didn't report
the pid of the notifying process and we didn't allow arbitrary strings
for notify (just pg_class relation names), we could just add a counter
to pg_class that is updated for every notify. If a backend is
listening, it remembers the counter at listen time, and on every commit
checks the pg_class counter to see if it has incremented. That way,
there is no queue, no shared memory, and there is no scanning. You just
pull up the cache entry for pg_class and look at the counter.

One problem is that pg_class would be updated more frequently. Anyway,
just an idea.

I think that currently a lot of people use select() (after all, it's mentioned
in the docs) in the frontend to determine when a notify comes into a
listening backend. If the backend only checks on commit, and the backend is
largely idle except for notify processing, might it be a while before the
frontend realizes that a notify was sent?

I meant to check exactly when it does now; when a query completes.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#14Christopher Kings-Lynne
chriskl@familyhealth.com.au
In reply to: Bruce Momjian (#7)
Re: listen/notify argument (old topic revisited)

Of course, a shared memory system probably is going to either do it
sequentailly or have its own index issues, so I don't see a huge
advantage to going to shared memory, and I do see extra code and a queue
limit.

Is a shared memory implementation going to play silly buggers with the Win32
port?

Chris

#15Hannu Krosing
hannu@tm.ee
In reply to: Christopher Kings-Lynne (#14)
Re: listen/notify argument (old topic revisited)

On Wed, 2002-07-03 at 08:20, Christopher Kings-Lynne wrote:

Of course, a shared memory system probably is going to either do it
sequentailly or have its own index issues, so I don't see a huge
advantage to going to shared memory, and I do see extra code and a queue
limit.

Is a shared memory implementation going to play silly buggers with the Win32
port?

Perhaps this is a good place to introduce anonymous mmap ?

Is there a way to grow anonymous mmap on demand ?

----------------
Hannu

#16Hannu Krosing
hannu@tm.ee
In reply to: Tom Lane (#10)
Re: listen/notify argument (old topic revisited)

On Tue, 2002-07-02 at 23:35, Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Is disk i/o a real performance
penalty for notify, and is performance a huge issue for notify anyway,

Yes, and yes. I have used NOTIFY in production applications, and I know
that performance is an issue.

The queue limit problem is a valid argument, but it's the only valid
complaint IMHO; and it seems a reasonable tradeoff to make for the
other advantages.

BTW, it occurs to me that as long as we make this an independent message
buffer used only for NOTIFY (and *not* try to merge it with SI), we
don't have to put up with overrun-reset behavior. The overrun reset
approach is useful for SI because there are only limited times when
we are prepared to handle SI notification in the backend work cycle.
However, I think a self-contained NOTIFY mechanism could be much more
flexible about when it will remove messages from the shared buffer.
Consider this:

1. To send NOTIFY: grab write lock on shared-memory circular buffer.

Are you planning to have one circular buffer per listening backend ?

Would that not be waste of space for large number of backends with long
notify arguments ?

--------------
Hannu

#17Rod Taylor
rbt@rbt.ca
In reply to: Bruce Momjian (#9)
Re: listen/notify argument (old topic revisited)

On Tue, 2002-07-02 at 17:12, Bruce Momjian wrote:

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Of course, a shared memory system probably is going to either do it
sequentailly or have its own index issues, so I don't see a huge
advantage to going to shared memory, and I do see extra code and a queue
limit.

Disk I/O vs. no disk I/O isn't a huge advantage? Come now.

My assumption is that it throws to disk as backing store, which seems
better to me than dropping the notifies. Is disk i/o a real performance
penalty for notify, and is performance a huge issue for notify anyway,
assuming autovacuum?

For me, performance would be one of the only concerns. Currently I use
two methods of finding changes, one is NOTIFY which directs frontends to
reload various sections of data, the second is a table which holds a
QUEUE of actions to be completed (which must be tracked, logged and
completed).

If performance wasn't a concern, I'd simply use more RULES which insert
requests into my queue table.

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christopher Kings-Lynne (#14)
Re: listen/notify argument (old topic revisited)

"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:

Is a shared memory implementation going to play silly buggers with the Win32
port?

No. Certainly no more so than shared disk buffers or the SI message
facility, both of which are *not* optional.

regards, tom lane

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hannu Krosing (#15)
Re: listen/notify argument (old topic revisited)

Hannu Krosing <hannu@tm.ee> writes:

Perhaps this is a good place to introduce anonymous mmap ?

I don't think so; it just adds a portability variable without buying
us anything.

Is there a way to grow anonymous mmap on demand ?

Nope. Not portably, anyway. For instance, the HPUX man page for mmap
sayeth:

If the size of the mapped file changes after the call to mmap(), the
effect of references to portions of the mapped region that correspond
to added or removed portions of the file is unspecified.

Dynamically re-mmapping after enlarging the file might work, but there
are all sorts of interesting constraints on that too; it looks like
you'd have to somehow synchronize things so that all the backends do it
at the exact same time.

On the whole I see no advantage to be gained here, compared to the
implementation I sketched earlier with a fixed-size shared buffer and
enlargeable internal buffers in backends.

regards, tom lane

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hannu Krosing (#16)
Re: listen/notify argument (old topic revisited)

Hannu Krosing <hannu@tm.ee> writes:

Are you planning to have one circular buffer per listening backend ?

No; one circular buffer, period.

Each backend would also internally buffer notifies that it hadn't yet
delivered to its client --- but since the time until delivery could vary
drastically across clients, I think that's reasonable. I'd expect
clients that are using LISTEN to avoid doing long-running transactions,
so under normal circumstances the internal buffer should not grow very
large.

regards, tom lane

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#9)
#22Neil Conway
neilc@samurai.com
In reply to: Tom Lane (#10)
#23Hannu Krosing
hannu@tm.ee
In reply to: Tom Lane (#20)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#9)
#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Neil Conway (#22)
#26Hannu Krosing
hannu@tm.ee
In reply to: Tom Lane (#21)
#27Hannu Krosing
hannu@tm.ee
In reply to: Tom Lane (#21)
#28Hannu Krosing
hannu@tm.ee
In reply to: Bruce Momjian (#9)
#29Hannu Krosing
hannu@tm.ee
In reply to: Tom Lane (#24)
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hannu Krosing (#29)
#31Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#30)
#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#31)
#33Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#32)
#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hannu Krosing (#28)