NOTIFY queue is at 66% and climbing...

Started by Jeff Rossover 4 years ago3 messagesgeneral
Jump to latest
#1Jeff Ross
jross@openvistas.net

Hi all,

On 10.15 I'm getting the following on a logically replicated server.

From the CSV logs:

2021-10-13 18:49:39.792 EDT,,,213601,,6143c257.34261,64243,,2021-09-16
18:16:55 EDT,4/3914851,60709901,WARNING,01000,"NOTIFY queue is 66%
full",,,,,,,,,""
2021-10-13 18:49:46.058 EDT,,,213601,,6143c257.34261,64244,,2021-09-16
18:16:55 EDT,4/3914855,60709905,WARNING,01000,"NOTIFY queue is 66%
full",,,,,,,,,""
2021-10-13 18:49:51.934 EDT,,,213601,,6143c257.34261,64245,,2021-09-16
18:16:55 EDT,4/3914862,60709915,WARNING,01000,"NOTIFY queue is 66%
full",,,,,,,,,""
2021-10-13 18:50:00.516 EDT,,,213601,,6143c257.34261,64246,,2021-09-16
18:16:55 EDT,4/3914864,60709917,WARNING,01000,"NOTIFY queue is 66%
full",,,,,,,,,""
2021-10-13 18:50:08.003 EDT,,,213601,,6143c257.34261,64247,,2021-09-16
18:16:55 EDT,4/3914871,60709926,WARNING,01000,"NOTIFY queue is 66%
full",,,,,,,,,""

The very fine docs for 10 say

"There is a queue that holds notifications that have been sent but not
yet processed by all listening sessions. If this queue becomes full,
transactions calling |NOTIFY| will fail at commit. The queue is quite
large (8GB in a standard installation) and should be sufficiently sized
for almost every use case. However, no cleanup can take place if a
session executes |LISTEN| and then enters a transaction for a very long
time. Once the queue is half full you will see warnings in the log file
pointing you to the session that is preventing cleanup. In this case you
should make sure that this session ends its current transaction so that
cleanup can proceed."

We do have a client that has access to this logically replicated server
and I have no idea what they have done regards LISTEN/NOTIFY, and my
query to them about this has not yielded a response.

In the CSV logs above what part points to "the session that is
preventing cleanup" so that I can kill it?  pg_stat_activity yields no
clues.

Thanks,

Jeff

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Ross (#1)
Re: NOTIFY queue is at 66% and climbing...

Jeff Ross <jross@openvistas.net> writes:

On 10.15 I'm getting the following on a logically replicated server.

2021-10-13 18:49:39.792 EDT,,,213601,,6143c257.34261,64243,,2021-09-16
18:16:55 EDT,4/3914851,60709901,WARNING,01000,"NOTIFY queue is 66%
full",,,,,,,,,""

In the CSV logs above what part points to "the session that is
preventing cleanup" so that I can kill it?

Normally there's a DETAIL entry citing the session's PID. Looking
at the code, the reason for the lack of any such entry must be that
there is no session whose current notify queue position exactly
matches the supposed global minimum position. This corresponds to
a known bug that was fixed in 10.16, so I'd suggest upgrading.
As a temporary workaround you could restart that server, but
likely the problem would recur after awhile.

regards, tom lane

#3Jeff Ross
jross@openvistas.net
In reply to: Tom Lane (#2)
Re: NOTIFY queue is at 66% and climbing...

On 10/13/21 5:50 PM, Tom Lane wrote:

Jeff Ross <jross@openvistas.net> writes:

On 10.15 I'm getting the following on a logically replicated server.
2021-10-13 18:49:39.792 EDT,,,213601,,6143c257.34261,64243,,2021-09-16
18:16:55 EDT,4/3914851,60709901,WARNING,01000,"NOTIFY queue is 66%
full",,,,,,,,,""
In the CSV logs above what part points to "the session that is
preventing cleanup" so that I can kill it?

Normally there's a DETAIL entry citing the session's PID. Looking
at the code, the reason for the lack of any such entry must be that
there is no session whose current notify queue position exactly
matches the supposed global minimum position. This corresponds to
a known bug that was fixed in 10.16, so I'd suggest upgrading.
As a temporary workaround you could restart that server, but
likely the problem would recur after awhile.

regards, tom lane

Thanks as always, Tom.  I'll schedule the upgrade to 10.18 and restart
for this weekend.

Jeff