How is this possible "publication does not exist"
The publication exists but for some reason the function can't find it
SELECT * FROM pg_logical_slot_get_binary_changes('debezium', NULL,
NULL,'proto_version','1','publication_names','dbz_publication');
ERROR: publication "dbz_publication" does not exist
CONTEXT: slot "debezium", output plugin "pgoutput", in the change
callback, associated LSN 0/307D8E8
postgres=# select * from pg_publication;
pubname | pubowner | puballtables | pubinsert | pubupdate |
pubdelete | pubtruncate
-----------------+----------+--------------+-----------+-----------+-----------+-------------
dbz_publication | 10 | t | t | t | t
| t
(1 row)
postgres=# SELECT * FROM pg_logical_slot_get_binary_changes('debezium',
NULL, NULL,'proto_version','1','publication_names','dbz_publication');
ERROR: publication "dbz_publication" does not exist
CONTEXT: slot "debezium", output plugin "pgoutput", in the change
callback, associated LSN 0/307D8E8
Dave Cramer
On Thu, 19 Dec 2019 at 11:59, Dave Cramer <davecramer@gmail.com> wrote:
The publication exists but for some reason the function can't find it
SELECT * FROM pg_logical_slot_get_binary_changes('debezium', NULL,
NULL,'proto_version','1','publication_names','dbz_publication');
ERROR: publication "dbz_publication" does not exist
CONTEXT: slot "debezium", output plugin "pgoutput", in the change
callback, associated LSN 0/307D8E8
postgres=# select * from pg_publication;
pubname | pubowner | puballtables | pubinsert | pubupdate |
pubdelete | pubtruncate-----------------+----------+--------------+-----------+-----------+-----------+-------------
dbz_publication | 10 | t | t | t | t
| t
(1 row)postgres=# SELECT * FROM pg_logical_slot_get_binary_changes('debezium',
NULL, NULL,'proto_version','1','publication_names','dbz_publication');
ERROR: publication "dbz_publication" does not exist
CONTEXT: slot "debezium", output plugin "pgoutput", in the change
callback, associated LSN 0/307D8E8
It seems that if you drop the publication on an existing slot it needs to
be recreated. Is this expected behaviour
drop publication dbz_publication ;
DROP PUBLICATION
postgres=# create publication dbz_publication for all tables;
CREATE PUBLICATION
postgres=# SELECT * FROM pg_logical_slot_get_binary_changes('debezium',
NULL, NULL,'proto_version','1','publication_names','dbz_publication');
ERROR: publication "dbz_publication" does not exist
CONTEXT: slot "debezium", output plugin "pgoutput", in the change
callback, associated LSN 0/4324180
Dave Cramer
On 2019-12-19 19:15, Dave Cramer wrote:
It seems that if you drop the publication on an existing slot it needs
to be recreated. Is this expected behaviour
A publication is not associated with a slot. Only a subscription is
associated with a slot.
drop publication dbz_publication ;
DROP PUBLICATION
postgres=# create publication dbz_publication for all tables;
CREATE PUBLICATION
postgres=# SELECT * FROM pg_logical_slot_get_binary_changes('debezium',
NULL, NULL,'proto_version','1','publication_names','dbz_publication');
ERROR: publication "dbz_publication" does not exist
CONTEXT: slot "debezium", output plugin "pgoutput", in the change
callback, associated LSN 0/4324180
This must be something particular to Debezium.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Dec 19, 2019 at 07:19:56PM +0100, Peter Eisentraut wrote:
On 2019-12-19 19:15, Dave Cramer wrote:
It seems that if you drop the publication on an existing slot it
needs to be recreated. Is this expected behaviourA publication is not associated with a slot. Only a subscription is
associated with a slot.drop publication dbz_publication ;
DROP PUBLICATION
postgres=# create publication dbz_publication for all tables;
CREATE PUBLICATION
postgres=# SELECT * FROM
pg_logical_slot_get_binary_changes('debezium', NULL,
NULL,'proto_version','1','publication_names','dbz_publication');
ERROR: �publication "dbz_publication" does not exist
CONTEXT: �slot "debezium", output plugin "pgoutput", in the change
callback, associated LSN 0/4324180This must be something particular to Debezium.
Yeah, I don't see this error message anywhere in our sources on 11 or
12, so perhaps debezium does something funny? It's not clear to me
where, though, as this suggests it uses the pgoutput module.
While trying to reproduce this I however ran into a related issue with
pgoutput/pg_logical_slot_get_binary_changes. If you call the function
repeatedly (~10x) you'll get an error like this:
FATAL: out of relcache_callback_list slots
CONTEXT: slot "slot", output plugin "pgoutput", in the startup callback
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The reason is very simple - each call executes pgoutput_startup, which
does CacheRegisterRelcacheCallback in init_rel_sync_cache. And we do
this on each pg_logical_slot_get_binary_changes() call and never remove
the callbacks, so we simply run out of MAX_RELCACHE_CALLBACKS slots.
Not sure if this is a known issue/behavior, but it seems a bit annoying
and possibly related to the issue reported by Dave.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, 19 Dec 2019 19:19:56 +0100
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 2019-12-19 19:15, Dave Cramer wrote:
It seems that if you drop the publication on an existing slot it needs
to be recreated. Is this expected behaviourA publication is not associated with a slot. Only a subscription is
associated with a slot.
Couldn't it be the same scenario I reported back in october? See:
Subject: Logical replication dead but synching
Date: Thu, 10 Oct 2019 11:57:52 +0200
I've been running into a similar issue and am a little puzzled by it,
especially since it survives restarts.
On Fri, Dec 20, 2019 at 2:39 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
Yeah, I don't see this error message anywhere in our sources on 11 or
12, so perhaps debezium does something funny? It's not clear to me
where, though, as this suggests it uses the pgoutput module.
The error message comes from GetPublicationByName and the context is
added by output_plugin_error_callback in logical.c. Stack trace of
where the error occurs below.
# SELECT * FROM pg_publication;
pubname | pubowner | puballtables | pubinsert | pubupdate |
pubdelete | pubtruncate
----------------+----------+--------------+-----------+-----------+-----------+-------------
migration_pub | 10 | f | t | t | t | t
(1 row)
# SELECT * FROM pg_replication_slots ;
slot_name | plugin | slot_type | datoid | database | temporary
| active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn
----------------+----------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------
migration_slot | pgoutput | logical | 13121 | postgres | f
| f | | | 17153 | 0/CDFC840 | 0/CDFC878
(1 row)
# SELECT * FROM pg_logical_slot_get_binary_changes('migration_slot',
NULL, NULL,'proto_version','1','publication_names','migration_pub');
ERROR: publication "migration_pub" does not exist
CONTEXT: slot "migration_slot", output plugin "pgoutput", in the
change callback, associated LSN 0/CDFC878
#0 errstart (elevel=elevel@entry=20,
filename=filename@entry=0x5581958a6c70 "pg_publication.c",
lineno=lineno@entry=401,
funcname=funcname@entry=0x5581958a6ea0 <__func__.24991>
"GetPublicationByName", domain=domain@entry=0x0) at elog.c:251
#1 0x00005581954771e5 in GetPublicationByName (pubname=0x558196d107a0
"migration_pub", missing_ok=missing_ok@entry=false) at
pg_publication.c:401
#2 0x00007f77ba1cd704 in LoadPublications (pubnames=<optimized out>)
at pgoutput.c:467
#3 0x00007f77ba1cd7e3 in get_rel_sync_entry
(data=data@entry=0x558196cedee8, relid=<optimized out>) at
pgoutput.c:559
#4 0x00007f77ba1cdb52 in pgoutput_change (ctx=0x558196d7b4f8,
txn=<optimized out>, relation=0x7f77ba1e67c8, change=0x558196cdbab8)
at pgoutput.c:315
#5 0x000055819566a2e6 in change_cb_wrapper (cache=<optimized out>,
txn=<optimized out>, relation=<optimized out>, change=<optimized out>)
at logical.c:747
#6 0x0000558195675785 in ReorderBufferCommit (rb=0x558196d35d38,
xid=xid@entry=17153, commit_lsn=215994160, end_lsn=<optimized out>,
commit_time=commit_time@entry=662061745906576,
origin_id=origin_id@entry=0, origin_lsn=0) at reorderbuffer.c:1592
#7 0x0000558195667407 in DecodeCommit (ctx=ctx@entry=0x558196d7b4f8,
buf=buf@entry=0x7ffd61faae60, parsed=parsed@entry=0x7ffd61faacf0,
xid=17153) at decode.c:641
#8 0x00005581956675a0 in DecodeXactOp (ctx=0x558196d7b4f8,
buf=buf@entry=0x7ffd61faae60) at decode.c:249
#9 0x00005581956684cb in LogicalDecodingProcessRecord
(ctx=ctx@entry=0x558196d7b4f8, record=<optimized out>) at decode.c:117
#10 0x000055819566c108 in pg_logical_slot_get_changes_guts
(fcinfo=0x7ffd61fab120, confirm=confirm@entry=true,
binary=binary@entry=true) at logicalfuncs.c:309
#11 0x000055819566c35d in pg_logical_slot_get_binary_changes
(fcinfo=<optimized out>) at logicalfuncs.c:391
cheers,
Marco
On 12/24/20 12:50 PM, Marco Slot wrote:
I've been running into a similar issue and am a little puzzled by it,
especially since it survives restarts.
Interesting. Which PostgreSQL version are you using? Any idea how to
reproduce it? Were there any failures right before the issue appeared?
I wonder if this might be a case of index corruption. Can you try
forcing an index scan on pg_publication?
SET enable_seqscan = false;
SET enable_bitmapscan = off;
SELECT * FROM pg_publication WHERE pubname = 'migration_pub';
Also, it might be helpful to know why get_rel_sync_entry ended up
calling LoadPublications - did we just create the entry?
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
20 дек. 2019 г., в 06:39, Tomas Vondra <tomas.vondra@2ndquadrant.com> написал(а):
While trying to reproduce this I however ran into a related issue with
pgoutput/pg_logical_slot_get_binary_changes. If you call the function
repeatedly (~10x) you'll get an error like this:FATAL: out of relcache_callback_list slots
CONTEXT: slot "slot", output plugin "pgoutput", in the startup callback
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.The reason is very simple - each call executes pgoutput_startup, which
does CacheRegisterRelcacheCallback in init_rel_sync_cache. And we do
this on each pg_logical_slot_get_binary_changes() call and never remove
the callbacks, so we simply run out of MAX_RELCACHE_CALLBACKS slots.Not sure if this is a known issue/behavior, but it seems a bit annoying
and possibly related to the issue reported by Dave.
Sorry for bumping old thread.
I was involved in troubleshooting logical replication recently. And found out that it sometimes has a really annoying error reporting.
While source of the problem was allegedly in low max_replication_slots, users were getting only this error about relcache_callback_list.
Maybe we could fix this particular error by deduplicating callbacks?
Best regards, Andrey Borodin.
Attachments:
0001-Avoid-duplication-in-relcache-and-syscache-callbacks.patchapplication/octet-stream; name=0001-Avoid-duplication-in-relcache-and-syscache-callbacks.patch; x-unix-mode=0644Download
From 5ee9ae963e3a6143975f543761f51b1bc0dc5de8 Mon Sep 17 00:00:00 2001
From: Andrey Borodin <amborodin@acm.org>
Date: Fri, 23 Jul 2021 16:00:31 +0500
Subject: [PATCH] Avoid duplication in relcache and syscache callbacks
---
src/backend/utils/cache/inval.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c
index dcfd9e8389..692ec8f04f 100644
--- a/src/backend/utils/cache/inval.c
+++ b/src/backend/utils/cache/inval.c
@@ -1430,17 +1430,32 @@ CacheInvalidateRelmap(Oid databaseId)
* Yes, there's a possibility of a false match to zero, but it doesn't seem
* worth troubling over, especially since most of the current callees just
* flush all cached state anyway.
+ * Each callback\arg will we called only once, even if it was registered
+ * many times.
*/
void
CacheRegisterSyscacheCallback(int cacheid,
SyscacheCallbackFunction func,
Datum arg)
{
+ int i;
if (cacheid < 0 || cacheid >= SysCacheSize)
elog(FATAL, "invalid cache ID: %d", cacheid);
if (syscache_callback_count >= MAX_SYSCACHE_CALLBACKS)
elog(FATAL, "out of syscache_callback_list slots");
+ for (i = 0; i < syscache_callback_count; i++)
+ {
+ if (syscache_callback_list[i].id == cacheid &&
+ syscache_callback_list[i].function == func &&
+ syscache_callback_list[i].arg == arg)
+ {
+ elog(DEBUG1, "Duplicate registartion of syscache callback");
+ /* Already registered */
+ return;
+ }
+ }
+
if (syscache_callback_links[cacheid] == 0)
{
/* first callback for this cache */
@@ -1472,14 +1487,28 @@ CacheRegisterSyscacheCallback(int cacheid,
*
* NOTE: InvalidOid will be passed if a cache reset request is received.
* In this case the called routines should flush all cached state.
+ * Each callback\arg will we called only once, even if it was registered
+ * many times.
*/
void
CacheRegisterRelcacheCallback(RelcacheCallbackFunction func,
Datum arg)
{
+ int i;
if (relcache_callback_count >= MAX_RELCACHE_CALLBACKS)
elog(FATAL, "out of relcache_callback_list slots");
+ for (i = 0; i < relcache_callback_count; i++)
+ {
+ if (relcache_callback_list[i].function == func &&
+ relcache_callback_list[i].arg == arg)
+ {
+ /* Already registered */
+ elog(DEBUG1, "Duplicate registartion of relcache callback");
+ return;
+ }
+ }
+
relcache_callback_list[relcache_callback_count].function = func;
relcache_callback_list[relcache_callback_count].arg = arg;
--
2.24.3 (Apple Git-128)
Reviving this thread
2021-08-10 19:05:09.096 UTC [3738] LOG: logical replication apply
worker for subscription "sub_mycluster_alltables" has started
2021-08-10 19:05:09.107 UTC [3739] LOG: logical replication table
synchronization worker for subscription "sub_mycluster_alltables",
table "t_random" has started
2021-08-10 19:05:12.222 UTC [3739] LOG: logical replication table
synchronization worker for subscription "sub_mycluster_alltables",
table "t_random" has finished
2021-08-10 19:05:14.806 UTC [3738] ERROR: could not receive data from
WAL stream: ERROR: publication "sub_mycluster_alltables" does not
exist
CONTEXT: slot "sub_mycluster_alltables", output plugin
"pgoutput", in the change callback, associated LSN 0/4015DF0
2021-08-10 19:05:14.811 UTC [175] LOG: background worker "logical
replication worker" (PID 3738) exited with exit code 1
select * from pg_publication;
-[ RECORD 1 ]+------------------------
oid | 16415
pubname | sub_mycluster_alltables
pubowner | 10
puballtables | t
pubinsert | t
pubupdate | t
pubdelete | t
pubtruncate | t
select * from pg_replication_slots;
-[ RECORD 1 ]-------+--------------------------------
slot_name | mycluster_cjvq_68cf55677c_6vgcf
plugin |
slot_type | physical
datoid |
database |
temporary | f
active | t
active_pid | 433
xmin |
catalog_xmin |
restart_lsn | 0/D000000
confirmed_flush_lsn |
-[ RECORD 2 ]-------+--------------------------------
slot_name | sub_mycluster_alltables
plugin | pgoutput
slot_type | logical
datoid | 16395
database | mycluster
temporary | f
active | t
active_pid | 8799
xmin |
catalog_xmin | 500
restart_lsn | 0/40011C0
confirmed_flush_lsn | 0/40011C0
I'm at a loss as to where to even look at this point.
Dave
On Wed, Aug 11, 2021 at 1:18 AM Dave Cramer <davecramer@gmail.com> wrote:
Reviving this thread
2021-08-10 19:05:09.096 UTC [3738] LOG: logical replication apply worker for subscription "sub_mycluster_alltables" has started
2021-08-10 19:05:09.107 UTC [3739] LOG: logical replication table synchronization worker for subscription "sub_mycluster_alltables", table "t_random" has started
2021-08-10 19:05:12.222 UTC [3739] LOG: logical replication table synchronization worker for subscription "sub_mycluster_alltables", table "t_random" has finished
2021-08-10 19:05:14.806 UTC [3738] ERROR: could not receive data from WAL stream: ERROR: publication "sub_mycluster_alltables" does not exist
CONTEXT: slot "sub_mycluster_alltables", output plugin "pgoutput", in the change callback, associated LSN 0/4015DF0
2021-08-10 19:05:14.811 UTC [175] LOG: background worker "logical replication worker" (PID 3738) exited with exit code 1select * from pg_publication;
-[ RECORD 1 ]+------------------------
oid | 16415
pubname | sub_mycluster_alltables
pubowner | 10
puballtables | t
pubinsert | t
pubupdate | t
pubdelete | t
pubtruncate | t
By any chance, did you dropped and recreated this publication as
mentioned in your first email? If so, I think this can happen because
of our use of historical snapshots to consult system catalogs.
--
With Regards,
Amit Kapila.
On Wed, 11 Aug 2021 at 07:24, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Aug 11, 2021 at 1:18 AM Dave Cramer <davecramer@gmail.com> wrote:
Reviving this thread
2021-08-10 19:05:09.096 UTC [3738] LOG: logical replication apply
worker for subscription "sub_mycluster_alltables" has started
2021-08-10 19:05:09.107 UTC [3739] LOG: logical replication table
synchronization worker for subscription "sub_mycluster_alltables", table
"t_random" has started2021-08-10 19:05:12.222 UTC [3739] LOG: logical replication table
synchronization worker for subscription "sub_mycluster_alltables", table
"t_random" has finished2021-08-10 19:05:14.806 UTC [3738] ERROR: could not receive data from
WAL stream: ERROR: publication "sub_mycluster_alltables" does not exist
CONTEXT: slot "sub_mycluster_alltables", output plugin
"pgoutput", in the change callback, associated LSN 0/4015DF0
2021-08-10 19:05:14.811 UTC [175] LOG: background worker "logical
replication worker" (PID 3738) exited with exit code 1
select * from pg_publication;
-[ RECORD 1 ]+------------------------
oid | 16415
pubname | sub_mycluster_alltables
pubowner | 10
puballtables | t
pubinsert | t
pubupdate | t
pubdelete | t
pubtruncate | tBy any chance, did you dropped and recreated this publication as
mentioned in your first email? If so, I think this can happen because
of our use of historical snapshots to consult system catalogs.
In this case, no.
I am suspecting this error comes from pgoutput though.
Dave
On Wed, Aug 11, 2021 at 4:57 PM Dave Cramer <davecramer@gmail.com> wrote:
On Wed, 11 Aug 2021 at 07:24, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Aug 11, 2021 at 1:18 AM Dave Cramer <davecramer@gmail.com> wrote:
Reviving this thread
2021-08-10 19:05:09.096 UTC [3738] LOG: logical replication apply worker for subscription "sub_mycluster_alltables" has started
2021-08-10 19:05:09.107 UTC [3739] LOG: logical replication table synchronization worker for subscription "sub_mycluster_alltables", table "t_random" has started
2021-08-10 19:05:12.222 UTC [3739] LOG: logical replication table synchronization worker for subscription "sub_mycluster_alltables", table "t_random" has finished
2021-08-10 19:05:14.806 UTC [3738] ERROR: could not receive data from WAL stream: ERROR: publication "sub_mycluster_alltables" does not exist
CONTEXT: slot "sub_mycluster_alltables", output plugin "pgoutput", in the change callback, associated LSN 0/4015DF0
2021-08-10 19:05:14.811 UTC [175] LOG: background worker "logical replication worker" (PID 3738) exited with exit code 1select * from pg_publication;
-[ RECORD 1 ]+------------------------
oid | 16415
pubname | sub_mycluster_alltables
pubowner | 10
puballtables | t
pubinsert | t
pubupdate | t
pubdelete | t
pubtruncate | tBy any chance, did you dropped and recreated this publication as
mentioned in your first email? If so, I think this can happen because
of our use of historical snapshots to consult system catalogs.In this case, no.
I am suspecting this error comes from pgoutput though.
I think it is and the context is generated via
output_plugin_error_callback. Is this reproducible for you and if so,
can you share a test case or some steps to reproduce this? Does this
work and suddenly start giving errors or it happens the very first
time you tried to set up publication/subscription? I think some more
details are required about your setup and steps to analyze this
problem. You might want to check publication-side logs but not sure if
get any better clue there.
--
With Regards,
Amit Kapila.
On Wed, 11 Aug 2021 at 07:37, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Aug 11, 2021 at 4:57 PM Dave Cramer <davecramer@gmail.com> wrote:
On Wed, 11 Aug 2021 at 07:24, Amit Kapila <amit.kapila16@gmail.com>
wrote:
On Wed, Aug 11, 2021 at 1:18 AM Dave Cramer <davecramer@gmail.com>
wrote:
Reviving this thread
2021-08-10 19:05:09.096 UTC [3738] LOG: logical replication apply
worker for subscription "sub_mycluster_alltables" has started
2021-08-10 19:05:09.107 UTC [3739] LOG: logical replication table
synchronization worker for subscription "sub_mycluster_alltables", table
"t_random" has started2021-08-10 19:05:12.222 UTC [3739] LOG: logical replication table
synchronization worker for subscription "sub_mycluster_alltables", table
"t_random" has finished2021-08-10 19:05:14.806 UTC [3738] ERROR: could not receive data
from WAL stream: ERROR: publication "sub_mycluster_alltables" does not
existCONTEXT: slot "sub_mycluster_alltables", output plugin
"pgoutput", in the change callback, associated LSN 0/4015DF0
2021-08-10 19:05:14.811 UTC [175] LOG: background worker "logical
replication worker" (PID 3738) exited with exit code 1
select * from pg_publication;
-[ RECORD 1 ]+------------------------
oid | 16415
pubname | sub_mycluster_alltables
pubowner | 10
puballtables | t
pubinsert | t
pubupdate | t
pubdelete | t
pubtruncate | tBy any chance, did you dropped and recreated this publication as
mentioned in your first email? If so, I think this can happen because
of our use of historical snapshots to consult system catalogs.In this case, no.
I am suspecting this error comes from pgoutput though.
I think it is and the context is generated via
output_plugin_error_callback. Is this reproducible for you and if so,
can you share a test case or some steps to reproduce this? Does this
work and suddenly start giving errors or it happens the very first
time you tried to set up publication/subscription? I think some more
details are required about your setup and steps to analyze this
problem. You might want to check publication-side logs but not sure if
get any better clue there.
In this case I am the messenger. I will try to get a repeatable test case.
Dave
On Wed, Aug 11, 2021 at 1:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I think it is and the context is generated via
output_plugin_error_callback. Is this reproducible for you and if so,
can you share a test case or some steps to reproduce this? Does this
work and suddenly start giving errors or it happens the very first
time you tried to set up publication/subscription? I think some more
details are required about your setup and steps to analyze this
problem. You might want to check publication-side logs but not sure if
get any better clue there.
This seems to regularly reproduce the issue on PostgreSQL 14.4:
drop subscription if exists local_sub;
drop publication if exists local_pub;
drop table if exists local;
select pg_create_logical_replication_slot('test','pgoutput');
create table local (x int, y int);
insert into local values (1,2);
create publication local_pub for table local;
create subscription local_sub connection 'host=localhost port=5432'
publication local_pub with (create_slot = false, slot_name = 'test',
copy_data = false);
The log on the publisher then repeatedly shows:
2022-08-04 10:46:56.140 CEST [12785] ERROR: publication "local_pub"
does not exist
2022-08-04 10:46:56.140 CEST [12785] CONTEXT: slot "test", output
plugin "pgoutput", in the change callback, associated LSN 8/6C01A270
2022-08-04 10:46:56.140 CEST [12785] STATEMENT: START_REPLICATION
SLOT "test" LOGICAL 0/0 (proto_version '2', publication_names
'"local_pub"')
(fails in the same way when setting up the subscription on a different node)
The local_pub does appear in pg_publication, but it seems a bit like
the change_cb is using an old snapshot when reading from the catalog
in GetPublicationByName.
cheers,
Marco
At Thu, 4 Aug 2022 10:56:45 +0200, Marco Slot <marco.slot@gmail.com> wrote in
On Wed, Aug 11, 2021 at 1:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I think it is and the context is generated via
output_plugin_error_callback. Is this reproducible for you and if so,
can you share a test case or some steps to reproduce this? Does this
work and suddenly start giving errors or it happens the very first
time you tried to set up publication/subscription? I think some more
details are required about your setup and steps to analyze this
problem. You might want to check publication-side logs but not sure if
get any better clue there.This seems to regularly reproduce the issue on PostgreSQL 14.4:
drop subscription if exists local_sub;
drop publication if exists local_pub;
drop table if exists local;select pg_create_logical_replication_slot('test','pgoutput');
create table local (x int, y int);
insert into local values (1,2);
create publication local_pub for table local;
create subscription local_sub connection 'host=localhost port=5432'
publication local_pub with (create_slot = false, slot_name = 'test',
copy_data = false);
The local_pub does appear in pg_publication, but it seems a bit like
the change_cb is using an old snapshot when reading from the catalog
in GetPublicationByName.
Yes. So the slot should be created *after* the publication is
created. A discussion [1]/messages/by-id/CAA4eK1LwQAEPJMTwVe3UYODeNMkK2QHf-WZF5aXp5ZcjDRcrUA@mail.gmail.com was held on allowing missing publications
in such a case but it has not been rached a conclusion..
[1]: /messages/by-id/CAA4eK1LwQAEPJMTwVe3UYODeNMkK2QHf-WZF5aXp5ZcjDRcrUA@mail.gmail.com
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center