create subscription with (origin = none, copy_data = on)
Hi, hackers!
I am looking at subscription creation command:
CREATE SUBSCRIPTION sub CONNECTION '...' PUBLICATION pub WITH (origin =
none, copy_data = on);
For now we log a warning if the publisher has subscribed to the same
table from some other publisher.
However, in case of publication with publish_via_partition_root option,
we will not raise such warinigs
because SQL command in check_publications_origin() checks only directly
published tables.
For example:
CREATE TABLE t(id int) PARTITION BY RANGE(id);
CREATE TABLE part1 PARTITION OF t FOR VALUES FROM (0) TO (5);
CREATE TABLE part2 PARTITION OF t FOR VALUES FROM (5) TO (10);
-- subscribe to part2
CREATE SUBSCRIPTION sub_part2 CONNECTION '...' PUBLICATION pub_part2;
CREATE PUBLICATION pub_t FOR TABLE t;
CREATE PUBLICATION pub_t_via_root FOR TABLE t WITH
(publish_via_partition_root);
and now this command will raise a warning:
CREATE SUBSCRIPTION sub1 CONNECTION '...' PUBLICATION pub_t WITH (origin
= none, copy_data = on);
but not this:
CREATE SUBSCRIPTION sub1 CONNECTION '...' PUBLICATION pub_t_via_root
WITH (origin = none, copy_data = on);
We also do not take into account cases of foreign partitions:
CREATE TABLE t(id int) PARTITION BY RANGE(id);
CREATE TABLE part1 PARTITION OF t FOR VALUES FROM (0) TO (5);
CREATE FOREIGN TABLE part2 PARTITION OF t FOR VALUES FROM (5) TO (10)
SERVER fdw_server;
CREATE PUBLICATION pub_t FOR TABLE t;
Maybe we should raise WARNING (or even ERROR) in such cases?
I would also note that the (origin = none) will work as expected, but in
case of (origin = any)
it will lead to inappropriate behavior - we will perform an initial sync
of "t", but we unable to
replicate further updates for "part2".
I have attached patch, demonstrating this problems
Attachments:
subscription-origin.patchtext/x-patch; charset=UTF-8; name=subscription-origin.patchDownload
diff --git a/src/test/subscription/Makefile b/src/test/subscription/Makefile
index ce1ca43009..e5e9bb469c 100644
--- a/src/test/subscription/Makefile
+++ b/src/test/subscription/Makefile
@@ -13,7 +13,7 @@ subdir = src/test/subscription
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
-EXTRA_INSTALL = contrib/hstore
+EXTRA_INSTALL = contrib/hstore contrib/postgres_fdw
export with_icu
diff --git a/src/test/subscription/t/030_origin.pl b/src/test/subscription/t/030_origin.pl
index adfae1a56e..d310e79ba8 100644
--- a/src/test/subscription/t/030_origin.pl
+++ b/src/test/subscription/t/030_origin.pl
@@ -255,6 +255,134 @@ $node_A->safe_psql('postgres', "DROP TABLE tab_new");
$node_B->safe_psql('postgres', "DROP TABLE tab_new");
$node_A->safe_psql('postgres', "DROP SUBSCRIPTION $subname_AB2");
+###############################################################################
+# Setting the origin option to NONE must raise WARNING if we subscribe to a
+# partitioned table, if this table contains any remotely originated data, even
+# if a publication was created with the publish_via_partition_root option.
+#
+# node_A node_C
+# _____________________ __________
+# | ts.t |---------------->| ts.t |
+# |_____________________| |__________|
+# | ts.part1 | ts.part2 |
+# |__________|__________|
+# node_B ^
+# __________ |
+# | ts.part2 |----------------------
+# |__________|
+#
+###############################################################################
+$node_B->safe_psql('postgres', qq(
+CREATE SCHEMA ts;
+CREATE TABLE ts.part2(id int);
+CREATE PUBLICATION pub_b FOR TABLE ts.part2;
+));
+$node_A->safe_psql('postgres', qq(
+CREATE SCHEMA ts;
+CREATE TABLE ts.t(id int) PARTITION BY RANGE(id);
+CREATE TABLE ts.part1 PARTITION OF ts.t FOR VALUES FROM (0) TO (5);
+CREATE TABLE ts.part2 PARTITION OF ts.t FOR VALUES FROM (5) TO (10);
+));
+$node_A->safe_psql('postgres',"CREATE SUBSCRIPTION sub_b CONNECTION '$node_B_connstr' PUBLICATION pub_b;");
+$node_A->safe_psql('postgres',"
+CREATE PUBLICATION pub_a FOR TABLE ts.t;
+CREATE PUBLICATION pub_a_via_root FOR TABLE ts.t WITH (publish_via_partition_root);
+CREATE PUBLICATION pub_a_schema FOR TABLES IN SCHEMA ts WITH (publish_via_partition_root);
+");
+
+$node_C->safe_psql('postgres',"
+CREATE SCHEMA ts;
+CREATE TABLE ts.t(id int) PARTITION BY RANGE(id);
+CREATE TABLE ts.part1 PARTITION OF ts.t FOR VALUES FROM (0) TO (5);
+CREATE TABLE ts.part2 PARTITION OF ts.t FOR VALUES FROM (5) TO (10);");
+
+my $pub;
+# in this case, only "pub_a" will raise WARNING. However, all of them must do it.
+my @publications = ("pub_a", "pub_a_via_root", "pub_a_schema");
+foreach $pub (@publications)
+{
+ note("check warning for publication $pub");
+ ($result, $stdout, $stderr) = $node_C->psql('postgres',"
+ CREATE SUBSCRIPTION sub_c CONNECTION '$node_A_connstr' PUBLICATION $pub WITH (origin = none, copy_data = on);
+ ");
+
+ like(
+ $stderr,
+ qr/WARNING: ( [A-Z0-9]+:)? subscription "sub_c" requested copy_data with origin = NONE but might copy data that had a different origin/,
+ "subscribe to $pub must raise WARNING"
+ );
+ $node_C->psql('postgres',"DROP SUBSCRIPTION IF EXISTS sub_c");
+}
+$node_A->safe_psql('postgres',qq(
+DROP SCHEMA ts CASCADE;
+DROP PUBLICATION pub_a;
+DROP PUBLICATION pub_a_via_root;
+DROP PUBLICATION pub_a_schema;
+));
+$node_B->safe_psql('postgres',"DROP SCHEMA ts CASCADE");
+$node_C->safe_psql('postgres',"DROP SCHEMA ts CASCADE");
+
+###############################################################################
+# Also, we must check foreign partitions if the origin option is set to NONE.
+# Maybe we should raise ERROR in these cases for any origin value?
+###############################################################################
+$node_A->safe_psql('postgres',"CREATE EXTENSION postgres_fdw");
+$node_B->safe_psql('postgres',"CREATE TABLE part22(id int);");
+my $fdwconnstr = "dbname 'postgres', host '" . $node_B->host . "', port '" . $node_B->port . "'";
+$node_A->safe_psql('postgres',qq(
+CREATE SERVER fdw FOREIGN DATA WRAPPER postgres_fdw OPTIONS ($fdwconnstr);
+CREATE USER MAPPING FOR CURRENT_USER SERVER fdw;
+));
+
+# Create a partitioned table:
+# _______________________________________
+# | t |
+# |_______________________________________|
+# | part1 | part2 |
+# |_______|_______________________________|
+# | part21 | part22 (foreign) |
+# |____________|__________________|
+$node_A->safe_psql('postgres',qq(
+CREATE TABLE t(id int) PARTITION BY RANGE(id);
+CREATE TABLE part1 PARTITION OF t FOR VALUES FROM (0) TO (5);
+CREATE TABLE part2 PARTITION OF t FOR VALUES FROM (5) TO (15) PARTITION BY RANGE(id);
+CREATE TABLE part21 PARTITION OF part2 FOR VALUES FROM (5) TO (10);
+CREATE FOREIGN TABLE part22 PARTITION OF part2 FOR VALUES FROM (10) TO (15) SERVER fdw;
+INSERT INTO t SELECT id FROM generate_series(0,14) id;
+));
+$node_C->safe_psql('postgres',qq(
+CREATE TABLE t(id int) PARTITION BY RANGE(id);
+CREATE TABLE part1 PARTITION OF t FOR VALUES FROM (0) TO (5);
+CREATE TABLE part2 PARTITION OF t FOR VALUES FROM (5) TO (15) PARTITION BY RANGE(id);
+CREATE TABLE part21 PARTITION OF part2 FOR VALUES FROM (5) TO (10);
+CREATE TABLE part22 PARTITION OF part2 FOR VALUES FROM (10) TO (15);
+));
+
+$node_A->safe_psql('postgres',qq(
+CREATE PUBLICATION fdwpub FOR TABLE t;
+CREATE PUBLICATION fdwpub_via_root FOR TABLE t WITH (publish_via_partition_root);
+));
+
+# Here, both commands must raise WARNING.
+@publications = ("fdwpub", "fdwpub_via_root");
+foreach $pub (@publications)
+{
+ note("check warning for publication $pub");
+ ($result, $stdout, $stderr) = $node_C->psql('postgres',"
+ CREATE SUBSCRIPTION sub_c CONNECTION '$node_A_connstr' PUBLICATION $pub WITH (origin = none, copy_data = on);
+ ");
+
+ like(
+ $stderr,
+ qr/WARNING: ( [A-Z0-9]+:)? subscription "sub_c" requested copy_data with origin = NONE but might copy data that had a different origin/,
+ "subscribe to $pub must raise WARNING"
+ );
+ $node_C->psql('postgres',"DROP SUBSCRIPTION IF EXISTS sub_c");
+}
+$node_A->safe_psql('postgres',"DROP TABLE t CASCADE");
+$node_B->safe_psql('postgres',"DROP TABLE part22");
+$node_C->safe_psql('postgres',"DROP TABLE t CASCADE");
+
# shutdown
$node_B->stop('fast');
$node_A->stop('fast');
On Fri, 17 Jan 2025 at 14:00, Sergey Tatarintsev
<s.tatarintsev@postgrespro.ru> wrote:
Hi, hackers!
I am looking at subscription creation command:
CREATE SUBSCRIPTION sub CONNECTION '...' PUBLICATION pub WITH (origin =
none, copy_data = on);For now we log a warning if the publisher has subscribed to the same
table from some other publisher.
However, in case of publication with publish_via_partition_root option,
we will not raise such warinigs
because SQL command in check_publications_origin() checks only directly
published tables.
Yes, I agree that we are checking only the directly published tables
which is why there is no warning in this case. I'm working on a fix to
change the check_publications_origin to check accordingly.
For example:
CREATE TABLE t(id int) PARTITION BY RANGE(id);
CREATE TABLE part1 PARTITION OF t FOR VALUES FROM (0) TO (5);
CREATE TABLE part2 PARTITION OF t FOR VALUES FROM (5) TO (10);
-- subscribe to part2
CREATE SUBSCRIPTION sub_part2 CONNECTION '...' PUBLICATION pub_part2;
CREATE PUBLICATION pub_t FOR TABLE t;
CREATE PUBLICATION pub_t_via_root FOR TABLE t WITH
(publish_via_partition_root);and now this command will raise a warning:
CREATE SUBSCRIPTION sub1 CONNECTION '...' PUBLICATION pub_t WITH (origin
= none, copy_data = on);but not this:
CREATE SUBSCRIPTION sub1 CONNECTION '...' PUBLICATION pub_t_via_root
WITH (origin = none, copy_data = on);We also do not take into account cases of foreign partitions:
CREATE TABLE t(id int) PARTITION BY RANGE(id);
CREATE TABLE part1 PARTITION OF t FOR VALUES FROM (0) TO (5);
CREATE FOREIGN TABLE part2 PARTITION OF t FOR VALUES FROM (5) TO (10)
SERVER fdw_server;
CREATE PUBLICATION pub_t FOR TABLE t;Maybe we should raise WARNING (or even ERROR) in such cases?
Currently we do not support replication of foreign tables. This is
mentioned in logical replication restriction sections at [1]https://www.postgresql.org/docs/devel/logical-replication-restrictions.html.
I would also note that the (origin = none) will work as expected, but in
case of (origin = any)
it will lead to inappropriate behavior - we will perform an initial sync
of "t", but we unable to
replicate further updates for "part2".
I noticed the same behavior with both origins as none and any. i.e
initial sync is ok and then replication of foreign table part2 will
not work which is because of the above restriction that I mentioned.
Just to be sure that I'm not checking a different scenario, could you
share the test for this case.
[1]: https://www.postgresql.org/docs/devel/logical-replication-restrictions.html
Regards,
Vignesh
17.01.2025 23:00, vignesh C пишет:
On Fri, 17 Jan 2025 at 14:00, Sergey Tatarintsev
<s.tatarintsev@postgrespro.ru> wrote:Hi, hackers!
I am looking at subscription creation command:
CREATE SUBSCRIPTION sub CONNECTION '...' PUBLICATION pub WITH (origin =
none, copy_data = on);For now we log a warning if the publisher has subscribed to the same
table from some other publisher.
However, in case of publication with publish_via_partition_root option,
we will not raise such warinigs
because SQL command in check_publications_origin() checks only directly
published tables.Yes, I agree that we are checking only the directly published tables
which is why there is no warning in this case. I'm working on a fix to
change the check_publications_origin to check accordingly.
seems promising. I would like to see this patch
For example:
CREATE TABLE t(id int) PARTITION BY RANGE(id);
CREATE TABLE part1 PARTITION OF t FOR VALUES FROM (0) TO (5);
CREATE TABLE part2 PARTITION OF t FOR VALUES FROM (5) TO (10);
-- subscribe to part2
CREATE SUBSCRIPTION sub_part2 CONNECTION '...' PUBLICATION pub_part2;
CREATE PUBLICATION pub_t FOR TABLE t;
CREATE PUBLICATION pub_t_via_root FOR TABLE t WITH
(publish_via_partition_root);and now this command will raise a warning:
CREATE SUBSCRIPTION sub1 CONNECTION '...' PUBLICATION pub_t WITH (origin
= none, copy_data = on);but not this:
CREATE SUBSCRIPTION sub1 CONNECTION '...' PUBLICATION pub_t_via_root
WITH (origin = none, copy_data = on);We also do not take into account cases of foreign partitions:
CREATE TABLE t(id int) PARTITION BY RANGE(id);
CREATE TABLE part1 PARTITION OF t FOR VALUES FROM (0) TO (5);
CREATE FOREIGN TABLE part2 PARTITION OF t FOR VALUES FROM (5) TO (10)
SERVER fdw_server;
CREATE PUBLICATION pub_t FOR TABLE t;Maybe we should raise WARNING (or even ERROR) in such cases?
Currently we do not support replication of foreign tables. This is
mentioned in logical replication restriction sections at [1].
Yes of course, but we must raise an ERROR in such cases
I would also note that the (origin = none) will work as expected, but in
case of (origin = any)
it will lead to inappropriate behavior - we will perform an initial sync
of "t", but we unable to
replicate further updates for "part2".I noticed the same behavior with both origins as none and any. i.e
initial sync is ok and then replication of foreign table part2 will
not work which is because of the above restriction that I mentioned.
Just to be sure that I'm not checking a different scenario, could you
share the test for this case.
That's right, but i think we must tell user about inappropriate usage.
check_publication_add_relation() checks only publication creation for
foreign tables directly, but not partitioned tables structure.
My test case just shows that we can try to replicate partitioned table
with foreign partitions, but I think we should disallow such cases.
I would also like to show an interesting subscription creation scenario
that I found:
1. subscriber: calls check_publications_origin()
2. publisher: executes the create/attach foreign partition command
3. the subscriber is sure he checked the origin and performing COPY t TO
STDOUT
i.e. between the check and the start of copying the publication has changed
the problem is that check_publications_origin() and COPY t TO STDOUT are
performed in different transactions
Show quoted text
[1] - https://www.postgresql.org/docs/devel/logical-replication-restrictions.html
Regards,
Vignesh
On Fri, 17 Jan 2025 at 21:30, vignesh C <vignesh21@gmail.com> wrote:
On Fri, 17 Jan 2025 at 14:00, Sergey Tatarintsev
<s.tatarintsev@postgrespro.ru> wrote:Hi, hackers!
I am looking at subscription creation command:
CREATE SUBSCRIPTION sub CONNECTION '...' PUBLICATION pub WITH (origin =
none, copy_data = on);For now we log a warning if the publisher has subscribed to the same
table from some other publisher.
However, in case of publication with publish_via_partition_root option,
we will not raise such warinigs
because SQL command in check_publications_origin() checks only directly
published tables.Yes, I agree that we are checking only the directly published tables
which is why there is no warning in this case. I'm working on a fix to
change the check_publications_origin to check accordingly.
Attached patch has the fix for this issue which includes the partition
tables also for the publication now and throws a warning
appropriately.
Regards,
Vignesh
Attachments:
0001-Fix-origin-warning-not-thrown-for-publications-on-pa.patchtext/x-patch; charset=US-ASCII; name=0001-Fix-origin-warning-not-thrown-for-publications-on-pa.patchDownload
From 514656829ac637d494cd431518986cbea0fe6bc3 Mon Sep 17 00:00:00 2001
From: Vignesh <vignesh21@gmail.com>
Date: Sat, 18 Jan 2025 10:19:12 +0530
Subject: [PATCH] Fix origin warning not thrown for publications on partition
tables
When checking if a publisher had subscribed to the same table from a different
publisher, the check only considered tables directly specified for the
publication. It did not account for cases where the publication was present on
partition tables as well. This has been fixed by including all partition tables
associated with the publication in the check.
---
src/backend/catalog/pg_publication.c | 39 +++++++++++++++++++++----
src/backend/commands/subscriptioncmds.c | 2 +-
src/include/catalog/pg_proc.dat | 9 ++++++
3 files changed, 44 insertions(+), 6 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index b89098f5e9..bfdd0c7cc5 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -1089,12 +1089,16 @@ GetPublicationByName(const char *pubname, bool missing_ok)
}
/*
- * Get information of the tables in the given publication array.
+ * Helper function for SQL callables: pg_get_publication_tables and
+ * pg_get_publication_tables_with_partitions.
*
- * Returns pubid, relid, column list, row filter for each table.
+ * If allparttables is true, retrieves tables including all the partitions
+ * for the publication.
+ * If allparttables is false, retrieves tables based on the publication's
+ * pubviaroot option.
*/
-Datum
-pg_get_publication_tables(PG_FUNCTION_ARGS)
+static Datum
+pg_get_publication_tables_internal(FunctionCallInfo fcinfo, bool allparttables)
{
#define NUM_PUBLICATION_TABLES_ELEM 4
FuncCallContext *funcctx;
@@ -1147,10 +1151,12 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
*schemarelids;
relids = GetPublicationRelations(pub_elem->oid,
+ allparttables ? PUBLICATION_PART_ALL :
pub_elem->pubviaroot ?
PUBLICATION_PART_ROOT :
PUBLICATION_PART_LEAF);
schemarelids = GetAllSchemaPublicationRelations(pub_elem->oid,
+ allparttables ? PUBLICATION_PART_ALL :
pub_elem->pubviaroot ?
PUBLICATION_PART_ROOT :
PUBLICATION_PART_LEAF);
@@ -1187,7 +1193,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
* data of the child table to be double-published on the subscriber
* side.
*/
- if (viaroot)
+ if (!allparttables && viaroot)
filter_partitions(table_infos);
/* Construct a tuple descriptor for the result rows. */
@@ -1298,3 +1304,26 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
SRF_RETURN_DONE(funcctx);
}
+
+/*
+ * Get information of the tables in the given publication array.
+ *
+ * Returns pubid, relid, column list, row filter for each table.
+ */
+Datum
+pg_get_publication_tables(PG_FUNCTION_ARGS)
+{
+ return pg_get_publication_tables_internal(fcinfo, false);
+}
+
+/*
+ * Get information of the tables (including all the all partitions) in the
+ * given publication array.
+ *
+ * Returns pubid, relid, column list, row filter for each table.
+ */
+Datum
+pg_get_publication_tables_with_partitions(PG_FUNCTION_ARGS)
+{
+ return pg_get_publication_tables_internal(fcinfo, true);
+}
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 2d8a71ca1e..403b4fc918 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -2116,7 +2116,7 @@ check_publications_origin(WalReceiverConn *wrconn, List *publications,
appendStringInfoString(&cmd,
"SELECT DISTINCT P.pubname AS pubname\n"
"FROM pg_publication P,\n"
- " LATERAL pg_get_publication_tables(P.pubname) GPT\n"
+ " LATERAL pg_get_publication_tables_with_partitions(P.pubname) GPT\n"
" JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),\n"
" pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)\n"
"WHERE C.oid = GPT.relid AND P.pubname IN (");
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 18560755d2..349c6330b2 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12128,6 +12128,15 @@
proargmodes => '{v,o,o,o,o}',
proargnames => '{pubname,pubid,relid,attrs,qual}',
prosrc => 'pg_get_publication_tables' },
+{ oid => '8051',
+ descr => 'get information of the tables(including all partitions) that are part of the specified publications',
+ proname => 'pg_get_publication_tables_with_partitions', prorows => '1000',
+ provariadic => 'text', proretset => 't', provolatile => 's',
+ prorettype => 'record', proargtypes => '_text',
+ proallargtypes => '{_text,oid,oid,int2vector,pg_node_tree}',
+ proargmodes => '{v,o,o,o,o}',
+ proargnames => '{pubname,pubid,relid,attrs,qual}',
+ prosrc => 'pg_get_publication_tables_with_partitions' },
{ oid => '6121',
descr => 'returns whether a relation can be part of a publication',
proname => 'pg_relation_is_publishable', provolatile => 's',
--
2.43.0
18.01.2025 12:01, vignesh C пишет:
On Fri, 17 Jan 2025 at 21:30, vignesh C<vignesh21@gmail.com> wrote:
On Fri, 17 Jan 2025 at 14:00, Sergey Tatarintsev
<s.tatarintsev@postgrespro.ru> wrote:Hi, hackers!
I am looking at subscription creation command:
CREATE SUBSCRIPTION sub CONNECTION '...' PUBLICATION pub WITH (origin =
none, copy_data = on);For now we log a warning if the publisher has subscribed to the same
table from some other publisher.
However, in case of publication with publish_via_partition_root option,
we will not raise such warinigs
because SQL command in check_publications_origin() checks only directly
published tables.Yes, I agree that we are checking only the directly published tables
which is why there is no warning in this case. I'm working on a fix to
change the check_publications_origin to check accordingly.Attached patch has the fix for this issue which includes the partition
tables also for the publication now and throws a warning
appropriately.Regards,
Vignesh
Thanks for patch!
I think we must take into account whole inheritance tree of partitioned
table.
For example:
node_A:
CREATE TABLE t(id int);
CREATE PUBLICATION pub_b FOR TABLE t;
node_A:
CREATE TABLE t(id int) PARTITION BY RANGE(id);
CREATE TABLE part PARTITION OF t FOR VALUES FROM (0) TO (10) PARTITION
BY RANGE(id);
CREATE TABLE subpart PARTITION OF part FOR VALUES FROM (0) TO (5);
CREATE SUBSCRIPTION sub_c CONNECTION '$node_B_connstr' PUBLICATION pub_b;
CREATE PUBLICATION pub_t FOR TABLE t WITH (publish_via_partition_root);
CREATE PUBLICATION pub_part FOR TABLE part WITH
(publish_via_partition_root);
node_C:
-- this command will raise a warning CREATE SUBSCRIPTION sub_t
CONNECTION '$node_A_connstr' PUBLICATION pub_t WITH (origin = none,
copy_data = on);
DROP SUBSCRIPTION IF EXISTS sub_t;
-- here we got silence, but "part" is in tree of upper level replicated
table
CREATE SUBSCRIPTION sub_part CONNECTION '$node_A_connstr' PUBLICATION
pub_part WITH (origin = none, copy_data = on);
DROP SUBSCRIPTION IF EXISTS sub_part;
I think that for each partition/partitioned table in the publication we
can use something like
select relid from pg_partition_tree('part'::regclass)
union
select relid from pg_partition_ancestors('part'::regclass);
In this case we don't care about publish_via_partition_root option,
because we already check all inheritance tree, and there is no need to
change pg_class
What are you thinking about it?
On Sat, Jan 18, 2025 at 10:31 AM vignesh C <vignesh21@gmail.com> wrote:
Attached patch has the fix for this issue which includes the partition
tables also for the publication now and throws a warning
appropriately.
The corresponding query (see "To find which tables might potentially
include non-local origins .." on [1]https://www.postgresql.org/docs/devel/sql-createsubscription.html#SQL-CREATESUBSCRIPTION-NOTES) on the create_subscription doc
page.
*
@@ -1147,10 +1151,12 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
*schemarelids;
relids = GetPublicationRelations(pub_elem->oid,
+ allparttables ? PUBLICATION_PART_ALL :
pub_elem->pubviaroot ?
PUBLICATION_PART_ROOT :
PUBLICATION_PART_LEAF);
schemarelids = GetAllSchemaPublicationRelations(pub_elem->oid,
+ allparttables ? PUBLICATION_PART_ALL :
pub_elem->pubviaroot ?
PUBLICATION_PART_ROOT :
PUBLICATION_PART_LEAF);
Don't we need to add similar handling FOR ALL TABLES case? If not, why?
BTW, the proposed fix is not backpatcheable as it changes the catalog
which requires catversion bump. However, as this is a WARNING case, if
we can't find a fix that can't be backpatched, we can fix it in
HEAD-only.
[1]: https://www.postgresql.org/docs/devel/sql-createsubscription.html#SQL-CREATESUBSCRIPTION-NOTES
--
With Regards,
Amit Kapila.
On Sat, 18 Jan 2025 at 14:29, Sergey Tatarintsev
<s.tatarintsev@postgrespro.ru> wrote:
I think we must take into account whole inheritance tree of partitioned table.
For example:
node_A:
CREATE TABLE t(id int);
CREATE PUBLICATION pub_b FOR TABLE t;node_A:
CREATE TABLE t(id int) PARTITION BY RANGE(id);
CREATE TABLE part PARTITION OF t FOR VALUES FROM (0) TO (10) PARTITION BY RANGE(id);
CREATE TABLE subpart PARTITION OF part FOR VALUES FROM (0) TO (5);
CREATE SUBSCRIPTION sub_c CONNECTION '$node_B_connstr' PUBLICATION pub_b;
CREATE PUBLICATION pub_t FOR TABLE t WITH (publish_via_partition_root);
CREATE PUBLICATION pub_part FOR TABLE part WITH (publish_via_partition_root);node_C:
-- this command will raise a warning CREATE SUBSCRIPTION sub_t CONNECTION '$node_A_connstr' PUBLICATION pub_t WITH (origin = none, copy_data = on);
DROP SUBSCRIPTION IF EXISTS sub_t;
-- here we got silence, but "part" is in tree of upper level replicated table
CREATE SUBSCRIPTION sub_part CONNECTION '$node_A_connstr' PUBLICATION pub_part WITH (origin = none, copy_data = on);
DROP SUBSCRIPTION IF EXISTS sub_part;I think that for each partition/partitioned table in the publication we can use something like
select relid from pg_partition_tree('part'::regclass)
union
select relid from pg_partition_ancestors('part'::regclass);In this case we don't care about publish_via_partition_root option, because we already check all inheritance tree, and there is no need to change pg_class
What are you thinking about it?
Yes, we should include the ancestors of the table to handle the
scenario you mentioned. The attached patch has the changes which
includes the ancestors also while getting the tables for the table
publication and schema publication. And in case of all tables
publication, get all the tables.
Thoughts?
Regards,
Vignesh
Attachments:
v2-0001-Fix-origin-warning-not-thrown-for-publications-on.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Fix-origin-warning-not-thrown-for-publications-on.patchDownload
From f92d7daea28f46d687beb1ff83a2eb229837fe17 Mon Sep 17 00:00:00 2001
From: Vignesh <vignesh21@gmail.com>
Date: Sat, 18 Jan 2025 10:19:12 +0530
Subject: [PATCH v2 1/2] Fix origin warning not thrown for publications on
partition tables
When checking if a publisher had subscribed to the same table from a different
publisher, the check only considered tables directly specified for the
publication. It did not account for cases where the publication was present on
partition tables as well. This has been fixed by including all partition tables
associated with the publication in the check.
---
src/backend/catalog/pg_publication.c | 39 +++++++++++++++++++++----
src/backend/commands/subscriptioncmds.c | 2 +-
src/include/catalog/pg_proc.dat | 9 ++++++
3 files changed, 44 insertions(+), 6 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index b89098f5e9..bfdd0c7cc5 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -1089,12 +1089,16 @@ GetPublicationByName(const char *pubname, bool missing_ok)
}
/*
- * Get information of the tables in the given publication array.
+ * Helper function for SQL callables: pg_get_publication_tables and
+ * pg_get_publication_tables_with_partitions.
*
- * Returns pubid, relid, column list, row filter for each table.
+ * If allparttables is true, retrieves tables including all the partitions
+ * for the publication.
+ * If allparttables is false, retrieves tables based on the publication's
+ * pubviaroot option.
*/
-Datum
-pg_get_publication_tables(PG_FUNCTION_ARGS)
+static Datum
+pg_get_publication_tables_internal(FunctionCallInfo fcinfo, bool allparttables)
{
#define NUM_PUBLICATION_TABLES_ELEM 4
FuncCallContext *funcctx;
@@ -1147,10 +1151,12 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
*schemarelids;
relids = GetPublicationRelations(pub_elem->oid,
+ allparttables ? PUBLICATION_PART_ALL :
pub_elem->pubviaroot ?
PUBLICATION_PART_ROOT :
PUBLICATION_PART_LEAF);
schemarelids = GetAllSchemaPublicationRelations(pub_elem->oid,
+ allparttables ? PUBLICATION_PART_ALL :
pub_elem->pubviaroot ?
PUBLICATION_PART_ROOT :
PUBLICATION_PART_LEAF);
@@ -1187,7 +1193,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
* data of the child table to be double-published on the subscriber
* side.
*/
- if (viaroot)
+ if (!allparttables && viaroot)
filter_partitions(table_infos);
/* Construct a tuple descriptor for the result rows. */
@@ -1298,3 +1304,26 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
SRF_RETURN_DONE(funcctx);
}
+
+/*
+ * Get information of the tables in the given publication array.
+ *
+ * Returns pubid, relid, column list, row filter for each table.
+ */
+Datum
+pg_get_publication_tables(PG_FUNCTION_ARGS)
+{
+ return pg_get_publication_tables_internal(fcinfo, false);
+}
+
+/*
+ * Get information of the tables (including all the all partitions) in the
+ * given publication array.
+ *
+ * Returns pubid, relid, column list, row filter for each table.
+ */
+Datum
+pg_get_publication_tables_with_partitions(PG_FUNCTION_ARGS)
+{
+ return pg_get_publication_tables_internal(fcinfo, true);
+}
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 2d8a71ca1e..403b4fc918 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -2116,7 +2116,7 @@ check_publications_origin(WalReceiverConn *wrconn, List *publications,
appendStringInfoString(&cmd,
"SELECT DISTINCT P.pubname AS pubname\n"
"FROM pg_publication P,\n"
- " LATERAL pg_get_publication_tables(P.pubname) GPT\n"
+ " LATERAL pg_get_publication_tables_with_partitions(P.pubname) GPT\n"
" JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),\n"
" pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)\n"
"WHERE C.oid = GPT.relid AND P.pubname IN (");
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 18560755d2..349c6330b2 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12128,6 +12128,15 @@
proargmodes => '{v,o,o,o,o}',
proargnames => '{pubname,pubid,relid,attrs,qual}',
prosrc => 'pg_get_publication_tables' },
+{ oid => '8051',
+ descr => 'get information of the tables(including all partitions) that are part of the specified publications',
+ proname => 'pg_get_publication_tables_with_partitions', prorows => '1000',
+ provariadic => 'text', proretset => 't', provolatile => 's',
+ prorettype => 'record', proargtypes => '_text',
+ proallargtypes => '{_text,oid,oid,int2vector,pg_node_tree}',
+ proargmodes => '{v,o,o,o,o}',
+ proargnames => '{pubname,pubid,relid,attrs,qual}',
+ prosrc => 'pg_get_publication_tables_with_partitions' },
{ oid => '6121',
descr => 'returns whether a relation can be part of a publication',
proname => 'pg_relation_is_publishable', provolatile => 's',
--
2.43.0
v2-0002-Fix-origin-warning-not-thrown-for-publications-on.patchtext/x-patch; charset=US-ASCII; name=v2-0002-Fix-origin-warning-not-thrown-for-publications-on.patchDownload
From 2012cccb9e4638210def7f01ceb511a2f51a7f97 Mon Sep 17 00:00:00 2001
From: Vignesh <vignesh21@gmail.com>
Date: Mon, 20 Jan 2025 17:42:37 +0530
Subject: [PATCH v2 2/2] Fix origin warning not thrown for publications on
partition tables
When checking if a publisher had subscribed to the same table from a different
publisher, the check only considered tables directly specified for the
publication. It did not account for cases where the publication was present on
ancestor tables as well. This has been fixed by including all the
ancestor tables associated with the publication in the check.
---
src/backend/catalog/pg_publication.c | 69 ++++++++++++++++---------
src/backend/commands/publicationcmds.c | 15 +++---
src/backend/commands/subscriptioncmds.c | 2 +-
src/include/catalog/pg_proc.dat | 4 +-
src/include/catalog/pg_publication.h | 11 ++--
5 files changed, 65 insertions(+), 36 deletions(-)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index bfdd0c7cc5..a91400ff6c 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -720,7 +720,8 @@ publication_add_schema(Oid pubid, Oid schemaid, bool if_not_exists)
* partitions.
*/
schemaRels = GetSchemaPublicationRelations(schemaid,
- PUBLICATION_PART_ALL);
+ PUBLICATION_PART_ALL,
+ false);
InvalidatePublicationRels(schemaRels);
return myself;
@@ -755,9 +756,12 @@ GetRelationPublications(Oid relid)
*
* This should only be used FOR TABLE publications, the FOR ALL TABLES
* should use GetAllTablesPublicationRelations().
+ * If include_ancestors is true include all the ancestors for the partitioned
+ * table which could insert data to this partitioned table.
*/
List *
-GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt)
+GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt,
+ bool include_ancestors)
{
List *result;
Relation pubrelsrel;
@@ -784,6 +788,11 @@ GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt)
pubrel = (Form_pg_publication_rel) GETSTRUCT(tup);
result = GetPubPartitionOptionRelations(result, pub_partopt,
pubrel->prrelid);
+
+ if (get_rel_relkind(pubrel->prrelid) == RELKIND_PARTITIONED_TABLE &&
+ include_ancestors)
+ result = list_concat_unique_oid(result,
+ get_partition_ancestors(pubrel->prrelid));
}
systable_endscan(scan);
@@ -838,10 +847,11 @@ GetAllTablesPublications(void)
*
* If the publication publishes partition changes via their respective root
* partitioned tables, we must exclude partitions in favor of including the
- * root partitioned tables.
+ * root partitioned tables. If allrelatedrels is true get all the publishable
+ * tables i.e. include both partition and partitioned tables also.
*/
List *
-GetAllTablesPublicationRelations(bool pubviaroot)
+GetAllTablesPublicationRelations(bool pubviaroot, bool allrelatedrels)
{
Relation classRel;
ScanKeyData key[1];
@@ -864,13 +874,13 @@ GetAllTablesPublicationRelations(bool pubviaroot)
Oid relid = relForm->oid;
if (is_publishable_class(relid, relForm) &&
- !(relForm->relispartition && pubviaroot))
+ (allrelatedrels || !(relForm->relispartition && pubviaroot)))
result = lappend_oid(result, relid);
}
table_endscan(scan);
- if (pubviaroot)
+ if (allrelatedrels || pubviaroot)
{
ScanKeyInit(&key[0],
Anum_pg_class_relkind,
@@ -964,9 +974,12 @@ GetSchemaPublications(Oid schemaid)
/*
* Get the list of publishable relation oids for a specified schema.
+ * If include_ancestors is true include all the ancestors for the partitioned
+ * table which could insert data to this partitioned table.
*/
List *
-GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt)
+GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt,
+ bool include_ancestors)
{
Relation classRel;
ScanKeyData key[1];
@@ -1009,6 +1022,11 @@ GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt)
partitionrels = GetPubPartitionOptionRelations(partitionrels,
pub_partopt,
relForm->oid);
+
+ if (include_ancestors)
+ partitionrels = list_concat_unique_oid(partitionrels,
+ get_partition_ancestors(relForm->oid));
+
result = list_concat_unique_oid(result, partitionrels);
}
}
@@ -1023,7 +1041,8 @@ GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt)
* publication.
*/
List *
-GetAllSchemaPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt)
+GetAllSchemaPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt,
+ bool include_ancestors)
{
List *result = NIL;
List *pubschemalist = GetPublicationSchemas(pubid);
@@ -1034,7 +1053,8 @@ GetAllSchemaPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt)
Oid schemaid = lfirst_oid(cell);
List *schemaRels = NIL;
- schemaRels = GetSchemaPublicationRelations(schemaid, pub_partopt);
+ schemaRels = GetSchemaPublicationRelations(schemaid, pub_partopt,
+ include_ancestors);
result = list_concat(result, schemaRels);
}
@@ -1090,15 +1110,15 @@ GetPublicationByName(const char *pubname, bool missing_ok)
/*
* Helper function for SQL callables: pg_get_publication_tables and
- * pg_get_publication_tables_with_partitions.
+ * pg_get_publication_all_related_tables.
*
- * If allparttables is true, retrieves tables including all the partitions
- * for the publication.
- * If allparttables is false, retrieves tables based on the publication's
+ * If allrelatedrels is true, retrieves tables including all the partitions and
+ * the ancestors for the publication.
+ * If allrelatedrels is false, retrieves tables based on the publication's
* pubviaroot option.
*/
static Datum
-pg_get_publication_tables_internal(FunctionCallInfo fcinfo, bool allparttables)
+pg_get_publication_tables_internal(FunctionCallInfo fcinfo, bool allrelatedrels)
{
#define NUM_PUBLICATION_TABLES_ELEM 4
FuncCallContext *funcctx;
@@ -1144,22 +1164,25 @@ pg_get_publication_tables_internal(FunctionCallInfo fcinfo, bool allparttables)
* those. Otherwise, get the partitioned table itself.
*/
if (pub_elem->alltables)
- pub_elem_tables = GetAllTablesPublicationRelations(pub_elem->pubviaroot);
+ pub_elem_tables = GetAllTablesPublicationRelations(pub_elem->pubviaroot,
+ allrelatedrels);
else
{
List *relids,
*schemarelids;
relids = GetPublicationRelations(pub_elem->oid,
- allparttables ? PUBLICATION_PART_ALL :
+ allrelatedrels ? PUBLICATION_PART_ALL :
pub_elem->pubviaroot ?
PUBLICATION_PART_ROOT :
- PUBLICATION_PART_LEAF);
+ PUBLICATION_PART_LEAF,
+ allrelatedrels);
schemarelids = GetAllSchemaPublicationRelations(pub_elem->oid,
- allparttables ? PUBLICATION_PART_ALL :
+ allrelatedrels ? PUBLICATION_PART_ALL :
pub_elem->pubviaroot ?
PUBLICATION_PART_ROOT :
- PUBLICATION_PART_LEAF);
+ PUBLICATION_PART_LEAF,
+ allrelatedrels);
pub_elem_tables = list_concat_unique_oid(relids, schemarelids);
}
@@ -1193,7 +1216,7 @@ pg_get_publication_tables_internal(FunctionCallInfo fcinfo, bool allparttables)
* data of the child table to be double-published on the subscriber
* side.
*/
- if (!allparttables && viaroot)
+ if (!allrelatedrels && viaroot)
filter_partitions(table_infos);
/* Construct a tuple descriptor for the result rows. */
@@ -1317,13 +1340,13 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
}
/*
- * Get information of the tables (including all the all partitions) in the
- * given publication array.
+ * Get information of the tables (including all the partitions and the
+ * ancestors) in the given publication array.
*
* Returns pubid, relid, column list, row filter for each table.
*/
Datum
-pg_get_publication_tables_with_partitions(PG_FUNCTION_ARGS)
+pg_get_publication_all_related_tables(PG_FUNCTION_ARGS)
{
return pg_get_publication_tables_internal(fcinfo, true);
}
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 35747b3df5..a3bec7023c 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -957,7 +957,7 @@ AlterPublicationOptions(ParseState *pstate, AlterPublicationStmt *stmt,
AccessShareLock);
root_relids = GetPublicationRelations(pubform->oid,
- PUBLICATION_PART_ROOT);
+ PUBLICATION_PART_ROOT, false);
foreach(lc, root_relids)
{
@@ -1077,7 +1077,7 @@ AlterPublicationOptions(ParseState *pstate, AlterPublicationStmt *stmt,
*/
if (root_relids == NIL)
relids = GetPublicationRelations(pubform->oid,
- PUBLICATION_PART_ALL);
+ PUBLICATION_PART_ALL, false);
else
{
/*
@@ -1091,7 +1091,8 @@ AlterPublicationOptions(ParseState *pstate, AlterPublicationStmt *stmt,
}
schemarelids = GetAllSchemaPublicationRelations(pubform->oid,
- PUBLICATION_PART_ALL);
+ PUBLICATION_PART_ALL,
+ false);
relids = list_concat_unique_oid(relids, schemarelids);
InvalidatePublicationRels(relids);
@@ -1163,7 +1164,8 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
else /* AP_SetObjects */
{
List *oldrelids = GetPublicationRelations(pubid,
- PUBLICATION_PART_ROOT);
+ PUBLICATION_PART_ROOT,
+ false);
List *delrels = NIL;
ListCell *oldlc;
@@ -1314,7 +1316,8 @@ AlterPublicationSchemas(AlterPublicationStmt *stmt,
ListCell *lc;
List *reloids;
- reloids = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
+ reloids = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT,
+ false);
foreach(lc, reloids)
{
@@ -1575,7 +1578,7 @@ RemovePublicationSchemaById(Oid psoid)
* partitions.
*/
schemaRels = GetSchemaPublicationRelations(pubsch->pnnspid,
- PUBLICATION_PART_ALL);
+ PUBLICATION_PART_ALL, false);
InvalidatePublicationRels(schemaRels);
CatalogTupleDelete(rel, &tup->t_self);
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 403b4fc918..843de0ccf6 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -2116,7 +2116,7 @@ check_publications_origin(WalReceiverConn *wrconn, List *publications,
appendStringInfoString(&cmd,
"SELECT DISTINCT P.pubname AS pubname\n"
"FROM pg_publication P,\n"
- " LATERAL pg_get_publication_tables_with_partitions(P.pubname) GPT\n"
+ " LATERAL pg_get_publication_all_related_tables(P.pubname) GPT\n"
" JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),\n"
" pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)\n"
"WHERE C.oid = GPT.relid AND P.pubname IN (");
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 349c6330b2..083ceee4e5 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12130,13 +12130,13 @@
prosrc => 'pg_get_publication_tables' },
{ oid => '8051',
descr => 'get information of the tables(including all partitions) that are part of the specified publications',
- proname => 'pg_get_publication_tables_with_partitions', prorows => '1000',
+ proname => 'pg_get_publication_all_related_tables', prorows => '1000',
provariadic => 'text', proretset => 't', provolatile => 's',
prorettype => 'record', proargtypes => '_text',
proallargtypes => '{_text,oid,oid,int2vector,pg_node_tree}',
proargmodes => '{v,o,o,o,o}',
proargnames => '{pubname,pubid,relid,attrs,qual}',
- prosrc => 'pg_get_publication_tables_with_partitions' },
+ prosrc => 'pg_get_publication_all_related_tables' },
{ oid => '6121',
descr => 'returns whether a relation can be part of a publication',
proname => 'pg_relation_is_publishable', provolatile => 's',
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 3c2ae2a960..3ce7e6e816 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -144,15 +144,18 @@ typedef enum PublicationPartOpt
PUBLICATION_PART_ALL,
} PublicationPartOpt;
-extern List *GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt);
+extern List *GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt,
+ bool include_ancestors);
extern List *GetAllTablesPublications(void);
-extern List *GetAllTablesPublicationRelations(bool pubviaroot);
+extern List *GetAllTablesPublicationRelations(bool pubviaroot, bool allrelatedrels);
extern List *GetPublicationSchemas(Oid pubid);
extern List *GetSchemaPublications(Oid schemaid);
extern List *GetSchemaPublicationRelations(Oid schemaid,
- PublicationPartOpt pub_partopt);
+ PublicationPartOpt pub_partopt,
+ bool include_ancestors);
extern List *GetAllSchemaPublicationRelations(Oid pubid,
- PublicationPartOpt pub_partopt);
+ PublicationPartOpt pub_partopt,
+ bool include_ancestors);
extern List *GetPubPartitionOptionRelations(List *result,
PublicationPartOpt pub_partopt,
Oid relid);
--
2.43.0
On Mon, 20 Jan 2025 at 17:31, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Jan 18, 2025 at 10:31 AM vignesh C <vignesh21@gmail.com> wrote:
Attached patch has the fix for this issue which includes the partition
tables also for the publication now and throws a warning
appropriately.The corresponding query (see "To find which tables might potentially
include non-local origins .." on [1]) on the create_subscription doc
page.
Modified this too
*
@@ -1147,10 +1151,12 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
*schemarelids;relids = GetPublicationRelations(pub_elem->oid,
+ allparttables ? PUBLICATION_PART_ALL :
pub_elem->pubviaroot ?
PUBLICATION_PART_ROOT :
PUBLICATION_PART_LEAF);
schemarelids = GetAllSchemaPublicationRelations(pub_elem->oid,
+ allparttables ? PUBLICATION_PART_ALL :
pub_elem->pubviaroot ?
PUBLICATION_PART_ROOT :
PUBLICATION_PART_LEAF);Don't we need to add similar handling FOR ALL TABLES case? If not, why?
Yes, it is required. Modified
BTW, the proposed fix is not backpatcheable as it changes the catalog
which requires catversion bump. However, as this is a WARNING case, if
we can't find a fix that can't be backpatched, we can fix it in
HEAD-only.
I could not find a way to fix the back version without changing the
catalog version.
The attached v3 version has the changes for the same.
Regards,
Vignesh
Attachments:
v3-0001-Fix-origin-warning-not-thrown-for-publications-on.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Fix-origin-warning-not-thrown-for-publications-on.patchDownload
From 6ba6ff02154f72eae1d1fe52bb0e853311abffe0 Mon Sep 17 00:00:00 2001
From: Vignesh <vignesh21@gmail.com>
Date: Sat, 18 Jan 2025 10:19:12 +0530
Subject: [PATCH v3 1/2] Fix origin warning not thrown for publications on
partition tables
When checking if a publisher had subscribed to the same table from a different
publisher, the check only considered tables directly specified for the
publication. It did not account for cases where the publication was present on
partition tables as well. This has been fixed by including all partition tables
associated with the publication in the check.
---
doc/src/sgml/ref/create_subscription.sgml | 12 +++---
src/backend/catalog/pg_publication.c | 51 ++++++++++++++++++-----
src/backend/commands/subscriptioncmds.c | 2 +-
src/include/catalog/pg_proc.dat | 9 ++++
src/include/catalog/pg_publication.h | 2 +-
5 files changed, 58 insertions(+), 18 deletions(-)
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 6cf7d4f9a1..37b1b1cbff 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -533,14 +533,14 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
other subscriptions created on the publisher) try this SQL query:
<programlisting>
# substitute <pub-names> below with your publication name(s) to be queried
-SELECT DISTINCT PT.schemaname, PT.tablename
-FROM pg_publication_tables PT,
- pg_subscription_rel PS
+SELECT DISTINCT N.nspname, C.relname
+FROM pg_publication P,
+ LATERAL pg_get_publication_tables_with_partitions(P.pubname) GPT
+ JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid)
JOIN pg_class C ON (C.oid = PS.srrelid)
JOIN pg_namespace N ON (N.oid = C.relnamespace)
-WHERE N.nspname = PT.schemaname AND
- C.relname = PT.tablename AND
- PT.pubname IN (<pub-names>);
+WHERE C.oid = GPT.relid AND
+ P.pubname IN (<pub-names>);
</programlisting></para>
</refsect1>
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index b89098f5e9..77fcf14164 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -838,10 +838,11 @@ GetAllTablesPublications(void)
*
* If the publication publishes partition changes via their respective root
* partitioned tables, we must exclude partitions in favor of including the
- * root partitioned tables.
+ * root partitioned tables. If allrelatedrels is true get all the publishable
+ * tables i.e. include both partition and partitioned tables also.
*/
List *
-GetAllTablesPublicationRelations(bool pubviaroot)
+GetAllTablesPublicationRelations(bool pubviaroot, bool allrelatedrels)
{
Relation classRel;
ScanKeyData key[1];
@@ -864,13 +865,13 @@ GetAllTablesPublicationRelations(bool pubviaroot)
Oid relid = relForm->oid;
if (is_publishable_class(relid, relForm) &&
- !(relForm->relispartition && pubviaroot))
+ (allrelatedrels || !(relForm->relispartition && pubviaroot)))
result = lappend_oid(result, relid);
}
table_endscan(scan);
- if (pubviaroot)
+ if (allrelatedrels || pubviaroot)
{
ScanKeyInit(&key[0],
Anum_pg_class_relkind,
@@ -1089,12 +1090,16 @@ GetPublicationByName(const char *pubname, bool missing_ok)
}
/*
- * Get information of the tables in the given publication array.
+ * Helper function for SQL callables: pg_get_publication_tables and
+ * pg_get_publication_tables_with_partitions.
*
- * Returns pubid, relid, column list, row filter for each table.
+ * If allparttables is true, retrieves tables including all the partitions
+ * for the publication.
+ * If allparttables is false, retrieves tables based on the publication's
+ * pubviaroot option.
*/
-Datum
-pg_get_publication_tables(PG_FUNCTION_ARGS)
+static Datum
+pg_get_publication_tables_internal(FunctionCallInfo fcinfo, bool allparttables)
{
#define NUM_PUBLICATION_TABLES_ELEM 4
FuncCallContext *funcctx;
@@ -1140,17 +1145,20 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
* those. Otherwise, get the partitioned table itself.
*/
if (pub_elem->alltables)
- pub_elem_tables = GetAllTablesPublicationRelations(pub_elem->pubviaroot);
+ pub_elem_tables = GetAllTablesPublicationRelations(pub_elem->pubviaroot,
+ allparttables);
else
{
List *relids,
*schemarelids;
relids = GetPublicationRelations(pub_elem->oid,
+ allparttables ? PUBLICATION_PART_ALL :
pub_elem->pubviaroot ?
PUBLICATION_PART_ROOT :
PUBLICATION_PART_LEAF);
schemarelids = GetAllSchemaPublicationRelations(pub_elem->oid,
+ allparttables ? PUBLICATION_PART_ALL :
pub_elem->pubviaroot ?
PUBLICATION_PART_ROOT :
PUBLICATION_PART_LEAF);
@@ -1187,7 +1195,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
* data of the child table to be double-published on the subscriber
* side.
*/
- if (viaroot)
+ if (!allparttables && viaroot)
filter_partitions(table_infos);
/* Construct a tuple descriptor for the result rows. */
@@ -1298,3 +1306,26 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
SRF_RETURN_DONE(funcctx);
}
+
+/*
+ * Get information of the tables in the given publication array.
+ *
+ * Returns pubid, relid, column list, row filter for each table.
+ */
+Datum
+pg_get_publication_tables(PG_FUNCTION_ARGS)
+{
+ return pg_get_publication_tables_internal(fcinfo, false);
+}
+
+/*
+ * Get information of the tables (including all the all partitions) in the
+ * given publication array.
+ *
+ * Returns pubid, relid, column list, row filter for each table.
+ */
+Datum
+pg_get_publication_tables_with_partitions(PG_FUNCTION_ARGS)
+{
+ return pg_get_publication_tables_internal(fcinfo, true);
+}
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 2d8a71ca1e..403b4fc918 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -2116,7 +2116,7 @@ check_publications_origin(WalReceiverConn *wrconn, List *publications,
appendStringInfoString(&cmd,
"SELECT DISTINCT P.pubname AS pubname\n"
"FROM pg_publication P,\n"
- " LATERAL pg_get_publication_tables(P.pubname) GPT\n"
+ " LATERAL pg_get_publication_tables_with_partitions(P.pubname) GPT\n"
" JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),\n"
" pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)\n"
"WHERE C.oid = GPT.relid AND P.pubname IN (");
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 18560755d2..8866d0e253 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12128,6 +12128,15 @@
proargmodes => '{v,o,o,o,o}',
proargnames => '{pubname,pubid,relid,attrs,qual}',
prosrc => 'pg_get_publication_tables' },
+{ oid => '8051',
+ descr => 'get information of the tables(including all partitions) that are part of the specified publications',
+ proname => 'pg_get_publication_tables_with_partitions', prorows => '1000',
+ provariadic => 'text', proretset => 't', provolatile => 's',
+ prorettype => 'record', proargtypes => '_text',
+ proallargtypes => '{_text,oid,oid,int2vector,pg_node_tree}',
+ proargmodes => '{v,o,o,o,o}',
+ proargnames => '{pubname,pubid,relid,attrs,qual}',
+ prosrc => 'pg_get_publication_tables_with_partitions' },
{ oid => '6121',
descr => 'returns whether a relation can be part of a publication',
proname => 'pg_relation_is_publishable', provolatile => 's',
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index 3c2ae2a960..da34daf791 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -146,7 +146,7 @@ typedef enum PublicationPartOpt
extern List *GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt);
extern List *GetAllTablesPublications(void);
-extern List *GetAllTablesPublicationRelations(bool pubviaroot);
+extern List *GetAllTablesPublicationRelations(bool pubviaroot, bool allrelatedrels);
extern List *GetPublicationSchemas(Oid pubid);
extern List *GetSchemaPublications(Oid schemaid);
extern List *GetSchemaPublicationRelations(Oid schemaid,
--
2.43.0
v3-0002-Fix-origin-warning-not-thrown-for-publications-on.patchtext/x-patch; charset=US-ASCII; name=v3-0002-Fix-origin-warning-not-thrown-for-publications-on.patchDownload
From 0262ec63bc71e3ade973cc58bc229edbc9766e7f Mon Sep 17 00:00:00 2001
From: Vignesh <vignesh21@gmail.com>
Date: Mon, 20 Jan 2025 21:52:24 +0530
Subject: [PATCH v3 2/2] Fix origin warning not thrown for publications on
partition tables
When checking if a publisher had subscribed to the same table from a different
publisher, the check only considered tables directly specified for the
publication. It did not account for cases where the publication was present on
ancestor tables as well. This has been fixed by including all the
ancestor tables associated with the publication in the check.
---
doc/src/sgml/ref/create_subscription.sgml | 2 +-
src/backend/catalog/pg_publication.c | 59 +++++++++++++++--------
src/backend/commands/publicationcmds.c | 15 +++---
src/backend/commands/subscriptioncmds.c | 2 +-
src/include/catalog/pg_proc.dat | 4 +-
src/include/catalog/pg_publication.h | 9 ++--
6 files changed, 59 insertions(+), 32 deletions(-)
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 37b1b1cbff..e9cbded2ec 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -535,7 +535,7 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
# substitute <pub-names> below with your publication name(s) to be queried
SELECT DISTINCT N.nspname, C.relname
FROM pg_publication P,
- LATERAL pg_get_publication_tables_with_partitions(P.pubname) GPT
+ LATERAL pg_get_publication_all_related_tables(P.pubname) GPT
JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid)
JOIN pg_class C ON (C.oid = PS.srrelid)
JOIN pg_namespace N ON (N.oid = C.relnamespace)
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 77fcf14164..a91400ff6c 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -720,7 +720,8 @@ publication_add_schema(Oid pubid, Oid schemaid, bool if_not_exists)
* partitions.
*/
schemaRels = GetSchemaPublicationRelations(schemaid,
- PUBLICATION_PART_ALL);
+ PUBLICATION_PART_ALL,
+ false);
InvalidatePublicationRels(schemaRels);
return myself;
@@ -755,9 +756,12 @@ GetRelationPublications(Oid relid)
*
* This should only be used FOR TABLE publications, the FOR ALL TABLES
* should use GetAllTablesPublicationRelations().
+ * If include_ancestors is true include all the ancestors for the partitioned
+ * table which could insert data to this partitioned table.
*/
List *
-GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt)
+GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt,
+ bool include_ancestors)
{
List *result;
Relation pubrelsrel;
@@ -784,6 +788,11 @@ GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt)
pubrel = (Form_pg_publication_rel) GETSTRUCT(tup);
result = GetPubPartitionOptionRelations(result, pub_partopt,
pubrel->prrelid);
+
+ if (get_rel_relkind(pubrel->prrelid) == RELKIND_PARTITIONED_TABLE &&
+ include_ancestors)
+ result = list_concat_unique_oid(result,
+ get_partition_ancestors(pubrel->prrelid));
}
systable_endscan(scan);
@@ -965,9 +974,12 @@ GetSchemaPublications(Oid schemaid)
/*
* Get the list of publishable relation oids for a specified schema.
+ * If include_ancestors is true include all the ancestors for the partitioned
+ * table which could insert data to this partitioned table.
*/
List *
-GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt)
+GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt,
+ bool include_ancestors)
{
Relation classRel;
ScanKeyData key[1];
@@ -1010,6 +1022,11 @@ GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt)
partitionrels = GetPubPartitionOptionRelations(partitionrels,
pub_partopt,
relForm->oid);
+
+ if (include_ancestors)
+ partitionrels = list_concat_unique_oid(partitionrels,
+ get_partition_ancestors(relForm->oid));
+
result = list_concat_unique_oid(result, partitionrels);
}
}
@@ -1024,7 +1041,8 @@ GetSchemaPublicationRelations(Oid schemaid, PublicationPartOpt pub_partopt)
* publication.
*/
List *
-GetAllSchemaPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt)
+GetAllSchemaPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt,
+ bool include_ancestors)
{
List *result = NIL;
List *pubschemalist = GetPublicationSchemas(pubid);
@@ -1035,7 +1053,8 @@ GetAllSchemaPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt)
Oid schemaid = lfirst_oid(cell);
List *schemaRels = NIL;
- schemaRels = GetSchemaPublicationRelations(schemaid, pub_partopt);
+ schemaRels = GetSchemaPublicationRelations(schemaid, pub_partopt,
+ include_ancestors);
result = list_concat(result, schemaRels);
}
@@ -1091,15 +1110,15 @@ GetPublicationByName(const char *pubname, bool missing_ok)
/*
* Helper function for SQL callables: pg_get_publication_tables and
- * pg_get_publication_tables_with_partitions.
+ * pg_get_publication_all_related_tables.
*
- * If allparttables is true, retrieves tables including all the partitions
- * for the publication.
- * If allparttables is false, retrieves tables based on the publication's
+ * If allrelatedrels is true, retrieves tables including all the partitions and
+ * the ancestors for the publication.
+ * If allrelatedrels is false, retrieves tables based on the publication's
* pubviaroot option.
*/
static Datum
-pg_get_publication_tables_internal(FunctionCallInfo fcinfo, bool allparttables)
+pg_get_publication_tables_internal(FunctionCallInfo fcinfo, bool allrelatedrels)
{
#define NUM_PUBLICATION_TABLES_ELEM 4
FuncCallContext *funcctx;
@@ -1146,22 +1165,24 @@ pg_get_publication_tables_internal(FunctionCallInfo fcinfo, bool allparttables)
*/
if (pub_elem->alltables)
pub_elem_tables = GetAllTablesPublicationRelations(pub_elem->pubviaroot,
- allparttables);
+ allrelatedrels);
else
{
List *relids,
*schemarelids;
relids = GetPublicationRelations(pub_elem->oid,
- allparttables ? PUBLICATION_PART_ALL :
+ allrelatedrels ? PUBLICATION_PART_ALL :
pub_elem->pubviaroot ?
PUBLICATION_PART_ROOT :
- PUBLICATION_PART_LEAF);
+ PUBLICATION_PART_LEAF,
+ allrelatedrels);
schemarelids = GetAllSchemaPublicationRelations(pub_elem->oid,
- allparttables ? PUBLICATION_PART_ALL :
+ allrelatedrels ? PUBLICATION_PART_ALL :
pub_elem->pubviaroot ?
PUBLICATION_PART_ROOT :
- PUBLICATION_PART_LEAF);
+ PUBLICATION_PART_LEAF,
+ allrelatedrels);
pub_elem_tables = list_concat_unique_oid(relids, schemarelids);
}
@@ -1195,7 +1216,7 @@ pg_get_publication_tables_internal(FunctionCallInfo fcinfo, bool allparttables)
* data of the child table to be double-published on the subscriber
* side.
*/
- if (!allparttables && viaroot)
+ if (!allrelatedrels && viaroot)
filter_partitions(table_infos);
/* Construct a tuple descriptor for the result rows. */
@@ -1319,13 +1340,13 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
}
/*
- * Get information of the tables (including all the all partitions) in the
- * given publication array.
+ * Get information of the tables (including all the partitions and the
+ * ancestors) in the given publication array.
*
* Returns pubid, relid, column list, row filter for each table.
*/
Datum
-pg_get_publication_tables_with_partitions(PG_FUNCTION_ARGS)
+pg_get_publication_all_related_tables(PG_FUNCTION_ARGS)
{
return pg_get_publication_tables_internal(fcinfo, true);
}
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 35747b3df5..a3bec7023c 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -957,7 +957,7 @@ AlterPublicationOptions(ParseState *pstate, AlterPublicationStmt *stmt,
AccessShareLock);
root_relids = GetPublicationRelations(pubform->oid,
- PUBLICATION_PART_ROOT);
+ PUBLICATION_PART_ROOT, false);
foreach(lc, root_relids)
{
@@ -1077,7 +1077,7 @@ AlterPublicationOptions(ParseState *pstate, AlterPublicationStmt *stmt,
*/
if (root_relids == NIL)
relids = GetPublicationRelations(pubform->oid,
- PUBLICATION_PART_ALL);
+ PUBLICATION_PART_ALL, false);
else
{
/*
@@ -1091,7 +1091,8 @@ AlterPublicationOptions(ParseState *pstate, AlterPublicationStmt *stmt,
}
schemarelids = GetAllSchemaPublicationRelations(pubform->oid,
- PUBLICATION_PART_ALL);
+ PUBLICATION_PART_ALL,
+ false);
relids = list_concat_unique_oid(relids, schemarelids);
InvalidatePublicationRels(relids);
@@ -1163,7 +1164,8 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
else /* AP_SetObjects */
{
List *oldrelids = GetPublicationRelations(pubid,
- PUBLICATION_PART_ROOT);
+ PUBLICATION_PART_ROOT,
+ false);
List *delrels = NIL;
ListCell *oldlc;
@@ -1314,7 +1316,8 @@ AlterPublicationSchemas(AlterPublicationStmt *stmt,
ListCell *lc;
List *reloids;
- reloids = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
+ reloids = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT,
+ false);
foreach(lc, reloids)
{
@@ -1575,7 +1578,7 @@ RemovePublicationSchemaById(Oid psoid)
* partitions.
*/
schemaRels = GetSchemaPublicationRelations(pubsch->pnnspid,
- PUBLICATION_PART_ALL);
+ PUBLICATION_PART_ALL, false);
InvalidatePublicationRels(schemaRels);
CatalogTupleDelete(rel, &tup->t_self);
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 403b4fc918..843de0ccf6 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -2116,7 +2116,7 @@ check_publications_origin(WalReceiverConn *wrconn, List *publications,
appendStringInfoString(&cmd,
"SELECT DISTINCT P.pubname AS pubname\n"
"FROM pg_publication P,\n"
- " LATERAL pg_get_publication_tables_with_partitions(P.pubname) GPT\n"
+ " LATERAL pg_get_publication_all_related_tables(P.pubname) GPT\n"
" JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),\n"
" pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)\n"
"WHERE C.oid = GPT.relid AND P.pubname IN (");
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 8866d0e253..083ceee4e5 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12130,13 +12130,13 @@
prosrc => 'pg_get_publication_tables' },
{ oid => '8051',
descr => 'get information of the tables(including all partitions) that are part of the specified publications',
- proname => 'pg_get_publication_tables_with_partitions', prorows => '1000',
+ proname => 'pg_get_publication_all_related_tables', prorows => '1000',
provariadic => 'text', proretset => 't', provolatile => 's',
prorettype => 'record', proargtypes => '_text',
proallargtypes => '{_text,oid,oid,int2vector,pg_node_tree}',
proargmodes => '{v,o,o,o,o}',
proargnames => '{pubname,pubid,relid,attrs,qual}',
- prosrc => 'pg_get_publication_tables_with_partitions' },
+ prosrc => 'pg_get_publication_all_related_tables' },
{ oid => '6121',
descr => 'returns whether a relation can be part of a publication',
proname => 'pg_relation_is_publishable', provolatile => 's',
diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h
index da34daf791..3ce7e6e816 100644
--- a/src/include/catalog/pg_publication.h
+++ b/src/include/catalog/pg_publication.h
@@ -144,15 +144,18 @@ typedef enum PublicationPartOpt
PUBLICATION_PART_ALL,
} PublicationPartOpt;
-extern List *GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt);
+extern List *GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt,
+ bool include_ancestors);
extern List *GetAllTablesPublications(void);
extern List *GetAllTablesPublicationRelations(bool pubviaroot, bool allrelatedrels);
extern List *GetPublicationSchemas(Oid pubid);
extern List *GetSchemaPublications(Oid schemaid);
extern List *GetSchemaPublicationRelations(Oid schemaid,
- PublicationPartOpt pub_partopt);
+ PublicationPartOpt pub_partopt,
+ bool include_ancestors);
extern List *GetAllSchemaPublicationRelations(Oid pubid,
- PublicationPartOpt pub_partopt);
+ PublicationPartOpt pub_partopt,
+ bool include_ancestors);
extern List *GetPubPartitionOptionRelations(List *result,
PublicationPartOpt pub_partopt,
Oid relid);
--
2.43.0
On Tuesday, January 21, 2025 1:31 AM vignesh C <vignesh21@gmail.com> wrote:
Hi,
On Mon, 20 Jan 2025 at 17:31, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Jan 18, 2025 at 10:31 AM vignesh C <vignesh21@gmail.com> wrote:
Attached patch has the fix for this issue which includes the
partition tables also for the publication now and throws a warning
appropriately.The corresponding query (see "To find which tables might potentially
include non-local origins .." on [1]) on the create_subscription doc
page.BTW, the proposed fix is not backpatcheable as it changes the catalog
which requires catversion bump. However, as this is a WARNING case, if
we can't find a fix that can't be backpatched, we can fix it in
HEAD-only.I could not find a way to fix the back version without changing the catalog
version.The attached v3 version has the changes for the same.
Thanks for the patch.
I agree that covering the partitioned table case when checking the non-local
origin data on publisher is an improvement. But I think adding or extending the
SQL functions may not be the appropriate way to fix because the new functions
cannot be used in older PG version and is also not backpatchable.
I am thinking it would be better to use the existing pg_partition_ancestors()
and pg_partition_tree() to verify the same, which can be used in all supported
PG versions and is also backpatchable.
And here is another version which fixed the issue like that. I have not added
tests for it, but I think it's doable to write the something like the testcases
provided by Sergey. This patch does not fix the foreign tabel as that seems to
be a separate issue which can be fixed independtly.
Hi Sergey, if you have the time, could you please verify whether this patch
resolves the partition issue you reported? I've confirmed that it passes the
partitioned tests in the scripts, but I would appreciate your confirmation for
the same.
Best Regards,
Hou zj
Attachments:
v4-0001-Improve-logging-for-data-origin-discrepancies-in-.patchapplication/octet-stream; name=v4-0001-Improve-logging-for-data-origin-discrepancies-in-.patchDownload
From 6c8bfd258bddda371437c52e7f7d9dd9d4b67a0b Mon Sep 17 00:00:00 2001
From: Hou Zhijie <houzj.fnst@cn.fujitsu.com>
Date: Tue, 21 Jan 2025 16:18:34 +0800
Subject: [PATCH v4] Improve logging for data origin discrepancies in table
synchronization
Previously, a WARNING was issued during initial table synchronization with
origin=NONE only when the publisher subscribed to the same table from
other publishers, indicating potential data origination from different origins.
However, it's possible for the publisher to subscribe to the partition
ancestors or partition children of the table from other publishers, which could
also result in mixed-origin data inclusion.
This patch expands the check to consider both the subscribed table's ancestors
and children. A WARNING will now be logged if any of these related tables could
contain data originating from different sources.
---
doc/src/sgml/ref/create_subscription.sgml | 14 ++++++++------
src/backend/commands/subscriptioncmds.c | 15 +++++++++------
2 files changed, 17 insertions(+), 12 deletions(-)
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index c6d87255a39..2a348967dad 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -603,13 +603,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<programlisting>
# substitute <pub-names> below with your publication name(s) to be queried
SELECT DISTINCT PT.schemaname, PT.tablename
-FROM pg_publication_tables PT,
+FROM pg_publication_tables PT
+ JOIN pg_class C ON (C.relname = PT.tablename)
+ JOIN pg_namespace N ON (N.nspname = PT.schemaname),
pg_subscription_rel PS
- JOIN pg_class C ON (C.oid = PS.srrelid)
- JOIN pg_namespace N ON (N.oid = C.relnamespace)
-WHERE N.nspname = PT.schemaname AND
- C.relname = PT.tablename AND
- PT.pubname IN (<pub-names>);
+WHERE C.relnamespace = N.oid AND
+ PT.pubname IN (<pub-names>) AND
+ (PS.srrelid = C.oid OR
+ C.oid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION
+ SELECT relid FROM pg_partition_tree(PS.srrelid)));
</programlisting></para>
</refsect1>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 13199740320..a87a9f5bfb7 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -2181,11 +2181,12 @@ AlterSubscriptionOwner_oid(Oid subid, Oid newOwnerId)
}
/*
- * Check and log a warning if the publisher has subscribed to the same table
- * from some other publisher. This check is required only if "copy_data = true"
- * and "origin = none" for CREATE SUBSCRIPTION and
- * ALTER SUBSCRIPTION ... REFRESH statements to notify the user that data
- * having origin might have been copied.
+ * Check and log a warning if the publisher has subscribed to the same table,
+ * its partition ancestors (if it's a partition), or its partition children (if
+ * it's a partitioned table), from some other publishers. This check is
+ * required only if "copy_data = true" and "origin = none" for CREATE
+ * SUBSCRIPTION and ALTER SUBSCRIPTION ... REFRESH statements to notify the
+ * user that data having origin might have been copied.
*
* This check need not be performed on the tables that are already added
* because incremental sync for those tables will happen through WAL and the
@@ -2215,7 +2216,9 @@ check_publications_origin(WalReceiverConn *wrconn, List *publications,
"SELECT DISTINCT P.pubname AS pubname\n"
"FROM pg_publication P,\n"
" LATERAL pg_get_publication_tables(P.pubname) GPT\n"
- " JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),\n"
+ " JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid OR"
+ " GPT.relid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION"
+ " SELECT relid FROM pg_partition_tree(PS.srrelid))),\n"
" pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)\n"
"WHERE C.oid = GPT.relid AND P.pubname IN (");
GetPublicationsStr(publications, &cmd, true);
--
2.30.0.windows.2
On Wed, 22 Jan 2025 at 09:00, Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
On Tuesday, January 21, 2025 1:31 AM vignesh C <vignesh21@gmail.com> wrote:
Hi,
On Mon, 20 Jan 2025 at 17:31, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Jan 18, 2025 at 10:31 AM vignesh C <vignesh21@gmail.com> wrote:
Attached patch has the fix for this issue which includes the
partition tables also for the publication now and throws a warning
appropriately.The corresponding query (see "To find which tables might potentially
include non-local origins .." on [1]) on the create_subscription doc
page.BTW, the proposed fix is not backpatcheable as it changes the catalog
which requires catversion bump. However, as this is a WARNING case, if
we can't find a fix that can't be backpatched, we can fix it in
HEAD-only.I could not find a way to fix the back version without changing the catalog
version.The attached v3 version has the changes for the same.
Thanks for the patch.
I agree that covering the partitioned table case when checking the non-local
origin data on publisher is an improvement. But I think adding or extending the
SQL functions may not be the appropriate way to fix because the new functions
cannot be used in older PG version and is also not backpatchable.I am thinking it would be better to use the existing pg_partition_ancestors()
and pg_partition_tree() to verify the same, which can be used in all supported
PG versions and is also backpatchable.And here is another version which fixed the issue like that. I have not added
tests for it, but I think it's doable to write the something like the testcases
provided by Sergey. This patch does not fix the foreign tabel as that seems to
be a separate issue which can be fixed independtly.Hi Sergey, if you have the time, could you please verify whether this patch
resolves the partition issue you reported? I've confirmed that it passes the
partitioned tests in the scripts, but I would appreciate your confirmation for
the same.
Hi Hou-san,
I have created a patch to add a test for the patch.
v5-0001 : same as v4-0001
v5-0002: adds the testcase
Thanks and Regards,
Shlok Kyal
Attachments:
v5-0001-Improve-logging-for-data-origin-discrepancies-in-.patchapplication/octet-stream; name=v5-0001-Improve-logging-for-data-origin-discrepancies-in-.patchDownload
From f95ddc20756f93d4a9bec1fa1edb9da77cd26a6b Mon Sep 17 00:00:00 2001
From: Hou Zhijie <houzj.fnst@cn.fujitsu.com>
Date: Tue, 21 Jan 2025 16:18:34 +0800
Subject: [PATCH v5 1/2] Improve logging for data origin discrepancies in table
synchronization
Previously, a WARNING was issued during initial table synchronization with
origin=NONE only when the publisher subscribed to the same table from
other publishers, indicating potential data origination from different origins.
However, it's possible for the publisher to subscribe to the partition
ancestors or partition children of the table from other publishers, which could
also result in mixed-origin data inclusion.
This patch expands the check to consider both the subscribed table's ancestors
and children. A WARNING will now be logged if any of these related tables could
contain data originating from different sources.
---
doc/src/sgml/ref/create_subscription.sgml | 14 ++++++++------
src/backend/commands/subscriptioncmds.c | 15 +++++++++------
2 files changed, 17 insertions(+), 12 deletions(-)
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 6cf7d4f9a1..8bee9f88a4 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -534,13 +534,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<programlisting>
# substitute <pub-names> below with your publication name(s) to be queried
SELECT DISTINCT PT.schemaname, PT.tablename
-FROM pg_publication_tables PT,
+FROM pg_publication_tables PT
+ JOIN pg_class C ON (C.relname = PT.tablename)
+ JOIN pg_namespace N ON (N.nspname = PT.schemaname),
pg_subscription_rel PS
- JOIN pg_class C ON (C.oid = PS.srrelid)
- JOIN pg_namespace N ON (N.oid = C.relnamespace)
-WHERE N.nspname = PT.schemaname AND
- C.relname = PT.tablename AND
- PT.pubname IN (<pub-names>);
+WHERE C.relnamespace = N.oid AND
+ PT.pubname IN (<pub-names>) AND
+ (PS.srrelid = C.oid OR
+ C.oid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION
+ SELECT relid FROM pg_partition_tree(PS.srrelid)));
</programlisting></para>
</refsect1>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 2d8a71ca1e..4aec73bcc6 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -2083,11 +2083,12 @@ AlterSubscriptionOwner_oid(Oid subid, Oid newOwnerId)
}
/*
- * Check and log a warning if the publisher has subscribed to the same table
- * from some other publisher. This check is required only if "copy_data = true"
- * and "origin = none" for CREATE SUBSCRIPTION and
- * ALTER SUBSCRIPTION ... REFRESH statements to notify the user that data
- * having origin might have been copied.
+ * Check and log a warning if the publisher has subscribed to the same table,
+ * its partition ancestors (if it's a partition), or its partition children (if
+ * it's a partitioned table), from some other publishers. This check is
+ * required only if "copy_data = true" and "origin = none" for CREATE
+ * SUBSCRIPTION and ALTER SUBSCRIPTION ... REFRESH statements to notify the
+ * user that data having origin might have been copied.
*
* This check need not be performed on the tables that are already added
* because incremental sync for those tables will happen through WAL and the
@@ -2117,7 +2118,9 @@ check_publications_origin(WalReceiverConn *wrconn, List *publications,
"SELECT DISTINCT P.pubname AS pubname\n"
"FROM pg_publication P,\n"
" LATERAL pg_get_publication_tables(P.pubname) GPT\n"
- " JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),\n"
+ " JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid OR"
+ " GPT.relid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION"
+ " SELECT relid FROM pg_partition_tree(PS.srrelid))),\n"
" pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)\n"
"WHERE C.oid = GPT.relid AND P.pubname IN (");
GetPublicationsStr(publications, &cmd, true);
--
2.34.1
v5-0002-Test-for-Improve-logging-for-data-origin-discrepa.patchapplication/octet-stream; name=v5-0002-Test-for-Improve-logging-for-data-origin-discrepa.patchDownload
From 9fd97e8d843d6b818f734e9c5021478b944d01f0 Mon Sep 17 00:00:00 2001
From: Shlok Kyal <shlok.kyal.oss@gmail.com>
Date: Wed, 22 Jan 2025 20:59:17 +0530
Subject: [PATCH v5 2/2] Test for Improve logging for data origin discrepancies
in table synchronization
---
src/test/subscription/t/030_origin.pl | 69 +++++++++++++++++++++++++++
1 file changed, 69 insertions(+)
diff --git a/src/test/subscription/t/030_origin.pl b/src/test/subscription/t/030_origin.pl
index eb1dcb7271..0ffec5a181 100644
--- a/src/test/subscription/t/030_origin.pl
+++ b/src/test/subscription/t/030_origin.pl
@@ -255,6 +255,75 @@ $node_A->safe_psql('postgres', "DROP TABLE tab_new");
$node_B->safe_psql('postgres', "DROP TABLE tab_new");
$node_A->safe_psql('postgres', "DROP SUBSCRIPTION $subname_AB2");
+###############################################################################
+# Specifying origin = NONE and copy_data = on must raise WARNING if we subscribe
+# to a partitioned table and this table contains any remotely originated data.
+###############################################################################
+
+# create a partition table on node A
+$node_A->safe_psql(
+ 'postgres', qq(
+CREATE SCHEMA ts;
+CREATE TABLE ts.t(id int) PARTITION BY RANGE(id);
+CREATE TABLE ts.part1 PARTITION OF ts.t FOR VALUES FROM (0) TO (5);
+CREATE TABLE ts.part2 PARTITION OF ts.t FOR VALUES FROM (5) TO (10);
+));
+
+# create a table on node B which will act as a source for a partition on node A
+$node_B->safe_psql(
+ 'postgres', qq(
+CREATE SCHEMA ts;
+CREATE TABLE ts.part2(id int);
+CREATE PUBLICATION pub_b_a FOR TABLE ts.part2;
+));
+$node_A->safe_psql('postgres',
+ "CREATE SUBSCRIPTION sub_a_b CONNECTION '$node_B_connstr' PUBLICATION pub_b_a;"
+);
+
+# create a table on node C
+$node_C->safe_psql(
+ 'postgres', "
+CREATE SCHEMA ts;
+CREATE TABLE ts.t(id int);");
+
+# create a logical replication setup between node A and node C with
+# subscription on node C having origin = NONE and copy_data = on
+$node_A->safe_psql(
+ 'postgres', "
+CREATE PUBLICATION pub_a_c FOR TABLE ts.t WITH (publish_via_partition_root);
+");
+
+($result, $stdout, $stderr) = $node_C->psql(
+ 'postgres', "
+ CREATE SUBSCRIPTION sub_a_c CONNECTION '$node_A_connstr' PUBLICATION pub_a_c WITH (origin = none, copy_data = on);
+");
+
+# A warning must be logged as a partition of table in node A might have
+# remotely originated data
+like(
+ $stderr,
+ qr/WARNING: ( [A-Z0-9]+:)? subscription "sub_a_c" requested copy_data with origin = NONE but might copy data that had a different origin/,
+ "Create subscription with origin = none and copy_data when the publisher has subscribed a partitioned table"
+);
+
+# clear the operations done by this test
+$node_C->safe_psql(
+ 'postgres', qq(
+DROP SUBSCRIPTION sub_a_c;
+DROP SCHEMA ts CASCADE;
+));
+$node_A->safe_psql(
+ 'postgres', qq(
+DROP SUBSCRIPTION sub_a_b;
+DROP PUBLICATION pub_a_c;
+DROP SCHEMA ts CASCADE;
+));
+$node_B->safe_psql(
+ 'postgres', qq(
+DROP PUBLICATION pub_b_a;
+DROP SCHEMA ts CASCADE;
+));
+
# shutdown
$node_B->stop('fast');
$node_A->stop('fast');
--
2.34.1
22.01.2025 18:41, Shlok Kyal пишет:
On Wed, 22 Jan 2025 at 09:00, Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:On Tuesday, January 21, 2025 1:31 AM vignesh C <vignesh21@gmail.com> wrote:
Hi,
On Mon, 20 Jan 2025 at 17:31, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Jan 18, 2025 at 10:31 AM vignesh C <vignesh21@gmail.com> wrote:
Attached patch has the fix for this issue which includes the
partition tables also for the publication now and throws a warning
appropriately.The corresponding query (see "To find which tables might potentially
include non-local origins .." on [1]) on the create_subscription doc
page.
BTW, the proposed fix is not backpatcheable as it changes the catalog
which requires catversion bump. However, as this is a WARNING case, if
we can't find a fix that can't be backpatched, we can fix it in
HEAD-only.I could not find a way to fix the back version without changing the catalog
version.The attached v3 version has the changes for the same.
Thanks for the patch.
I agree that covering the partitioned table case when checking the non-local
origin data on publisher is an improvement. But I think adding or extending the
SQL functions may not be the appropriate way to fix because the new functions
cannot be used in older PG version and is also not backpatchable.I am thinking it would be better to use the existing pg_partition_ancestors()
and pg_partition_tree() to verify the same, which can be used in all supported
PG versions and is also backpatchable.And here is another version which fixed the issue like that. I have not added
tests for it, but I think it's doable to write the something like the testcases
provided by Sergey. This patch does not fix the foreign tabel as that seems to
be a separate issue which can be fixed independtly.Hi Sergey, if you have the time, could you please verify whether this patch
resolves the partition issue you reported? I've confirmed that it passes the
partitioned tests in the scripts, but I would appreciate your confirmation for
the same.Hi Hou-san,
I have created a patch to add a test for the patch.
v5-0001 : same as v4-0001
v5-0002: adds the testcaseThanks and Regards,
Shlok Kyal
Hi!
Sorry, I can't do it right now, but I think we need add testcase where
ancestor of published table have different origin.
Also we still don't care about foreign partitions (as I wrote earlier
we should raise an ERROR for such publications). This checking must be
done at publication creation in check_publication_add_relation(), but I
not sure about publication for all tables/for tables in schema because
one foreign table will block publication creation
Thoughts?
On Wed, Jan 22, 2025 at 9:44 PM Sergey Tatarintsev
<s.tatarintsev@postgrespro.ru> wrote:
22.01.2025 18:41, Shlok Kyal пишет:
Also we still don't care about foreign partitions (as I wrote earlier
we should raise an ERROR for such publications).
I think dealing with this separately from the origin vs. partitioned
table issue is better.
--
With Regards,
Amit Kapila.
On Wed, 22 Jan 2025 at 09:00, Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
On Tuesday, January 21, 2025 1:31 AM vignesh C <vignesh21@gmail.com> wrote:
Hi,
On Mon, 20 Jan 2025 at 17:31, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Jan 18, 2025 at 10:31 AM vignesh C <vignesh21@gmail.com> wrote:
Attached patch has the fix for this issue which includes the
partition tables also for the publication now and throws a warning
appropriately.The corresponding query (see "To find which tables might potentially
include non-local origins .." on [1]) on the create_subscription doc
page.BTW, the proposed fix is not backpatcheable as it changes the catalog
which requires catversion bump. However, as this is a WARNING case, if
we can't find a fix that can't be backpatched, we can fix it in
HEAD-only.I could not find a way to fix the back version without changing the catalog
version.The attached v3 version has the changes for the same.
Thanks for the patch.
I agree that covering the partitioned table case when checking the non-local
origin data on publisher is an improvement. But I think adding or extending the
SQL functions may not be the appropriate way to fix because the new functions
cannot be used in older PG version and is also not backpatchable.I am thinking it would be better to use the existing pg_partition_ancestors()
and pg_partition_tree() to verify the same, which can be used in all supported
PG versions and is also backpatchable.And here is another version which fixed the issue like that. I have not added
tests for it, but I think it's doable to write the something like the testcases
provided by Sergey. This patch does not fix the foreign tabel as that seems to
be a separate issue which can be fixed independtly.Hi Sergey, if you have the time, could you please verify whether this patch
resolves the partition issue you reported? I've confirmed that it passes the
partitioned tests in the scripts, but I would appreciate your confirmation for
the same.
Hi Hou-san,
I tested the patch, and it is working fine on HEAD.
I also tried to apply the patches to back branches PG17 and PG 16. But
the patch does not apply.
This 'origin' option was added in PG 16. So, this patch will not be
required for PG 15 and back branches.
Thanks and Regards,
Shlok Kyal
On Thu, 23 Jan 2025 at 12:35, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Wed, 22 Jan 2025 at 09:00, Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:On Tuesday, January 21, 2025 1:31 AM vignesh C <vignesh21@gmail.com> wrote:
Hi,
On Mon, 20 Jan 2025 at 17:31, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sat, Jan 18, 2025 at 10:31 AM vignesh C <vignesh21@gmail.com> wrote:
Attached patch has the fix for this issue which includes the
partition tables also for the publication now and throws a warning
appropriately.The corresponding query (see "To find which tables might potentially
include non-local origins .." on [1]) on the create_subscription doc
page.BTW, the proposed fix is not backpatcheable as it changes the catalog
which requires catversion bump. However, as this is a WARNING case, if
we can't find a fix that can't be backpatched, we can fix it in
HEAD-only.I could not find a way to fix the back version without changing the catalog
version.The attached v3 version has the changes for the same.
Thanks for the patch.
I agree that covering the partitioned table case when checking the non-local
origin data on publisher is an improvement. But I think adding or extending the
SQL functions may not be the appropriate way to fix because the new functions
cannot be used in older PG version and is also not backpatchable.I am thinking it would be better to use the existing pg_partition_ancestors()
and pg_partition_tree() to verify the same, which can be used in all supported
PG versions and is also backpatchable.And here is another version which fixed the issue like that. I have not added
tests for it, but I think it's doable to write the something like the testcases
provided by Sergey. This patch does not fix the foreign tabel as that seems to
be a separate issue which can be fixed independtly.Hi Sergey, if you have the time, could you please verify whether this patch
resolves the partition issue you reported? I've confirmed that it passes the
partitioned tests in the scripts, but I would appreciate your confirmation for
the same.Hi Hou-san,
I tested the patch, and it is working fine on HEAD.
I also tried to apply the patches to back branches PG17 and PG 16. But
the patch does not apply.This 'origin' option was added in PG 16. So, this patch will not be
required for PG 15 and back branches.
I have created a patch which applies to both PG17 and PG 16. The
v6-0002 is the test patch. It applies to all the branches (HEAD, PG17,
PG16) correctly.
Thanks and Regards,
Shlok Kyal
Attachments:
v6-0001-Improve-logging-for-data-origin-discrepancies-in-.patchapplication/octet-stream; name=v6-0001-Improve-logging-for-data-origin-discrepancies-in-.patchDownload
From 37966d29319d61736e74ee42fac0d7a9507fea9b Mon Sep 17 00:00:00 2001
From: Hou Zhijie <houzj.fnst@cn.fujitsu.com>
Date: Tue, 21 Jan 2025 16:18:34 +0800
Subject: [PATCH v6 1/2] Improve logging for data origin discrepancies in table
synchronization
Previously, a WARNING was issued during initial table synchronization with
origin=NONE only when the publisher subscribed to the same table from
other publishers, indicating potential data origination from different origins.
However, it's possible for the publisher to subscribe to the partition
ancestors or partition children of the table from other publishers, which could
also result in mixed-origin data inclusion.
This patch expands the check to consider both the subscribed table's ancestors
and children. A WARNING will now be logged if any of these related tables could
contain data originating from different sources.
---
doc/src/sgml/ref/create_subscription.sgml | 14 ++++++++------
src/backend/commands/subscriptioncmds.c | 15 +++++++++------
2 files changed, 17 insertions(+), 12 deletions(-)
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 6cf7d4f9a1..8bee9f88a4 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -534,13 +534,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<programlisting>
# substitute <pub-names> below with your publication name(s) to be queried
SELECT DISTINCT PT.schemaname, PT.tablename
-FROM pg_publication_tables PT,
+FROM pg_publication_tables PT
+ JOIN pg_class C ON (C.relname = PT.tablename)
+ JOIN pg_namespace N ON (N.nspname = PT.schemaname),
pg_subscription_rel PS
- JOIN pg_class C ON (C.oid = PS.srrelid)
- JOIN pg_namespace N ON (N.oid = C.relnamespace)
-WHERE N.nspname = PT.schemaname AND
- C.relname = PT.tablename AND
- PT.pubname IN (<pub-names>);
+WHERE C.relnamespace = N.oid AND
+ PT.pubname IN (<pub-names>) AND
+ (PS.srrelid = C.oid OR
+ C.oid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION
+ SELECT relid FROM pg_partition_tree(PS.srrelid)));
</programlisting></para>
</refsect1>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 2d8a71ca1e..4aec73bcc6 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -2083,11 +2083,12 @@ AlterSubscriptionOwner_oid(Oid subid, Oid newOwnerId)
}
/*
- * Check and log a warning if the publisher has subscribed to the same table
- * from some other publisher. This check is required only if "copy_data = true"
- * and "origin = none" for CREATE SUBSCRIPTION and
- * ALTER SUBSCRIPTION ... REFRESH statements to notify the user that data
- * having origin might have been copied.
+ * Check and log a warning if the publisher has subscribed to the same table,
+ * its partition ancestors (if it's a partition), or its partition children (if
+ * it's a partitioned table), from some other publishers. This check is
+ * required only if "copy_data = true" and "origin = none" for CREATE
+ * SUBSCRIPTION and ALTER SUBSCRIPTION ... REFRESH statements to notify the
+ * user that data having origin might have been copied.
*
* This check need not be performed on the tables that are already added
* because incremental sync for those tables will happen through WAL and the
@@ -2117,7 +2118,9 @@ check_publications_origin(WalReceiverConn *wrconn, List *publications,
"SELECT DISTINCT P.pubname AS pubname\n"
"FROM pg_publication P,\n"
" LATERAL pg_get_publication_tables(P.pubname) GPT\n"
- " JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),\n"
+ " JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid OR"
+ " GPT.relid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION"
+ " SELECT relid FROM pg_partition_tree(PS.srrelid))),\n"
" pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)\n"
"WHERE C.oid = GPT.relid AND P.pubname IN (");
GetPublicationsStr(publications, &cmd, true);
--
2.34.1
v6-0001-PG-17-and-PG-16-Improve-logging-for-data-origin-d.patchapplication/octet-stream; name=v6-0001-PG-17-and-PG-16-Improve-logging-for-data-origin-d.patchDownload
From dc4cc741e6023e83dee6d72c0c19db7323ad8287 Mon Sep 17 00:00:00 2001
From: Shlok Kyal <shlok.kyal.oss@gmail.com>
Date: Thu, 23 Jan 2025 12:59:19 +0530
Subject: [PATCH v6 1/2] [PG 17 and PG 16] Improve logging for data origin
discrepancies in table synchronization
Previously, a WARNING was issued during initial table synchronization with
origin=NONE only when the publisher subscribed to the same table from
other publishers, indicating potential data origination from different origins.
However, it's possible for the publisher to subscribe to the partition
ancestors or partition children of the table from other publishers, which could
also result in mixed-origin data inclusion.
This patch expands the check to consider both the subscribed table's ancestors
and children. A WARNING will now be logged if any of these related tables could
contain data originating from different sources.
---
doc/src/sgml/ref/create_subscription.sgml | 14 ++++++++------
src/backend/commands/subscriptioncmds.c | 15 +++++++++------
2 files changed, 17 insertions(+), 12 deletions(-)
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 740b7d9421..8feec74023 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -526,13 +526,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<programlisting>
# substitute <pub-names> below with your publication name(s) to be queried
SELECT DISTINCT PT.schemaname, PT.tablename
-FROM pg_publication_tables PT,
+FROM pg_publication_tables PT
+ JOIN pg_class C ON (C.relname = PT.tablename)
+ JOIN pg_namespace N ON (N.nspname = PT.schemaname),
pg_subscription_rel PS
- JOIN pg_class C ON (C.oid = PS.srrelid)
- JOIN pg_namespace N ON (N.oid = C.relnamespace)
-WHERE N.nspname = PT.schemaname AND
- C.relname = PT.tablename AND
- PT.pubname IN (<pub-names>);
+WHERE C.relnamespace = N.oid AND
+ PT.pubname IN (<pub-names>) AND
+ (PS.srrelid = C.oid OR
+ C.oid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION
+ SELECT relid FROM pg_partition_tree(PS.srrelid)));
</programlisting></para>
</refsect1>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 8ecb6e0bb8..9467f58a23 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -2012,11 +2012,12 @@ AlterSubscriptionOwner_oid(Oid subid, Oid newOwnerId)
}
/*
- * Check and log a warning if the publisher has subscribed to the same table
- * from some other publisher. This check is required only if "copy_data = true"
- * and "origin = none" for CREATE SUBSCRIPTION and
- * ALTER SUBSCRIPTION ... REFRESH statements to notify the user that data
- * having origin might have been copied.
+ * Check and log a warning if the publisher has subscribed to the same table,
+ * its partition ancestors (if it's a partition), or its partition children (if
+ * it's a partitioned table), from some other publishers. This check is
+ * required only if "copy_data = true" and "origin = none" for CREATE
+ * SUBSCRIPTION and ALTER SUBSCRIPTION ... REFRESH statements to notify the
+ * user that data having origin might have been copied.
*
* This check need not be performed on the tables that are already added
* because incremental sync for those tables will happen through WAL and the
@@ -2046,7 +2047,9 @@ check_publications_origin(WalReceiverConn *wrconn, List *publications,
"SELECT DISTINCT P.pubname AS pubname\n"
"FROM pg_publication P,\n"
" LATERAL pg_get_publication_tables(P.pubname) GPT\n"
- " JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),\n"
+ " JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid OR"
+ " GPT.relid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION"
+ " SELECT relid FROM pg_partition_tree(PS.srrelid))),\n"
" pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)\n"
"WHERE C.oid = GPT.relid AND P.pubname IN (");
get_publications_str(publications, &cmd, true);
--
2.34.1
v6-0002-Test-for-Improve-logging-for-data-origin-discrepa.patchapplication/octet-stream; name=v6-0002-Test-for-Improve-logging-for-data-origin-discrepa.patchDownload
From fbee24aeace55c21f291b161391b88c433258da5 Mon Sep 17 00:00:00 2001
From: Shlok Kyal <shlok.kyal.oss@gmail.com>
Date: Wed, 22 Jan 2025 20:59:17 +0530
Subject: [PATCH v6 2/2] Test for Improve logging for data origin discrepancies
in table synchronization
---
src/test/subscription/t/030_origin.pl | 69 +++++++++++++++++++++++++++
1 file changed, 69 insertions(+)
diff --git a/src/test/subscription/t/030_origin.pl b/src/test/subscription/t/030_origin.pl
index eb1dcb7271..0ffec5a181 100644
--- a/src/test/subscription/t/030_origin.pl
+++ b/src/test/subscription/t/030_origin.pl
@@ -255,6 +255,75 @@ $node_A->safe_psql('postgres', "DROP TABLE tab_new");
$node_B->safe_psql('postgres', "DROP TABLE tab_new");
$node_A->safe_psql('postgres', "DROP SUBSCRIPTION $subname_AB2");
+###############################################################################
+# Specifying origin = NONE and copy_data = on must raise WARNING if we subscribe
+# to a partitioned table and this table contains any remotely originated data.
+###############################################################################
+
+# create a partition table on node A
+$node_A->safe_psql(
+ 'postgres', qq(
+CREATE SCHEMA ts;
+CREATE TABLE ts.t(id int) PARTITION BY RANGE(id);
+CREATE TABLE ts.part1 PARTITION OF ts.t FOR VALUES FROM (0) TO (5);
+CREATE TABLE ts.part2 PARTITION OF ts.t FOR VALUES FROM (5) TO (10);
+));
+
+# create a table on node B which will act as a source for a partition on node A
+$node_B->safe_psql(
+ 'postgres', qq(
+CREATE SCHEMA ts;
+CREATE TABLE ts.part2(id int);
+CREATE PUBLICATION pub_b_a FOR TABLE ts.part2;
+));
+$node_A->safe_psql('postgres',
+ "CREATE SUBSCRIPTION sub_a_b CONNECTION '$node_B_connstr' PUBLICATION pub_b_a;"
+);
+
+# create a table on node C
+$node_C->safe_psql(
+ 'postgres', "
+CREATE SCHEMA ts;
+CREATE TABLE ts.t(id int);");
+
+# create a logical replication setup between node A and node C with
+# subscription on node C having origin = NONE and copy_data = on
+$node_A->safe_psql(
+ 'postgres', "
+CREATE PUBLICATION pub_a_c FOR TABLE ts.t WITH (publish_via_partition_root);
+");
+
+($result, $stdout, $stderr) = $node_C->psql(
+ 'postgres', "
+ CREATE SUBSCRIPTION sub_a_c CONNECTION '$node_A_connstr' PUBLICATION pub_a_c WITH (origin = none, copy_data = on);
+");
+
+# A warning must be logged as a partition of table in node A might have
+# remotely originated data
+like(
+ $stderr,
+ qr/WARNING: ( [A-Z0-9]+:)? subscription "sub_a_c" requested copy_data with origin = NONE but might copy data that had a different origin/,
+ "Create subscription with origin = none and copy_data when the publisher has subscribed a partitioned table"
+);
+
+# clear the operations done by this test
+$node_C->safe_psql(
+ 'postgres', qq(
+DROP SUBSCRIPTION sub_a_c;
+DROP SCHEMA ts CASCADE;
+));
+$node_A->safe_psql(
+ 'postgres', qq(
+DROP SUBSCRIPTION sub_a_b;
+DROP PUBLICATION pub_a_c;
+DROP SCHEMA ts CASCADE;
+));
+$node_B->safe_psql(
+ 'postgres', qq(
+DROP PUBLICATION pub_b_a;
+DROP SCHEMA ts CASCADE;
+));
+
# shutdown
$node_B->stop('fast');
$node_A->stop('fast');
--
2.34.1
On Thursday, January 23, 2025 4:43 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Thu, 23 Jan 2025 at 12:35, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Wed, 22 Jan 2025 at 09:00, Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:Thanks for the patch.
I agree that covering the partitioned table case when checking the
non-local origin data on publisher is an improvement. But I think
adding or extending the SQL functions may not be the appropriate way
to fix because the new functions cannot be used in older PG version and isalso not backpatchable.
I am thinking it would be better to use the existing
pg_partition_ancestors() and pg_partition_tree() to verify the same,
which can be used in all supported PG versions and is also backpatchable.And here is another version which fixed the issue like that. I have
not added tests for it, but I think it's doable to write the
something like the testcases provided by Sergey. This patch does not
fix the foreign tabel as that seems to be a separate issue which can be fixedindependtly.
Hi Sergey, if you have the time, could you please verify whether
this patch resolves the partition issue you reported? I've confirmed
that it passes the partitioned tests in the scripts, but I would
appreciate your confirmation for the same.Hi Hou-san,
I tested the patch, and it is working fine on HEAD.
I also tried to apply the patches to back branches PG17 and PG 16. But
the patch does not apply.This 'origin' option was added in PG 16. So, this patch will not be
required for PG 15 and back branches.I have created a patch which applies to both PG17 and PG 16. The
v6-0002 is the test patch. It applies to all the branches (HEAD, PG17,
PG16) correctly.
Thanks for the patch. I think the testcases could be improved.
It's not clear why a separate schema is created for tables. I assume it was
initially intended to test TABLES IN SCHEMA but was later modified. If the
separate schema is still necessary, could you please add comments to clarify
its purpose?
Besides, the new table name 'ts' seems a bit unconventional. It would be better
to align with the naming style of existing test cases for consistency and
clarity.
Also, Sergey had suggested adding an more test to verify scenarios where the
table's ancestors are subscribed. It appears this hasn't been added yet. I
think it would be better to add it.
Best Regards,
Hou zj
23.01.2025 15:24, Zhijie Hou (Fujitsu) пишет:
On Thursday, January 23, 2025 4:43 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Thu, 23 Jan 2025 at 12:35, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Wed, 22 Jan 2025 at 09:00, Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:Thanks for the patch.
I agree that covering the partitioned table case when checking the
non-local origin data on publisher is an improvement. But I think
adding or extending the SQL functions may not be the appropriate way
to fix because the new functions cannot be used in older PG version and isalso not backpatchable.
I am thinking it would be better to use the existing
pg_partition_ancestors() and pg_partition_tree() to verify the same,
which can be used in all supported PG versions and is also backpatchable.And here is another version which fixed the issue like that. I have
not added tests for it, but I think it's doable to write the
something like the testcases provided by Sergey. This patch does not
fix the foreign tabel as that seems to be a separate issue which can be fixedindependtly.
Hi Sergey, if you have the time, could you please verify whether
this patch resolves the partition issue you reported? I've confirmed
that it passes the partitioned tests in the scripts, but I would
appreciate your confirmation for the same.Hi Hou-san,
I tested the patch, and it is working fine on HEAD.
I also tried to apply the patches to back branches PG17 and PG 16. But
the patch does not apply.This 'origin' option was added in PG 16. So, this patch will not be
required for PG 15 and back branches.I have created a patch which applies to both PG17 and PG 16. The
v6-0002 is the test patch. It applies to all the branches (HEAD, PG17,
PG16) correctly.Thanks for the patch. I think the testcases could be improved.
It's not clear why a separate schema is created for tables. I assume it was
initially intended to test TABLES IN SCHEMA but was later modified. If the
separate schema is still necessary, could you please add comments to clarify
its purpose?Besides, the new table name 'ts' seems a bit unconventional. It would be better
to align with the naming style of existing test cases for consistency and
clarity.Also, Sergey had suggested adding an more test to verify scenarios where the
table's ancestors are subscribed. It appears this hasn't been added yet. I
think it would be better to add it.Best Regards,
Hou zj
Hi!
That's right, separate schema was used to test "CREATE PUBLICATION FOR
TABLES IN SCHEMA".
My first patch with test cases contains 3 scenarios:
CREATE PUBLICATION pub_a FOR TABLE ts.t;
CREATE PUBLICATION pub_a_via_root FOR TABLE ts.t WITH
(publish_via_partition_root);
CREATE PUBLICATION pub_a_schema FOR TABLES IN SCHEMA ts WITH
(publish_via_partition_root);
(but not scenario where the table's ancestors are subscribed)
On Thu, 23 Jan 2025 at 17:54, Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
On Thursday, January 23, 2025 4:43 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Thu, 23 Jan 2025 at 12:35, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Wed, 22 Jan 2025 at 09:00, Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:Thanks for the patch.
I agree that covering the partitioned table case when checking the
non-local origin data on publisher is an improvement. But I think
adding or extending the SQL functions may not be the appropriate way
to fix because the new functions cannot be used in older PG version and isalso not backpatchable.
I am thinking it would be better to use the existing
pg_partition_ancestors() and pg_partition_tree() to verify the same,
which can be used in all supported PG versions and is also backpatchable.And here is another version which fixed the issue like that. I have
not added tests for it, but I think it's doable to write the
something like the testcases provided by Sergey. This patch does not
fix the foreign tabel as that seems to be a separate issue which can be fixedindependtly.
Hi Sergey, if you have the time, could you please verify whether
this patch resolves the partition issue you reported? I've confirmed
that it passes the partitioned tests in the scripts, but I would
appreciate your confirmation for the same.Hi Hou-san,
I tested the patch, and it is working fine on HEAD.
I also tried to apply the patches to back branches PG17 and PG 16. But
the patch does not apply.This 'origin' option was added in PG 16. So, this patch will not be
required for PG 15 and back branches.I have created a patch which applies to both PG17 and PG 16. The
v6-0002 is the test patch. It applies to all the branches (HEAD, PG17,
PG16) correctly.Thanks for the patch. I think the testcases could be improved.
It's not clear why a separate schema is created for tables. I assume it was
initially intended to test TABLES IN SCHEMA but was later modified. If the
separate schema is still necessary, could you please add comments to clarify
its purpose?
Besides, the new table name 'ts' seems a bit unconventional. It would be better
to align with the naming style of existing test cases for consistency and
clarity.
I think the schema is not required. I have removed it.
Also, Sergey had suggested adding an more test to verify scenarios where the
table's ancestors are subscribed. It appears this hasn't been added yet. I
think it would be better to add it.
I have added the test in the latest patch.
Thanks and Regards,
Shlok Kyal
Attachments:
v7-0001-Improve-logging-for-data-origin-discrepancies-in-.patchapplication/octet-stream; name=v7-0001-Improve-logging-for-data-origin-discrepancies-in-.patchDownload
From e465249800d81f047accdb30348edb3ba39149f1 Mon Sep 17 00:00:00 2001
From: Hou Zhijie <houzj.fnst@cn.fujitsu.com>
Date: Tue, 21 Jan 2025 16:18:34 +0800
Subject: [PATCH v7 1/2] Improve logging for data origin discrepancies in table
synchronization
Previously, a WARNING was issued during initial table synchronization with
origin=NONE only when the publisher subscribed to the same table from
other publishers, indicating potential data origination from different origins.
However, it's possible for the publisher to subscribe to the partition
ancestors or partition children of the table from other publishers, which could
also result in mixed-origin data inclusion.
This patch expands the check to consider both the subscribed table's ancestors
and children. A WARNING will now be logged if any of these related tables could
contain data originating from different sources.
---
doc/src/sgml/ref/create_subscription.sgml | 14 ++++++++------
src/backend/commands/subscriptioncmds.c | 15 +++++++++------
2 files changed, 17 insertions(+), 12 deletions(-)
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 6cf7d4f9a1..8bee9f88a4 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -534,13 +534,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<programlisting>
# substitute <pub-names> below with your publication name(s) to be queried
SELECT DISTINCT PT.schemaname, PT.tablename
-FROM pg_publication_tables PT,
+FROM pg_publication_tables PT
+ JOIN pg_class C ON (C.relname = PT.tablename)
+ JOIN pg_namespace N ON (N.nspname = PT.schemaname),
pg_subscription_rel PS
- JOIN pg_class C ON (C.oid = PS.srrelid)
- JOIN pg_namespace N ON (N.oid = C.relnamespace)
-WHERE N.nspname = PT.schemaname AND
- C.relname = PT.tablename AND
- PT.pubname IN (<pub-names>);
+WHERE C.relnamespace = N.oid AND
+ PT.pubname IN (<pub-names>) AND
+ (PS.srrelid = C.oid OR
+ C.oid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION
+ SELECT relid FROM pg_partition_tree(PS.srrelid)));
</programlisting></para>
</refsect1>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 2d8a71ca1e..4aec73bcc6 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -2083,11 +2083,12 @@ AlterSubscriptionOwner_oid(Oid subid, Oid newOwnerId)
}
/*
- * Check and log a warning if the publisher has subscribed to the same table
- * from some other publisher. This check is required only if "copy_data = true"
- * and "origin = none" for CREATE SUBSCRIPTION and
- * ALTER SUBSCRIPTION ... REFRESH statements to notify the user that data
- * having origin might have been copied.
+ * Check and log a warning if the publisher has subscribed to the same table,
+ * its partition ancestors (if it's a partition), or its partition children (if
+ * it's a partitioned table), from some other publishers. This check is
+ * required only if "copy_data = true" and "origin = none" for CREATE
+ * SUBSCRIPTION and ALTER SUBSCRIPTION ... REFRESH statements to notify the
+ * user that data having origin might have been copied.
*
* This check need not be performed on the tables that are already added
* because incremental sync for those tables will happen through WAL and the
@@ -2117,7 +2118,9 @@ check_publications_origin(WalReceiverConn *wrconn, List *publications,
"SELECT DISTINCT P.pubname AS pubname\n"
"FROM pg_publication P,\n"
" LATERAL pg_get_publication_tables(P.pubname) GPT\n"
- " JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),\n"
+ " JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid OR"
+ " GPT.relid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION"
+ " SELECT relid FROM pg_partition_tree(PS.srrelid))),\n"
" pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)\n"
"WHERE C.oid = GPT.relid AND P.pubname IN (");
GetPublicationsStr(publications, &cmd, true);
--
2.34.1
v7-PG_17-PG_16-0001-Improve-logging-for-data-origin-discr.patchapplication/octet-stream; name=v7-PG_17-PG_16-0001-Improve-logging-for-data-origin-discr.patchDownload
From 0473d5c92deca5d15574931ebb56a14ef68b2817 Mon Sep 17 00:00:00 2001
From: Shlok Kyal <shlok.kyal.oss@gmail.com>
Date: Thu, 23 Jan 2025 12:59:19 +0530
Subject: [PATCH v7-PG_17-PG_16 1/2] Improve logging for data origin
discrepancies in table synchronization
Previously, a WARNING was issued during initial table synchronization with
origin=NONE only when the publisher subscribed to the same table from
other publishers, indicating potential data origination from different origins.
However, it's possible for the publisher to subscribe to the partition
ancestors or partition children of the table from other publishers, which could
also result in mixed-origin data inclusion.
This patch expands the check to consider both the subscribed table's ancestors
and children. A WARNING will now be logged if any of these related tables could
contain data originating from different sources.
---
doc/src/sgml/ref/create_subscription.sgml | 14 ++++++++------
src/backend/commands/subscriptioncmds.c | 15 +++++++++------
2 files changed, 17 insertions(+), 12 deletions(-)
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 740b7d9421..8feec74023 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -526,13 +526,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<programlisting>
# substitute <pub-names> below with your publication name(s) to be queried
SELECT DISTINCT PT.schemaname, PT.tablename
-FROM pg_publication_tables PT,
+FROM pg_publication_tables PT
+ JOIN pg_class C ON (C.relname = PT.tablename)
+ JOIN pg_namespace N ON (N.nspname = PT.schemaname),
pg_subscription_rel PS
- JOIN pg_class C ON (C.oid = PS.srrelid)
- JOIN pg_namespace N ON (N.oid = C.relnamespace)
-WHERE N.nspname = PT.schemaname AND
- C.relname = PT.tablename AND
- PT.pubname IN (<pub-names>);
+WHERE C.relnamespace = N.oid AND
+ PT.pubname IN (<pub-names>) AND
+ (PS.srrelid = C.oid OR
+ C.oid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION
+ SELECT relid FROM pg_partition_tree(PS.srrelid)));
</programlisting></para>
</refsect1>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 8ecb6e0bb8..9467f58a23 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -2012,11 +2012,12 @@ AlterSubscriptionOwner_oid(Oid subid, Oid newOwnerId)
}
/*
- * Check and log a warning if the publisher has subscribed to the same table
- * from some other publisher. This check is required only if "copy_data = true"
- * and "origin = none" for CREATE SUBSCRIPTION and
- * ALTER SUBSCRIPTION ... REFRESH statements to notify the user that data
- * having origin might have been copied.
+ * Check and log a warning if the publisher has subscribed to the same table,
+ * its partition ancestors (if it's a partition), or its partition children (if
+ * it's a partitioned table), from some other publishers. This check is
+ * required only if "copy_data = true" and "origin = none" for CREATE
+ * SUBSCRIPTION and ALTER SUBSCRIPTION ... REFRESH statements to notify the
+ * user that data having origin might have been copied.
*
* This check need not be performed on the tables that are already added
* because incremental sync for those tables will happen through WAL and the
@@ -2046,7 +2047,9 @@ check_publications_origin(WalReceiverConn *wrconn, List *publications,
"SELECT DISTINCT P.pubname AS pubname\n"
"FROM pg_publication P,\n"
" LATERAL pg_get_publication_tables(P.pubname) GPT\n"
- " JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),\n"
+ " JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid OR"
+ " GPT.relid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION"
+ " SELECT relid FROM pg_partition_tree(PS.srrelid))),\n"
" pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)\n"
"WHERE C.oid = GPT.relid AND P.pubname IN (");
get_publications_str(publications, &cmd, true);
--
2.34.1
v7-0002-Test-for-Improve-logging-for-data-origin-discrepa.patchapplication/octet-stream; name=v7-0002-Test-for-Improve-logging-for-data-origin-discrepa.patchDownload
From 1c4c08e67e6a020d58bd03024cb9487e4f2293c6 Mon Sep 17 00:00:00 2001
From: Shlok Kyal <shlok.kyal.oss@gmail.com>
Date: Wed, 22 Jan 2025 20:59:17 +0530
Subject: [PATCH v7 2/2] Test for Improve logging for data origin discrepancies
in table synchronization
---
src/test/subscription/t/030_origin.pl | 88 +++++++++++++++++++++++++++
1 file changed, 88 insertions(+)
diff --git a/src/test/subscription/t/030_origin.pl b/src/test/subscription/t/030_origin.pl
index eb1dcb7271..18a0b88d6c 100644
--- a/src/test/subscription/t/030_origin.pl
+++ b/src/test/subscription/t/030_origin.pl
@@ -255,6 +255,94 @@ $node_A->safe_psql('postgres', "DROP TABLE tab_new");
$node_B->safe_psql('postgres', "DROP TABLE tab_new");
$node_A->safe_psql('postgres', "DROP SUBSCRIPTION $subname_AB2");
+###############################################################################
+# Specifying origin = NONE and copy_data = on must raise WARNING if we subscribe
+# to a partitioned table and this table contains any remotely originated data.
+###############################################################################
+
+# create a partition table on node A
+$node_A->safe_psql(
+ 'postgres', qq(
+CREATE TABLE tab_main(a int) PARTITION BY RANGE(a);
+CREATE TABLE tab_part1 PARTITION OF tab_main FOR VALUES FROM (0) TO (5);
+CREATE TABLE tab_part2(a int) PARTITION BY RANGE(a);
+CREATE TABLE tab_part2_1 PARTITION OF tab_part2 FOR VALUES FROM (5) TO (10);
+ALTER TABLE tab_main ATTACH PARTITION tab_part2 FOR VALUES FROM (5) to (10);
+));
+
+# create a table on node B which will act as a source for a partition on node A
+$node_B->safe_psql(
+ 'postgres', qq(
+CREATE TABLE tab_part2(a int);
+CREATE PUBLICATION pub_b_a FOR TABLE tab_part2;
+));
+$node_A->safe_psql('postgres',
+ "CREATE SUBSCRIPTION sub_a_b CONNECTION '$node_B_connstr' PUBLICATION pub_b_a;"
+);
+
+# create a table on node C
+$node_C->safe_psql(
+ 'postgres', qq(
+CREATE TABLE tab_main(a int);
+CREATE TABLE tab_part2_1(a int);
+));
+
+# create a logical replication setup between node A and node C with
+# subscription on node C having origin = NONE and copy_data = on
+$node_A->safe_psql(
+ 'postgres', qq(
+CREATE PUBLICATION pub_a_c FOR TABLE tab_main WITH (publish_via_partition_root);
+CREATE PUBLICATION pub_a_c_2 FOR TABLE tab_part2_1;
+));
+
+($result, $stdout, $stderr) = $node_C->psql(
+ 'postgres', "
+ CREATE SUBSCRIPTION sub_a_c CONNECTION '$node_A_connstr' PUBLICATION pub_a_c WITH (origin = none, copy_data = on);
+");
+
+# A warning must be logged as a partition 'tab_part2' in node A is subscribed to
+# node B so partition 'tab_part2' can have remotely originated data
+like(
+ $stderr,
+ qr/WARNING: ( [A-Z0-9]+:)? subscription "sub_a_c" requested copy_data with origin = NONE but might copy data that had a different origin/,
+ "Create subscription with origin = none and copy_data when the publisher's partition is subscribing from different origin"
+);
+$node_C->safe_psql('postgres', "DROP SUBSCRIPTION sub_a_c");
+
+($result, $stdout, $stderr) = $node_C->psql(
+ 'postgres', "
+ CREATE SUBSCRIPTION sub_a_c CONNECTION '$node_A_connstr' PUBLICATION pub_a_c_2 WITH (origin = none, copy_data = on);
+");
+
+# A warning must be logged as ancestor of table 'tab_part2_1' in node A have
+# been subscribed to node B so table 'tab_part2_1' can have remotely originated
+# data
+like(
+ $stderr,
+ qr/WARNING: ( [A-Z0-9]+:)? subscription "sub_a_c" requested copy_data with origin = NONE but might copy data that had a different origin/,
+ "Create subscription with origin = none and copy_data when the publisher's ancestor is subscribing from different origin"
+);
+
+# clear the operations done by this test
+$node_C->safe_psql(
+ 'postgres', qq(
+DROP SUBSCRIPTION sub_a_c;
+DROP TABLE tab_main;
+DROP TABLE tab_part2_1;
+));
+$node_A->safe_psql(
+ 'postgres', qq(
+DROP SUBSCRIPTION sub_a_b;
+DROP PUBLICATION pub_a_c;
+DROP PUBLICATION pub_a_c_2;
+DROP TABLE tab_main;
+));
+$node_B->safe_psql(
+ 'postgres', qq(
+DROP PUBLICATION pub_b_a;
+DROP TABLE tab_part2;
+));
+
# shutdown
$node_B->stop('fast');
$node_A->stop('fast');
--
2.34.1
24.01.2025 07:22, Shlok Kyal пишет:
On Thu, 23 Jan 2025 at 17:54, Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:On Thursday, January 23, 2025 4:43 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Thu, 23 Jan 2025 at 12:35, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
On Wed, 22 Jan 2025 at 09:00, Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:Thanks for the patch.
I agree that covering the partitioned table case when checking the
non-local origin data on publisher is an improvement. But I think
adding or extending the SQL functions may not be the appropriate way
to fix because the new functions cannot be used in older PG version and isalso not backpatchable.
I am thinking it would be better to use the existing
pg_partition_ancestors() and pg_partition_tree() to verify the same,
which can be used in all supported PG versions and is also backpatchable.And here is another version which fixed the issue like that. I have
not added tests for it, but I think it's doable to write the
something like the testcases provided by Sergey. This patch does not
fix the foreign tabel as that seems to be a separate issue which can be fixedindependtly.
Hi Sergey, if you have the time, could you please verify whether
this patch resolves the partition issue you reported? I've confirmed
that it passes the partitioned tests in the scripts, but I would
appreciate your confirmation for the same.Hi Hou-san,
I tested the patch, and it is working fine on HEAD.
I also tried to apply the patches to back branches PG17 and PG 16. But
the patch does not apply.This 'origin' option was added in PG 16. So, this patch will not be
required for PG 15 and back branches.I have created a patch which applies to both PG17 and PG 16. The
v6-0002 is the test patch. It applies to all the branches (HEAD, PG17,
PG16) correctly.Thanks for the patch. I think the testcases could be improved.
It's not clear why a separate schema is created for tables. I assume it was
initially intended to test TABLES IN SCHEMA but was later modified. If the
separate schema is still necessary, could you please add comments to clarify
its purpose?
Besides, the new table name 'ts' seems a bit unconventional. It would be better
to align with the naming style of existing test cases for consistency and
clarity.I think the schema is not required. I have removed it.
Also, Sergey had suggested adding an more test to verify scenarios where the
table's ancestors are subscribed. It appears this hasn't been added yet. I
think it would be better to add it.I have added the test in the latest patch.
Thanks and Regards,
Shlok Kyal
Hello!
I think there is another problem - publisher can modify publication
during the execution of "CREATE SUBSCRIPTION" (Rarely, this situation
occurred in my tests.)
I.e.:
1. Publisher: CREATE PUBLICATION ... FOR TABLE t1;
2. Subscriber (starts CREATE SUBSCRIPTION ...): check_publications_origin()
3. Publisher: ALTER PUBLICATION ... ADD TABLE t2;
4. Subscriber (still process CREATE SUBSCRIPTION ...): fetch_table_list()
So, we check publication with only t1, but fetch t1,t2
I think we must start transaction on publisher while executing
CreateSubscription() on subscriber (Or may be take an lock on publisher )
Thoughts?
On Fri, Jan 24, 2025 at 5:43 PM Sergey Tatarintsev
<s.tatarintsev@postgrespro.ru> wrote:
I think there is another problem - publisher can modify publication
during the execution of "CREATE SUBSCRIPTION" (Rarely, this situation
occurred in my tests.)I.e.:
1. Publisher: CREATE PUBLICATION ... FOR TABLE t1;
2. Subscriber (starts CREATE SUBSCRIPTION ...): check_publications_origin()
3. Publisher: ALTER PUBLICATION ... ADD TABLE t2;
4. Subscriber (still process CREATE SUBSCRIPTION ...): fetch_table_list()
So, we check publication with only t1, but fetch t1,t2
I think we must start transaction on publisher while executing
CreateSubscription() on subscriber (Or may be take an lock on publisher )
We don't want to make the code complex for this, especially in
back-branches as that can introduce more bugs. We already document
that "... it is the user's responsibility to make the necessary checks
to ensure the copied data origins are really as wanted or not.". We
should try to give this WARNING in possible scenarios without
complicating the code too much. Anyway, even after this WARNING it is
possible that the data is not copied from the publisher and there is
no danger of copying the wrong origin. So, I suggest leaving the above
case as it is.
--
With Regards,
Amit Kapila.
23.01.2025 09:46, Amit Kapila пишет:
On Wed, Jan 22, 2025 at 9:44 PM Sergey Tatarintsev
<s.tatarintsev@postgrespro.ru> wrote:22.01.2025 18:41, Shlok Kyal пишет:
Also we still don't care about foreign partitions (as I wrote earlier
we should raise an ERROR for such publications).I think dealing with this separately from the origin vs. partitioned
table issue is better.
ok! let it be so
I made a patch for such cases. You can see it here:
/messages/by-id/c78766fa-4eff-4805-ad9c-868f02954ad4@postgrespro.ru
On Fri, 24 Jan 2025 at 09:52, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
I have added the test in the latest patch.
Few comments:
1) Let's rearrange this query slightly so that the "PT.pubname IN
(<pub-names>)" appears at the end, the reason being that it will
be easy to copy/paste and edit it to include the publications names if
it is at the end:
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -534,13 +534,15 @@ CREATE SUBSCRIPTION <replaceable
class="parameter">subscription_name</replaceabl
<programlisting>
# substitute <pub-names> below with your publication name(s) to
be queried
SELECT DISTINCT PT.schemaname, PT.tablename
-FROM pg_publication_tables PT,
+FROM pg_publication_tables PT
+ JOIN pg_class C ON (C.relname = PT.tablename)
+ JOIN pg_namespace N ON (N.nspname = PT.schemaname),
pg_subscription_rel PS
- JOIN pg_class C ON (C.oid = PS.srrelid)
- JOIN pg_namespace N ON (N.oid = C.relnamespace)
-WHERE N.nspname = PT.schemaname AND
- C.relname = PT.tablename AND
- PT.pubname IN (<pub-names>);
+WHERE C.relnamespace = N.oid AND
+ PT.pubname IN (<pub-names>) AND
+ (PS.srrelid = C.oid OR
+ C.oid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION
+ SELECT relid FROM pg_partition_tree(PS.srrelid)));
2) The same should be handled in the PG17 version patch too.
3) Currently the setup is done like:
node_B(table tab_part2 - publication pub_b_a) replicating to
node_A(sub_a_b subscription)
node_A(table tab_main - publication pub_a_c) replicating to node_C(sub_a_c)
+###############################################################################
+# Specifying origin = NONE and copy_data = on must raise WARNING if
we subscribe
+# to a partitioned table and this table contains any remotely originated data.
+###############################################################################
+
+# create a partition table on node A
+$node_A->safe_psql(
+ 'postgres', qq(
+CREATE TABLE tab_main(a int) PARTITION BY RANGE(a);
+CREATE TABLE tab_part1 PARTITION OF tab_main FOR VALUES FROM (0) TO (5);
+CREATE TABLE tab_part2(a int) PARTITION BY RANGE(a);
+CREATE TABLE tab_part2_1 PARTITION OF tab_part2 FOR VALUES FROM (5) TO (10);
+ALTER TABLE tab_main ATTACH PARTITION tab_part2 FOR VALUES FROM (5) to (10);
+));
+
+# create a table on node B which will act as a source for a partition on node A
+$node_B->safe_psql(
+ 'postgres', qq(
Can we change this like below to make review easier:
node_A(table tab_part2 - publication pub_b_a) replicating to
node_B(sub_a_b subscription)
node_B(table tab_main - publication pub_a_c) replicating to node_C(sub_a_c)
Also add something similar like above to the comment.
Regards,
Vignesh
On Wed, 29 Jan 2025 at 15:58, vignesh C <vignesh21@gmail.com> wrote:
On Fri, 24 Jan 2025 at 09:52, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
I have added the test in the latest patch.
Few comments: 1) Let's rearrange this query slightly so that the "PT.pubname IN (<pub-names>)" appears at the end, the reason being that it will be easy to copy/paste and edit it to include the publications names if it is at the end: +++ b/doc/src/sgml/ref/create_subscription.sgml @@ -534,13 +534,15 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl <programlisting> # substitute <pub-names> below with your publication name(s) to be queried SELECT DISTINCT PT.schemaname, PT.tablename -FROM pg_publication_tables PT, +FROM pg_publication_tables PT + JOIN pg_class C ON (C.relname = PT.tablename) + JOIN pg_namespace N ON (N.nspname = PT.schemaname), pg_subscription_rel PS - JOIN pg_class C ON (C.oid = PS.srrelid) - JOIN pg_namespace N ON (N.oid = C.relnamespace) -WHERE N.nspname = PT.schemaname AND - C.relname = PT.tablename AND - PT.pubname IN (<pub-names>); +WHERE C.relnamespace = N.oid AND + PT.pubname IN (<pub-names>) AND + (PS.srrelid = C.oid OR + C.oid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION + SELECT relid FROM pg_partition_tree(PS.srrelid)));2) The same should be handled in the PG17 version patch too.
3) Currently the setup is done like:
node_B(table tab_part2 - publication pub_b_a) replicating to
node_A(sub_a_b subscription)
node_A(table tab_main - publication pub_a_c) replicating to node_C(sub_a_c)+############################################################################### +# Specifying origin = NONE and copy_data = on must raise WARNING if we subscribe +# to a partitioned table and this table contains any remotely originated data. +############################################################################### + +# create a partition table on node A +$node_A->safe_psql( + 'postgres', qq( +CREATE TABLE tab_main(a int) PARTITION BY RANGE(a); +CREATE TABLE tab_part1 PARTITION OF tab_main FOR VALUES FROM (0) TO (5); +CREATE TABLE tab_part2(a int) PARTITION BY RANGE(a); +CREATE TABLE tab_part2_1 PARTITION OF tab_part2 FOR VALUES FROM (5) TO (10); +ALTER TABLE tab_main ATTACH PARTITION tab_part2 FOR VALUES FROM (5) to (10); +)); + +# create a table on node B which will act as a source for a partition on node A +$node_B->safe_psql( + 'postgres', qq(Can we change this like below to make review easier:
node_A(table tab_part2 - publication pub_b_a) replicating to
node_B(sub_a_b subscription)
node_B(table tab_main - publication pub_a_c) replicating to node_C(sub_a_c)Also add something similar like above to the comment.
I have addressed the comments. Here is an updated patch.
Thanks and Regards,
Shlok Kyal
Attachments:
v8-0001-Improve-logging-for-data-origin-discrepancies-in-.patchapplication/x-patch; name=v8-0001-Improve-logging-for-data-origin-discrepancies-in-.patchDownload
From ab9f7ffa4810959577b1d0690519717e8543485a Mon Sep 17 00:00:00 2001
From: Hou Zhijie <houzj.fnst@cn.fujitsu.com>
Date: Tue, 21 Jan 2025 16:18:34 +0800
Subject: [PATCH v8 1/2] Improve logging for data origin discrepancies in table
synchronization
Previously, a WARNING was issued during initial table synchronization with
origin=NONE only when the publisher subscribed to the same table from
other publishers, indicating potential data origination from different origins.
However, it's possible for the publisher to subscribe to the partition
ancestors or partition children of the table from other publishers, which could
also result in mixed-origin data inclusion.
This patch expands the check to consider both the subscribed table's ancestors
and children. A WARNING will now be logged if any of these related tables could
contain data originating from different sources.
---
doc/src/sgml/ref/create_subscription.sgml | 12 +++++++-----
src/backend/commands/subscriptioncmds.c | 15 +++++++++------
2 files changed, 16 insertions(+), 11 deletions(-)
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 6cf7d4f9a1..57dec28a5d 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -534,12 +534,14 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<programlisting>
# substitute <pub-names> below with your publication name(s) to be queried
SELECT DISTINCT PT.schemaname, PT.tablename
-FROM pg_publication_tables PT,
+FROM pg_publication_tables PT
+ JOIN pg_class C ON (C.relname = PT.tablename)
+ JOIN pg_namespace N ON (N.nspname = PT.schemaname),
pg_subscription_rel PS
- JOIN pg_class C ON (C.oid = PS.srrelid)
- JOIN pg_namespace N ON (N.oid = C.relnamespace)
-WHERE N.nspname = PT.schemaname AND
- C.relname = PT.tablename AND
+WHERE C.relnamespace = N.oid AND
+ (PS.srrelid = C.oid OR
+ C.oid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION
+ SELECT relid FROM pg_partition_tree(PS.srrelid))) AND
PT.pubname IN (<pub-names>);
</programlisting></para>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 2d8a71ca1e..4aec73bcc6 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -2083,11 +2083,12 @@ AlterSubscriptionOwner_oid(Oid subid, Oid newOwnerId)
}
/*
- * Check and log a warning if the publisher has subscribed to the same table
- * from some other publisher. This check is required only if "copy_data = true"
- * and "origin = none" for CREATE SUBSCRIPTION and
- * ALTER SUBSCRIPTION ... REFRESH statements to notify the user that data
- * having origin might have been copied.
+ * Check and log a warning if the publisher has subscribed to the same table,
+ * its partition ancestors (if it's a partition), or its partition children (if
+ * it's a partitioned table), from some other publishers. This check is
+ * required only if "copy_data = true" and "origin = none" for CREATE
+ * SUBSCRIPTION and ALTER SUBSCRIPTION ... REFRESH statements to notify the
+ * user that data having origin might have been copied.
*
* This check need not be performed on the tables that are already added
* because incremental sync for those tables will happen through WAL and the
@@ -2117,7 +2118,9 @@ check_publications_origin(WalReceiverConn *wrconn, List *publications,
"SELECT DISTINCT P.pubname AS pubname\n"
"FROM pg_publication P,\n"
" LATERAL pg_get_publication_tables(P.pubname) GPT\n"
- " JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),\n"
+ " JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid OR"
+ " GPT.relid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION"
+ " SELECT relid FROM pg_partition_tree(PS.srrelid))),\n"
" pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)\n"
"WHERE C.oid = GPT.relid AND P.pubname IN (");
GetPublicationsStr(publications, &cmd, true);
--
2.34.1
v8-PG_17-PG_16-0001-Improve-logging-for-data-origin-discr.patchapplication/x-patch; name=v8-PG_17-PG_16-0001-Improve-logging-for-data-origin-discr.patchDownload
From 7da437f55ec13134d070d1a1aeed143dbc856a98 Mon Sep 17 00:00:00 2001
From: Shlok Kyal <shlok.kyal.oss@gmail.com>
Date: Thu, 23 Jan 2025 12:59:19 +0530
Subject: [PATCH v8-PG_17-PG_16] Improve logging for data origin discrepancies
in table synchronization
Previously, a WARNING was issued during initial table synchronization with
origin=NONE only when the publisher subscribed to the same table from
other publishers, indicating potential data origination from different origins.
However, it's possible for the publisher to subscribe to the partition
ancestors or partition children of the table from other publishers, which could
also result in mixed-origin data inclusion.
This patch expands the check to consider both the subscribed table's ancestors
and children. A WARNING will now be logged if any of these related tables could
contain data originating from different sources.
---
doc/src/sgml/ref/create_subscription.sgml | 12 +++++++-----
src/backend/commands/subscriptioncmds.c | 15 +++++++++------
2 files changed, 16 insertions(+), 11 deletions(-)
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 740b7d9421..c9c8dd440d 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -526,12 +526,14 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<programlisting>
# substitute <pub-names> below with your publication name(s) to be queried
SELECT DISTINCT PT.schemaname, PT.tablename
-FROM pg_publication_tables PT,
+FROM pg_publication_tables PT
+ JOIN pg_class C ON (C.relname = PT.tablename)
+ JOIN pg_namespace N ON (N.nspname = PT.schemaname),
pg_subscription_rel PS
- JOIN pg_class C ON (C.oid = PS.srrelid)
- JOIN pg_namespace N ON (N.oid = C.relnamespace)
-WHERE N.nspname = PT.schemaname AND
- C.relname = PT.tablename AND
+WHERE C.relnamespace = N.oid AND
+ (PS.srrelid = C.oid OR
+ C.oid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION
+ SELECT relid FROM pg_partition_tree(PS.srrelid))) AND
PT.pubname IN (<pub-names>);
</programlisting></para>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 8ecb6e0bb8..9467f58a23 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -2012,11 +2012,12 @@ AlterSubscriptionOwner_oid(Oid subid, Oid newOwnerId)
}
/*
- * Check and log a warning if the publisher has subscribed to the same table
- * from some other publisher. This check is required only if "copy_data = true"
- * and "origin = none" for CREATE SUBSCRIPTION and
- * ALTER SUBSCRIPTION ... REFRESH statements to notify the user that data
- * having origin might have been copied.
+ * Check and log a warning if the publisher has subscribed to the same table,
+ * its partition ancestors (if it's a partition), or its partition children (if
+ * it's a partitioned table), from some other publishers. This check is
+ * required only if "copy_data = true" and "origin = none" for CREATE
+ * SUBSCRIPTION and ALTER SUBSCRIPTION ... REFRESH statements to notify the
+ * user that data having origin might have been copied.
*
* This check need not be performed on the tables that are already added
* because incremental sync for those tables will happen through WAL and the
@@ -2046,7 +2047,9 @@ check_publications_origin(WalReceiverConn *wrconn, List *publications,
"SELECT DISTINCT P.pubname AS pubname\n"
"FROM pg_publication P,\n"
" LATERAL pg_get_publication_tables(P.pubname) GPT\n"
- " JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),\n"
+ " JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid OR"
+ " GPT.relid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION"
+ " SELECT relid FROM pg_partition_tree(PS.srrelid))),\n"
" pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)\n"
"WHERE C.oid = GPT.relid AND P.pubname IN (");
get_publications_str(publications, &cmd, true);
--
2.34.1
v8-0002-Test-for-Improve-logging-for-data-origin-discrepa.patchapplication/x-patch; name=v8-0002-Test-for-Improve-logging-for-data-origin-discrepa.patchDownload
From 10d4cab23486d04004114ef415cbd4b79ff31d0e Mon Sep 17 00:00:00 2001
From: Shlok Kyal <shlok.kyal.oss@gmail.com>
Date: Wed, 22 Jan 2025 20:59:17 +0530
Subject: [PATCH v8 2/2] Test for Improve logging for data origin discrepancies
in table synchronization
---
src/test/subscription/t/030_origin.pl | 104 ++++++++++++++++++++++++++
1 file changed, 104 insertions(+)
diff --git a/src/test/subscription/t/030_origin.pl b/src/test/subscription/t/030_origin.pl
index eb1dcb7271..390623d2da 100644
--- a/src/test/subscription/t/030_origin.pl
+++ b/src/test/subscription/t/030_origin.pl
@@ -255,6 +255,110 @@ $node_A->safe_psql('postgres', "DROP TABLE tab_new");
$node_B->safe_psql('postgres', "DROP TABLE tab_new");
$node_A->safe_psql('postgres', "DROP SUBSCRIPTION $subname_AB2");
+###############################################################################
+# Specifying origin = NONE and copy_data = on must raise WARNING if we subscribe
+# to a partitioned table and this table contains any remotely originated data.
+#
+# node_B
+# __________________________
+# | tab_main | --------------> node_C (tab_main)
+# |__________________________|
+# | tab_part1 | tab_part2 | <-------------- node_A (tab_part2)
+# |____________|_____________|
+# | tab_part2_1 |
+# |_____________|
+#
+# node_B
+# __________________________
+# | tab_main |
+# |__________________________|
+# | tab_part1 | tab_part2 | <-------------- node_A (tab_part2)
+# |____________|_____________|
+# | tab_part2_1 | --------------> node_C (tab_part2_1)
+# |_____________|
+###############################################################################
+
+# create a table on node A which will act as a source for a partition on node B
+$node_A->safe_psql(
+ 'postgres', qq(
+CREATE TABLE tab_part2(a int);
+CREATE PUBLICATION pub_a_b FOR TABLE tab_part2;
+));
+
+# create a partition table on node B
+$node_B->safe_psql(
+ 'postgres', qq(
+CREATE TABLE tab_main(a int) PARTITION BY RANGE(a);
+CREATE TABLE tab_part1 PARTITION OF tab_main FOR VALUES FROM (0) TO (5);
+CREATE TABLE tab_part2(a int) PARTITION BY RANGE(a);
+CREATE TABLE tab_part2_1 PARTITION OF tab_part2 FOR VALUES FROM (5) TO (10);
+ALTER TABLE tab_main ATTACH PARTITION tab_part2 FOR VALUES FROM (5) to (10);
+CREATE SUBSCRIPTION sub_a_b CONNECTION '$node_A_connstr' PUBLICATION pub_a_b;
+));
+
+# create a table on node C
+$node_C->safe_psql(
+ 'postgres', qq(
+CREATE TABLE tab_main(a int);
+CREATE TABLE tab_part2_1(a int);
+));
+
+# create a logical replication setup between node B and node C with
+# subscription on node C having origin = NONE and copy_data = on
+$node_B->safe_psql(
+ 'postgres', qq(
+CREATE PUBLICATION pub_b_c FOR TABLE tab_main WITH (publish_via_partition_root);
+CREATE PUBLICATION pub_b_c_2 FOR TABLE tab_part2_1;
+));
+
+($result, $stdout, $stderr) = $node_C->psql(
+ 'postgres', "
+ CREATE SUBSCRIPTION sub_b_c CONNECTION '$node_B_connstr' PUBLICATION pub_b_c WITH (origin = none, copy_data = on);
+");
+
+# A warning must be logged as a partition 'tab_part2' in node B is subscribed to
+# node A so partition 'tab_part2' can have remotely originated data
+like(
+ $stderr,
+ qr/WARNING: ( [A-Z0-9]+:)? subscription "sub_b_c" requested copy_data with origin = NONE but might copy data that had a different origin/,
+ "Create subscription with origin = none and copy_data when the publisher's partition is subscribing from different origin"
+);
+$node_C->safe_psql('postgres', "DROP SUBSCRIPTION sub_b_c");
+
+($result, $stdout, $stderr) = $node_C->psql(
+ 'postgres', "
+ CREATE SUBSCRIPTION sub_b_c CONNECTION '$node_B_connstr' PUBLICATION pub_b_c_2 WITH (origin = none, copy_data = on);
+");
+
+# A warning must be logged as ancestor of table 'tab_part2_1' in node A have
+# been subscribed to node B so table 'tab_part2_1' can have remotely originated
+# data
+like(
+ $stderr,
+ qr/WARNING: ( [A-Z0-9]+:)? subscription "sub_b_c" requested copy_data with origin = NONE but might copy data that had a different origin/,
+ "Create subscription with origin = none and copy_data when the publisher's ancestor is subscribing from different origin"
+);
+
+# clear the operations done by this test
+$node_C->safe_psql(
+ 'postgres', qq(
+DROP SUBSCRIPTION sub_b_c;
+DROP TABLE tab_main;
+DROP TABLE tab_part2_1;
+));
+$node_B->safe_psql(
+ 'postgres', qq(
+DROP SUBSCRIPTION sub_a_b;
+DROP PUBLICATION pub_b_c;
+DROP PUBLICATION pub_b_c_2;
+DROP TABLE tab_main;
+));
+$node_A->safe_psql(
+ 'postgres', qq(
+DROP PUBLICATION pub_a_b;
+DROP TABLE tab_part2;
+));
+
# shutdown
$node_B->stop('fast');
$node_A->stop('fast');
--
2.34.1
On Wednesday, January 29, 2025 8:19 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
I have addressed the comments. Here is an updated patch.
Thanks for updating the patch. The patches look mostly OK to me, I only have
one minor comments in 0002.
1.
+CREATE PUBLICATION pub_b_c_2 FOR TABLE tab_part2_1;
+));
+
+($result, $stdout, $stderr) = $node_C->psql(
+ 'postgres', "
+ CREATE SUBSCRIPTION sub_b_c CONNECTION '$node_B_connstr' PUBLICATION pub_b_c WITH (origin = none, copy_data = on);
+");
The naming style of new publications and subscriptions doesn't seem consistent
with existing ones in 030_origin.
Best Regards,
Hou zj
On Thu, 20 Feb 2025 at 16:32, Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
On Wednesday, January 29, 2025 8:19 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
I have addressed the comments. Here is an updated patch.
Thanks for updating the patch. The patches look mostly OK to me, I only have
one minor comments in 0002.1.
+CREATE PUBLICATION pub_b_c_2 FOR TABLE tab_part2_1; +)); + +($result, $stdout, $stderr) = $node_C->psql( + 'postgres', " + CREATE SUBSCRIPTION sub_b_c CONNECTION '$node_B_connstr' PUBLICATION pub_b_c WITH (origin = none, copy_data = on); +");The naming style of new publications and subscriptions doesn't seem consistent
with existing ones in 030_origin.
I have addressed the comments and attached the updated v9 patch.
I have also combined the 0001 and 0002 patch and updated the commit
message as per off list discussion.
Thanks and Regards
Shlok Kyal
Attachments:
v9-0001-Fix-a-WARNING-for-data-origin-discrepancies.patchapplication/x-patch; name=v9-0001-Fix-a-WARNING-for-data-origin-discrepancies.patchDownload
From 9ecf7c638c41c161da36b8862e26f42449d55641 Mon Sep 17 00:00:00 2001
From: Hou Zhijie <houzj.fnst@cn.fujitsu.com>
Date: Tue, 21 Jan 2025 16:18:34 +0800
Subject: [PATCH v9] Fix a WARNING for data origin discrepancies.
Previously, a WARNING was issued at the time of defining a subscription
with origin=NONE only when the publisher subscribed to the same table from
other publishers, indicating potential data origination from different
origins. However, the publisher can subscribe to the partition ancestors
or partition children of the table from other publishers, which could also
result in mixed-origin data inclusion. So, give a WARNING in those cases
as well.
Reported-by: Sergey Tatarintsev <s.tatarintsev@postgrespro.ru>
Author: Hou Zhijie <houzj.fnst@fujitsu.com>
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 16, where it was introduced
---
doc/src/sgml/ref/create_subscription.sgml | 12 +--
src/backend/commands/subscriptioncmds.c | 15 ++--
src/test/subscription/t/030_origin.pl | 104 ++++++++++++++++++++++
3 files changed, 120 insertions(+), 11 deletions(-)
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 6cf7d4f9a1a..57dec28a5df 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -534,12 +534,14 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<programlisting>
# substitute <pub-names> below with your publication name(s) to be queried
SELECT DISTINCT PT.schemaname, PT.tablename
-FROM pg_publication_tables PT,
+FROM pg_publication_tables PT
+ JOIN pg_class C ON (C.relname = PT.tablename)
+ JOIN pg_namespace N ON (N.nspname = PT.schemaname),
pg_subscription_rel PS
- JOIN pg_class C ON (C.oid = PS.srrelid)
- JOIN pg_namespace N ON (N.oid = C.relnamespace)
-WHERE N.nspname = PT.schemaname AND
- C.relname = PT.tablename AND
+WHERE C.relnamespace = N.oid AND
+ (PS.srrelid = C.oid OR
+ C.oid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION
+ SELECT relid FROM pg_partition_tree(PS.srrelid))) AND
PT.pubname IN (<pub-names>);
</programlisting></para>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 2d8a71ca1e1..4aec73bcc6b 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -2083,11 +2083,12 @@ AlterSubscriptionOwner_oid(Oid subid, Oid newOwnerId)
}
/*
- * Check and log a warning if the publisher has subscribed to the same table
- * from some other publisher. This check is required only if "copy_data = true"
- * and "origin = none" for CREATE SUBSCRIPTION and
- * ALTER SUBSCRIPTION ... REFRESH statements to notify the user that data
- * having origin might have been copied.
+ * Check and log a warning if the publisher has subscribed to the same table,
+ * its partition ancestors (if it's a partition), or its partition children (if
+ * it's a partitioned table), from some other publishers. This check is
+ * required only if "copy_data = true" and "origin = none" for CREATE
+ * SUBSCRIPTION and ALTER SUBSCRIPTION ... REFRESH statements to notify the
+ * user that data having origin might have been copied.
*
* This check need not be performed on the tables that are already added
* because incremental sync for those tables will happen through WAL and the
@@ -2117,7 +2118,9 @@ check_publications_origin(WalReceiverConn *wrconn, List *publications,
"SELECT DISTINCT P.pubname AS pubname\n"
"FROM pg_publication P,\n"
" LATERAL pg_get_publication_tables(P.pubname) GPT\n"
- " JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),\n"
+ " JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid OR"
+ " GPT.relid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION"
+ " SELECT relid FROM pg_partition_tree(PS.srrelid))),\n"
" pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)\n"
"WHERE C.oid = GPT.relid AND P.pubname IN (");
GetPublicationsStr(publications, &cmd, true);
diff --git a/src/test/subscription/t/030_origin.pl b/src/test/subscription/t/030_origin.pl
index eb1dcb72715..38c51a725b8 100644
--- a/src/test/subscription/t/030_origin.pl
+++ b/src/test/subscription/t/030_origin.pl
@@ -255,6 +255,110 @@ $node_A->safe_psql('postgres', "DROP TABLE tab_new");
$node_B->safe_psql('postgres', "DROP TABLE tab_new");
$node_A->safe_psql('postgres', "DROP SUBSCRIPTION $subname_AB2");
+###############################################################################
+# Specifying origin = NONE and copy_data = on must raise WARNING if we subscribe
+# to a partitioned table and this table contains any remotely originated data.
+#
+# node_B
+# __________________________
+# | tab_main | --------------> node_C (tab_main)
+# |__________________________|
+# | tab_part1 | tab_part2 | <-------------- node_A (tab_part2)
+# |____________|_____________|
+# | tab_part2_1 |
+# |_____________|
+#
+# node_B
+# __________________________
+# | tab_main |
+# |__________________________|
+# | tab_part1 | tab_part2 | <-------------- node_A (tab_part2)
+# |____________|_____________|
+# | tab_part2_1 | --------------> node_C (tab_part2_1)
+# |_____________|
+###############################################################################
+
+# create a table on node A which will act as a source for a partition on node B
+$node_A->safe_psql(
+ 'postgres', qq(
+CREATE TABLE tab_part2(a int);
+CREATE PUBLICATION tab_pub_A FOR TABLE tab_part2;
+));
+
+# create a partition table on node B
+$node_B->safe_psql(
+ 'postgres', qq(
+CREATE TABLE tab_main(a int) PARTITION BY RANGE(a);
+CREATE TABLE tab_part1 PARTITION OF tab_main FOR VALUES FROM (0) TO (5);
+CREATE TABLE tab_part2(a int) PARTITION BY RANGE(a);
+CREATE TABLE tab_part2_1 PARTITION OF tab_part2 FOR VALUES FROM (5) TO (10);
+ALTER TABLE tab_main ATTACH PARTITION tab_part2 FOR VALUES FROM (5) to (10);
+CREATE SUBSCRIPTION tab_sub_A_B CONNECTION '$node_A_connstr' PUBLICATION tab_pub_A;
+));
+
+# create a table on node C
+$node_C->safe_psql(
+ 'postgres', qq(
+CREATE TABLE tab_main(a int);
+CREATE TABLE tab_part2_1(a int);
+));
+
+# create a logical replication setup between node B and node C with
+# subscription on node C having origin = NONE and copy_data = on
+$node_B->safe_psql(
+ 'postgres', qq(
+CREATE PUBLICATION tab_pub_B FOR TABLE tab_main WITH (publish_via_partition_root);
+CREATE PUBLICATION tab_pub_B_2 FOR TABLE tab_part2_1;
+));
+
+($result, $stdout, $stderr) = $node_C->psql(
+ 'postgres', "
+ CREATE SUBSCRIPTION tab_sub_B_C CONNECTION '$node_B_connstr' PUBLICATION tab_pub_B WITH (origin = none, copy_data = on);
+");
+
+# A warning must be logged as a partition 'tab_part2' in node B is subscribed to
+# node A so partition 'tab_part2' can have remotely originated data
+like(
+ $stderr,
+ qr/WARNING: ( [A-Z0-9]+:)? subscription "tab_sub_b_c" requested copy_data with origin = NONE but might copy data that had a different origin/,
+ "Create subscription with origin = none and copy_data when the publisher's partition is subscribing from different origin"
+);
+$node_C->safe_psql('postgres', "DROP SUBSCRIPTION tab_sub_B_C");
+
+($result, $stdout, $stderr) = $node_C->psql(
+ 'postgres', "
+ CREATE SUBSCRIPTION tab_sub_B_C CONNECTION '$node_B_connstr' PUBLICATION tab_pub_B_2 WITH (origin = none, copy_data = on);
+");
+
+# A warning must be logged as ancestor of table 'tab_part2_1' in node B is
+# subscribed to node A so table 'tab_part2_1' can have remotely originated
+# data
+like(
+ $stderr,
+ qr/WARNING: ( [A-Z0-9]+:)? subscription "tab_sub_b_c" requested copy_data with origin = NONE but might copy data that had a different origin/,
+ "Create subscription with origin = none and copy_data when the publisher's ancestor is subscribing from different origin"
+);
+
+# clear the operations done by this test
+$node_C->safe_psql(
+ 'postgres', qq(
+DROP SUBSCRIPTION tab_sub_B_C;
+DROP TABLE tab_main;
+DROP TABLE tab_part2_1;
+));
+$node_B->safe_psql(
+ 'postgres', qq(
+DROP SUBSCRIPTION tab_sub_A_B;
+DROP PUBLICATION tab_pub_B;
+DROP PUBLICATION tab_pub_B_2;
+DROP TABLE tab_main;
+));
+$node_A->safe_psql(
+ 'postgres', qq(
+DROP PUBLICATION tab_pub_A;
+DROP TABLE tab_part2;
+));
+
# shutdown
$node_B->stop('fast');
$node_A->stop('fast');
--
2.41.0.windows.3
v9_PG_17-PG_16-0001-Fix-a-WARNING-for-data-origin-discrep.patchapplication/x-patch; name=v9_PG_17-PG_16-0001-Fix-a-WARNING-for-data-origin-discrep.patchDownload
From b80b2e29b96fbb132991299825d6f0e2093dc66f Mon Sep 17 00:00:00 2001
From: Shlok Kyal <shlok.kyal.oss@gmail.com>
Date: Thu, 23 Jan 2025 12:59:19 +0530
Subject: [PATCH v9_PG_17-PG_16] Fix a WARNING for data origin discrepancies.
Previously, a WARNING was issued at the time of defining a subscription
with origin=NONE only when the publisher subscribed to the same table from
other publishers, indicating potential data origination from different
origins. However, the publisher can subscribe to the partition ancestors
or partition children of the table from other publishers, which could also
result in mixed-origin data inclusion. So, give a WARNING in those cases
as well.
Reported-by: Sergey Tatarintsev <s.tatarintsev@postgrespro.ru>
Author: Hou Zhijie <houzj.fnst@fujitsu.com>
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 16, where it was introduced
---
doc/src/sgml/ref/create_subscription.sgml | 12 +--
src/backend/commands/subscriptioncmds.c | 15 ++--
src/test/subscription/t/030_origin.pl | 104 ++++++++++++++++++++++
3 files changed, 120 insertions(+), 11 deletions(-)
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 740b7d94210..c9c8dd440dc 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -526,12 +526,14 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
<programlisting>
# substitute <pub-names> below with your publication name(s) to be queried
SELECT DISTINCT PT.schemaname, PT.tablename
-FROM pg_publication_tables PT,
+FROM pg_publication_tables PT
+ JOIN pg_class C ON (C.relname = PT.tablename)
+ JOIN pg_namespace N ON (N.nspname = PT.schemaname),
pg_subscription_rel PS
- JOIN pg_class C ON (C.oid = PS.srrelid)
- JOIN pg_namespace N ON (N.oid = C.relnamespace)
-WHERE N.nspname = PT.schemaname AND
- C.relname = PT.tablename AND
+WHERE C.relnamespace = N.oid AND
+ (PS.srrelid = C.oid OR
+ C.oid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION
+ SELECT relid FROM pg_partition_tree(PS.srrelid))) AND
PT.pubname IN (<pub-names>);
</programlisting></para>
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 8ecb6e0bb87..9467f58a23d 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -2012,11 +2012,12 @@ AlterSubscriptionOwner_oid(Oid subid, Oid newOwnerId)
}
/*
- * Check and log a warning if the publisher has subscribed to the same table
- * from some other publisher. This check is required only if "copy_data = true"
- * and "origin = none" for CREATE SUBSCRIPTION and
- * ALTER SUBSCRIPTION ... REFRESH statements to notify the user that data
- * having origin might have been copied.
+ * Check and log a warning if the publisher has subscribed to the same table,
+ * its partition ancestors (if it's a partition), or its partition children (if
+ * it's a partitioned table), from some other publishers. This check is
+ * required only if "copy_data = true" and "origin = none" for CREATE
+ * SUBSCRIPTION and ALTER SUBSCRIPTION ... REFRESH statements to notify the
+ * user that data having origin might have been copied.
*
* This check need not be performed on the tables that are already added
* because incremental sync for those tables will happen through WAL and the
@@ -2046,7 +2047,9 @@ check_publications_origin(WalReceiverConn *wrconn, List *publications,
"SELECT DISTINCT P.pubname AS pubname\n"
"FROM pg_publication P,\n"
" LATERAL pg_get_publication_tables(P.pubname) GPT\n"
- " JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid),\n"
+ " JOIN pg_subscription_rel PS ON (GPT.relid = PS.srrelid OR"
+ " GPT.relid IN (SELECT relid FROM pg_partition_ancestors(PS.srrelid) UNION"
+ " SELECT relid FROM pg_partition_tree(PS.srrelid))),\n"
" pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)\n"
"WHERE C.oid = GPT.relid AND P.pubname IN (");
get_publications_str(publications, &cmd, true);
diff --git a/src/test/subscription/t/030_origin.pl b/src/test/subscription/t/030_origin.pl
index 056561f0084..e2e5b4db213 100644
--- a/src/test/subscription/t/030_origin.pl
+++ b/src/test/subscription/t/030_origin.pl
@@ -208,6 +208,110 @@ $node_A->safe_psql('postgres', "DROP TABLE tab_new");
$node_B->safe_psql('postgres', "DROP TABLE tab_new");
$node_A->safe_psql('postgres', "DROP SUBSCRIPTION $subname_AB2");
+###############################################################################
+# Specifying origin = NONE and copy_data = on must raise WARNING if we subscribe
+# to a partitioned table and this table contains any remotely originated data.
+#
+# node_B
+# __________________________
+# | tab_main | --------------> node_C (tab_main)
+# |__________________________|
+# | tab_part1 | tab_part2 | <-------------- node_A (tab_part2)
+# |____________|_____________|
+# | tab_part2_1 |
+# |_____________|
+#
+# node_B
+# __________________________
+# | tab_main |
+# |__________________________|
+# | tab_part1 | tab_part2 | <-------------- node_A (tab_part2)
+# |____________|_____________|
+# | tab_part2_1 | --------------> node_C (tab_part2_1)
+# |_____________|
+###############################################################################
+
+# create a table on node A which will act as a source for a partition on node B
+$node_A->safe_psql(
+ 'postgres', qq(
+CREATE TABLE tab_part2(a int);
+CREATE PUBLICATION tab_pub_A FOR TABLE tab_part2;
+));
+
+# create a partition table on node B
+$node_B->safe_psql(
+ 'postgres', qq(
+CREATE TABLE tab_main(a int) PARTITION BY RANGE(a);
+CREATE TABLE tab_part1 PARTITION OF tab_main FOR VALUES FROM (0) TO (5);
+CREATE TABLE tab_part2(a int) PARTITION BY RANGE(a);
+CREATE TABLE tab_part2_1 PARTITION OF tab_part2 FOR VALUES FROM (5) TO (10);
+ALTER TABLE tab_main ATTACH PARTITION tab_part2 FOR VALUES FROM (5) to (10);
+CREATE SUBSCRIPTION tab_sub_A_B CONNECTION '$node_A_connstr' PUBLICATION tab_pub_A;
+));
+
+# create a table on node C
+$node_C->safe_psql(
+ 'postgres', qq(
+CREATE TABLE tab_main(a int);
+CREATE TABLE tab_part2_1(a int);
+));
+
+# create a logical replication setup between node B and node C with
+# subscription on node C having origin = NONE and copy_data = on
+$node_B->safe_psql(
+ 'postgres', qq(
+CREATE PUBLICATION tab_pub_B FOR TABLE tab_main WITH (publish_via_partition_root);
+CREATE PUBLICATION tab_pub_B_2 FOR TABLE tab_part2_1;
+));
+
+($result, $stdout, $stderr) = $node_C->psql(
+ 'postgres', "
+ CREATE SUBSCRIPTION tab_sub_B_C CONNECTION '$node_B_connstr' PUBLICATION tab_pub_B WITH (origin = none, copy_data = on);
+");
+
+# A warning must be logged as a partition 'tab_part2' in node B is subscribed to
+# node A so partition 'tab_part2' can have remotely originated data
+like(
+ $stderr,
+ qr/WARNING: ( [A-Z0-9]+:)? subscription "tab_sub_b_c" requested copy_data with origin = NONE but might copy data that had a different origin/,
+ "Create subscription with origin = none and copy_data when the publisher's partition is subscribing from different origin"
+);
+$node_C->safe_psql('postgres', "DROP SUBSCRIPTION tab_sub_B_C");
+
+($result, $stdout, $stderr) = $node_C->psql(
+ 'postgres', "
+ CREATE SUBSCRIPTION tab_sub_B_C CONNECTION '$node_B_connstr' PUBLICATION tab_pub_B_2 WITH (origin = none, copy_data = on);
+");
+
+# A warning must be logged as ancestor of table 'tab_part2_1' in node B is
+# subscribed to node A so table 'tab_part2_1' can have remotely originated
+# data
+like(
+ $stderr,
+ qr/WARNING: ( [A-Z0-9]+:)? subscription "tab_sub_b_c" requested copy_data with origin = NONE but might copy data that had a different origin/,
+ "Create subscription with origin = none and copy_data when the publisher's ancestor is subscribing from different origin"
+);
+
+# clear the operations done by this test
+$node_C->safe_psql(
+ 'postgres', qq(
+DROP SUBSCRIPTION tab_sub_B_C;
+DROP TABLE tab_main;
+DROP TABLE tab_part2_1;
+));
+$node_B->safe_psql(
+ 'postgres', qq(
+DROP SUBSCRIPTION tab_sub_A_B;
+DROP PUBLICATION tab_pub_B;
+DROP PUBLICATION tab_pub_B_2;
+DROP TABLE tab_main;
+));
+$node_A->safe_psql(
+ 'postgres', qq(
+DROP PUBLICATION tab_pub_A;
+DROP TABLE tab_part2;
+));
+
# shutdown
$node_B->stop('fast');
$node_A->stop('fast');
--
2.41.0.windows.3