Data is copied twice when specifying both child and parent table in publication
Hi,
In another logical replication related thread[1]/messages/by-id/CAJcOf-eBhDUT2J5zs8Z0qEMiZUdhinX+buGX3GN4V83fPnZV3Q@mail.gmail.com, my colleague Greg found that
if publish_via_partition_root is true, then the child table's data will be
copied twice when adding both child and parent table to the publication.
Example:
-----
Pub:
create table tbl1 (a int) partition by range (a);
create table tbl1_part1 partition of tbl1 for values from (1) to (10);
create table tbl1_part2 partition of tbl1 for values from (10) to (20);
create publication pub for table tbl1, tbl1_part1 with (publish_via_partition_root=on);
insert into tbl1_part1 values(1);
Sub:
create table tbl1 (a int) partition by range (a);
create table tbl1_part1 partition of tbl1 for values from (1) to (10);
create table tbl1_part2 partition of tbl1 for values from (10) to (20);
create subscription sub CONNECTION 'dbname=postgres port=10000' publication pub;
-- data is copied twice
select * from tbl1_part1;
a
---
1
1
-----
The reason is that the subscriber will fetch the table list from publisher
using the following sql[2]select * from pg_publication_tables; pubname | schemaname | tablename ---------+------------+------------ pub | public | tbl1 pub | public | tbl1_part1 and the subscriber will execute table
synchronization for each table in the query results in this case. But
tbl1_part1 is a partition of tbl1, so the data of tbl1_part1 was copied twice.
[2]: select * from pg_publication_tables; pubname | schemaname | tablename ---------+------------+------------ pub | public | tbl1 pub | public | tbl1_part1
select * from pg_publication_tables;
pubname | schemaname | tablename
---------+------------+------------
pub | public | tbl1
pub | public | tbl1_part1
IMO, it looks like a bug and it's more natural to only execute the table
synchronization for the parent table in the above case. Because as the document
said: if publish_via_partition_root is true, "changes in a partitioned table
(or on its partitions) contained in the publication will be published using the
identity and schema of the partitioned table rather than that of the individual
partitions that are actually changed;"
To fix it, I think we should fix function GetPublicationRelations which
generate data for the view pg_publication_tables and make it only show the
parent table if publish_via_partition_root is true. And for other future
feature like schema level publication, we can also follow this to exclude
partitions if their parent is specified by FOR TABLE in the same publication.
Attach a patch to fix it.
Thoughts ?
[1]: /messages/by-id/CAJcOf-eBhDUT2J5zs8Z0qEMiZUdhinX+buGX3GN4V83fPnZV3Q@mail.gmail.com
Best regards,
Hou zhijie
Attachments:
0001-fix-double-publish.patchapplication/octet-stream; name=0001-fix-double-publish.patchDownload+67-9
On Friday, October 15, 2021 7:23 PM houzj.fnst@fujitsu.com wrote:
Attach a patch to fix it.
Attach a new version patch which refactor the fix code in a cleaner way.
Best regards,
Hou zj
Attachments:
v2-0001-fix-double-publish.patchapplication/octet-stream; name=v2-0001-fix-double-publish.patchDownload+87-5
On Sat, Oct 16, 2021 at 5:30 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
On Friday, October 15, 2021 7:23 PM houzj.fnst@fujitsu.com wrote:
Attach a patch to fix it.
Attach a new version patch which refactor the fix code in a cleaner way.
I have not debugged it yet to find out why, but with the patch
applied, the original double-publish problem that I reported
(converted to just use TABLE rather than ALL TABLES IN SCHEMA) still
occurs.
The steps are below:
CREATE SCHEMA sch;
CREATE SCHEMA sch1;
CREATE TABLE sch.sale (sale_date date not null, country_code text,
product_sku text, units integer) PARTITION BY RANGE (sale_date);
CREATE TABLE sch1.sale_201901 PARTITION OF sch.sale FOR VALUES FROM
('2019-01-01') TO ('2019-02-01');
CREATE TABLE sch1.sale_201902 PARTITION OF sch.sale FOR VALUES FROM
('2019-02-01') TO ('2019-03-01');
(1) PUB: CREATE PUBLICATION pub FOR TABLE sch1.sale_201901,
sch1.sale_201902 WITH (publish_via_partition_root=true);
(2) SUB: CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres
host=localhost port=5432' PUBLICATION pub;
(3) PUB: INSERT INTO sch.sale VALUES('2019-01-01', 'AU', 'cpu', 5),
('2019-01-02', 'AU', 'disk', 8);
(4) SUB: SELECT * FROM sch.sale;
(5) PUB: ALTER PUBLICATION pub ADD TABLE sch.sale;
(6) SUB: ALTER SUBSCRIPTION sub REFRESH PUBLICATION;
(7) SUB: SELECT * FROM sch.sale;
Regards,
Greg Nancarrow
Fujitsu Australia
On Mon, Oct 18, 2021 at 8:27 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Sat, Oct 16, 2021 at 5:30 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On Friday, October 15, 2021 7:23 PM houzj.fnst@fujitsu.com wrote:
Attach a patch to fix it.
Attach a new version patch which refactor the fix code in a cleaner way.
I have not debugged it yet to find out why, but with the patch
applied, the original double-publish problem that I reported
(converted to just use TABLE rather than ALL TABLES IN SCHEMA) still
occurs.
Yeah, I think this is a variant of the problem being fixed by
Hou-San's patch. I think one possible idea to investigate is that on
the subscriber-side, after fetching tables, we check the already
subscribed tables and if the child tables already exist then we ignore
the parent table and vice versa. We might want to consider the case
where a user has toggled the "publish_via_partition_root" parameter.
It seems both these behaviours/problems exist since commit 17b9e7f9
(Support adding partitioned tables to publication). Adding Amit L and
Peter E (people involved in this work) to know their opinion?
--
With Regards,
Amit Kapila.
On Saturday, October 16, 2021 2:31 PM houzj.fnst@fujitsu.com wrote:
On Friday, October 15, 2021 7:23 PM houzj.fnst@fujitsu.com wrote:
Attach a patch to fix it.
Attach a new version patch which refactor the fix code in a cleaner way.
Although the discussion about the partition behavior[1]/messages/by-id/CAA4eK1JC5sy5M_UVoGdgubHN2--peYqApOJkT=FLCq+VUxqerQ@mail.gmail.com is going on,
attach a refactored fix patch which make the pg_publication_tables view be
consistent for FOR TABLE and FOR ALL TABLES here in case someone want
to have a look.
[1]: /messages/by-id/CAA4eK1JC5sy5M_UVoGdgubHN2--peYqApOJkT=FLCq+VUxqerQ@mail.gmail.com
Best regards,
Hou zj
Attachments:
v3-0001-fix-double-publish.patchapplication/octet-stream; name=v3-0001-fix-double-publish.patchDownload+65-1
On Mon, Oct 18, 2021 at 3:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Oct 18, 2021 at 8:27 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Sat, Oct 16, 2021 at 5:30 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On Friday, October 15, 2021 7:23 PM houzj.fnst@fujitsu.com wrote:
Attach a patch to fix it.
Attach a new version patch which refactor the fix code in a cleaner way.
I have not debugged it yet to find out why, but with the patch
applied, the original double-publish problem that I reported
(converted to just use TABLE rather than ALL TABLES IN SCHEMA) still
occurs.Yeah, I think this is a variant of the problem being fixed by
Hou-San's patch. I think one possible idea to investigate is that on
the subscriber-side, after fetching tables, we check the already
subscribed tables and if the child tables already exist then we ignore
the parent table and vice versa. We might want to consider the case
where a user has toggled the "publish_via_partition_root" parameter.It seems both these behaviours/problems exist since commit 17b9e7f9
(Support adding partitioned tables to publication). Adding Amit L and
Peter E (people involved in this work) to know their opinion?
I can imagine that the behavior seen here may look surprising, but not
sure if I would call it a bug as such. I do remember thinking about
this case and the current behavior is how I may have coded it to be.
Looking at this command in Hou-san's email:
create publication pub for table tbl1, tbl1_part1 with
(publish_via_partition_root=on);
It's adding both the root partitioned table and the leaf partition
*explicitly*, and it's not clear to me if the latter's inclusion in
the publication should be assumed because the former is found to have
been added to the publication, that is, as far as the latter's
visibility to the subscriber is concerned. It's not a stretch to
imagine that a user may write the command this way to account for a
subscriber node on which tbl1 and tbl1_part1 are unrelated tables.
I don't think we assume anything on the publisher side regarding the
state/configuration of tables on the subscriber side, at least with
publication commands where tables are added to a publication
explicitly, so it is up to the user to make sure that the tables are
not added duplicatively. One may however argue that the way we've
decided to handle FOR ALL TABLES does assume something about
partitions where it skips advertising them to subscribers when
publish_via_partition_root flag is set to true, but that is exactly to
avoid the duplication of data that goes to a subscriber.
--
Amit Langote
EDB: http://www.enterprisedb.com
On Mon, Oct 18, 2021 at 2:32 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Mon, Oct 18, 2021 at 3:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Oct 18, 2021 at 8:27 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Sat, Oct 16, 2021 at 5:30 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:On Friday, October 15, 2021 7:23 PM houzj.fnst@fujitsu.com wrote:
Attach a patch to fix it.
Attach a new version patch which refactor the fix code in a cleaner way.
I have not debugged it yet to find out why, but with the patch
applied, the original double-publish problem that I reported
(converted to just use TABLE rather than ALL TABLES IN SCHEMA) still
occurs.Yeah, I think this is a variant of the problem being fixed by
Hou-San's patch. I think one possible idea to investigate is that on
the subscriber-side, after fetching tables, we check the already
subscribed tables and if the child tables already exist then we ignore
the parent table and vice versa. We might want to consider the case
where a user has toggled the "publish_via_partition_root" parameter.It seems both these behaviours/problems exist since commit 17b9e7f9
(Support adding partitioned tables to publication). Adding Amit L and
Peter E (people involved in this work) to know their opinion?I can imagine that the behavior seen here may look surprising, but not
sure if I would call it a bug as such. I do remember thinking about
this case and the current behavior is how I may have coded it to be.Looking at this command in Hou-san's email:
create publication pub for table tbl1, tbl1_part1 with
(publish_via_partition_root=on);It's adding both the root partitioned table and the leaf partition
*explicitly*, and it's not clear to me if the latter's inclusion in
the publication should be assumed because the former is found to have
been added to the publication, that is, as far as the latter's
visibility to the subscriber is concerned. It's not a stretch to
imagine that a user may write the command this way to account for a
subscriber node on which tbl1 and tbl1_part1 are unrelated tables.I don't think we assume anything on the publisher side regarding the
state/configuration of tables on the subscriber side, at least with
publication commands where tables are added to a publication
explicitly, so it is up to the user to make sure that the tables are
not added duplicatively. One may however argue that the way we've
decided to handle FOR ALL TABLES does assume something about
partitions where it skips advertising them to subscribers when
publish_via_partition_root flag is set to true, but that is exactly to
avoid the duplication of data that goes to a subscriber.
I think the same confusion will then apply to the new feature (For All
Tables In Schema) being discussed [1]/messages/by-id/OS0PR01MB5716B523961FE338EB9B3F9A94BC9@OS0PR01MB5716.jpnprd01.prod.outlook.com (that is a bit long thread so
shared the email where the latest patch version is posted). There
also, the partitioned table and partition can be in a different
schema. We either need to follow "For All Tables" or "For Table"
behavior. Then, there is also an argument that such behavior is not
documented, and by reading "publish_via_partition_root", it is not
clear why would the user expect the current behavior?
Also, what about Greg's case [2]/messages/by-id/CAJcOf-eQR_=q0f4ZVHd342QdLvBd_995peSr4xCU05hrS3TeTg@mail.gmail.com, where I think it is clear that the
subscriber also has partitions?
[1]: /messages/by-id/OS0PR01MB5716B523961FE338EB9B3F9A94BC9@OS0PR01MB5716.jpnprd01.prod.outlook.com
[2]: /messages/by-id/CAJcOf-eQR_=q0f4ZVHd342QdLvBd_995peSr4xCU05hrS3TeTg@mail.gmail.com
--
With Regards,
Amit Kapila.
On Monday, October 18, 2021 5:03 PM Amit Langote <amitlangote09@gmail.com> wrote:
I can imagine that the behavior seen here may look surprising, but not
sure if I would call it a bug as such. I do remember thinking about
this case and the current behavior is how I may have coded it to be.Looking at this command in Hou-san's email:
create publication pub for table tbl1, tbl1_part1 with
(publish_via_partition_root=on);It's adding both the root partitioned table and the leaf partition
*explicitly*, and it's not clear to me if the latter's inclusion in
the publication should be assumed because the former is found to have
been added to the publication, that is, as far as the latter's
visibility to the subscriber is concerned. It's not a stretch to
imagine that a user may write the command this way to account for a
subscriber node on which tbl1 and tbl1_part1 are unrelated tables.I don't think we assume anything on the publisher side regarding the
state/configuration of tables on the subscriber side, at least with
publication commands where tables are added to a publication
explicitly, so it is up to the user to make sure that the tables are
not added duplicatively. One may however argue that the way we've
decided to handle FOR ALL TABLES does assume something about
partitions where it skips advertising them to subscribers when
publish_via_partition_root flag is set to true, but that is exactly to
avoid the duplication of data that goes to a subscriber.
Hi,
Thanks for the explanation.
I think one reason that I consider this behavior a bug is that: If we add
both the root partitioned table and the leaf partition explicitly to the
publication (and set publish_via_partition_root = on), the behavior of the
apply worker is inconsistent with the behavior of table sync worker.
In this case, all changes in the leaf the partition will be applied using the
identity and schema of the partitioned(root) table. But for the table sync, it
will execute table sync for both the leaf and the root table which cause
duplication of data.
Wouldn't it be better to make the behavior consistent here ?
Best regards,
Hou zj
On Mon, Oct 18, 2021 at 2:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Oct 18, 2021 at 2:32 PM Amit Langote <amitlangote09@gmail.com> wrote:
Looking at this command in Hou-san's email:
create publication pub for table tbl1, tbl1_part1 with
(publish_via_partition_root=on);It's adding both the root partitioned table and the leaf partition
*explicitly*, and it's not clear to me if the latter's inclusion in
the publication should be assumed because the former is found to have
been added to the publication, that is, as far as the latter's
visibility to the subscriber is concerned. It's not a stretch to
imagine that a user may write the command this way to account for a
subscriber node on which tbl1 and tbl1_part1 are unrelated tables.I don't think we assume anything on the publisher side regarding the
state/configuration of tables on the subscriber side, at least with
publication commands where tables are added to a publication
explicitly, so it is up to the user to make sure that the tables are
not added duplicatively. One may however argue that the way we've
decided to handle FOR ALL TABLES does assume something about
partitions where it skips advertising them to subscribers when
publish_via_partition_root flag is set to true, but that is exactly to
avoid the duplication of data that goes to a subscriber.I think the same confusion will then apply to the new feature (For All
Tables In Schema) being discussed [1] (that is a bit long thread so
shared the email where the latest patch version is posted). There
also, the partitioned table and partition can be in a different
schema.
Sorry, I wanted to say that table partition and partitioned table can
be in the same schema. Now, if the user publishes all tables in a
schema, if we want to follow the "For All Tables" behavior then we
should skip the leaf table and publish only the parent table, OTOH, if
want to follow "For Table" behavior, we need to publish both
partitioned table and partition table. I feel it is better to be
consistent here in all three cases ("For Table", "For All Tables", and
"For All Tables In Schema") as it will be easier to explain and
document it.
--
With Regards,
Amit Kapila.
On Tue, Oct 19, 2021 at 8:45 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Oct 18, 2021 at 2:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Oct 18, 2021 at 2:32 PM Amit Langote <amitlangote09@gmail.com> wrote:
Looking at this command in Hou-san's email:
create publication pub for table tbl1, tbl1_part1 with
(publish_via_partition_root=on);It's adding both the root partitioned table and the leaf partition
*explicitly*, and it's not clear to me if the latter's inclusion in
the publication should be assumed because the former is found to have
been added to the publication, that is, as far as the latter's
visibility to the subscriber is concerned. It's not a stretch to
imagine that a user may write the command this way to account for a
subscriber node on which tbl1 and tbl1_part1 are unrelated tables.I don't think we assume anything on the publisher side regarding the
state/configuration of tables on the subscriber side, at least with
publication commands where tables are added to a publication
explicitly, so it is up to the user to make sure that the tables are
not added duplicatively. One may however argue that the way we've
decided to handle FOR ALL TABLES does assume something about
partitions where it skips advertising them to subscribers when
publish_via_partition_root flag is set to true, but that is exactly to
avoid the duplication of data that goes to a subscriber.I think the same confusion will then apply to the new feature (For All
Tables In Schema) being discussed [1] (that is a bit long thread so
shared the email where the latest patch version is posted). There
also, the partitioned table and partition can be in a different
schema.Sorry, I wanted to say that table partition and partitioned table can
be in the same schema. Now, if the user publishes all tables in a
schema, if we want to follow the "For All Tables" behavior then we
should skip the leaf table and publish only the parent table, OTOH, if
want to follow "For Table" behavior, we need to publish both
partitioned table and partition table. I feel it is better to be
consistent here in all three cases ("For Table", "For All Tables", and
"For All Tables In Schema") as it will be easier to explain and
document it.
Thinking some more about it, I think we also have a problem when the
partitioned and partition tables are in different schemas especially
when the user created a publication having a combination of "For
Table" and "For All Tables In Schema", see below:
create schema sch1;
create schema sch2;
create table sch1.tbl1 (a int) partition by range ( a );
create table sch2.tbl1_part1 partition of sch1.tbl1 for values from
(1) to (101);
create table sch1.tbl1_part2 partition of sch1.tbl1 for values from
(101) to (200);
create publication mypub for table sch2.tbl1_part1, all tables in
schema sch1 WITH (publish_via_partition_root = true);
Now, here if we follow the rules of "For Table", then we should get
both partitioned and partition tables which will be different from the
case when both are in the same schema considering we follow "For All
Tables" behavior in "For All Tables In Schema" case.
The point is that as we extend the current feature, I think the
complications will increase if we don't have a consistent behavior for
all cases and it will also be difficult to explain it to users.
--
With Regards,
Amit Kapila.
On Tue, Oct 19, 2021 at 8:17 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
Thanks for the explanation.
I think one reason that I consider this behavior a bug is that: If we add
both the root partitioned table and the leaf partition explicitly to the
publication (and set publish_via_partition_root = on), the behavior of the
apply worker is inconsistent with the behavior of table sync worker.In this case, all changes in the leaf the partition will be applied using the
identity and schema of the partitioned(root) table. But for the table sync, it
will execute table sync for both the leaf and the root table which cause
duplication of data.Wouldn't it be better to make the behavior consistent here ?
I agree with the point, whether we are doing the initial sync or we
are doing transaction streaming the behavior should be the same. I
think the right behavior should be that even if user has given both
parent table and the child table in the published table list, it
should sync it only once, because consider the case where we add a
same table twice e.g (CREATE PUBLICATION mypub FOR TABLE t1,t1;) but
in that case also we consider this table only once and there will be
no duplicate data.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Mon, Oct 18, 2021 at 5:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I have not debugged it yet to find out why, but with the patch
applied, the original double-publish problem that I reported
(converted to just use TABLE rather than ALL TABLES IN SCHEMA) still
occurs.Yeah, I think this is a variant of the problem being fixed by
Hou-San's patch. I think one possible idea to investigate is that on
the subscriber-side, after fetching tables, we check the already
subscribed tables and if the child tables already exist then we ignore
the parent table and vice versa. We might want to consider the case
where a user has toggled the "publish_via_partition_root" parameter.It seems both these behaviours/problems exist since commit 17b9e7f9
(Support adding partitioned tables to publication). Adding Amit L and
Peter E (people involved in this work) to know their opinion?
Actually, at least with the scenario I gave steps for, after looking
at it again and debugging, I think that the behavior is understandable
and not a bug.
The reason is that the INSERTed data is first published though the
partitions, since initially there is no partitioned table in the
publication (so publish_via_partition_root=true doesn't have any
effect). But then adding the partitioned table to the publication and
refreshing the publication in the subscriber, the data is then
published "using the identity and schema of the partitioned table" due
to publish_via_partition_root=true. Note that the corresponding table
in the subscriber may well be a non-partitioned table (or the
partitions arranged differently) so the data does need to be
replicated again.
Regards,
Greg Nancarrow
Fujitsu Australia
On Wed, Oct 20, 2021 at 12:44 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Oct 18, 2021 at 5:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I have not debugged it yet to find out why, but with the patch
applied, the original double-publish problem that I reported
(converted to just use TABLE rather than ALL TABLES IN SCHEMA) still
occurs.Yeah, I think this is a variant of the problem being fixed by
Hou-San's patch. I think one possible idea to investigate is that on
the subscriber-side, after fetching tables, we check the already
subscribed tables and if the child tables already exist then we ignore
the parent table and vice versa. We might want to consider the case
where a user has toggled the "publish_via_partition_root" parameter.It seems both these behaviours/problems exist since commit 17b9e7f9
(Support adding partitioned tables to publication). Adding Amit L and
Peter E (people involved in this work) to know their opinion?Actually, at least with the scenario I gave steps for, after looking
at it again and debugging, I think that the behavior is understandable
and not a bug.
The reason is that the INSERTed data is first published though the
partitions, since initially there is no partitioned table in the
publication (so publish_via_partition_root=true doesn't have any
effect). But then adding the partitioned table to the publication and
refreshing the publication in the subscriber, the data is then
published "using the identity and schema of the partitioned table" due
to publish_via_partition_root=true. Note that the corresponding table
in the subscriber may well be a non-partitioned table (or the
partitions arranged differently) so the data does need to be
replicated again.
I don't think this behavior is consistent, I mean for the initial sync
we will replicate the duplicate data, whereas for later streaming we
will only replicate it once. From the user POW, this behavior doesn't
look correct.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Wed, Oct 20, 2021 at 1:32 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Wed, Oct 20, 2021 at 12:44 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Oct 18, 2021 at 5:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I have not debugged it yet to find out why, but with the patch
applied, the original double-publish problem that I reported
(converted to just use TABLE rather than ALL TABLES IN SCHEMA) still
occurs.Yeah, I think this is a variant of the problem being fixed by
Hou-San's patch. I think one possible idea to investigate is that on
the subscriber-side, after fetching tables, we check the already
subscribed tables and if the child tables already exist then we ignore
the parent table and vice versa. We might want to consider the case
where a user has toggled the "publish_via_partition_root" parameter.It seems both these behaviours/problems exist since commit 17b9e7f9
(Support adding partitioned tables to publication). Adding Amit L and
Peter E (people involved in this work) to know their opinion?Actually, at least with the scenario I gave steps for, after looking
at it again and debugging, I think that the behavior is understandable
and not a bug.
The reason is that the INSERTed data is first published though the
partitions, since initially there is no partitioned table in the
publication (so publish_via_partition_root=true doesn't have any
effect). But then adding the partitioned table to the publication and
refreshing the publication in the subscriber, the data is then
published "using the identity and schema of the partitioned table" due
to publish_via_partition_root=true. Note that the corresponding table
in the subscriber may well be a non-partitioned table (or the
partitions arranged differently) so the data does need to be
replicated again.
Even if the partitions are arranged differently why would the user
expect the same data to be replicated twice?
I don't think this behavior is consistent, I mean for the initial sync
we will replicate the duplicate data, whereas for later streaming we
will only replicate it once. From the user POW, this behavior doesn't
look correct.
+1.
--
With Regards,
Amit Kapila.
On Wed, Oct 20, 2021 at 7:02 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
Actually, at least with the scenario I gave steps for, after looking
at it again and debugging, I think that the behavior is understandable
and not a bug.
The reason is that the INSERTed data is first published though the
partitions, since initially there is no partitioned table in the
publication (so publish_via_partition_root=true doesn't have any
effect). But then adding the partitioned table to the publication and
refreshing the publication in the subscriber, the data is then
published "using the identity and schema of the partitioned table" due
to publish_via_partition_root=true. Note that the corresponding table
in the subscriber may well be a non-partitioned table (or the
partitions arranged differently) so the data does need to be
replicated again.I don't think this behavior is consistent, I mean for the initial sync
we will replicate the duplicate data, whereas for later streaming we
will only replicate it once. From the user POW, this behavior doesn't
look correct.
The scenario I gave steps for didn't have any table data when the
subscription was made, so the initial sync did not replicate any data.
I was referring to the double-publish that occurs when
publish_via_partition_root=true and then the partitioned table is
added to the publication and the subscriber does ALTER SUBSCRIPTION
... REFRESH PUBLICATION.
If I modify my example to include both the partitioned table and
(explicitly) its child partitions in the publication, and insert some
data on the publisher side prior to the subscription, then I am seeing
duplicate data on the initial sync on the subscriber side, and I would
agree that this doesn't seem correct.
Regards,
Greg Nancarrow
Fujitsu Australia
On Wed, Oct 20, 2021 at 7:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Actually, at least with the scenario I gave steps for, after looking
at it again and debugging, I think that the behavior is understandable
and not a bug.
The reason is that the INSERTed data is first published though the
partitions, since initially there is no partitioned table in the
publication (so publish_via_partition_root=true doesn't have any
effect). But then adding the partitioned table to the publication and
refreshing the publication in the subscriber, the data is then
published "using the identity and schema of the partitioned table" due
to publish_via_partition_root=true. Note that the corresponding table
in the subscriber may well be a non-partitioned table (or the
partitions arranged differently) so the data does need to be
replicated again.Even if the partitions are arranged differently why would the user
expect the same data to be replicated twice?
It's the same data, but published in different ways because of changes
the user made to the publication.
I am not talking in general, I am specifically referring to the
scenario I gave steps for.
In the example scenario I gave, initially when the subscription was
made, the publication just explicitly included the partitions, but
publish_via_partition_root was true. So in this case it publishes
through the individual partitions (as no partitioned table is present
in the publication). Then on the publisher side, the partitioned table
was then added to the publication and then ALTER SUBSCRIPTION ...
REFRESH PUBLICATION done on the subscriber side. Now that the
partitioned table is present in the publication and
publish_via_partition_root is true, it is "published using the
identity and schema of the partitioned table rather than that of the
individual partitions that are actually changed". So the data is
replicated again.
This scenario didn't use initial table data, so initial table sync
didn't come into play (although as I previously posted, I can see a
double-publish issue on initial sync if data is put in the table prior
to subscription and partitions have been explicitly added to the
publication).
Regards,
Greg Nancarrow
Fujitsu Australia
On Wed, Oct 20, 2021 at 3:03 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Wed, Oct 20, 2021 at 7:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Actually, at least with the scenario I gave steps for, after looking
at it again and debugging, I think that the behavior is understandable
and not a bug.
The reason is that the INSERTed data is first published though the
partitions, since initially there is no partitioned table in the
publication (so publish_via_partition_root=true doesn't have any
effect). But then adding the partitioned table to the publication and
refreshing the publication in the subscriber, the data is then
published "using the identity and schema of the partitioned table" due
to publish_via_partition_root=true. Note that the corresponding table
in the subscriber may well be a non-partitioned table (or the
partitions arranged differently) so the data does need to be
replicated again.Even if the partitions are arranged differently why would the user
expect the same data to be replicated twice?It's the same data, but published in different ways because of changes
the user made to the publication.
I am not talking in general, I am specifically referring to the
scenario I gave steps for.
In the example scenario I gave, initially when the subscription was
made, the publication just explicitly included the partitions, but
publish_via_partition_root was true. So in this case it publishes
through the individual partitions (as no partitioned table is present
in the publication). Then on the publisher side, the partitioned table
was then added to the publication and then ALTER SUBSCRIPTION ...
REFRESH PUBLICATION done on the subscriber side. Now that the
partitioned table is present in the publication and
publish_via_partition_root is true, it is "published using the
identity and schema of the partitioned table rather than that of the
individual partitions that are actually changed". So the data is
replicated again.
I don't see why data need to be replicated again even in that case.
Can you see any such duplicate data replicated for non-partitioned
tables?
This scenario didn't use initial table data, so initial table sync
didn't come into play
It will be equivalent to initial sync because the tablesync worker
would copy the entire data again in this case unless during refresh we
pass copy_data as false.
--
With Regards,
Amit Kapila.
On Wed, Oct 20, 2021 at 9:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
I don't see why data need to be replicated again even in that case.
Can you see any such duplicate data replicated for non-partitioned
tables?
If my example is slightly modified to use the same-named tables on the
subscriber side, but without partitioning, i.e.:
PUB:
CREATE SCHEMA sch;
CREATE SCHEMA sch1;
CREATE TABLE sch.sale (sale_date date not null, country_code text,
product_sku text, units integer) PARTITION BY RANGE (sale_date);
CREATE TABLE sch1.sale_201901 PARTITION OF sch.sale FOR VALUES FROM
('2019-01-01') TO ('2019-02-01');
CREATE TABLE sch1.sale_201902 PARTITION OF sch.sale FOR VALUES FROM
('2019-02-01') TO ('2019-03-01');
SUB:
CREATE SCHEMA sch;
CREATE SCHEMA sch1;
CREATE TABLE sch.sale (sale_date date not null, country_code text,
product_sku text, units integer);
CREATE TABLE sch1.sale_201901 (sale_date date not null, country_code
text, product_sku text, units integer);
CREATE TABLE sch1.sale_201902 (sale_date date not null, country_code
text, product_sku text, units integer);
then the INSERTed data on the publisher side gets replicated to the
subscriber's "sch1.sale_201901" and "sch1.sale_201902" tables (only),
depending on the date values.
Now if the partitioned table is then added to the publication and
ALTER SUBSCRIPTION ... REFRESH PUBLICATION done by the subscriber,
then the current functionality is that the existing sch.sale data is
replicated (only) to the subscriber's "sch.sale" table (even though
data had been replicated previously to the "sch1.sale_201901" and
"sch1.sale_201902" tables, only).
So, just to be clear, you think that this current functionality isn't
correct (i.e. no data should be replicated on the REFRESH in this
case)?
I think it's debatable because here copy_data=true and sch.sale was
not a previously-subscribed table (so pre-existing data in that table
should be copied, in accordance with the current documentation).
Regards,
Greg Nancarrow
Fujitsu Australia
Hi,
I just wanted to bring to your attention an earlier thread [1]/messages/by-id/CAHut+PvJMRB-ZyC80we2kiUFv4cVjmA6jxXpEMhm1rmz=1ryeA@mail.gmail.com in
which I had already encountered/reported behaviour that is either
exactly the same or is closely related to what is being discussed in
this current thread. If it is different please take that into account
also.
------
[1]: /messages/by-id/CAHut+PvJMRB-ZyC80we2kiUFv4cVjmA6jxXpEMhm1rmz=1ryeA@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Thu, Oct 21, 2021 at 2:56 PM Peter Smith <smithpb2250@gmail.com> wrote:
I just wanted to bring to your attention an earlier thread [1] in
which I had already encountered/reported behaviour that is either
exactly the same or is closely related to what is being discussed in
this current thread. If it is different please take that into account
also.------
[1] /messages/by-id/CAHut+PvJMRB-ZyC80we2kiUFv4cVjmA6jxXpEMhm1rmz=1ryeA@mail.gmail.com
Thanks, I was able to reproduce that behavior, which is similar (but
in that case the publish_via_partition_root flag is toggled with the
partitioned table present in the publication, whereas in the case
being discussed the presence of the partitioned table in the
publication is toggled with publish_via_partition_root always true).
What seems to happen internally when a partitioned table is published
is that when publish_via_partition_root=true the subscriber to that
publication is effectively subscribed to the parent partitioned table
but not the child partitions. If publish_via_partition_root is changed
to false and the subscriber refreshes, the partitioned table is
unsubscribed and it subscribes to the partitions. This explains why
data gets "re-copied" when this happens, because then it is
subscribing to a "new" table and copy_data=true by default.
Regards,
Greg Nancarrow
Fujitsu Australia