Handle infinite recursion in logical replication setup
Hi,
In logical replication, currently Walsender sends the data that is
generated locally and the data that are replicated from other
instances. This results in infinite recursion in circular logical
replication setup.
Here the user is trying to have a 2-way replication setup with node 1
publishing data to node2 and node2 publishing data to node1, so that
the user can perform dml operations from any node, it can act as a
2-way multi master replication setup.
This problem can be reproduced with the following steps:
-- Instance 1
create publication pub1 for table t1;
create table t1(c1 int);
-- Instance 2
create table t1(c1 int);
create publication pub2 for table t1;
create subscription sub1 CONNECTION 'dbname=postgres port=5432'
publication pub1;
-- Instance 1
create subscription sub2 CONNECTION 'dbname=postgres port=5433'
publication pub2; insert into t1 values(10);
In this scenario, the Walsender in publisher pub1 sends data to the
apply worker in subscriber sub1, the apply worker in sub1 maps the
data to local tables and applies the individual changes as they are
received. Then the Walsender in publisher pub2 sends data to the apply
worker in subscriber sub2, the apply worker in sub2 maps the data to
local tables and applies the individual changes as they are received.
This process repeats infinitely.
Currently we do not differentiate if the data is locally generated
data, or a replicated data and we send both the data which causes
infinite recursion.
We could see that the record count has increased significantly within sometime:
select count(*) from t1;
count
--------------
4000000
(1 row)
If the table had primary key constraint, we could notice that the
first insert is successful and when the same insert is sent back, the
insert fails because of constraint error:
2022-02-23 09:28:43.592 IST [14743] ERROR: duplicate key value
violates unique constraint "t1_pkey"
2022-02-23 09:28:43.592 IST [14743] DETAIL: Key (c1)=(10) already exists.
2022-02-23 09:28:43.592 IST [14743] CONTEXT: processing remote data
during "INSERT" for replication target relation "public.t1" in
transaction 727 at 2022-02-23 09:28:43.406738+05:30
2022-02-23 09:28:43.593 IST [14678] LOG: background worker "logical
replication worker" (PID 14743) exited with exit code 1
2022-02-23 09:28:48.608 IST [14745] LOG: logical replication apply
worker for subscription "sub2" has started
2022-02-23 09:28:48.624 IST [14745] ERROR: duplicate key value
violates unique constraint "t1_pkey"
2022-02-23 09:28:48.624 IST [14745] DETAIL: Key (c1)=(10) already exists.
2022-02-23 09:28:48.624 IST [14745] CONTEXT: processing remote data
during "INSERT" for replication target relation "public.t1" in
transaction 727 at 2022-02-23 09:28:43.406738+05:30
2022-02-23 09:28:48.626 IST [14678] LOG: background worker "logical
replication worker" (PID 14745) exited with exit code 1
The same problem can occur in any circular node setup like 3 nodes,
4node etc like: a) node1 publishing to node2 b) node2 publishing to
node3 c) node3 publishing back to node1.
Here there are two problems for the user: a) incremental
synchronization of table sending both local data and replicated data
by walsender b) Table synchronization of table using copy command
sending both local data and replicated data
For the first problem "Incremental synchronization of table by
Walsender" can be solved by:
Currently the locally generated data does not have replication origin
associated and the data that has originated from another instance will
have a replication origin associated. We could use this information to
differentiate locally generated data and replicated data and send only
the locally generated data. This "only_local" could be provided as an
option while subscription is created:
ex: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname =postgres port=5433'
PUBLICATION pub1 with (only_local = on);
I have attached a basic patch for this, if the idea is accepted, I
will work further to test more scenarios, add documentation, and test
and post an updated patch.
For the second problem, Table synchronization of table including local
data and replicated data using copy command.
Let us consider the following scenario:
a) node1 publishing to node2 b) node2 publishing to node1. Here in
this case node1 will have replicated data from node2 and vice versa.
In the above if user wants to include node3 to subscribe data from
node2. Users will have to create a subscription in node3 to get the
data from node2. During table synchronization we send the complete
table data from node2 to node3. Node2 will have local data from node2
and also replicated data from node1. Currently we don't have an option
to differentiate between the locally generated data and replicated
data in the heap which will cause infinite recursion as described
above.
To handle this if user has specified only_local option, we could throw
a warning or error out while creating subscription in this case, we
could have a column srreplicateddata in pg_subscription_rel which
could indicate if the table has any replicated data or not:
postgres=# select * from pg_subscription_rel;
srsubid | srrelid | srsubstate | srsublsn | srreplicateddata
---------+---------+------------+-----------+------------------
16389 | 16384 | r | 0/14A4640 | t
16389 | 16385 | r | 0/14A4690 | f
(1 row)
In the above example, srreplicateddata with true indicates, tabel t1
whose relid is 16384 has replicated data and the other row having
srreplicateddata as false indicates table t2 whose relid is 16385
does not have replicated data.
When creating a new subscription, the subscriber will connect to the
publisher and check if the relation has replicated data by checking
srreplicateddata in pg_subscription_rel table.
If the table has any replicated data, log a warning or error for this.
Also, we could document the steps on how to handle the initial sync like:
a) Complete the ongoing transactions on this table in the replication
setup nodes i.e. node1 and node2 in the above case, so that the table
data is consistent, b) Once there are no ongoing transaction, Copy the
table data using copy command from any one of the nodes, c) create
subscription with copy_data option as off d) Perform further
transactions on the table e) All the further transactions performed
will be handled by the walsender which will take care of skipping
replicated data and sending only the local data. i.e. node2 will send
the locally generated data to node3.
I'm not sure if there is any other better way to handle this. If there
is a better way, we could handle it accordingly.
Thoughts?
Regards,
Vignesh
Attachments:
v1-0001-Skip-replication-of-non-local-data.patchapplication/x-patch; name=v1-0001-Skip-replication-of-non-local-data.patchDownload+205-37
Hi Vignesh,
In logical replication, currently Walsender sends the data that is
generated locally and the data that are replicated from other
instances. This results in infinite recursion in circular logical
replication setup.
Thank you for good explanation. I understand that this fix can be used
for a bidirectional replication.
Here there are two problems for the user: a) incremental
synchronization of table sending both local data and replicated data
by walsender b) Table synchronization of table using copy command
sending both local data and replicated data
So you wanted to solve these two problem and currently focused on
the first one, right? We can check one by one.
For the first problem "Incremental synchronization of table by
Walsender" can be solved by:
Currently the locally generated data does not have replication origin
associated and the data that has originated from another instance will
have a replication origin associated. We could use this information to
differentiate locally generated data and replicated data and send only
the locally generated data. This "only_local" could be provided as an
option while subscription is created:
ex: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname =postgres port=5433'
PUBLICATION pub1 with (only_local = on);
Sounds good, but I cannot distinguish whether the assumption will keep.
I played with your patch, but it could not be applied to current master.
I tested from bd74c40 and I confirmed infinite loop was not appeared.
local_only could not be set from ALTER SUBSCRIPTION command.
Is it expected?
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
On Tue, Mar 1, 2022 at 4:12 PM kuroda.hayato@fujitsu.com
<kuroda.hayato@fujitsu.com> wrote:
Hi Vignesh,
In logical replication, currently Walsender sends the data that is
generated locally and the data that are replicated from other
instances. This results in infinite recursion in circular logical
replication setup.Thank you for good explanation. I understand that this fix can be used
for a bidirectional replication.
Once these issues are resolved, it can be used for bi-directional
logical replication.
Here there are two problems for the user: a) incremental
synchronization of table sending both local data and replicated data
by walsender b) Table synchronization of table using copy command
sending both local data and replicated dataSo you wanted to solve these two problem and currently focused on
the first one, right? We can check one by one.
Yes.
For the first problem "Incremental synchronization of table by
Walsender" can be solved by:
Currently the locally generated data does not have replication origin
associated and the data that has originated from another instance will
have a replication origin associated. We could use this information to
differentiate locally generated data and replicated data and send only
the locally generated data. This "only_local" could be provided as an
option while subscription is created:
ex: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname =postgres port=5433'
PUBLICATION pub1 with (only_local = on);Sounds good, but I cannot distinguish whether the assumption will keep.
Replication origin is created by the apply worker and it will be used
for all the transactions received from the walsender. I feel the
replication origin will be present always.
I played with your patch, but it could not be applied to current master.
I tested from bd74c40 and I confirmed infinite loop was not appeared.
I will post an updated version for this soon.
local_only could not be set from ALTER SUBSCRIPTION command.
Is it expected?
I wanted to get the opinion from others too just to make sure the
approach is right. I will fix this including the documentation, test,
etc in the later versions.
Regards,
Vignesh
On Wed, Feb 23, 2022 at 11:59 AM vignesh C <vignesh21@gmail.com> wrote:
...
...
I have attached a basic patch for this, if the idea is accepted, I
will work further to test more scenarios, add documentation, and test
and post an updated patch.
For the second problem, Table synchronization of table including local
data and replicated data using copy command.Let us consider the following scenario:
a) node1 publishing to node2 b) node2 publishing to node1. Here in
this case node1 will have replicated data from node2 and vice versa.In the above if user wants to include node3 to subscribe data from
node2. Users will have to create a subscription in node3 to get the
data from node2. During table synchronization we send the complete
table data from node2 to node3. Node2 will have local data from node2
and also replicated data from node1. Currently we don't have an option
to differentiate between the locally generated data and replicated
data in the heap which will cause infinite recursion as described
above.To handle this if user has specified only_local option, we could throw
a warning or error out while creating subscription in this case, we
could have a column srreplicateddata in pg_subscription_rel which
could indicate if the table has any replicated data or not:
postgres=# select * from pg_subscription_rel;
srsubid | srrelid | srsubstate | srsublsn | srreplicateddata
---------+---------+------------+-----------+------------------
16389 | 16384 | r | 0/14A4640 | t
16389 | 16385 | r | 0/14A4690 | f
(1 row)In the above example, srreplicateddata with true indicates, tabel t1
whose relid is 16384 has replicated data and the other row having
srreplicateddata as false indicates table t2 whose relid is 16385
does not have replicated data.
When creating a new subscription, the subscriber will connect to the
publisher and check if the relation has replicated data by checking
srreplicateddata in pg_subscription_rel table.
If the table has any replicated data, log a warning or error for this.
If you want to give the error in this case, then I think we need to
provide an option to the user to allow copy. One possibility could be
to extend existing copy_data option as 'false', 'true', 'force'. For
'false', there shouldn't be any change, for 'true', if 'only_local'
option is also set and the new column indicates replicated data then
give an error, for 'force', we won't give an error even if the
conditions as mentioned for 'true' case are met, rather we will allow
copy in this case.
Also, we could document the steps on how to handle the initial sync like:
a) Complete the ongoing transactions on this table in the replication
setup nodes i.e. node1 and node2 in the above case, so that the table
data is consistent, b) Once there are no ongoing transaction, Copy the
table data using copy command from any one of the nodes, c) create
subscription with copy_data option as off d) Perform further
transactions on the table e) All the further transactions performed
will be handled by the walsender which will take care of skipping
replicated data and sending only the local data. i.e. node2 will send
the locally generated data to node3.I'm not sure if there is any other better way to handle this.
I could think of the below options for users to set up bi-directional
replication for the same table.
Option-1:
There is no pre-existing data in the tables that are going to
participate in bi-directional replication. In such a case, Users can
create pub/sub (with only_local option as proposed by you) on both
nodes before starting any writes on tables. This will allow
bi-directional replication for the required tables. Now, if the user
wants one of the nodes to join at a later point, then the strategy in
Option-2/3 could be used.
Option-2:
One of the nodes (say node-1) has some pre-existing data and another
node (say node-2) doesn't have any pre-existing data. In this case,
the user can set up pub/sub (with only_local and copy_data as 'false'
options) for node-1 first before any of the operations on node-2.
Then, it can set up pub/sub on node-2. This will allow bi-directional
replication for the required tables.
Option-3:
Both the nodes have some pre-existing data. I think the easiest option
could be to truncate data on one of the nodes and set up pub/sub on
both nodes. See, one way to achieve it among two nodes as below:
Node-1:
Table t1 has data
1, 2, 3, 4
Publication for t1, pub1: Create Publication pub1 For Table t1;
Node-2:
Table t1 has data
5, 6, 7, 8
Publication for t1, pub1_2: Create Publication pub1_2 For Table t1;
Now, Create Subscription for pub1 on node1: Create Subscription sub1_2
Connection '<node-1 details>' Publication pub1 WITH (only_local =
true);
Node-1:
Begin;
# Disallow truncates to be published
Alter Publication pub1 Set (publish = 'insert, update, delete');
Truncate t1;
Create Subscription sub1 Connection '<node-2 details>' Publication
pub1 WITH (only_local = true, copy_data = 'force');
Alter Publication pub1 Set (publish = 'insert, update, delete, truncate');
Commit;
I think this will allow the bi-directional replication between two
nodes. In this scheme, the user needs to manually perform some steps
including truncate of the table on one of the nodes which she might or
might not like but at least there will be a way to set up a
bi-directional replication on two nodes for same table operations
which is not possible now.
I think one can even imagine using and extending this functionality so
that users don't need to perform TRUNCATE on one of the nodes. Say, in
the above case for tablesync phase, we make both nodes to start a
transaction, create a slot on another node (with USE_SNAPSHOT option),
and then allow copy from another node. I think it will be important to
allow copy on each node once the slots are created and the initial
snapshot is established.
For more than two nodes, I think we can suggest having either of the
option-1 or 2 for setup. But, there could be other ways as well
depending on how the user wants to do the setup.
Thoughts?
--
With Regards,
Amit Kapila.
On Tue, Mar 1, 2022 at 4:12 PM kuroda.hayato@fujitsu.com
<kuroda.hayato@fujitsu.com> wrote:
Hi Vignesh,
In logical replication, currently Walsender sends the data that is
generated locally and the data that are replicated from other
instances. This results in infinite recursion in circular logical
replication setup.Thank you for good explanation. I understand that this fix can be used
for a bidirectional replication.Here there are two problems for the user: a) incremental
synchronization of table sending both local data and replicated data
by walsender b) Table synchronization of table using copy command
sending both local data and replicated dataSo you wanted to solve these two problem and currently focused on
the first one, right? We can check one by one.For the first problem "Incremental synchronization of table by
Walsender" can be solved by:
Currently the locally generated data does not have replication origin
associated and the data that has originated from another instance will
have a replication origin associated. We could use this information to
differentiate locally generated data and replicated data and send only
the locally generated data. This "only_local" could be provided as an
option while subscription is created:
ex: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname =postgres port=5433'
PUBLICATION pub1 with (only_local = on);Sounds good, but I cannot distinguish whether the assumption will keep.
I played with your patch, but it could not be applied to current master.
I tested from bd74c40 and I confirmed infinite loop was not appeared.
Rebased the patch on top of head
local_only could not be set from ALTER SUBSCRIPTION command.
Is it expected?
Modified
Thanks for the comments, the attached patch has the changes for the same.
Regards,
Vignesh
Attachments:
v2-0001-Skip-replication-of-non-local-data.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Skip-replication-of-non-local-data.patchDownload+345-40
FYI, the v2 patch did not apply to HEAD
[postgres@CentOS7-x64 oss_postgres_misc]$ git apply
../patches_misc/v2-0001-Skip-replication-of-non-local-data.patch
--verbose
...
error: patch failed: src/backend/replication/slotfuncs.c:231
error: src/backend/replication/slotfuncs.c: patch does not apply
------
Kind Regards,
Peter Smith.
Fujitsu Australia.
On Wed, Feb 23, 2022 at 11:59 AM vignesh C <vignesh21@gmail.com> wrote:
Here there are two problems for the user: a) incremental
synchronization of table sending both local data and replicated data
by walsender b) Table synchronization of table using copy command
sending both local data and replicated dataFor the first problem "Incremental synchronization of table by
Walsender" can be solved by:
Currently the locally generated data does not have replication origin
associated and the data that has originated from another instance will
have a replication origin associated. We could use this information to
differentiate locally generated data and replicated data and send only
the locally generated data. This "only_local" could be provided as an
option while subscription is created:
ex: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname =postgres port=5433'
PUBLICATION pub1 with (only_local = on);
I haven't yet gone through the patch, but I have a question about the
idea. Suppose I want to set up a logical replication like,
node1->node2->node3->node1. So how would I create the subscriber at
node1? only_local=on or off?. I mean on node1, I want the changes
from node3 which are generated on node3 or which are replicated from
node2 but I do not want changes that are replicated from node1 itself?
So if I set only_local=on then node1 will not get the changes
replicated from node2, is that right? and If I set only_local=off then
it will create the infinite loop again? So how are we protecting
against this case?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Mon, Mar 7, 2022 at 9:26 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Wed, Feb 23, 2022 at 11:59 AM vignesh C <vignesh21@gmail.com> wrote:
Here there are two problems for the user: a) incremental
synchronization of table sending both local data and replicated data
by walsender b) Table synchronization of table using copy command
sending both local data and replicated dataFor the first problem "Incremental synchronization of table by
Walsender" can be solved by:
Currently the locally generated data does not have replication origin
associated and the data that has originated from another instance will
have a replication origin associated. We could use this information to
differentiate locally generated data and replicated data and send only
the locally generated data. This "only_local" could be provided as an
option while subscription is created:
ex: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname =postgres port=5433'
PUBLICATION pub1 with (only_local = on);I haven't yet gone through the patch, but I have a question about the
idea. Suppose I want to set up a logical replication like,
node1->node2->node3->node1. So how would I create the subscriber at
node1? only_local=on or off?. I mean on node1, I want the changes
from node3 which are generated on node3 or which are replicated from
node2 but I do not want changes that are replicated from node1 itself?
So if I set only_local=on then node1 will not get the changes
replicated from node2, is that right? and If I set only_local=off then
it will create the infinite loop again? So how are we protecting
against this case?
In the above topology if you want local changes from both node3 and
node2 then I think the way to get that would be you have to create two
subscriptions on node1. The first one points to node2 (with
only_local=off) and the second one points to node3 (with only_local
=off). Will that address your case or am I missing something?
--
With Regards,
Amit Kapila.
On Mon, Mar 7, 2022 at 10:05 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Mar 7, 2022 at 9:26 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Wed, Feb 23, 2022 at 11:59 AM vignesh C <vignesh21@gmail.com> wrote:
Here there are two problems for the user: a) incremental
synchronization of table sending both local data and replicated data
by walsender b) Table synchronization of table using copy command
sending both local data and replicated dataFor the first problem "Incremental synchronization of table by
Walsender" can be solved by:
Currently the locally generated data does not have replication origin
associated and the data that has originated from another instance will
have a replication origin associated. We could use this information to
differentiate locally generated data and replicated data and send only
the locally generated data. This "only_local" could be provided as an
option while subscription is created:
ex: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname =postgres port=5433'
PUBLICATION pub1 with (only_local = on);I haven't yet gone through the patch, but I have a question about the
idea. Suppose I want to set up a logical replication like,
node1->node2->node3->node1. So how would I create the subscriber at
node1? only_local=on or off?. I mean on node1, I want the changes
from node3 which are generated on node3 or which are replicated from
node2 but I do not want changes that are replicated from node1 itself?
So if I set only_local=on then node1 will not get the changes
replicated from node2, is that right? and If I set only_local=off then
it will create the infinite loop again? So how are we protecting
against this case?In the above topology if you want local changes from both node3 and
node2 then I think the way to get that would be you have to create two
subscriptions on node1. The first one points to node2 (with
only_local=off) and the second one points to node3 (with only_local
=off).
Sorry, I intend to say 'only_local=on' at both places in my previous email.
--
With Regards,
Amit Kapila.
Hi Vignesh, I also have not looked at the patch yet, but I have what
seems like a very fundamental (and possibly dumb) question...
Basically, I do not understand the choice of syntax for setting things up.
IMO that "only-local" option sounds very similar to the other
PUBLICATION ("publish") options which decide the kinds of things that
will be published. So it feels more natural for me to think of the
publisher as being the one to decide what will be published.
e.g.
option 1:
CREATE PUBLICATION p1 FOR TABLE t1;
CREATE SUBSCRITION s1 ... FOR PUBLICATION p1 WITH (only_local = true);
option 2:
CREATE PUBLICATION p1 FOR TABLE t1 WEHRE (publish = 'only_local');
CREATE SUBSCRITION s1 ... FOR PUBLICATION p1;
~~
IIUC the patch is using option 1. My first impression was it feels
back-to-front for the SUBSCRIPTION telling the PUBLICATION what to
publish.
So, why does the patch use syntax option 1?
------
Kind Regards,
Peter Smith
Fujitsu Australia
On Mon, Mar 7, 2022 at 3:56 PM Peter Smith <smithpb2250@gmail.com> wrote:
Hi Vignesh, I also have not looked at the patch yet, but I have what
seems like a very fundamental (and possibly dumb) question...Basically, I do not understand the choice of syntax for setting things up.
IMO that "only-local" option sounds very similar to the other
PUBLICATION ("publish") options which decide the kinds of things that
will be published. So it feels more natural for me to think of the
publisher as being the one to decide what will be published.e.g.
option 1:
CREATE PUBLICATION p1 FOR TABLE t1;
CREATE SUBSCRITION s1 ... FOR PUBLICATION p1 WITH (only_local = true);option 2:
CREATE PUBLICATION p1 FOR TABLE t1 WEHRE (publish = 'only_local');
CREATE SUBSCRITION s1 ... FOR PUBLICATION p1;
Sorry, I mean to write WITH.
option 2:
CREATE PUBLICATION p1 FOR TABLE t1 WITH (publish = 'only_local');
CREATE SUBSCRITION s1 ... FOR PUBLICATION p1;
Show quoted text
~~
IIUC the patch is using option 1. My first impression was it feels
back-to-front for the SUBSCRIPTION telling the PUBLICATION what to
publish.So, why does the patch use syntax option 1?
------
Kind Regards,
Peter Smith
Fujitsu Australia
On Mon, Mar 7, 2022 at 10:26 AM Peter Smith <smithpb2250@gmail.com> wrote:
Hi Vignesh, I also have not looked at the patch yet, but I have what
seems like a very fundamental (and possibly dumb) question...Basically, I do not understand the choice of syntax for setting things up.
IMO that "only-local" option sounds very similar to the other
PUBLICATION ("publish") options which decide the kinds of things that
will be published. So it feels more natural for me to think of the
publisher as being the one to decide what will be published.e.g.
option 1:
CREATE PUBLICATION p1 FOR TABLE t1;
CREATE SUBSCRITION s1 ... FOR PUBLICATION p1 WITH (only_local = true);option 2:
CREATE PUBLICATION p1 FOR TABLE t1 WEHRE (publish = 'only_local');
CREATE SUBSCRITION s1 ... FOR PUBLICATION p1;~~
IIUC the patch is using option 1. My first impression was it feels
back-to-front for the SUBSCRIPTION telling the PUBLICATION what to
publish.So, why does the patch use syntax option 1?
I felt the advantage with keeping it at the subscription side is that,
the subscriber from one node can subscribe with only_local option on
and a different subscriber from a different node can subscribe with
only_local option as off. This might not be possible with having the
option at publisher side. Having it at the subscriber side might give
more flexibility for the user.
Regards,
Vignesh
On Tue, Mar 1, 2022 at 4:12 PM kuroda.hayato@fujitsu.com
<kuroda.hayato@fujitsu.com> wrote:
Hi Vignesh,
In logical replication, currently Walsender sends the data that is
generated locally and the data that are replicated from other
instances. This results in infinite recursion in circular logical
replication setup.Thank you for good explanation. I understand that this fix can be used
for a bidirectional replication.Here there are two problems for the user: a) incremental
synchronization of table sending both local data and replicated data
by walsender b) Table synchronization of table using copy command
sending both local data and replicated dataSo you wanted to solve these two problem and currently focused on
the first one, right? We can check one by one.For the first problem "Incremental synchronization of table by
Walsender" can be solved by:
Currently the locally generated data does not have replication origin
associated and the data that has originated from another instance will
have a replication origin associated. We could use this information to
differentiate locally generated data and replicated data and send only
the locally generated data. This "only_local" could be provided as an
option while subscription is created:
ex: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname =postgres port=5433'
PUBLICATION pub1 with (only_local = on);Sounds good, but I cannot distinguish whether the assumption will keep.
I played with your patch, but it could not be applied to current master.
I tested from bd74c40 and I confirmed infinite loop was not appeared.local_only could not be set from ALTER SUBSCRIPTION command.
Is it expected?
I felt changing only_local option might be useful for the user while
modifying the subscription like setting it with a different set of
publications. Changes for this are included in the v2 patch attached
at [1].
[2]: /messages/by-id/CALDaNm0WSo5369pr2eN1obTGBeiJU9cQdF6Ju1sC4hMQNy5BfQ@mail.gmail.com
Regards,
Vignesh
Dear Peter,
So, why does the patch use syntax option 1?
IMU it might be useful for the following case.
Assuming that multi-master configuration with node1, node2.
Node1 has a publication pub1 and a subscription sub2, node2 has pub2 and sub1.
From that situation, please consider that new node node3 is added
that subscribe some changes from node2.
If the feature is introduced as option1, new publication must be defined in node2.
If that is introduced as option2, however, maybe pub2 can be reused.
i.e. multiple declaration of publications can be avoided.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
On Mon, Mar 7, 2022 at 4:20 PM vignesh C <vignesh21@gmail.com> wrote:
On Mon, Mar 7, 2022 at 10:26 AM Peter Smith <smithpb2250@gmail.com> wrote:
Hi Vignesh, I also have not looked at the patch yet, but I have what
seems like a very fundamental (and possibly dumb) question...Basically, I do not understand the choice of syntax for setting things up.
IMO that "only-local" option sounds very similar to the other
PUBLICATION ("publish") options which decide the kinds of things that
will be published. So it feels more natural for me to think of the
publisher as being the one to decide what will be published.e.g.
option 1:
CREATE PUBLICATION p1 FOR TABLE t1;
CREATE SUBSCRITION s1 ... FOR PUBLICATION p1 WITH (only_local = true);option 2:
CREATE PUBLICATION p1 FOR TABLE t1 WEHRE (publish = 'only_local');
CREATE SUBSCRITION s1 ... FOR PUBLICATION p1;~~
IIUC the patch is using option 1. My first impression was it feels
back-to-front for the SUBSCRIPTION telling the PUBLICATION what to
publish.So, why does the patch use syntax option 1?
I felt the advantage with keeping it at the subscription side is that,
the subscriber from one node can subscribe with only_local option on
and a different subscriber from a different node can subscribe with
only_local option as off. This might not be possible with having the
option at publisher side. Having it at the subscriber side might give
more flexibility for the user.
OK. Option 2 needs two publications for that scenario. IMO it's more
intuitive this way, but maybe you wanted to avoid the extra
publications?
node0:
CREATE PUBLICATION p1 FOR TABLE t1;
CREATE PUBLICATION p1_local FOR TABLE t1 WITH (publish = 'only_local');
node1: CREATE SUBSCRIPTION s1 ... FOR PUBLICATION p1_local;
node2: CREATE SUBSCRIPTION s1 ... FOR PUBLICATION p1;
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Mon, Mar 7, 2022 at 5:12 PM kuroda.hayato@fujitsu.com
<kuroda.hayato@fujitsu.com> wrote:
Dear Peter,
So, why does the patch use syntax option 1?
IMU it might be useful for the following case.
Assuming that multi-master configuration with node1, node2.
Node1 has a publication pub1 and a subscription sub2, node2 has pub2 and sub1.From that situation, please consider that new node node3 is added
that subscribe some changes from node2.If the feature is introduced as option1, new publication must be defined in node2.
If that is introduced as option2, however, maybe pub2 can be reused.
i.e. multiple declaration of publications can be avoided.
Yes. Thanks for the example. I had the same observation in my last post [1]/messages/by-id/CAHut+PtRxiQR_4UFLNThg-NNRV447FvwtcR-BvqMzjyMJXKwfw@mail.gmail.com
------
[1]: /messages/by-id/CAHut+PtRxiQR_4UFLNThg-NNRV447FvwtcR-BvqMzjyMJXKwfw@mail.gmail.com
Kind Regards,
Peter Smith
Fujitsu Australia.
On Mon, Mar 7, 2022 at 11:45 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Mon, Mar 7, 2022 at 4:20 PM vignesh C <vignesh21@gmail.com> wrote:
On Mon, Mar 7, 2022 at 10:26 AM Peter Smith <smithpb2250@gmail.com> wrote:
Hi Vignesh, I also have not looked at the patch yet, but I have what
seems like a very fundamental (and possibly dumb) question...Basically, I do not understand the choice of syntax for setting things up.
IMO that "only-local" option sounds very similar to the other
PUBLICATION ("publish") options which decide the kinds of things that
will be published. So it feels more natural for me to think of the
publisher as being the one to decide what will be published.e.g.
option 1:
CREATE PUBLICATION p1 FOR TABLE t1;
CREATE SUBSCRITION s1 ... FOR PUBLICATION p1 WITH (only_local = true);option 2:
CREATE PUBLICATION p1 FOR TABLE t1 WEHRE (publish = 'only_local');
CREATE SUBSCRITION s1 ... FOR PUBLICATION p1;~~
IIUC the patch is using option 1. My first impression was it feels
back-to-front for the SUBSCRIPTION telling the PUBLICATION what to
publish.So, why does the patch use syntax option 1?
I felt the advantage with keeping it at the subscription side is that,
the subscriber from one node can subscribe with only_local option on
and a different subscriber from a different node can subscribe with
only_local option as off. This might not be possible with having the
option at publisher side. Having it at the subscriber side might give
more flexibility for the user.OK. Option 2 needs two publications for that scenario. IMO it's more
intuitive this way, but maybe you wanted to avoid the extra
publications?
Yes, I wanted to avoid the extra publication creation that you pointed
out. Option 1 can handle this scenario without creating the extra
publications:
node0: CREATE PUBLICATION p1 FOR TABLE t1;
node1: CREATE SUBSCRIPTION s1 ... FOR PUBLICATION p1 with (only_local = on);
node2: CREATE SUBSCRIPTION s1 ... FOR PUBLICATION p1 with (only_local = off);
I'm ok with both the approaches, now that this scenario can be handled
by using both the options. i.e providing only_local option as an
option while creating publication or providing only_local option as an
option while creating subscription as Peter has pointed out at [1]/messages/by-id/CAHut+PsAWaETh9VMymbBfMrqiE1KuqMq+wpBg0s7eMzwLATr+w@mail.gmail.com.
option 1:
CREATE PUBLICATION p1 FOR TABLE t1;
CREATE SUBSCRITION s1 ... FOR PUBLICATION p1 WITH (only_local = true);
option 2:
CREATE PUBLICATION p1 FOR TABLE t1 WITH (publish = 'only_local');
CREATE SUBSCRITION s1 ... FOR PUBLICATION p1;
Shall we get a few opinions on this and take it in that direction?
[1]: /messages/by-id/CAHut+PsAWaETh9VMymbBfMrqiE1KuqMq+wpBg0s7eMzwLATr+w@mail.gmail.com
Regards,
Vignesh
Dear Vignesh,
I felt changing only_local option might be useful for the user while
modifying the subscription like setting it with a different set of
publications. Changes for this are included in the v2 patch attached
at [1].
+1, thanks. I'll post if I notice something to say.
Shall we get a few opinions on this and take it in that direction?
I prefer subscriber-option, but I also think both are reasonable.
+1 about asking other reviewers.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
On Mon, Mar 7, 2022 at 10:15 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
I haven't yet gone through the patch, but I have a question about the
idea. Suppose I want to set up a logical replication like,
node1->node2->node3->node1. So how would I create the subscriber at
node1? only_local=on or off?. I mean on node1, I want the changes
from node3 which are generated on node3 or which are replicated from
node2 but I do not want changes that are replicated from node1 itself?
So if I set only_local=on then node1 will not get the changes
replicated from node2, is that right? and If I set only_local=off then
it will create the infinite loop again? So how are we protecting
against this case?In the above topology if you want local changes from both node3 and
node2 then I think the way to get that would be you have to create two
subscriptions on node1. The first one points to node2 (with
only_local=off) and the second one points to node3 (with only_local
=off).Sorry, I intend to say 'only_local=on' at both places in my previous email.
Hmm okay, so for this topology we will have to connect node1 directly
to node2 as well as to node3 but can not cascade the changes. I was
wondering can it be done without using the extra connection between
node2 to node1? I mean instead of making this a boolean flag that
whether we want local change or remote change, can't we control the
changes based on the origin id? Such that node1 will get the local
changes of node3 but with using the same subscription it will get
changes from node3 which are originated from node2 but it will not
receive the changes which are originated from node1.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Mon, Mar 7, 2022 at 6:17 PM vignesh C <vignesh21@gmail.com> wrote:
On Mon, Mar 7, 2022 at 11:45 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Mon, Mar 7, 2022 at 4:20 PM vignesh C <vignesh21@gmail.com> wrote:
On Mon, Mar 7, 2022 at 10:26 AM Peter Smith <smithpb2250@gmail.com> wrote:
Hi Vignesh, I also have not looked at the patch yet, but I have what
seems like a very fundamental (and possibly dumb) question...Basically, I do not understand the choice of syntax for setting things up.
IMO that "only-local" option sounds very similar to the other
PUBLICATION ("publish") options which decide the kinds of things that
will be published. So it feels more natural for me to think of the
publisher as being the one to decide what will be published.e.g.
option 1:
CREATE PUBLICATION p1 FOR TABLE t1;
CREATE SUBSCRITION s1 ... FOR PUBLICATION p1 WITH (only_local = true);option 2:
CREATE PUBLICATION p1 FOR TABLE t1 WEHRE (publish = 'only_local');
CREATE SUBSCRITION s1 ... FOR PUBLICATION p1;~~
IIUC the patch is using option 1. My first impression was it feels
back-to-front for the SUBSCRIPTION telling the PUBLICATION what to
publish.So, why does the patch use syntax option 1?
I felt the advantage with keeping it at the subscription side is that,
the subscriber from one node can subscribe with only_local option on
and a different subscriber from a different node can subscribe with
only_local option as off. This might not be possible with having the
option at publisher side. Having it at the subscriber side might give
more flexibility for the user.OK. Option 2 needs two publications for that scenario. IMO it's more
intuitive this way, but maybe you wanted to avoid the extra
publications?Yes, I wanted to avoid the extra publication creation that you pointed
out. Option 1 can handle this scenario without creating the extra
publications:
node0: CREATE PUBLICATION p1 FOR TABLE t1;
node1: CREATE SUBSCRIPTION s1 ... FOR PUBLICATION p1 with (only_local = on);
node2: CREATE SUBSCRIPTION s1 ... FOR PUBLICATION p1 with (only_local = off);I'm ok with both the approaches, now that this scenario can be handled
by using both the options. i.e providing only_local option as an
option while creating publication or providing only_local option as an
option while creating subscription as Peter has pointed out at [1].
option 1:
CREATE PUBLICATION p1 FOR TABLE t1;
CREATE SUBSCRITION s1 ... FOR PUBLICATION p1 WITH (only_local = true);option 2:
CREATE PUBLICATION p1 FOR TABLE t1 WITH (publish = 'only_local');
CREATE SUBSCRITION s1 ... FOR PUBLICATION p1;Shall we get a few opinions on this and take it in that direction?
[1] - /messages/by-id/CAHut+PsAWaETh9VMymbBfMrqiE1KuqMq+wpBg0s7eMzwLATr+w@mail.gmail.com
Regards,
Vignesh
BTW here is a counter-example to your scenario from earlier.
Let's say I have a publication p1 and p2 and want to subscribe to p1
with only_local=true, and p2 with only_local = false;
Using the current OPtion 1 syntax you cannot do this with a single
subscription because the option is tied to the subscription.
But using syntax Option 2 you may be able to do it.
Option 1:
CREATE PUBLICATION p1 FOR TABLE t1;
CREATE PUBLICATION p2 FOR TABLE t2;
CREATE SUBSCRIPTION s1 ... FOR PUBLICATION p1 WITH (local_only = true);
CREATE SUBSCRIPTION s2 ... FOR PUBLICATION p1 WITH (local_only = false);
Option 2:
CREATE PUBLICATION p1 FOR TABLE t1 WITH (publish = 'local_only');
CREATE PUBLICATION p2 FOR TABLE t2;
CREATE SUBSCRIPTION s1 ... FOR PUBLICATION p1, p2;
------
Kind Regards,
Peter Smith.
Fujitsu Australia