Adding a '--two-phase' option to 'pg_createsubscriber' utility.
Hi all,
I am writing to propose the addition of the two_phase option in
pg_createsubscriber. As discussed in [1]/messages/by-id/CAA4eK1+7gRxiK3xNEH03CY3mMKxqi7BVz2k3VxxVZp8fgjHZxg@mail.gmail.com, supporting this feature
during the development of pg_createsubscriber was planned for this
version.
Currently, pg_createsubscriber creates subscriptions with the
two_phase option set to false. Enabling the two_phase option after a
subscription has been created is not straightforward, as it requires
the subscription to be disabled first. This patch aims to address this
limitation by incorporating the two_phase option into
pg_createsubscriber which will help create subscription with two_phase
option and make use of the advantages of creating subscription with
two_phase option.
The attached patch has the changes for the same.
[1]: /messages/by-id/CAA4eK1+7gRxiK3xNEH03CY3mMKxqi7BVz2k3VxxVZp8fgjHZxg@mail.gmail.com
Thanks and regards,
Shubham Khanna.
Attachments:
v1-0001-Add-support-for-enabling-disabling-two-phase-comm.patchapplication/octet-stream; name=v1-0001-Add-support-for-enabling-disabling-two-phase-comm.patchDownload+115-4
On Mon, 9 Dec 2024 at 16:25, Shubham Khanna <khannashubham1197@gmail.com> wrote:
Hi all,
I am writing to propose the addition of the two_phase option in
pg_createsubscriber. As discussed in [1], supporting this feature
during the development of pg_createsubscriber was planned for this
version.
Yes this will be useful.
Currently, pg_createsubscriber creates subscriptions with the
two_phase option set to false. Enabling the two_phase option after a
subscription has been created is not straightforward, as it requires
the subscription to be disabled first. This patch aims to address this
limitation by incorporating the two_phase option into
pg_createsubscriber which will help create subscription with two_phase
option and make use of the advantages of creating subscription with
two_phase option.
The attached patch has the changes for the same.
Few comments:
1) You can keep it with no argument similar to how dry-run is handled:
@@ -1872,6 +1881,7 @@ main(int argc, char **argv)
{"publisher-server", required_argument, NULL, 'P'},
{"socketdir", required_argument, NULL, 's'},
{"recovery-timeout", required_argument, NULL, 't'},
+ {"two_phase", optional_argument, NULL, 'T'},
{"subscriber-username", required_argument, NULL, 'U'},
{"verbose", no_argument, NULL, 'v'},
{"version", no_argument, NULL, 'V'},
2) After making it to no argument option, if this option is set, just
set the value, strcmp's are not required:
+ case 'T':
+ if (optarg != NULL)
+ {
+ if (strcmp(optarg, "on") == 0)
+ opt.two_phase = true;
+ else if (strcmp(optarg, "off") == 0)
+ opt.two_phase = false;
+ else
+ pg_fatal("invalid
value for --two-phase: must be 'on' or 'off'");
+ }
+ else
+ opt.two_phase = true; /*
Default to true if no argument
+
* is provided */
+ break;
3) You can add a link to create subscription documentation page of
two_phase option here:
+ Enables or disables two-phase commit for the subscription.
+ When the option is provided without a value, it defaults to
+ <literal>on</literal>. Specify <literal>on</literal> to enable or
+ <literal>off</literal> to disable.
+ Two-phase commit ensures atomicity in logical replication for prepared
+ transactions. By default, this option is enabled unless explicitly set
+ to <literal>off</literal>.
4) Instead of adding new tests, can we include the prepare transaction
and prepare transaction verification to the existing tests itself?
+# Set up node A as primary
+my $node_a = PostgreSQL::Test::Cluster->new('node_a');
+my $aconnstr = $node_a->connstr;
+$node_a->init(allows_streaming => 'logical');
+$node_a->append_conf(
+ 'postgresql.conf', qq[
+ autovacuum = off
+ wal_level = logical
+ max_wal_senders = 10
+ max_worker_processes = 8
+ max_prepared_transactions = 10
+]);
+
+$node_a->start;
Regards,
Vignesh
Dear Shubham,
Thanks for the proposal!
I am writing to propose the addition of the two_phase option in
pg_createsubscriber. As discussed in [1], supporting this feature
during the development of pg_createsubscriber was planned for this
version.
Yes, that was the pending item.
The attached patch has the changes for the same.
There are API related comments.
01.
I think the two-phase should be off by default. Because default value for CREATE
SUBSCRIPTION command is off, and your patch breaks pre-existing style.
02.
API style should be changed: no need to require the argument. I think it is enough
to provide "enable-twophase" and "T" option which specify the two_phase to true.
Best regards,
Hayato Kuroda
FUJITSU LIMITED
Hi Shubham,
Here are some review comments for the patch v1-0001.
Note -- I think Kuroda-san's idea to use a new switch like
'--enable-two-phase' would eliminate lots of my review comments below
about the inconsistencies with CREATE SUBSCRIBER, and the current
confusion about defaults and argument values etc.
======
Commit message
1.
By default, two-phase commit is enabled if the option is provided without
arguments. Users can explicitly set it to 'on' or 'off' using '--two-phase=on'
or '--two-phase=off'.
~
But you don't say what is the default if the option is omitted entirely.
~~~
2.
Notably, the replication for prepared transactions functions regardless of the
initial two-phase setting on the replication slot. However, the user cannot
change the setting after the subscription is created unless a future command,
such as 'ALTER SUBSCRIPTION ... SET (two-phase = on)', is supported.
This provides flexibility for future enhancements.
~
Typo in ALTER example with the option name /two-phase/two_phase/
~~~
3.
Documentation has been updated to reflect the new option, and test cases have
been added to validate various scenarios, including proper validation of the
'--two-phase' flag and its combinations with other options.
/flag/option/ ??
======
doc/src/sgml/ref/pg_createsubscriber.sgml
4.
+ <varlistentry>
+ <term><option>-T</option></term>
+ <term><option>--two_phase</option></term>
+ <listitem>
+ <para>
+ Enables or disables two-phase commit for the subscription.
+ When the option is provided without a value, it defaults to
+ <literal>on</literal>. Specify <literal>on</literal> to enable or
+ <literal>off</literal> to disable.
+ Two-phase commit ensures atomicity in logical replication for prepared
+ transactions. By default, this option is enabled unless explicitly set
+ to <literal>off</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
4a.
Typo -- the option you made is 'two-phase', not 'two_phase'
~
4b.
When you say "By default, this option is enabled unless explicitly set
to off." I take that as meaning it is default *enabled* even when the
option is entirely omitted.
But, that would be contrary to the behaviour of a normal CREATE
SUBSCRIPTION 'two_phase' option, so I wonder why should
pg_createsubscriber have a different default. Is this intentional? IMO
if no switch is specified then I thought it should be the same as the
CREATE SUBSCRIPTION default (i.e. false).
~
4c.
This "-T" option should be moved to be adjacent (below) "-t" to keep
everything consistently in A-Z order.
======
src/bin/pg_basebackup/pg_createsubscriber.c
5.
bool made_replslot; /* replication slot was created */
bool made_publication; /* publication was created */
+ bool two_phase; /* two-phase option was created */
I am not sure what "option was created" even means. Cut/paste error?
Also, perhaps this field belongs with the 1st group of fields in this
struct, instead of with those made_xxx fields.
~~~
store_pub_sub_info:
6.
+ /* Store two-phase option */
+ dbinfo[i].two_phase = opt->two_phase;
+
The comment says the same as what the code is doing which seems
unhelpful/redundant. And if you rearrange the struct fields as
previously suggested, this assignment should also move.
~~~
7.
pg_log_debug("subscriber(%d): subscription: %s ; connection string: %s", i,
dbinfo[i].subname ? dbinfo[i].subname : "(auto)",
dbinfo[i].subconninfo);
+ pg_log_debug("publisher(%d): two-phase: %s", i,
+ dbinfo[i].two_phase ? "true" : "false");
"two_phase" is a subscription option. So shouldn't this added
pg_log_debug info be part of the preceding pg_log_debug about the
subscription?
~~~
main:
8.
{"recovery-timeout", required_argument, NULL, 't'},
+ {"two_phase", optional_argument, NULL, 'T'},
{"subscriber-username", required_argument, NULL, 'U'},
AFAIK this option was supposed to be 'two-phase' (dash instead of
underscore is consistent with the other pg_createsubscriber options)
~~~
9.
opt.sub_username = NULL;
+ opt.two_phase = true;
As previously mentioned this default is not the same as the CREATE
SUBSCRIPTION default for 'two_phase', and I find that inconsistency to
be confusing.
~~~
10.
+ case 'T':
+ if (optarg != NULL)
+ {
+ if (strcmp(optarg, "on") == 0)
+ opt.two_phase = true;
+ else if (strcmp(optarg, "off") == 0)
+ opt.two_phase = false;
+ else
+ pg_fatal("invalid value for --two-phase: must be 'on' or 'off'");
+ }
+ else
+ opt.two_phase = true; /* Default to true if no argument
+ * is provided */
+ break;
I wonder if users familiar with the CREATE SUBSCRIPTION 'two_phase'
might find it strange that the only values accepted here are 'on' and
'off' but now all the other boolean variants.
======
.../t/040_pg_createsubscriber.pl
11.
+# Run pg_createsubscriber on a promoted server with two_phase=on
+command_ok(
+ [
+ 'pg_createsubscriber', '--verbose',
+ '--recovery-timeout', "$PostgreSQL::Test::Utils::timeout_default",
+ '--pgdata', $node_b->data_dir,
+ '--publisher-server', $node_a->connstr($db10),
+ '--subscriber-port', $node_b->port,
+ '--database', $db10,
+ '--two_phase=on'
+ ],
+ 'created subscription with two-phase commit enabled');
Hmm. I expect this should have been specified as '--two-phase=on'
(dash instead of underscore for consistency with all other switches of
pg_createsubscriber) but previous typos (e.g. #8 above) have
compounded the confusion.
======
Kind Regards,
Peter Smith.
Fujitsu Australia
On Mon, Dec 9, 2024 at 7:43 PM vignesh C <vignesh21@gmail.com> wrote:
On Mon, 9 Dec 2024 at 16:25, Shubham Khanna <khannashubham1197@gmail.com> wrote:
Hi all,
I am writing to propose the addition of the two_phase option in
pg_createsubscriber. As discussed in [1], supporting this feature
during the development of pg_createsubscriber was planned for this
version.Yes this will be useful.
Currently, pg_createsubscriber creates subscriptions with the
two_phase option set to false. Enabling the two_phase option after a
subscription has been created is not straightforward, as it requires
the subscription to be disabled first. This patch aims to address this
limitation by incorporating the two_phase option into
pg_createsubscriber which will help create subscription with two_phase
option and make use of the advantages of creating subscription with
two_phase option.
The attached patch has the changes for the same.Few comments: 1) You can keep it with no argument similar to how dry-run is handled: @@ -1872,6 +1881,7 @@ main(int argc, char **argv) {"publisher-server", required_argument, NULL, 'P'}, {"socketdir", required_argument, NULL, 's'}, {"recovery-timeout", required_argument, NULL, 't'}, + {"two_phase", optional_argument, NULL, 'T'}, {"subscriber-username", required_argument, NULL, 'U'}, {"verbose", no_argument, NULL, 'v'}, {"version", no_argument, NULL, 'V'},
Changed the argument to 'no_argument'.
2) After making it to no argument option, if this option is set, just set the value, strcmp's are not required: + case 'T': + if (optarg != NULL) + { + if (strcmp(optarg, "on") == 0) + opt.two_phase = true; + else if (strcmp(optarg, "off") == 0) + opt.two_phase = false; + else + pg_fatal("invalid value for --two-phase: must be 'on' or 'off'"); + } + else + opt.two_phase = true; /* Default to true if no argument + * is provided */ + break;
Updated the switch case according to the modified argument.
3) You can add a link to create subscription documentation page of two_phase option here: + Enables or disables two-phase commit for the subscription. + When the option is provided without a value, it defaults to + <literal>on</literal>. Specify <literal>on</literal> to enable or + <literal>off</literal> to disable. + Two-phase commit ensures atomicity in logical replication for prepared + transactions. By default, this option is enabled unless explicitly set + to <literal>off</literal>.
Added the link to create subscription in the documentation of
pg_createsubscriber.
4) Instead of adding new tests, can we include the prepare transaction and prepare transaction verification to the existing tests itself? +# Set up node A as primary +my $node_a = PostgreSQL::Test::Cluster->new('node_a'); +my $aconnstr = $node_a->connstr; +$node_a->init(allows_streaming => 'logical'); +$node_a->append_conf( + 'postgresql.conf', qq[ + autovacuum = off + wal_level = logical + max_wal_senders = 10 + max_worker_processes = 8 + max_prepared_transactions = 10 +]); + +$node_a->start;
Removed the new test case and added it to the existing test cases.
The attached Patch contains the required changes.
Thanks and regards,
Shubham Khanna.
Attachments:
v2-0001-Add-support-for-enabling-disabling-two-phase-comm.patchapplication/octet-stream; name=v2-0001-Add-support-for-enabling-disabling-two-phase-comm.patchDownload+51-7
On Tue, Dec 10, 2024 at 7:42 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Dear Shubham,
Thanks for the proposal!
I am writing to propose the addition of the two_phase option in
pg_createsubscriber. As discussed in [1], supporting this feature
during the development of pg_createsubscriber was planned for this
version.Yes, that was the pending item.
The attached patch has the changes for the same.
There are API related comments.
01.
I think the two-phase should be off by default. Because default value for CREATE
SUBSCRIPTION command is off, and your patch breaks pre-existing style.02.
API style should be changed: no need to require the argument. I think it is enough
to provide "enable-twophase" and "T" option which specify the two_phase to true.
I have fixed the given comments. The v2 version patch attached at [1]/messages/by-id/CAHv8RjLcdmz=_RMwveuDdr8i7r=09TAwtOnFmXeaia_v2RmnYA@mail.gmail.com
has the changes for the same.
[1]: /messages/by-id/CAHv8RjLcdmz=_RMwveuDdr8i7r=09TAwtOnFmXeaia_v2RmnYA@mail.gmail.com
Thanks and Regards,
Shubham Khanna.
On Tue, Dec 10, 2024 at 8:05 AM Peter Smith <smithpb2250@gmail.com> wrote:
Hi Shubham,
Here are some review comments for the patch v1-0001.
Note -- I think Kuroda-san's idea to use a new switch like
'--enable-two-phase' would eliminate lots of my review comments below
about the inconsistencies with CREATE SUBSCRIBER, and the current
confusion about defaults and argument values etc.======
Commit message1.
By default, two-phase commit is enabled if the option is provided without
arguments. Users can explicitly set it to 'on' or 'off' using '--two-phase=on'
or '--two-phase=off'.~
But you don't say what is the default if the option is omitted entirely.
~~~
2.
Notably, the replication for prepared transactions functions regardless of the
initial two-phase setting on the replication slot. However, the user cannot
change the setting after the subscription is created unless a future command,
such as 'ALTER SUBSCRIPTION ... SET (two-phase = on)', is supported.
This provides flexibility for future enhancements.~
Typo in ALTER example with the option name /two-phase/two_phase/
~~~
3.
Documentation has been updated to reflect the new option, and test cases have
been added to validate various scenarios, including proper validation of the
'--two-phase' flag and its combinations with other options./flag/option/ ??
Modified the commit message according to the suggested changes.
======
doc/src/sgml/ref/pg_createsubscriber.sgml4. + <varlistentry> + <term><option>-T</option></term> + <term><option>--two_phase</option></term> + <listitem> + <para> + Enables or disables two-phase commit for the subscription. + When the option is provided without a value, it defaults to + <literal>on</literal>. Specify <literal>on</literal> to enable or + <literal>off</literal> to disable. + Two-phase commit ensures atomicity in logical replication for prepared + transactions. By default, this option is enabled unless explicitly set + to <literal>off</literal>. + </para> + </listitem> + </varlistentry> +4a.
Typo -- the option you made is 'two-phase', not 'two_phase'~
Fixed.
4b.
When you say "By default, this option is enabled unless explicitly set
to off." I take that as meaning it is default *enabled* even when the
option is entirely omitted.But, that would be contrary to the behaviour of a normal CREATE
SUBSCRIPTION 'two_phase' option, so I wonder why should
pg_createsubscriber have a different default. Is this intentional? IMO
if no switch is specified then I thought it should be the same as the
CREATE SUBSCRIPTION default (i.e. false).~
Modified the documentation on the basis of the latest changes added to
the patch.
4c.
This "-T" option should be moved to be adjacent (below) "-t" to keep
everything consistently in A-Z order.
Fixed.
======
src/bin/pg_basebackup/pg_createsubscriber.c5.
bool made_replslot; /* replication slot was created */
bool made_publication; /* publication was created */
+ bool two_phase; /* two-phase option was created */I am not sure what "option was created" even means. Cut/paste error?
Also, perhaps this field belongs with the 1st group of fields in this
struct, instead of with those made_xxx fields.~~~
store_pub_sub_info:
6. + /* Store two-phase option */ + dbinfo[i].two_phase = opt->two_phase; +The comment says the same as what the code is doing which seems
unhelpful/redundant. And if you rearrange the struct fields as
previously suggested, this assignment should also move.
Fixed.
7. pg_log_debug("subscriber(%d): subscription: %s ; connection string: %s", i, dbinfo[i].subname ? dbinfo[i].subname : "(auto)", dbinfo[i].subconninfo); + pg_log_debug("publisher(%d): two-phase: %s", i, + dbinfo[i].two_phase ? "true" : "false");"two_phase" is a subscription option. So shouldn't this added
pg_log_debug info be part of the preceding pg_log_debug about the
subscription?
Fixed.
main:
8.
{"recovery-timeout", required_argument, NULL, 't'},
+ {"two_phase", optional_argument, NULL, 'T'},
{"subscriber-username", required_argument, NULL, 'U'},AFAIK this option was supposed to be 'two-phase' (dash instead of
underscore is consistent with the other pg_createsubscriber options)
Fixed.
9.
opt.sub_username = NULL;
+ opt.two_phase = true;As previously mentioned this default is not the same as the CREATE
SUBSCRIPTION default for 'two_phase', and I find that inconsistency to
be confusing.
Changed the default option to false as suggested.
10. + case 'T': + if (optarg != NULL) + { + if (strcmp(optarg, "on") == 0) + opt.two_phase = true; + else if (strcmp(optarg, "off") == 0) + opt.two_phase = false; + else + pg_fatal("invalid value for --two-phase: must be 'on' or 'off'"); + } + else + opt.two_phase = true; /* Default to true if no argument + * is provided */ + break;I wonder if users familiar with the CREATE SUBSCRIPTION 'two_phase'
might find it strange that the only values accepted here are 'on' and
'off' but now all the other boolean variants.
Fixed.
======
.../t/040_pg_createsubscriber.pl11. +# Run pg_createsubscriber on a promoted server with two_phase=on +command_ok( + [ + 'pg_createsubscriber', '--verbose', + '--recovery-timeout', "$PostgreSQL::Test::Utils::timeout_default", + '--pgdata', $node_b->data_dir, + '--publisher-server', $node_a->connstr($db10), + '--subscriber-port', $node_b->port, + '--database', $db10, + '--two_phase=on' + ], + 'created subscription with two-phase commit enabled');Hmm. I expect this should have been specified as '--two-phase=on'
(dash instead of underscore for consistency with all other switches of
pg_createsubscriber) but previous typos (e.g. #8 above) have
compounded the confusion.
Fixed.
The v2 version patch attached at [1]/messages/by-id/CAHv8RjLcdmz=_RMwveuDdr8i7r=09TAwtOnFmXeaia_v2RmnYA@mail.gmail.com has the changes for the same.
[1]: /messages/by-id/CAHv8RjLcdmz=_RMwveuDdr8i7r=09TAwtOnFmXeaia_v2RmnYA@mail.gmail.com
Thanks and Regards,
Shubham Khanna.
Hi Shubham,
Here are some review comments for patch v2-0001.
======
GENERAL - the new option name
1.
I am not sure if this new switch needed to be called
'--enable-two-phase'; probably just calling it '--two-phase' would
mean the same because specifying the switch already implies "enabling"
it to me.
Perhaps either name is fine. What do others think?
======
Commit message
2.
This patch introduces the '--enable-two-phase' option to the
'pg_createsubscriber' utility, allowing users to enable or disable two-phase
commit for subscriptions during their creation.
~
It seems a bit strange IMO to say "enable or disable", because this is
only for "enable", and the default is disable as described in the
following sentence.
~~~
3.
By default, two-phase commit is disabled if the option is provided without
arguments.
~
But, this option now has no arguments so the part "if the option is
provided without arguments" is not applicable anymore and should be
removed. Or, if you want to say something you could say "if the option
is not provided."
======
doc/src/sgml/ref/pg_createsubscriber.sgml
4.
+ <varlistentry>
+ <term><option>-T</option></term>
+ <term><option>--enable-two-phase</option></term>
+ <listitem>
+ <para>
+ Enables two-phase commit for the subscription. When the option is
+ provided, it is explicitly enabled. By default, two-phase commit is
+ <literal>off</literal>.
+ Two-phase commit ensures atomicity in logical replication for prepared
+ transactions. See the
+ <link linkend="sql-createsubscription-params-with-two-phase"><literal>two_phase</literal></link>
+ documentation for more details.
+ </para>
+ </listitem>
+ </varlistentry>
+
I felt this was more verbose than necessary. IMO you only needed to
say something very simple; the user can chase the link to learn more
about two_phase if they want to.
SUGGESTION
Enables <link linkend="sql-createsubscription-params-with-two-phase"><literal>two_phase</literal></link>
commit for the subscription. The default is <literal>false</literal>.
======
src/bin/pg_basebackup/pg_createsubscriber.c
usage:
5.
printf(_(" -t, --recovery-timeout=SECS seconds to wait for
recovery to end\n"));
+ printf(_(" -T, --enable-two-phase enable two-phase commit for the
subscription\n"));
printf(_(" -U, --subscriber-username=NAME user name for subscriber
connection\n"));
AFAICT the patch is wrongly using tabs here instead of spaces.
~~~
store_pub_sub_info:
6.
+ dbinfo[i].two_phase = opt->two_phase; /* Set two-phase option */
I still think this comment only states the obvious so it is not
helpful. Can remove it.
~~~
7.
dbinfo[i].made_publication = false;
+
/* Fill subscriber attributes */
This whitespace change is unrelated to this patch.
~~~
8.
- pg_log_debug("subscriber(%d): subscription: %s ; connection string: %s", i,
+ pg_log_debug("subscriber(%d): subscription: %s ; connection string:
%s, --enable-two-phase: %s", i,
dbinfo[i].subname ? dbinfo[i].subname : "(auto)",
- dbinfo[i].subconninfo);
+ dbinfo[i].subconninfo,
+ dbinfo[i].two_phase ? "true" : "false");
IMO say "two_phase" here, not "--enable-two-phase".
======
.../t/040_pg_createsubscriber.pl
9.
max_worker_processes = 8
+max_prepared_transactions = 10
});
AFAICT the comment for this test code said:
# Restore default settings here but only apply it after testing standby. Some
# standby settings should not be a lower setting than on the primary.
But max_prepared_transactions = 10 is not the default setting for this GUC.
~~~
10.
is($node_s->safe_psql('postgres', 'SELECT pg_catalog.pg_is_in_recovery()'),
't', 'standby is in recovery');
+
$node_s->stop;
This whitespace change has nothing to do with the patch.
~~~
11.
- 'replslot1'
+ 'replslot1', '--enable-two-phase'
The comment for this test only says
# pg_createsubscriber can run without --databases option
Now it is doing more, so maybe it should give more details like "In
passing, also test the --enable-two-phase option."
~~~
12.
+# Verify that the prepared transaction is replicated to the subscriber
+my $count_prepared_s =
+ $node_s->safe_psql($db1, "SELECT count(*) FROM pg_prepared_xacts;");
+
+is($count_prepared_s, qq(0), 'Prepared transaction replicated to subscriber');
+
Is this test OK? It says to verify it is replicated. How does checking
subscriber has zero pg_prepared_xacts ensure replication occurred?
======
Please see the attached NITPICKS diffs which includes some (not all)
of the above suggestions.
======
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
PS_NITPICKS_v2.txttext/plain; charset=US-ASCII; name=PS_NITPICKS_v2.txtDownload+5-10
On Tue, 10 Dec 2024 at 17:17, Shubham Khanna
<khannashubham1197@gmail.com> wrote:
On Mon, Dec 9, 2024 at 7:43 PM vignesh C <vignesh21@gmail.com> wrote:
The attached Patch contains the required changes.
Few comments:
1) This is not correct, currently enabling two_phase option using
alter subscription is supported:
Notably, the replication for prepared transactions functions regardless of the
initial two-phase setting on the replication slot. However, the user cannot
change the setting after the subscription is created unless a future command,
such as 'ALTER SUBSCRIPTION ... SET (two_phase = on)', is supported.
This provides flexibility for future enhancements.
2) You can enable-two-phase on a non dry-run mode test case, as the
verification will not be possible in dry-run mode:
# pg_createsubscriber can run without --databases option
@@ -352,7 +355,7 @@ command_ok(
$node_p->connstr($db1), '--socketdir',
$node_s->host, '--subscriber-port',
$node_s->port, '--replication-slot',
- 'replslot1'
+ 'replslot1', '--enable-two-phase'
3) I felt this line can be removed "Two-phase commit ensures atomicity
in logical replication for prepared transactions." as this information
is available at prepare transaction and create subscription page
documentation:
+ <literal>off</literal>.
+ Two-phase commit ensures atomicity in logical replication for prepared
+ transactions. See the
+ <link linkend="sql-createsubscription-params-with-two-phase"><literal>two_phase</literal></link>
+ documentation for more details.
4) This change is not required as it is not needed for the patch:
dbinfo[i].made_replslot = false;
dbinfo[i].made_publication = false;
+
/* Fill subscriber attributes */
conninfo = concat_conninfo_dbname(sub_base_conninfo, cell->val);
5) Similarly here too:
@@ -341,6 +343,7 @@ command_ok(
$node_s->start;
is($node_s->safe_psql('postgres', 'SELECT pg_catalog.pg_is_in_recovery()'),
't', 'standby is in recovery');
+
$node_s->stop;
Regards,
Vignesh
On Wed, Dec 11, 2024 at 4:21 AM Peter Smith <smithpb2250@gmail.com> wrote:
Hi Shubham,
Here are some review comments for patch v2-0001.
======
GENERAL - the new option name1.
I am not sure if this new switch needed to be called
'--enable-two-phase'; probably just calling it '--two-phase' would
mean the same because specifying the switch already implies "enabling"
it to me.Perhaps either name is fine. What do others think?
======
Commit message2.
This patch introduces the '--enable-two-phase' option to the
'pg_createsubscriber' utility, allowing users to enable or disable two-phase
commit for subscriptions during their creation.~
It seems a bit strange IMO to say "enable or disable", because this is
only for "enable", and the default is disable as described in the
following sentence.~~~
3.
By default, two-phase commit is disabled if the option is provided without
arguments.~
But, this option now has no arguments so the part "if the option is
provided without arguments" is not applicable anymore and should be
removed. Or, if you want to say something you could say "if the option
is not provided."======
doc/src/sgml/ref/pg_createsubscriber.sgml4. + <varlistentry> + <term><option>-T</option></term> + <term><option>--enable-two-phase</option></term> + <listitem> + <para> + Enables two-phase commit for the subscription. When the option is + provided, it is explicitly enabled. By default, two-phase commit is + <literal>off</literal>. + Two-phase commit ensures atomicity in logical replication for prepared + transactions. See the + <link linkend="sql-createsubscription-params-with-two-phase"><literal>two_phase</literal></link> + documentation for more details. + </para> + </listitem> + </varlistentry> +I felt this was more verbose than necessary. IMO you only needed to
say something very simple; the user can chase the link to learn more
about two_phase if they want to.SUGGESTION
Enables <link linkend="sql-createsubscription-params-with-two-phase"><literal>two_phase</literal></link>
commit for the subscription. The default is <literal>false</literal>.======
src/bin/pg_basebackup/pg_createsubscriber.cusage:
5.
printf(_(" -t, --recovery-timeout=SECS seconds to wait for
recovery to end\n"));
+ printf(_(" -T, --enable-two-phase enable two-phase commit for the
subscription\n"));
printf(_(" -U, --subscriber-username=NAME user name for subscriber
connection\n"));AFAICT the patch is wrongly using tabs here instead of spaces.
~~~
store_pub_sub_info:
6.
+ dbinfo[i].two_phase = opt->two_phase; /* Set two-phase option */I still think this comment only states the obvious so it is not
helpful. Can remove it.~~~
7.
dbinfo[i].made_publication = false;
+
/* Fill subscriber attributes */
This whitespace change is unrelated to this patch.~~~
8. - pg_log_debug("subscriber(%d): subscription: %s ; connection string: %s", i, + pg_log_debug("subscriber(%d): subscription: %s ; connection string: %s, --enable-two-phase: %s", i, dbinfo[i].subname ? dbinfo[i].subname : "(auto)", - dbinfo[i].subconninfo); + dbinfo[i].subconninfo, + dbinfo[i].two_phase ? "true" : "false");IMO say "two_phase" here, not "--enable-two-phase".
======
.../t/040_pg_createsubscriber.pl9.
max_worker_processes = 8
+max_prepared_transactions = 10
});AFAICT the comment for this test code said:
# Restore default settings here but only apply it after testing standby. Some
# standby settings should not be a lower setting than on the primary.But max_prepared_transactions = 10 is not the default setting for this GUC.
~~~
10.
is($node_s->safe_psql('postgres', 'SELECT pg_catalog.pg_is_in_recovery()'),
't', 'standby is in recovery');
+
$node_s->stop;This whitespace change has nothing to do with the patch.
~~~
11. - 'replslot1' + 'replslot1', '--enable-two-phase'The comment for this test only says
# pg_createsubscriber can run without --databases optionNow it is doing more, so maybe it should give more details like "In
passing, also test the --enable-two-phase option."~~~
12. +# Verify that the prepared transaction is replicated to the subscriber +my $count_prepared_s = + $node_s->safe_psql($db1, "SELECT count(*) FROM pg_prepared_xacts;"); + +is($count_prepared_s, qq(0), 'Prepared transaction replicated to subscriber'); +Is this test OK? It says to verify it is replicated. How does checking
subscriber has zero pg_prepared_xacts ensure replication occurred?======
Please see the attached NITPICKS diffs which includes some (not all)
of the above suggestions.======
I have fixed the given comments. The attached patch contains the
suggested changes.
Thanks and regards,
Shubham Khanna.
Attachments:
v3-0001-Add-support-for-two-phase-commit-in-pg_createsubs.patchapplication/octet-stream; name=v3-0001-Add-support-for-two-phase-commit-in-pg_createsubs.patchDownload+48-9
On Wed, Dec 11, 2024 at 10:59 AM vignesh C <vignesh21@gmail.com> wrote:
On Tue, 10 Dec 2024 at 17:17, Shubham Khanna
<khannashubham1197@gmail.com> wrote:On Mon, Dec 9, 2024 at 7:43 PM vignesh C <vignesh21@gmail.com> wrote:
The attached Patch contains the required changes.
Few comments:
1) This is not correct, currently enabling two_phase option using
alter subscription is supported:
Notably, the replication for prepared transactions functions regardless of the
initial two-phase setting on the replication slot. However, the user cannot
change the setting after the subscription is created unless a future command,
such as 'ALTER SUBSCRIPTION ... SET (two_phase = on)', is supported.
This provides flexibility for future enhancements.2) You can enable-two-phase on a non dry-run mode test case, as the verification will not be possible in dry-run mode: # pg_createsubscriber can run without --databases option @@ -352,7 +355,7 @@ command_ok( $node_p->connstr($db1), '--socketdir', $node_s->host, '--subscriber-port', $node_s->port, '--replication-slot', - 'replslot1' + 'replslot1', '--enable-two-phase'3) I felt this line can be removed "Two-phase commit ensures atomicity in logical replication for prepared transactions." as this information is available at prepare transaction and create subscription page documentation: + <literal>off</literal>. + Two-phase commit ensures atomicity in logical replication for prepared + transactions. See the + <link linkend="sql-createsubscription-params-with-two-phase"><literal>two_phase</literal></link> + documentation for more details.4) This change is not required as it is not needed for the patch:
dbinfo[i].made_replslot = false;
dbinfo[i].made_publication = false;
+
/* Fill subscriber attributes */
conninfo = concat_conninfo_dbname(sub_base_conninfo, cell->val);5) Similarly here too:
@@ -341,6 +343,7 @@ command_ok(
$node_s->start;
is($node_s->safe_psql('postgres', 'SELECT pg_catalog.pg_is_in_recovery()'),
't', 'standby is in recovery');
+
$node_s->stop;
I have fixed the given comments. The v3 version patch attached at [1]/messages/by-id/CAHv8RjJ8NuHLQUhkqE-fy5k+3nZUdX9QngrLaa7iS0TEJNicow@mail.gmail.com
has the changes for the same.
[1]: /messages/by-id/CAHv8RjJ8NuHLQUhkqE-fy5k+3nZUdX9QngrLaa7iS0TEJNicow@mail.gmail.com
Thanks and Regards,
Shubham Khanna.
On Wed, 11 Dec 2024 at 11:21, Shubham Khanna
<khannashubham1197@gmail.com> wrote:
On Wed, Dec 11, 2024 at 4:21 AM Peter Smith <smithpb2250@gmail.com> wrote:
I have fixed the given comments. The attached patch contains the
suggested changes.
Since all the subscriptions are created based on the two_phase option
provided, there is no need to store this for each database:
@@ -53,6 +54,7 @@ struct LogicalRepInfo
char *pubname; /* publication name */
char *subname; /* subscription name */
char *replslotname; /* replication slot name */
+ bool two_phase; /* two-phase enabled
for the subscription */
dbinfo[i].dbname = cell->val;
+ dbinfo[i].two_phase = opt->two_phase;
if (num_pubs > 0)
How about we handle something like in the attached changes.
Regards,
Vignesh
Attachments:
Remove_dbwise_two_phase.patchtext/x-patch; charset=US-ASCII; name=Remove_dbwise_two_phase.patchDownload+12-10
On Wed, Dec 11, 2024 at 2:08 PM vignesh C <vignesh21@gmail.com> wrote:
On Wed, 11 Dec 2024 at 11:21, Shubham Khanna
<khannashubham1197@gmail.com> wrote:On Wed, Dec 11, 2024 at 4:21 AM Peter Smith <smithpb2250@gmail.com> wrote:
I have fixed the given comments. The attached patch contains the
suggested changes.Since all the subscriptions are created based on the two_phase option provided, there is no need to store this for each database: @@ -53,6 +54,7 @@ struct LogicalRepInfo char *pubname; /* publication name */ char *subname; /* subscription name */ char *replslotname; /* replication slot name */ + bool two_phase; /* two-phase enabled for the subscription */dbinfo[i].dbname = cell->val;
+ dbinfo[i].two_phase = opt->two_phase;
if (num_pubs > 0)How about we handle something like in the attached changes.
Thank you for pointing this out and for suggesting the changes. I
agree with your approach.
Also, I found a mistake in getopt_long and fixed it in this version of
the patch.
The attached patch contains the suggested changes.
Thanks and regards,
Shubham Khanna.
Attachments:
v4-0001-Add-support-for-two-phase-commit-in-pg_createsubs.patchapplication/octet-stream; name=v4-0001-Add-support-for-two-phase-commit-in-pg_createsubs.patchDownload+56-15
Hi Shubham,
Here are some review comments for the patch v4-0001.
======
GENERAL.
1.
After reading Vignesh's last review and then the pg_createsubscriber
documentation I see there can be multiple databases simultaneously
specified (by writing multiple -d switches) and in that case each one
gets its own:
--publication
--replication-slot
--subscription
OTOH, this new '--enable-two-phase' switch is just a blanket two_phase
enablement across all subscriptions. There is no way for the user to
say if they want it enabled for some subscriptions (on some databases)
but not for others. I suppose this was intentional and OK (right?),
but this point needs to be clarified in the docs.
======
doc/src/sgml/ref/pg_createsubscriber.sgml
2.
+ <para>
+ Enables <link
linkend="sql-createsubscription-params-with-two-phase"><literal>two_phase</literal></link>
+ commit for the subscription. The default is <literal>false</literal>.
+ </para>
Following on from the previous review comment. Since this switch is
applied to all database subscriptions I think the help needs to say
that. Something like.
"If there are multiple subscriptions specified, this option applies to
all of them."
~~~
3.
In the "Prerequisites" sections of the docs, it gives requirements for
various GUC settings on the source server and the target server. So,
should there be another note here advising to configure the
'max_prepared_transactions' GUC when the '--enable-two-phase' is
specified?
~~~
4. "Warnings" section includes the following:
pg_createsubscriber sets up logical replication with two-phase commit
disabled. This means that any prepared transactions will be replicated
at the time of COMMIT PREPARED, without advance preparation. Once
setup is complete, you can manually drop and re-create the
subscription(s) with the two_phase option enabled.
~
The above text is from the "Warnings" section, but it seems stale
information that needs to be updated due to the introduction of this
new '--enable-two-phase' option.
======
src/bin/pg_basebackup/pg_createsubscriber.c
usage:
5.
printf(_(" -t, --recovery-timeout=SECS seconds to wait for
recovery to end\n"));
+ printf(_(" -T, --enable-two-phase enable two-phase commit
for the subscription\n"));
Given the previous review comments (#1/#2 etc), I was wondering if it
might be better to say more like "enable two-phase commit for all
subscriptions".
======
.../t/040_pg_createsubscriber.pl
6.
+is($count_prepared_s, qq(1), 'Prepared transaction replicated to subscriber');
Should there also have been an earlier check (way back before the
PREPARE) just to make sure this count was initially 0?
======
Kind Regards,
Peter Smith.
Fujitsu Australia
Dear Shubham,
Thank you for pointing this out and for suggesting the changes. I
agree with your approach.
Also, I found a mistake in getopt_long and fixed it in this version of
the patch.
The attached patch contains the suggested changes.
Thanks for updating the patch. I think the patch looks mostly OK.
Regarding the test code - I think we should directly refer the pg_subscription catalog,
and confirm that subtwophase is 'p'. IIUC, we can wait until
all subscriptions. Subtwophasestate are 'e', by using poll_query_until() and [1]SELECT count(1) = 0 FROM pg_subscription WHERE subtwophasestate NOT IN ('e');.
[1]: SELECT count(1) = 0 FROM pg_subscription WHERE subtwophasestate NOT IN ('e');
Best regards,
Hayato Kuroda
FUJITSU LIMITED
On Thu, 12 Dec 2024 at 08:14, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Dear Shubham,
Thank you for pointing this out and for suggesting the changes. I
agree with your approach.
Also, I found a mistake in getopt_long and fixed it in this version of
the patch.
The attached patch contains the suggested changes.Thanks for updating the patch. I think the patch looks mostly OK.
Regarding the test code - I think we should directly refer the pg_subscription catalog,
and confirm that subtwophase is 'p'. IIUC, we can wait until
all subscriptions. Subtwophasestate are 'e', by using poll_query_until() and [1].[1]: SELECT count(1) = 0 FROM pg_subscription WHERE subtwophasestate NOT IN ('e');
Yes, that approach is better. Also this is not required after checking
using pg_subscription:
+# Prepare a transaction on the publisher
+$node_p->safe_psql(
+ $db1, qq[
+ BEGIN;
+ INSERT INTO tbl1 SELECT generate_series(1, 10);
+ PREPARE TRANSACTION 'test_prepare';
+]);
+
# Start subscriber
$node_s->start;
+# Verify that the prepared transaction is replicated to the subscriber
+my $count_prepared_s =
+ $node_s->safe_psql($db1, "SELECT count(*) FROM pg_prepared_xacts;");
+
+is($count_prepared_s, qq(1), 'Prepared transaction replicated to subscriber');
Regards,
Vignesh
On Thu, Dec 12, 2024 at 6:04 AM Peter Smith <smithpb2250@gmail.com> wrote:
Hi Shubham,
Here are some review comments for the patch v4-0001.
======
GENERAL.1.
After reading Vignesh's last review and then the pg_createsubscriber
documentation I see there can be multiple databases simultaneously
specified (by writing multiple -d switches) and in that case each one
gets its own:
--publication
--replication-slot
--subscriptionOTOH, this new '--enable-two-phase' switch is just a blanket two_phase
enablement across all subscriptions. There is no way for the user to
say if they want it enabled for some subscriptions (on some databases)
but not for others. I suppose this was intentional and OK (right?),
but this point needs to be clarified in the docs.
Fixed.
======
doc/src/sgml/ref/pg_createsubscriber.sgml2. + <para> + Enables <link linkend="sql-createsubscription-params-with-two-phase"><literal>two_phase</literal></link> + commit for the subscription. The default is <literal>false</literal>. + </para>Following on from the previous review comment. Since this switch is
applied to all database subscriptions I think the help needs to say
that. Something like."If there are multiple subscriptions specified, this option applies to
all of them."~~~
Fixed.
3.
In the "Prerequisites" sections of the docs, it gives requirements for
various GUC settings on the source server and the target server. So,
should there be another note here advising to configure the
'max_prepared_transactions' GUC when the '--enable-two-phase' is
specified?~~~
Fixed.
4. "Warnings" section includes the following:
pg_createsubscriber sets up logical replication with two-phase commit
disabled. This means that any prepared transactions will be replicated
at the time of COMMIT PREPARED, without advance preparation. Once
setup is complete, you can manually drop and re-create the
subscription(s) with the two_phase option enabled.~
The above text is from the "Warnings" section, but it seems stale
information that needs to be updated due to the introduction of this
new '--enable-two-phase' option.
Fixed.
======
src/bin/pg_basebackup/pg_createsubscriber.cusage:
5.
printf(_(" -t, --recovery-timeout=SECS seconds to wait for
recovery to end\n"));
+ printf(_(" -T, --enable-two-phase enable two-phase commit
for the subscription\n"));Given the previous review comments (#1/#2 etc), I was wondering if it
might be better to say more like "enable two-phase commit for all
subscriptions".
Fixed.
======
.../t/040_pg_createsubscriber.pl6.
+is($count_prepared_s, qq(1), 'Prepared transaction replicated to subscriber');Should there also have been an earlier check (way back before the
PREPARE) just to make sure this count was initially 0?
Removed this and added a new test case instead of this.
The attached patch contains the suggested changes.
Thanks and regards,
Shubham Khanna.
Attachments:
v5-0001-Add-support-for-two-phase-commit-in-pg_createsubs.patchapplication/octet-stream; name=v5-0001-Add-support-for-two-phase-commit-in-pg_createsubs.patchDownload+67-23
On Thu, Dec 12, 2024 at 8:14 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Dear Shubham,
Thank you for pointing this out and for suggesting the changes. I
agree with your approach.
Also, I found a mistake in getopt_long and fixed it in this version of
the patch.
The attached patch contains the suggested changes.Thanks for updating the patch. I think the patch looks mostly OK.
Regarding the test code - I think we should directly refer the pg_subscription catalog,
and confirm that subtwophase is 'p'. IIUC, we can wait until
all subscriptions. Subtwophasestate are 'e', by using poll_query_until() and [1].[1]: SELECT count(1) = 0 FROM pg_subscription WHERE subtwophasestate NOT IN ('e');
I have fixed the given comment. The v5 version patch attached at [1]/messages/by-id/CAHv8Rj+hd2MTNRs4AsK6=hRqvV6JC9g2tTAJwGjrNfXg6vhD8g@mail.gmail.com
has the changes for the same.
[1]: /messages/by-id/CAHv8Rj+hd2MTNRs4AsK6=hRqvV6JC9g2tTAJwGjrNfXg6vhD8g@mail.gmail.com
Thanks and Regards,
Shubham Khanna.
On Thu, Dec 12, 2024 at 9:34 AM vignesh C <vignesh21@gmail.com> wrote:
On Thu, 12 Dec 2024 at 08:14, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:Dear Shubham,
Thank you for pointing this out and for suggesting the changes. I
agree with your approach.
Also, I found a mistake in getopt_long and fixed it in this version of
the patch.
The attached patch contains the suggested changes.Thanks for updating the patch. I think the patch looks mostly OK.
Regarding the test code - I think we should directly refer the pg_subscription catalog,
and confirm that subtwophase is 'p'. IIUC, we can wait until
all subscriptions. Subtwophasestate are 'e', by using poll_query_until() and [1].[1]: SELECT count(1) = 0 FROM pg_subscription WHERE subtwophasestate NOT IN ('e');
Yes, that approach is better. Also this is not required after checking using pg_subscription: +# Prepare a transaction on the publisher +$node_p->safe_psql( + $db1, qq[ + BEGIN; + INSERT INTO tbl1 SELECT generate_series(1, 10); + PREPARE TRANSACTION 'test_prepare'; +]); + # Start subscriber $node_s->start;+# Verify that the prepared transaction is replicated to the subscriber +my $count_prepared_s = + $node_s->safe_psql($db1, "SELECT count(*) FROM pg_prepared_xacts;"); + +is($count_prepared_s, qq(1), 'Prepared transaction replicated to subscriber');
I have fixed the given comment. The v5 version patch attached at [1]/messages/by-id/CAHv8Rj+hd2MTNRs4AsK6=hRqvV6JC9g2tTAJwGjrNfXg6vhD8g@mail.gmail.com
has the changes for the same.
[1]: /messages/by-id/CAHv8Rj+hd2MTNRs4AsK6=hRqvV6JC9g2tTAJwGjrNfXg6vhD8g@mail.gmail.com
Thanks and Regards,
Shubham Khanna.
Hi Shubham,
Here are my review comments for the patch v5-0001.
======
doc/src/sgml/ref/pg_createsubscriber.sgml
1.
- must accept local connections.
+ must accept local connections. If you are planning to use the
+ --enable-two-phase switch then you will also need to set the
+ <xref linkend="guc-max-prepared-transactions"/> appropriately.
Should use sgml <option> markup for "--enable-two-phase".
~~~
2.
- <application>pg_createsubscriber</application> sets up logical
- replication with two-phase commit disabled. This means that any
- prepared transactions will be replicated at the time
- of <command>COMMIT PREPARED</command>, without advance preparation.
- Once setup is complete, you can manually drop and re-create the
- subscription(s) with
+ If --enable-two-phase switch is not specified, the
+ <application>pg_createsubscriber</application> sets up logical replication
+ with two-phase commit disabled. This means that any prepared transactions
+ will be replicated at the time of <command>COMMIT PREPARED</command>,
+ without advance preparation. Once setup is complete, you can manually drop
+ and re-create the subscription(s) with
Should use sgml <option> markup for "--enable-two-phase".
======
.../t/040_pg_createsubscriber.pl
3.
+# Verify that the subtwophase is 'p' in the pg_subscription catalog
+my $poll_query_until = $node_s->safe_psql('postgres',
+ "SELECT count(1) = 0 FROM pg_subscription WHERE subtwophasestate NOT
IN ('e');"
+);
+
+is($poll_query_until, qq(t),
+ 'Timed out while waiting for subscriber to enable twophase');
3a.
Hmm. Does this code match the comment? The comment says verify
subtwophase is 'p' (aka "pending") but AFAICT the code is actually
waiting until every subtwophase is 'e' (aka "enabled").
~
3b.
Also, if you are going to name char-codes (like 'p') in comments, it
might be helpful to include the equivalent words, saving readers from
having to search the documentation to find the meaning.
======
Kind Regards,
Peter Smith.
Fujitsu Australia