PGDOCS - Logical replication GUCs - added some xrefs
Hi hackers.
There is a docs Logical Replication section "31.10 Configuration
Settings" [1]31.10 Configuration Settings - https://www.postgresql.org/docs/current/logical-replication-config.html which describes some logical replication GUCs, and
details on how they interact with each other and how to take that into
account when setting their values.
There is another docs Server Configuration section "20.6 Replication"
[2]: 20.6 Replication - https://www.postgresql.org/docs/current/runtime-config-replication.html
are for.
Currently AFAIK those two pages are unconnected, but I felt it might
be helpful if some of the parameters in the list [2]20.6 Replication - https://www.postgresql.org/docs/current/runtime-config-replication.html had xref links to
the additional logical replication configuration information [1]31.10 Configuration Settings - https://www.postgresql.org/docs/current/logical-replication-config.html. PSA
a patch to do that.
~~
Meanwhile, I also suspect that the main blurb top of [1]31.10 Configuration Settings - https://www.postgresql.org/docs/current/logical-replication-config.html is not
entirely correct... it says "These settings control the behaviour of
the built-in streaming replication feature", although some of the GUCs
mentioned later in this section are clearly for "logical replication".
Thoughts?
------
[1]: 31.10 Configuration Settings - https://www.postgresql.org/docs/current/logical-replication-config.html
https://www.postgresql.org/docs/current/logical-replication-config.html
[2]: 20.6 Replication - https://www.postgresql.org/docs/current/runtime-config-replication.html
https://www.postgresql.org/docs/current/runtime-config-replication.html
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v1-0001-Logical-replication-GUCs-added-some-docs-xrefs.patchapplication/octet-stream; name=v1-0001-Logical-replication-GUCs-added-some-docs-xrefs.patchDownload
From f1872c227d1029468f81eef8964acec143270a85 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Mon, 24 Oct 2022 18:32:26 +1100
Subject: [PATCH v1] Logical replication GUCs - added some docs xrefs
---
doc/src/sgml/config.sgml | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 6c64933..10a9e21 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4234,6 +4234,12 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
not <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>)
will prevent the server from starting.
</para>
+
+ <para>
+ See <xref linkend="logical-replication-config"/> for more details
+ about setting <varname>max_replication_slots</varname> for logical
+ replication.
+ </para>
</listitem>
</varlistentry>
@@ -4952,7 +4958,8 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
</para>
<para>
Logical replication workers are taken from the pool defined by
- <varname>max_worker_processes</varname>.
+ <varname>max_worker_processes</varname>. See
+ <xref linkend="logical-replication-config"/> for more details.
</para>
<para>
The default value is 4. This parameter can only be set at server
@@ -4978,7 +4985,8 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
</para>
<para>
The synchronization workers are taken from the pool defined by
- <varname>max_logical_replication_workers</varname>.
+ <varname>max_logical_replication_workers</varname>. See
+ <xref linkend="logical-replication-config"/> for more details.
</para>
<para>
The default value is 2. This parameter can only be set in the
--
1.8.3.1
On Mon, 24 Oct 2022 at 13:15, Peter Smith <smithpb2250@gmail.com> wrote:
Hi hackers.
There is a docs Logical Replication section "31.10 Configuration
Settings" [1] which describes some logical replication GUCs, and
details on how they interact with each other and how to take that into
account when setting their values.There is another docs Server Configuration section "20.6 Replication"
[2] which lists the replication-related GUC parameters, and what they
are for.Currently AFAIK those two pages are unconnected, but I felt it might
be helpful if some of the parameters in the list [2] had xref links to
the additional logical replication configuration information [1]. PSA
a patch to do that.~~
Meanwhile, I also suspect that the main blurb top of [1] is not
entirely correct... it says "These settings control the behaviour of
the built-in streaming replication feature", although some of the GUCs
mentioned later in this section are clearly for "logical replication".
The introduction mainly talks about streaming replication and the page
[1]: 20.6 Replication - https://www.postgresql.org/docs/current/runtime-config-replication.html
configurations are for logical replication. As we already have a
separate page [2]31.10 Configuration Settings - https://www.postgresql.org/docs/current/logical-replication-config.html to detail about logical replication configurations,
it might be better to move the "subscribers" section from [1]20.6 Replication - https://www.postgresql.org/docs/current/runtime-config-replication.html to [2]31.10 Configuration Settings - https://www.postgresql.org/docs/current/logical-replication-config.html.
[1]: 20.6 Replication - https://www.postgresql.org/docs/current/runtime-config-replication.html
https://www.postgresql.org/docs/current/runtime-config-replication.html
[2]: 31.10 Configuration Settings - https://www.postgresql.org/docs/current/logical-replication-config.html
https://www.postgresql.org/docs/current/logical-replication-config.html
Regards,
Vignesh
On Sun, Nov 13, 2022 at 11:47 AM vignesh C <vignesh21@gmail.com> wrote:
On Mon, 24 Oct 2022 at 13:15, Peter Smith <smithpb2250@gmail.com> wrote:
Hi hackers.
There is a docs Logical Replication section "31.10 Configuration
Settings" [1] which describes some logical replication GUCs, and
details on how they interact with each other and how to take that into
account when setting their values.There is another docs Server Configuration section "20.6 Replication"
[2] which lists the replication-related GUC parameters, and what they
are for.Currently AFAIK those two pages are unconnected, but I felt it might
be helpful if some of the parameters in the list [2] had xref links to
the additional logical replication configuration information [1]. PSA
a patch to do that.~~
Meanwhile, I also suspect that the main blurb top of [1] is not
entirely correct... it says "These settings control the behaviour of
the built-in streaming replication feature", although some of the GUCs
mentioned later in this section are clearly for "logical replication".The introduction mainly talks about streaming replication and the page
[1] subsection "Subscribers" clearly mentions that these
configurations are for logical replication. As we already have a
separate page [2] to detail about logical replication configurations,
it might be better to move the "subscribers" section from [1] to [2].[1] 20.6 Replication -
https://www.postgresql.org/docs/current/runtime-config-replication.html
[2] 31.10 Configuration Settings -
https://www.postgresql.org/docs/current/logical-replication-config.html
Thanks, Vignesh. Your suggestion (to move that "Subscribers" section)
seemed like a good idea to me, so PSA my patch v2 to implement that.
Now, on the Streaming Replication page
- the blurb has a reference to information about logical replication config
- the "Subscribers" section was relocated to the other page
Now, on the Logical Replication "Configuration Settings" page
- there are new subsections for "Publishers", "Subscribers" (copied), "Notes"
- some wording is rearranged but the content is basically the same as before
------
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v2-0001-Logical-replication-GUCs-consolidated.patchapplication/octet-stream; name=v2-0001-Logical-replication-GUCs-consolidated.patchDownload
From de8a835b7793c1deed16462aec3e5dca3933d816 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Tue, 15 Nov 2022 16:26:32 +1100
Subject: [PATCH v2] Logical replication GUCs - consolidated.
Combines all the logical replication configuration settings on one page
with new Publisher/Subscriber sub-sections.
---
doc/src/sgml/config.sgml | 80 +++------------------
doc/src/sgml/logical-replication.sgml | 130 ++++++++++++++++++++++++++++------
2 files changed, 120 insertions(+), 90 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 559eb89..4e2559d 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4162,6 +4162,11 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
across the cluster without problems if that is required.
</para>
+ <para>
+ For <firstterm>logical replication</firstterm> configuration settings refer
+ to <xref linkend="logical-replication-config"/>.
+ </para>
+
<sect2 id="runtime-config-replication-sender">
<title>Sending Servers</title>
@@ -4234,6 +4239,12 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
not <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>)
will prevent the server from starting.
</para>
+
+ <para>
+ See <xref linkend="logical-replication-config"/> for more details
+ about setting <varname>max_replication_slots</varname> for logical
+ replication.
+ </para>
</listitem>
</varlistentry>
@@ -4922,75 +4933,6 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
</variablelist>
</sect2>
- <sect2 id="runtime-config-replication-subscriber">
- <title>Subscribers</title>
-
- <para>
- These settings control the behavior of a logical replication subscriber.
- Their values on the publisher are irrelevant.
- </para>
-
- <para>
- Note that <varname>wal_receiver_timeout</varname>,
- <varname>wal_receiver_status_interval</varname> and
- <varname>wal_retrieve_retry_interval</varname> configuration parameters
- affect the logical replication workers as well.
- </para>
-
- <variablelist>
-
- <varlistentry id="guc-max-logical-replication-workers" xreflabel="max_logical_replication_workers">
- <term><varname>max_logical_replication_workers</varname> (<type>integer</type>)
- <indexterm>
- <primary><varname>max_logical_replication_workers</varname> configuration parameter</primary>
- </indexterm>
- </term>
- <listitem>
- <para>
- Specifies maximum number of logical replication workers. This includes
- both apply workers and table synchronization workers.
- </para>
- <para>
- Logical replication workers are taken from the pool defined by
- <varname>max_worker_processes</varname>.
- </para>
- <para>
- The default value is 4. This parameter can only be set at server
- start.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry id="guc-max-sync-workers-per-subscription" xreflabel="max_sync_workers_per_subscription">
- <term><varname>max_sync_workers_per_subscription</varname> (<type>integer</type>)
- <indexterm>
- <primary><varname>max_sync_workers_per_subscription</varname> configuration parameter</primary>
- </indexterm>
- </term>
- <listitem>
- <para>
- Maximum number of synchronization workers per subscription. This
- parameter controls the amount of parallelism of the initial data copy
- during the subscription initialization or when new tables are added.
- </para>
- <para>
- Currently, there can be only one synchronization worker per table.
- </para>
- <para>
- The synchronization workers are taken from the pool defined by
- <varname>max_logical_replication_workers</varname>.
- </para>
- <para>
- The default value is 2. This parameter can only be set in the
- <filename>postgresql.conf</filename> file or on the server command
- line.
- </para>
- </listitem>
- </varlistentry>
-
- </variablelist>
- </sect2>
-
</sect1>
<sect1 id="runtime-config-query">
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index f875638..4e07392 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -1768,28 +1768,116 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER
Logical replication requires several configuration options to be set.
</para>
- <para>
- On the publisher side, <varname>wal_level</varname> must be set to
- <literal>logical</literal>, and <varname>max_replication_slots</varname>
- must be set to at least the number of subscriptions expected to connect,
- plus some reserve for table synchronization. And
- <varname>max_wal_senders</varname> should be set to at least the same as
- <varname>max_replication_slots</varname> plus the number of physical
- replicas that are connected at the same time.
- </para>
+ <sect2 id="logical-replication-config-publisher">
+ <title>Publishers</title>
+
+ <para>
+ <varname>wal_level</varname> must be set to <literal>logical</literal>.
+ </para>
+
+ <para>
+ <varname>max_replication_slots</varname> must be set to at least the number
+ of subscriptions expected to connect, plus some reserve for table
+ synchronization.
+ </para>
+
+ <para>
+ <varname>max_wal_senders</varname> should be set to at least the same as
+ <varname>max_replication_slots</varname>, plus the number of physical
+ replicas that are connected at the same time.
+ </para>
+
+ </sect2>
+
+ <sect2 id="logical-replication-config-subscriber">
+ <title>Subscribers</title>
+
+ <para>
+ <varname>max_replication_slots</varname> must be set to at least the number
+ of subscriptions that will be added to the subscriber, plus some reserve for
+ table synchronization.
+ </para>
+
+ <para>
+ The following settings control the behavior of a logical replication subscriber.
+ Their values on the publisher are irrelevant:
+ </para>
+
+ <variablelist>
+
+ <varlistentry id="guc-max-logical-replication-workers" xreflabel="max_logical_replication_workers">
+ <term><varname>max_logical_replication_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>max_logical_replication_workers</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Specifies maximum number of logical replication workers. This must be set
+ to at least the number of subscriptions (for apply workers), plus some
+ reserve for the table synchronization workers.
+ </para>
+ <para>
+ Logical replication workers are taken from the pool defined by
+ <varname>max_worker_processes</varname>.
+ </para>
+ <para>
+ The default value is 4. This parameter can only be set at server
+ start.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry id="guc-max-sync-workers-per-subscription" xreflabel="max_sync_workers_per_subscription">
+ <term><varname>max_sync_workers_per_subscription</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>max_sync_workers_per_subscription</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Maximum number of synchronization workers per subscription. This
+ parameter controls the amount of parallelism of the initial data copy
+ during the subscription initialization or when new tables are added.
+ </para>
+ <para>
+ Currently, there can be only one synchronization worker per table.
+ </para>
+ <para>
+ The synchronization workers are taken from the pool defined by
+ <varname>max_logical_replication_workers</varname>.
+ </para>
+ <para>
+ The default value is 2. This parameter can only be set in the
+ <filename>postgresql.conf</filename> file or on the server command
+ line.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </sect2>
+
+ <sect2 id="logical-replication-config-notes">
+ <title>Notes</title>
+
+ <para>
+ <varname>wal_receiver_timeout</varname>,
+ <varname>wal_receiver_status_interval</varname> and
+ <varname>wal_retrieve_retry_interval</varname> configuration parameters
+ affect the logical replication workers as well.
+ </para>
+
+ <para>
+ <varname>max_worker_processes</varname> may need to be adjusted to
+ accommodate for replication workers, at least
+ <varname>max_logical_replication_workers</varname> + <literal>1</literal>.
+ Note that some extensions and parallel queries also take worker slots
+ from <varname>max_worker_processes</varname>.
+ </para>
+
+ </sect2>
- <para>
- <varname>max_replication_slots</varname> must also be set on the subscriber.
- It should be set to at least the number of subscriptions that will be added
- to the subscriber, plus some reserve for table synchronization.
- <varname>max_logical_replication_workers</varname> must be set to at least
- the number of subscriptions, again plus some reserve for the table
- synchronization. Additionally the <varname>max_worker_processes</varname>
- may need to be adjusted to accommodate for replication workers, at least
- (<varname>max_logical_replication_workers</varname>
- + <literal>1</literal>). Note that some extensions and parallel queries
- also take worker slots from <varname>max_worker_processes</varname>.
- </para>
</sect1>
<sect1 id="logical-replication-quick-setup">
--
1.8.3.1
On Tue, 15 Nov 2022 at 11:17, Peter Smith <smithpb2250@gmail.com> wrote:
On Sun, Nov 13, 2022 at 11:47 AM vignesh C <vignesh21@gmail.com> wrote:
On Mon, 24 Oct 2022 at 13:15, Peter Smith <smithpb2250@gmail.com> wrote:
Hi hackers.
There is a docs Logical Replication section "31.10 Configuration
Settings" [1] which describes some logical replication GUCs, and
details on how they interact with each other and how to take that into
account when setting their values.There is another docs Server Configuration section "20.6 Replication"
[2] which lists the replication-related GUC parameters, and what they
are for.Currently AFAIK those two pages are unconnected, but I felt it might
be helpful if some of the parameters in the list [2] had xref links to
the additional logical replication configuration information [1]. PSA
a patch to do that.~~
Meanwhile, I also suspect that the main blurb top of [1] is not
entirely correct... it says "These settings control the behaviour of
the built-in streaming replication feature", although some of the GUCs
mentioned later in this section are clearly for "logical replication".The introduction mainly talks about streaming replication and the page
[1] subsection "Subscribers" clearly mentions that these
configurations are for logical replication. As we already have a
separate page [2] to detail about logical replication configurations,
it might be better to move the "subscribers" section from [1] to [2].[1] 20.6 Replication -
https://www.postgresql.org/docs/current/runtime-config-replication.html
[2] 31.10 Configuration Settings -
https://www.postgresql.org/docs/current/logical-replication-config.htmlThanks, Vignesh. Your suggestion (to move that "Subscribers" section)
seemed like a good idea to me, so PSA my patch v2 to implement that.Now, on the Streaming Replication page
- the blurb has a reference to information about logical replication config
- the "Subscribers" section was relocated to the other pageNow, on the Logical Replication "Configuration Settings" page
- there are new subsections for "Publishers", "Subscribers" (copied), "Notes"
- some wording is rearranged but the content is basically the same as before
One suggestion:
The format of subscribers includes the data type and default values,
the format of publishers does not include data type and default
values. We can try to maintain the consistency for both publisher and
subscriber configurations.
+ <para>
+ <varname>wal_level</varname> must be set to <literal>logical</literal>.
+ </para>
+ <term><varname>max_logical_replication_workers</varname>
(<type>integer</type>)
+ <indexterm>
+ <primary><varname>max_logical_replication_workers</varname>
configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Specifies maximum number of logical replication workers. This
must be set
+ to at least the number of subscriptions (for apply workers), plus some
+ reserve for the table synchronization workers.
+ </para>
+ <para>
If we don't want to keep the same format, we could give a link to
runtime-config-replication where data type and default is defined for
publisher configurations max_replication_slots and max_wal_senders.
Regards,
Vignesh
On Wed, Nov 16, 2022 at 10:24 PM vignesh C <vignesh21@gmail.com> wrote:
...
One suggestion: The format of subscribers includes the data type and default values, the format of publishers does not include data type and default values. We can try to maintain the consistency for both publisher and subscriber configurations. + <para> + <varname>wal_level</varname> must be set to <literal>logical</literal>. + </para>+ <term><varname>max_logical_replication_workers</varname> (<type>integer</type>) + <indexterm> + <primary><varname>max_logical_replication_workers</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Specifies maximum number of logical replication workers. This must be set + to at least the number of subscriptions (for apply workers), plus some + reserve for the table synchronization workers. + </para> + <para>If we don't want to keep the same format, we could give a link to
runtime-config-replication where data type and default is defined for
publisher configurations max_replication_slots and max_wal_senders.
Thanks for your suggestions.
I have included xref links to the original definitions, rather than
defining the same GUC in multiple places.
PSA v3.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v3-0001-Logical-replication-GUCs-consolidated.patchapplication/octet-stream; name=v3-0001-Logical-replication-GUCs-consolidated.patchDownload
From cd48f7160240497a2af3a6d897d5aed21642c064 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 23 Nov 2022 08:59:44 +1100
Subject: [PATCH v3] Logical replication GUCs - consolidated.
Combines all the logical replication configuration settings on one page
with new Publisher/Subscriber sub-sections.
---
doc/src/sgml/config.sgml | 80 +++------------------
doc/src/sgml/logical-replication.sgml | 132 ++++++++++++++++++++++++++++------
2 files changed, 121 insertions(+), 91 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 24b1624..c61bb33 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4166,6 +4166,11 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
across the cluster without problems if that is required.
</para>
+ <para>
+ For <firstterm>logical replication</firstterm> configuration settings refer
+ to <xref linkend="logical-replication-config"/>.
+ </para>
+
<sect2 id="runtime-config-replication-sender">
<title>Sending Servers</title>
@@ -4238,6 +4243,12 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
not <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>)
will prevent the server from starting.
</para>
+
+ <para>
+ See <xref linkend="logical-replication-config"/> for more details
+ about setting <varname>max_replication_slots</varname> for logical
+ replication.
+ </para>
</listitem>
</varlistentry>
@@ -4926,75 +4937,6 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
</variablelist>
</sect2>
- <sect2 id="runtime-config-replication-subscriber">
- <title>Subscribers</title>
-
- <para>
- These settings control the behavior of a logical replication subscriber.
- Their values on the publisher are irrelevant.
- </para>
-
- <para>
- Note that <varname>wal_receiver_timeout</varname>,
- <varname>wal_receiver_status_interval</varname> and
- <varname>wal_retrieve_retry_interval</varname> configuration parameters
- affect the logical replication workers as well.
- </para>
-
- <variablelist>
-
- <varlistentry id="guc-max-logical-replication-workers" xreflabel="max_logical_replication_workers">
- <term><varname>max_logical_replication_workers</varname> (<type>integer</type>)
- <indexterm>
- <primary><varname>max_logical_replication_workers</varname> configuration parameter</primary>
- </indexterm>
- </term>
- <listitem>
- <para>
- Specifies maximum number of logical replication workers. This includes
- both apply workers and table synchronization workers.
- </para>
- <para>
- Logical replication workers are taken from the pool defined by
- <varname>max_worker_processes</varname>.
- </para>
- <para>
- The default value is 4. This parameter can only be set at server
- start.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry id="guc-max-sync-workers-per-subscription" xreflabel="max_sync_workers_per_subscription">
- <term><varname>max_sync_workers_per_subscription</varname> (<type>integer</type>)
- <indexterm>
- <primary><varname>max_sync_workers_per_subscription</varname> configuration parameter</primary>
- </indexterm>
- </term>
- <listitem>
- <para>
- Maximum number of synchronization workers per subscription. This
- parameter controls the amount of parallelism of the initial data copy
- during the subscription initialization or when new tables are added.
- </para>
- <para>
- Currently, there can be only one synchronization worker per table.
- </para>
- <para>
- The synchronization workers are taken from the pool defined by
- <varname>max_logical_replication_workers</varname>.
- </para>
- <para>
- The default value is 2. This parameter can only be set in the
- <filename>postgresql.conf</filename> file or on the server command
- line.
- </para>
- </listitem>
- </varlistentry>
-
- </variablelist>
- </sect2>
-
</sect1>
<sect1 id="runtime-config-query">
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index f875638..30f0358 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -1765,31 +1765,119 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER
<title>Configuration Settings</title>
<para>
- Logical replication requires several configuration options to be set.
+ Logical replication requires several configuration parameters to be set.
</para>
- <para>
- On the publisher side, <varname>wal_level</varname> must be set to
- <literal>logical</literal>, and <varname>max_replication_slots</varname>
- must be set to at least the number of subscriptions expected to connect,
- plus some reserve for table synchronization. And
- <varname>max_wal_senders</varname> should be set to at least the same as
- <varname>max_replication_slots</varname> plus the number of physical
- replicas that are connected at the same time.
- </para>
+ <sect2 id="logical-replication-config-publisher">
+ <title>Publishers</title>
+
+ <para>
+ <xref linkend="guc-wal-level"/> must be set to <literal>logical</literal>.
+ </para>
+
+ <para>
+ <xref linkend="guc-max-replication-slots"/> must be set to at least the number
+ of subscriptions expected to connect, plus some reserve for table
+ synchronization.
+ </para>
+
+ <para>
+ <xref linkend="guc-max-wal-senders"/> should be set to at least the same as
+ <varname>max_replication_slots</varname>, plus the number of physical
+ replicas that are connected at the same time.
+ </para>
+
+ </sect2>
+
+ <sect2 id="logical-replication-config-subscriber">
+ <title>Subscribers</title>
+
+ <para>
+ <xref linkend="guc-max-replication-slots"/> must be set to at least the number
+ of subscriptions that will be added to the subscriber, plus some reserve for
+ table synchronization.
+ </para>
+
+ <para>
+ The following settings control the behavior of a logical replication subscriber.
+ Their values on the publisher are irrelevant:
+ </para>
+
+ <variablelist>
+
+ <varlistentry id="guc-max-logical-replication-workers" xreflabel="max_logical_replication_workers">
+ <term><varname>max_logical_replication_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>max_logical_replication_workers</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Specifies maximum number of logical replication workers. This must be set
+ to at least the number of subscriptions (for apply workers), plus some
+ reserve for the table synchronization workers.
+ </para>
+ <para>
+ Logical replication workers are taken from the pool defined by
+ <xref linkend="guc-max-worker-processes"/>.
+ </para>
+ <para>
+ The default value is 4. This parameter can only be set at server
+ start.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry id="guc-max-sync-workers-per-subscription" xreflabel="max_sync_workers_per_subscription">
+ <term><varname>max_sync_workers_per_subscription</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>max_sync_workers_per_subscription</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Maximum number of synchronization workers per subscription. This
+ parameter controls the amount of parallelism of the initial data copy
+ during the subscription initialization or when new tables are added.
+ </para>
+ <para>
+ Currently, there can be only one synchronization worker per table.
+ </para>
+ <para>
+ The synchronization workers are taken from the pool defined by
+ <xref linkend="guc-max-logical-replication-workers"/>.
+ </para>
+ <para>
+ The default value is 2. This parameter can only be set in the
+ <filename>postgresql.conf</filename> file or on the server command
+ line.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </sect2>
+
+ <sect2 id="logical-replication-config-notes">
+ <title>Notes</title>
+
+ <para>
+ Logical replication workers are also affected by
+ <xref linkend="guc-wal-receiver-timeout"/>,
+ <xref linkend="guc-wal-receiver-status-interval"/> and
+ <xref linkend="guc-wal-retrieve-retry-interval"/>.
+ </para>
+
+ <para>
+ Configuration parameter <xref linkend="guc-max-worker-processes"/> may need
+ to be adjusted to accommodate for replication workers, at least
+ (<xref linkend="guc-max-logical-replication-workers"/> + <literal>1</literal>).
+ Some extensions and parallel queries also take worker slots from
+ <varname>max_worker_processes</varname>.
+ </para>
+
+ </sect2>
- <para>
- <varname>max_replication_slots</varname> must also be set on the subscriber.
- It should be set to at least the number of subscriptions that will be added
- to the subscriber, plus some reserve for table synchronization.
- <varname>max_logical_replication_workers</varname> must be set to at least
- the number of subscriptions, again plus some reserve for the table
- synchronization. Additionally the <varname>max_worker_processes</varname>
- may need to be adjusted to accommodate for replication workers, at least
- (<varname>max_logical_replication_workers</varname>
- + <literal>1</literal>). Note that some extensions and parallel queries
- also take worker slots from <varname>max_worker_processes</varname>.
- </para>
</sect1>
<sect1 id="logical-replication-quick-setup">
--
1.8.3.1
On Wed, Nov 23, 2022 at 9:16 AM Peter Smith <smithpb2250@gmail.com> wrote:
On Wed, Nov 16, 2022 at 10:24 PM vignesh C <vignesh21@gmail.com> wrote:
...
One suggestion: The format of subscribers includes the data type and default values, the format of publishers does not include data type and default values. We can try to maintain the consistency for both publisher and subscriber configurations. + <para> + <varname>wal_level</varname> must be set to <literal>logical</literal>. + </para>+ <term><varname>max_logical_replication_workers</varname> (<type>integer</type>) + <indexterm> + <primary><varname>max_logical_replication_workers</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Specifies maximum number of logical replication workers. This must be set + to at least the number of subscriptions (for apply workers), plus some + reserve for the table synchronization workers. + </para> + <para>If we don't want to keep the same format, we could give a link to
runtime-config-replication where data type and default is defined for
publisher configurations max_replication_slots and max_wal_senders.Thanks for your suggestions.
I have included xref links to the original definitions, rather than
defining the same GUC in multiple places.PSA v3.
I updated the patch. The content is unchanged from v3 but the links
are modified so now they render with the correct <varname> format for
the GUC names.
PSA v4.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v4-0001-Logical-replication-GUCs-consolidated.patchapplication/octet-stream; name=v4-0001-Logical-replication-GUCs-consolidated.patchDownload
From 2da6ad40e2bd868fdb3f0c328da03ce2c32eef0b Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 24 Nov 2022 10:35:30 +1100
Subject: [PATCH v4] Logical replication GUCs - consolidated.
Combines all the logical replication configuration settings on one page
with new Publisher/Subscriber sub-sections.
---
doc/src/sgml/config.sgml | 80 +++-----------------
doc/src/sgml/logical-replication.sgml | 135 ++++++++++++++++++++++++++++------
2 files changed, 124 insertions(+), 91 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 24b1624..c61bb33 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4166,6 +4166,11 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
across the cluster without problems if that is required.
</para>
+ <para>
+ For <firstterm>logical replication</firstterm> configuration settings refer
+ to <xref linkend="logical-replication-config"/>.
+ </para>
+
<sect2 id="runtime-config-replication-sender">
<title>Sending Servers</title>
@@ -4238,6 +4243,12 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
not <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>)
will prevent the server from starting.
</para>
+
+ <para>
+ See <xref linkend="logical-replication-config"/> for more details
+ about setting <varname>max_replication_slots</varname> for logical
+ replication.
+ </para>
</listitem>
</varlistentry>
@@ -4926,75 +4937,6 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
</variablelist>
</sect2>
- <sect2 id="runtime-config-replication-subscriber">
- <title>Subscribers</title>
-
- <para>
- These settings control the behavior of a logical replication subscriber.
- Their values on the publisher are irrelevant.
- </para>
-
- <para>
- Note that <varname>wal_receiver_timeout</varname>,
- <varname>wal_receiver_status_interval</varname> and
- <varname>wal_retrieve_retry_interval</varname> configuration parameters
- affect the logical replication workers as well.
- </para>
-
- <variablelist>
-
- <varlistentry id="guc-max-logical-replication-workers" xreflabel="max_logical_replication_workers">
- <term><varname>max_logical_replication_workers</varname> (<type>integer</type>)
- <indexterm>
- <primary><varname>max_logical_replication_workers</varname> configuration parameter</primary>
- </indexterm>
- </term>
- <listitem>
- <para>
- Specifies maximum number of logical replication workers. This includes
- both apply workers and table synchronization workers.
- </para>
- <para>
- Logical replication workers are taken from the pool defined by
- <varname>max_worker_processes</varname>.
- </para>
- <para>
- The default value is 4. This parameter can only be set at server
- start.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry id="guc-max-sync-workers-per-subscription" xreflabel="max_sync_workers_per_subscription">
- <term><varname>max_sync_workers_per_subscription</varname> (<type>integer</type>)
- <indexterm>
- <primary><varname>max_sync_workers_per_subscription</varname> configuration parameter</primary>
- </indexterm>
- </term>
- <listitem>
- <para>
- Maximum number of synchronization workers per subscription. This
- parameter controls the amount of parallelism of the initial data copy
- during the subscription initialization or when new tables are added.
- </para>
- <para>
- Currently, there can be only one synchronization worker per table.
- </para>
- <para>
- The synchronization workers are taken from the pool defined by
- <varname>max_logical_replication_workers</varname>.
- </para>
- <para>
- The default value is 2. This parameter can only be set in the
- <filename>postgresql.conf</filename> file or on the server command
- line.
- </para>
- </listitem>
- </varlistentry>
-
- </variablelist>
- </sect2>
-
</sect1>
<sect1 id="runtime-config-query">
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index f875638..cff9546 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -1765,31 +1765,122 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER
<title>Configuration Settings</title>
<para>
- Logical replication requires several configuration options to be set.
+ Logical replication requires several configuration parameters to be set.
</para>
- <para>
- On the publisher side, <varname>wal_level</varname> must be set to
- <literal>logical</literal>, and <varname>max_replication_slots</varname>
- must be set to at least the number of subscriptions expected to connect,
- plus some reserve for table synchronization. And
- <varname>max_wal_senders</varname> should be set to at least the same as
- <varname>max_replication_slots</varname> plus the number of physical
- replicas that are connected at the same time.
- </para>
+ <sect2 id="logical-replication-config-publisher">
+ <title>Publishers</title>
+
+ <para>
+ <link linkend="guc-wal-level"><varname>wal_level</varname></link> must be
+ set to <literal>logical</literal>.
+ </para>
+
+ <para>
+ <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+ must be set to at least the number of subscriptions expected to connect,
+ plus some reserve for table synchronization.
+ </para>
+
+ <para>
+ <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
+ should be set to at least the same as
+ <varname>max_replication_slots</varname>, plus the number of physical
+ replicas that are connected at the same time.
+ </para>
+
+ </sect2>
+
+ <sect2 id="logical-replication-config-subscriber">
+ <title>Subscribers</title>
+
+ <para>
+ <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+ must be set to at least the number of subscriptions that will be added to
+ the subscriber, plus some reserve for table synchronization.
+ </para>
+
+ <para>
+ The following settings control the behavior of a logical replication subscriber.
+ Their values on the publisher are irrelevant:
+ </para>
+
+ <variablelist>
+
+ <varlistentry id="guc-max-logical-replication-workers" xreflabel="max_logical_replication_workers">
+ <term><varname>max_logical_replication_workers</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>max_logical_replication_workers</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Specifies maximum number of logical replication workers. This must be set
+ to at least the number of subscriptions (for apply workers), plus some
+ reserve for the table synchronization workers.
+ </para>
+ <para>
+ Logical replication workers are taken from the pool defined by
+ <link linkend="guc-max-worker-processes"><varname>max_worker_processes</varname></link>.
+ </para>
+ <para>
+ The default value is 4. This parameter can only be set at server
+ start.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry id="guc-max-sync-workers-per-subscription" xreflabel="max_sync_workers_per_subscription">
+ <term><varname>max_sync_workers_per_subscription</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>max_sync_workers_per_subscription</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Maximum number of synchronization workers per subscription. This
+ parameter controls the amount of parallelism of the initial data copy
+ during the subscription initialization or when new tables are added.
+ </para>
+ <para>
+ Currently, there can be only one synchronization worker per table.
+ </para>
+ <para>
+ The synchronization workers are taken from the pool defined by
+ <link linkend="guc-max-logical-replication-workers"><varname>max_logical_replication_workers</varname></link>.
+ </para>
+ <para>
+ The default value is 2. This parameter can only be set in the
+ <filename>postgresql.conf</filename> file or on the server command
+ line.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </sect2>
+
+ <sect2 id="logical-replication-config-notes">
+ <title>Notes</title>
+
+ <para>
+ Logical replication workers are also affected by
+ <link linkend="guc-wal-receiver-timeout"><varname>wal_receiver_timeout</varname></link>,
+ <link linkend="guc-wal-receiver-status-interval"><varname>wal_receiver_status_interval</varname></link> and
+ <link linkend="guc-wal-retrieve-retry-interval"><varname>wal_receiver_retry_interval</varname></link>.
+ </para>
+
+ <para>
+ Configuration parameter
+ <link linkend="guc-max-worker-processes"><varname>max_worker_processes</varname></link>
+ may need to be adjusted to accommodate for replication workers, at least (
+ <link linkend="guc-max-logical-replication-workers"><varname>max_logical_replication_workers</varname></link>
+ + <literal>1</literal>). Some extensions and parallel queries also take
+ worker slots from <varname>max_worker_processes</varname>.
+ </para>
+
+ </sect2>
- <para>
- <varname>max_replication_slots</varname> must also be set on the subscriber.
- It should be set to at least the number of subscriptions that will be added
- to the subscriber, plus some reserve for table synchronization.
- <varname>max_logical_replication_workers</varname> must be set to at least
- the number of subscriptions, again plus some reserve for the table
- synchronization. Additionally the <varname>max_worker_processes</varname>
- may need to be adjusted to accommodate for replication workers, at least
- (<varname>max_logical_replication_workers</varname>
- + <literal>1</literal>). Note that some extensions and parallel queries
- also take worker slots from <varname>max_worker_processes</varname>.
- </para>
</sect1>
<sect1 id="logical-replication-quick-setup">
--
1.8.3.1
Your patch moves the description of the subscriber-related configuration
parameters from config.sgml to logical-replication.sgml. But
config.sgml is supposed to contain *all* configuration parameters. If
we're going to start splitting this up and moving things around then
we'd need a more comprehensive plan than this individual patch. (I'm
not suggesting that we actually do this.)
On Fri, Nov 25, 2022 at 9:23 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
Your patch moves the description of the subscriber-related configuration
parameters from config.sgml to logical-replication.sgml. But
config.sgml is supposed to contain *all* configuration parameters. If
we're going to start splitting this up and moving things around then
we'd need a more comprehensive plan than this individual patch. (I'm
not suggesting that we actually do this.)
OK, thanks for the information.
This v5 patch now only adds some previously missing cross-references
and tidies the Chapter 31.10 "Configuration Settings" section.
Meanwhile, the Subscriber GUC descriptions are left on the
config.sgml, where you said they are supposed to be.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v5-0001-Logical-replication-GUCs-links-and-tidy.patchapplication/octet-stream; name=v5-0001-Logical-replication-GUCs-links-and-tidy.patchDownload
From b237546ccfb3240cfc82b258230b114b3cea02ca Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Tue, 29 Nov 2022 13:29:09 +1100
Subject: [PATCH v5] Logical replication GUCs - links and tidy
---
doc/src/sgml/config.sgml | 12 +++++
doc/src/sgml/logical-replication.sgml | 88 ++++++++++++++++++++++++++---------
2 files changed, 78 insertions(+), 22 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 82df89b..48e531f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4166,6 +4166,11 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
across the cluster without problems if that is required.
</para>
+ <para>
+ For <firstterm>logical replication</firstterm> configuration settings refer
+ also to <xref linkend="logical-replication-config"/>.
+ </para>
+
<sect2 id="runtime-config-replication-sender">
<title>Sending Servers</title>
@@ -4238,6 +4243,12 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
not <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>)
will prevent the server from starting.
</para>
+
+ <para>
+ See <xref linkend="logical-replication-config"/> for more details
+ about setting <varname>max_replication_slots</varname> for logical
+ replication.
+ </para>
</listitem>
</varlistentry>
@@ -4914,6 +4925,7 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
<para>
These settings control the behavior of a logical replication subscriber.
Their values on the publisher are irrelevant.
+ See <xref linkend="logical-replication-config"/> for more details.
</para>
<para>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index f875638..dd51940 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -1765,31 +1765,75 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER
<title>Configuration Settings</title>
<para>
- Logical replication requires several configuration options to be set.
+ Logical replication requires several configuration parameters to be set.
</para>
- <para>
- On the publisher side, <varname>wal_level</varname> must be set to
- <literal>logical</literal>, and <varname>max_replication_slots</varname>
- must be set to at least the number of subscriptions expected to connect,
- plus some reserve for table synchronization. And
- <varname>max_wal_senders</varname> should be set to at least the same as
- <varname>max_replication_slots</varname> plus the number of physical
- replicas that are connected at the same time.
- </para>
+ <sect2 id="logical-replication-config-publisher">
+ <title>Publishers</title>
+
+ <para>
+ <link linkend="guc-wal-level"><varname>wal_level</varname></link> must be
+ set to <literal>logical</literal>.
+ </para>
+
+ <para>
+ <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+ must be set to at least the number of subscriptions expected to connect,
+ plus some reserve for table synchronization.
+ </para>
+
+ <para>
+ <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
+ should be set to at least the same as
+ <varname>max_replication_slots</varname>, plus the number of physical
+ replicas that are connected at the same time.
+ </para>
+
+ </sect2>
+
+ <sect2 id="logical-replication-config-subscriber">
+ <title>Subscribers</title>
+
+ <para>
+ <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+ must be set to at least the number of subscriptions that will be added to
+ the subscriber, plus some reserve for table synchronization.
+ </para>
+
+ <para>
+ <link linkend="guc-max-logical-replication-workers"><varname>max_logical_replication_workers</varname></link>
+ must be set to at least the number of subscriptions (for apply workers), plus
+ some reserve for the table synchronization workers.
+ </para>
+
+ <para>
+ <link linkend="guc-max-sync-workers-per-subscription"><varname>max_sync_workers_per_subscription</varname></link>
+ controls the amount of parallelism of the initial data copy during the
+ subscription initialization or when new tables are added.
+ </para>
+ </sect2>
+
+ <sect2 id="logical-replication-config-notes">
+ <title>Notes</title>
+
+ <para>
+ Logical replication workers are also affected by
+ <link linkend="guc-wal-receiver-timeout"><varname>wal_receiver_timeout</varname></link>,
+ <link linkend="guc-wal-receiver-status-interval"><varname>wal_receiver_status_interval</varname></link> and
+ <link linkend="guc-wal-retrieve-retry-interval"><varname>wal_receiver_retry_interval</varname></link>.
+ </para>
+
+ <para>
+ Configuration parameter
+ <link linkend="guc-max-worker-processes"><varname>max_worker_processes</varname></link>
+ may need to be adjusted to accommodate for replication workers, at least (
+ <link linkend="guc-max-logical-replication-workers"><varname>max_logical_replication_workers</varname></link>
+ + <literal>1</literal>). Some extensions and parallel queries also take
+ worker slots from <varname>max_worker_processes</varname>.
+ </para>
+
+ </sect2>
- <para>
- <varname>max_replication_slots</varname> must also be set on the subscriber.
- It should be set to at least the number of subscriptions that will be added
- to the subscriber, plus some reserve for table synchronization.
- <varname>max_logical_replication_workers</varname> must be set to at least
- the number of subscriptions, again plus some reserve for the table
- synchronization. Additionally the <varname>max_worker_processes</varname>
- may need to be adjusted to accommodate for replication workers, at least
- (<varname>max_logical_replication_workers</varname>
- + <literal>1</literal>). Note that some extensions and parallel queries
- also take worker slots from <varname>max_worker_processes</varname>.
- </para>
</sect1>
<sect1 id="logical-replication-quick-setup">
--
1.8.3.1
Hi,
On Mon, Oct 24, 2022 at 12:45 AM Peter Smith <smithpb2250@gmail.com> wrote:
Hi hackers.
There is a docs Logical Replication section "31.10 Configuration
Settings" [1] which describes some logical replication GUCs, and
details on how they interact with each other and how to take that into
account when setting their values.There is another docs Server Configuration section "20.6 Replication"
[2] which lists the replication-related GUC parameters, and what they
are for.Currently AFAIK those two pages are unconnected, but I felt it might
be helpful if some of the parameters in the list [2] had xref links to
the additional logical replication configuration information [1]. PSA
a patch to do that.
+1 on the patch. Some feedback on v5 below.
+ <para> + For <firstterm>logical replication</firstterm> configuration
settings refer
+ also to <xref linkend="logical-replication-config"/>. + </para> +
I feel the top paragraph needs to explain terminology for logical
replication like it does for physical replication in addition to linking to
the logical replication config page. I'm recommending this as we use terms
like subscriber etc. in description of parameters without introducing them
first.
As an example, something like below might work.
These settings control the behavior of the built-in streaming replication
feature (see Section 27.2.5) and logical replication (link).
For physical replication, servers will be either a primary or a standby
server. Primaries can send data, while standbys are always receivers of
replicated data. When cascading replication (see Section 27.2.7) is used,
standby servers can also be senders, as well as receivers. Parameters are
mainly for sending and standby servers, though some parameters have meaning
only on the primary server. Settings may vary across the cluster without
problems if that is required.
For logical replication, servers will either be publishers (also called
senders in the sections below) or subscribers. Publishers are ....,
Subscribers are...
+ <para> + See <xref linkend="logical-replication-config"/> for more
details
+ about setting <varname>max_replication_slots</varname> for
logical
+ replication.
+ </para>
The link doesn't add any new information regarding max_replication_slots
other than "to reserve some for table sync" and has a good amount of
unrelated info. I think it might be useful to just put a line here asking
to reserve some for table sync instead of linking to the entire logical
replication config section.
- Logical replication requires several configuration options to be set. + Logical replication requires several configuration parameters to be
set.
May not be needed? The docs have references to both options and parameters
but I don't feel strongly about it. Feel free to use what you prefer.
I think we should add an additional line to the intro here saying that
parameters are mostly relevant only one of the subscriber or publisher.
Maybe a better written version of "While max_replication_slots means
different things on the publisher and subscriber, all other parameters are
relevant only on either the publisher or the subscriber."
+ <sect2 id="logical-replication-config-notes"> + <title>Notes</title>
I don't think we need this sub-section. If I understand correctly, these
parameters are effective only on the subscriber side. So, any reason to not
include them in that section?
+ + <para> + Logical replication workers are also affected by + <link
linkend="guc-wal-receiver-timeout"><varname>wal_receiver_timeout</varname></link>,
+ <link
linkend="guc-wal-receiver-status-interval"><varname>wal_receiver_status_interval</varname></link>
and
+ <link
linkend="guc-wal-retrieve-retry-interval"><varname>wal_receiver_retry_interval</varname></link>.
+ </para>
+
I like moving this; it makes more sense here. Should we remove it from
config.sgml? It seems a bit out of place there as we generally talk only
about individual parameters there and this line is general logical
replication subscriber advise which is more suited to
logical-replication.sgml
+ <para> + Configuration parameter + <link
linkend="guc-max-worker-processes"><varname>max_worker_processes</varname></link>
+ may need to be adjusted to accommodate for replication workers, at
least (
+ <link
linkend="guc-max-logical-replication-workers"><varname>max_logical_replication_workers</varname></link>
+ + <literal>1</literal>). Some extensions and parallel queries also
take
+ worker slots from <varname>max_worker_processes</varname>. + </para> + + </sect2>
I think we should move this to the subscriber section as said above. It's
useful to know this and people might skip over the notes.
~~
Meanwhile, I also suspect that the main blurb top of [1] is not
entirely correct... it says "These settings control the behaviour of
the built-in streaming replication feature", although some of the GUCs
mentioned later in this section are clearly for "logical replication".
Thoughts?
I shared an idea above.
Regards,
Samay
Show quoted text
------
[1] 31.10 Configuration Settings -
https://www.postgresql.org/docs/current/logical-replication-config.html
[2] 20.6 Replication -
https://www.postgresql.org/docs/current/runtime-config-replication.htmlKind Regards,
Peter Smith.
Fujitsu Australia
On Tue, Dec 6, 2022 at 5:57 AM samay sharma <smilingsamay@gmail.com> wrote:
Hi,
On Mon, Oct 24, 2022 at 12:45 AM Peter Smith <smithpb2250@gmail.com> wrote:
Hi hackers.
There is a docs Logical Replication section "31.10 Configuration
Settings" [1] which describes some logical replication GUCs, and
details on how they interact with each other and how to take that into
account when setting their values.There is another docs Server Configuration section "20.6 Replication"
[2] which lists the replication-related GUC parameters, and what they
are for.Currently AFAIK those two pages are unconnected, but I felt it might
be helpful if some of the parameters in the list [2] had xref links to
the additional logical replication configuration information [1]. PSA
a patch to do that.+1 on the patch. Some feedback on v5 below.
Thanks for your detailed review comments!
I have changed most things according to your suggestions. Please check patch v6.
+ <para> + For <firstterm>logical replication</firstterm> configuration settings refer + also to <xref linkend="logical-replication-config"/>. + </para> +I feel the top paragraph needs to explain terminology for logical replication like it does for physical replication in addition to linking to the logical replication config page. I'm recommending this as we use terms like subscriber etc. in description of parameters without introducing them first.
As an example, something like below might work.
These settings control the behavior of the built-in streaming replication feature (see Section 27.2.5) and logical replication (link).
For physical replication, servers will be either a primary or a standby server. Primaries can send data, while standbys are always receivers of replicated data. When cascading replication (see Section 27.2.7) is used, standby servers can also be senders, as well as receivers. Parameters are mainly for sending and standby servers, though some parameters have meaning only on the primary server. Settings may vary across the cluster without problems if that is required.
For logical replication, servers will either be publishers (also called senders in the sections below) or subscribers. Publishers are ...., Subscribers are...
OK. I split this blurb into 2 parts – streaming and logical
replication. The streaming replication part is the same as before. The
logical replication part is new.
+ <para> + See <xref linkend="logical-replication-config"/> for more details + about setting <varname>max_replication_slots</varname> for logical + replication. + </para>The link doesn't add any new information regarding max_replication_slots other than "to reserve some for table sync" and has a good amount of unrelated info. I think it might be useful to just put a line here asking to reserve some for table sync instead of linking to the entire logical replication config section.
OK. I copied the tablesync note back to config.sgml definition of
'max_replication_slots' and removed the link as suggested. Frankly, I
also thought it is a bit strange that the max_replication_slots in the
“Sending Servers” section was describing this parameter for
“Subscribers”. OTOH, I did not want to split the definition in half so
instead, I’ve added another Subscriber <varlistentry> that just refers
back to this place. It looks like an improvement to me.
- Logical replication requires several configuration options to be set. + Logical replication requires several configuration parameters to be set.May not be needed? The docs have references to both options and parameters but I don't feel strongly about it. Feel free to use what you prefer.
OK. I removed this.
I think we should add an additional line to the intro here saying that parameters are mostly relevant only one of the subscriber or publisher. Maybe a better written version of "While max_replication_slots means different things on the publisher and subscriber, all other parameters are relevant only on either the publisher or the subscriber."
OK. Done but with slightly different wording to that.
+ <sect2 id="logical-replication-config-notes"> + <title>Notes</title>I don't think we need this sub-section. If I understand correctly, these parameters are effective only on the subscriber side. So, any reason to not include them in that section?
OK. I moved these notes into the "Subscribers" section as suggested,
and removed "Notes".
+ + <para> + Logical replication workers are also affected by + <link linkend="guc-wal-receiver-timeout"><varname>wal_receiver_timeout</varname></link>, + <link linkend="guc-wal-receiver-status-interval"><varname>wal_receiver_status_interval</varname></link> and + <link linkend="guc-wal-retrieve-retry-interval"><varname>wal_receiver_retry_interval</varname></link>. + </para> +I like moving this; it makes more sense here. Should we remove it from config.sgml? It seems a bit out of place there as we generally talk only about individual parameters there and this line is general logical replication subscriber advise which is more suited to logical-replication.sgml
OK. I agree, it looked repetitive since the link to the
logical-replication page is nearby this information anyway, so I’ve
removed it from the config.sgml as you suggested.
+ <para> + Configuration parameter + <link linkend="guc-max-worker-processes"><varname>max_worker_processes</varname></link> + may need to be adjusted to accommodate for replication workers, at least ( + <link linkend="guc-max-logical-replication-workers"><varname>max_logical_replication_workers</varname></link> + + <literal>1</literal>). Some extensions and parallel queries also take + worker slots from <varname>max_worker_processes</varname>. + </para> + + </sect2>I think we should move this to the subscriber section as said above. It's useful to know this and people might skip over the notes.
OK. Done.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v6-0001-Logical-replication-GUCs-links-and-tidy.patchapplication/octet-stream; name=v6-0001-Logical-replication-GUCs-links-and-tidy.patchDownload
From f39c5ebb13053a13f56d03a4cd085f4b7df927d8 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Wed, 7 Dec 2022 17:43:41 +1100
Subject: [PATCH v6] Logical replication GUCs - links and tidy
---
doc/src/sgml/config.sgml | 42 +++++++++++++++----
doc/src/sgml/logical-replication.sgml | 78 +++++++++++++++++++++++++----------
2 files changed, 91 insertions(+), 29 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index ff6fcd9..30b4e28 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4156,7 +4156,13 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
<para>
These settings control the behavior of the built-in
<firstterm>streaming replication</firstterm> feature (see
- <xref linkend="streaming-replication"/>). Servers will be either a
+ <xref linkend="streaming-replication"/>), and the built-in
+ <firstterm>logical replication</firstterm> feature (see
+ <xref linkend="logical-replication"/>).
+ </para>
+
+ <para>
+ For <emphasis>streaming replication</emphasis>, servers will be either a
primary or a standby server. Primaries can send data, while standbys
are always receivers of replicated data. When cascading replication
(see <xref linkend="cascading-replication"/>) is used, standby servers
@@ -4166,6 +4172,20 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
across the cluster without problems if that is required.
</para>
+ <para>
+ For <emphasis>logical replication</emphasis>, <firstterm>publishers</firstterm>
+ (servers that do <link linkend="sql-createpublication"><command>CREATE PUBLICATION</command></link>)
+ replicate data to <firstterm>subscribers</firstterm>
+ (servers that do <link linkend="sql-createsubscription"><command>CREATE SUBSCRIPTION</command></link>).
+ Servers can also be publishers and subscribers at the same time. Note,
+ the following sections refers to publishers as "senders". The parameter
+ <literal>max_replication_slots</literal> has a different meaning for the
+ publisher and subscriber, but all other parameters are relevant only to
+ one side of the replication. For more details about logical replication
+ configuration settings refer to
+ <xref linkend="logical-replication-config"/>.
+ </para>
+
<sect2 id="runtime-config-replication-sender">
<title>Sending Servers</title>
@@ -4237,6 +4257,9 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
<link linkend="view-pg-replication-origin-status">pg_replication_origin_status</link>,
not <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>)
will prevent the server from starting.
+ <literal>max_replication_slots</literal> must be set to at least the
+ number of subscriptions that will be added to the subscriber, plus some
+ reserve for table synchronization.
</para>
</listitem>
</varlistentry>
@@ -4914,17 +4937,20 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
<para>
These settings control the behavior of a logical replication subscriber.
Their values on the publisher are irrelevant.
- </para>
-
- <para>
- Note that <varname>wal_receiver_timeout</varname>,
- <varname>wal_receiver_status_interval</varname> and
- <varname>wal_retrieve_retry_interval</varname> configuration parameters
- affect the logical replication workers as well.
+ See <xref linkend="logical-replication-config"/> for more details.
</para>
<variablelist>
+ <varlistentry>
+ <term><varname>max_replication_slots</varname> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ See <xref linkend="guc-max-replication-slots"/>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-max-logical-replication-workers" xreflabel="max_logical_replication_workers">
<term><varname>max_logical_replication_workers</varname> (<type>integer</type>)
<indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index f875638..7ea6560 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -1768,28 +1768,64 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER
Logical replication requires several configuration options to be set.
</para>
- <para>
- On the publisher side, <varname>wal_level</varname> must be set to
- <literal>logical</literal>, and <varname>max_replication_slots</varname>
- must be set to at least the number of subscriptions expected to connect,
- plus some reserve for table synchronization. And
- <varname>max_wal_senders</varname> should be set to at least the same as
- <varname>max_replication_slots</varname> plus the number of physical
- replicas that are connected at the same time.
- </para>
+ <sect2 id="logical-replication-config-publisher">
+ <title>Publishers</title>
+
+ <para>
+ <link linkend="guc-wal-level"><varname>wal_level</varname></link> must be
+ set to <literal>logical</literal>.
+ </para>
+
+ <para>
+ <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+ must be set to at least the number of subscriptions expected to connect,
+ plus some reserve for table synchronization.
+ </para>
+
+ <para>
+ <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
+ should be set to at least the same as
+ <varname>max_replication_slots</varname>, plus the number of physical
+ replicas that are connected at the same time.
+ </para>
+
+ </sect2>
+
+ <sect2 id="logical-replication-config-subscriber">
+ <title>Subscribers</title>
+
+ <para>
+ <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+ must be set to at least the number of subscriptions that will be added to
+ the subscriber, plus some reserve for table synchronization.
+ </para>
+
+ <para>
+ <link linkend="guc-max-logical-replication-workers"><varname>max_logical_replication_workers</varname></link>
+ must be set to at least the number of subscriptions (for apply workers), plus
+ some reserve for the table synchronization workers. Configuration parameter
+ <link linkend="guc-max-worker-processes"><varname>max_worker_processes</varname></link>
+ may need to be adjusted to accommodate for replication workers, at least (
+ <link linkend="guc-max-logical-replication-workers"><varname>max_logical_replication_workers</varname></link>
+ + <literal>1</literal>). Note, some extensions and parallel queries also
+ take worker slots from <varname>max_worker_processes</varname>.
+ </para>
+
+ <para>
+ <link linkend="guc-max-sync-workers-per-subscription"><varname>max_sync_workers_per_subscription</varname></link>
+ controls the amount of parallelism of the initial data copy during the
+ subscription initialization or when new tables are added.
+ </para>
+
+ <para>
+ Logical replication workers are also affected by
+ <link linkend="guc-wal-receiver-timeout"><varname>wal_receiver_timeout</varname></link>,
+ <link linkend="guc-wal-receiver-status-interval"><varname>wal_receiver_status_interval</varname></link> and
+ <link linkend="guc-wal-retrieve-retry-interval"><varname>wal_receiver_retry_interval</varname></link>.
+ </para>
+
+ </sect2>
- <para>
- <varname>max_replication_slots</varname> must also be set on the subscriber.
- It should be set to at least the number of subscriptions that will be added
- to the subscriber, plus some reserve for table synchronization.
- <varname>max_logical_replication_workers</varname> must be set to at least
- the number of subscriptions, again plus some reserve for the table
- synchronization. Additionally the <varname>max_worker_processes</varname>
- may need to be adjusted to accommodate for replication workers, at least
- (<varname>max_logical_replication_workers</varname>
- + <literal>1</literal>). Note that some extensions and parallel queries
- also take worker slots from <varname>max_worker_processes</varname>.
- </para>
</sect1>
<sect1 id="logical-replication-quick-setup">
--
1.8.3.1
Hi,
On Tue, Dec 6, 2022 at 11:12 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Tue, Dec 6, 2022 at 5:57 AM samay sharma <smilingsamay@gmail.com>
wrote:Hi,
On Mon, Oct 24, 2022 at 12:45 AM Peter Smith <smithpb2250@gmail.com>
wrote:
Hi hackers.
There is a docs Logical Replication section "31.10 Configuration
Settings" [1] which describes some logical replication GUCs, and
details on how they interact with each other and how to take that into
account when setting their values.There is another docs Server Configuration section "20.6 Replication"
[2] which lists the replication-related GUC parameters, and what they
are for.Currently AFAIK those two pages are unconnected, but I felt it might
be helpful if some of the parameters in the list [2] had xref links to
the additional logical replication configuration information [1]. PSA
a patch to do that.+1 on the patch. Some feedback on v5 below.
Thanks for your detailed review comments!
I have changed most things according to your suggestions. Please check
patch v6.
Thanks for the changes. See a few points of feedback below.
+ <para> + For <emphasis>logical replication</emphasis>,
<firstterm>publishers</firstterm>
+ (servers that do <link
linkend="sql-createpublication"><command>CREATE
PUBLICATION</command></link>)
+ replicate data to <firstterm>subscribers</firstterm> + (servers that do <link
linkend="sql-createsubscription"><command>CREATE
SUBSCRIPTION</command></link>).
+ Servers can also be publishers and subscribers at the same time.
Note,
+ the following sections refers to publishers as "senders". The
parameter
+ <literal>max_replication_slots</literal> has a different meaning
for the
+ publisher and subscriber, but all other parameters are relevant
only to
+ one side of the replication. For more details about logical
replication
+ configuration settings refer to + <xref linkend="logical-replication-config"/>. + </para>
The second last line seems a bit odd here. In my last round of feedback, I
had meant to add the line "The parameter .... " onwards to the top of
logical-replication-config.sgml.
What if we made the top of logical-replication-config.sgml like below?
Logical replication requires several configuration options to be set. Most
configuration options are relevant only on one side of the replication
(i.e. publisher or subscriber). However, max_replication_slots is
applicable on both sides but has different meanings on each side.
+ <para> + For <firstterm>logical replication</firstterm> configurationsettings refer
+ also to <xref linkend="logical-replication-config"/>. + </para> +I feel the top paragraph needs to explain terminology for logical
replication like it does for physical replication in addition to linking to
the logical replication config page. I'm recommending this as we use terms
like subscriber etc. in description of parameters without introducing them
first.As an example, something like below might work.
These settings control the behavior of the built-in streaming
replication feature (see Section 27.2.5) and logical replication (link).
For physical replication, servers will be either a primary or a standby
server. Primaries can send data, while standbys are always receivers of
replicated data. When cascading replication (see Section 27.2.7) is used,
standby servers can also be senders, as well as receivers. Parameters are
mainly for sending and standby servers, though some parameters have meaning
only on the primary server. Settings may vary across the cluster without
problems if that is required.For logical replication, servers will either be publishers (also called
senders in the sections below) or subscribers. Publishers are ....,
Subscribers are...OK. I split this blurb into 2 parts – streaming and logical
replication. The streaming replication part is the same as before. The
logical replication part is new.+ <para> + See <xref linkend="logical-replication-config"/> for moredetails
+ about setting <varname>max_replication_slots</varname> for
logical
+ replication.
+ </para>The link doesn't add any new information regarding max_replication_slots
other than "to reserve some for table sync" and has a good amount of
unrelated info. I think it might be useful to just put a line here asking
to reserve some for table sync instead of linking to the entire logical
replication config section.OK. I copied the tablesync note back to config.sgml definition of
'max_replication_slots' and removed the link as suggested. Frankly, I
also thought it is a bit strange that the max_replication_slots in the
“Sending Servers” section was describing this parameter for
“Subscribers”. OTOH, I did not want to split the definition in half so
instead, I’ve added another Subscriber <varlistentry> that just refers
back to this place. It looks like an improvement to me.
Hmm, I agree this is a tricky scenario. However, to me, it seems odd to
mention the parameter twice as this chapter of the docs just lists each
parameter and describes them. So, I'd probably remove the reference to it
in the subscriber section. We should describe it's usage in different
places in the logical replication part of the docs (as we do).
- Logical replication requires several configuration options to be
set.
+ Logical replication requires several configuration parameters to
be set.
May not be needed? The docs have references to both options and
parameters but I don't feel strongly about it. Feel free to use what you
prefer.OK. I removed this.
I think we should add an additional line to the intro here saying that
parameters are mostly relevant only one of the subscriber or publisher.
Maybe a better written version of "While max_replication_slots means
different things on the publisher and subscriber, all other parameters are
relevant only on either the publisher or the subscriber."OK. Done but with slightly different wording to that.
+ <sect2 id="logical-replication-config-notes"> + <title>Notes</title>I don't think we need this sub-section. If I understand correctly, these
parameters are effective only on the subscriber side. So, any reason to not
include them in that section?OK. I moved these notes into the "Subscribers" section as suggested,
and removed "Notes".+ + <para> + Logical replication workers are also affected by + <linklinkend="guc-wal-receiver-timeout"><varname>wal_receiver_timeout</varname></link>,
+ <link
linkend="guc-wal-receiver-status-interval"><varname>wal_receiver_status_interval</varname></link>
and+ <link
linkend="guc-wal-retrieve-retry-interval"><varname>wal_receiver_retry_interval</varname></link>.
+ </para>
+I like moving this; it makes more sense here. Should we remove it from
config.sgml? It seems a bit out of place there as we generally talk only
about individual parameters there and this line is general logical
replication subscriber advise which is more suited to
logical-replication.sgmlOK. I agree, it looked repetitive since the link to the
logical-replication page is nearby this information anyway, so I’ve
removed it from the config.sgml as you suggested.+ <para> + Configuration parameter + <linklinkend="guc-max-worker-processes"><varname>max_worker_processes</varname></link>
+ may need to be adjusted to accommodate for replication workers,
at least (
+ <link
linkend="guc-max-logical-replication-workers"><varname>max_logical_replication_workers</varname></link>
+ + <literal>1</literal>). Some extensions and parallel queries
also take
+ worker slots from <varname>max_worker_processes</varname>. + </para> + + </sect2>I think we should move this to the subscriber section as said above.
It's useful to know this and people might skip over the notes.
OK. Done.
+ <para>
+ <link
linkend="guc-max-logical-replication-workers"><varname>max_logical_replication_workers</varname></link>
+ must be set to at least the number of subscriptions (for apply
workers), plus
+ some reserve for the table synchronization workers. Configuration
parameter
+ <link
linkend="guc-max-worker-processes"><varname>max_worker_processes</varname></link>
+ may need to be adjusted to accommodate for replication workers, at
least (
+ <link
linkend="guc-max-logical-replication-workers"><varname>max_logical_replication_workers</varname></link>
+ + <literal>1</literal>). Note, some extensions and parallel queries
also
+ take worker slots from <varname>max_worker_processes</varname>. + </para>
Maybe do max_worker_processes in a new line like the rest.
Regards,
Samay
Microsoft
Show quoted text
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Thu, Dec 8, 2022 at 10:49 AM samay sharma <smilingsamay@gmail.com> wrote:
...
Thanks for the changes. See a few points of feedback below.
Patch v7 addresses this feedback. PSA.
+ <para> + For <emphasis>logical replication</emphasis>, <firstterm>publishers</firstterm> + (servers that do <link linkend="sql-createpublication"><command>CREATE PUBLICATION</command></link>) + replicate data to <firstterm>subscribers</firstterm> + (servers that do <link linkend="sql-createsubscription"><command>CREATE SUBSCRIPTION</command></link>). + Servers can also be publishers and subscribers at the same time. Note, + the following sections refers to publishers as "senders". The parameter + <literal>max_replication_slots</literal> has a different meaning for the + publisher and subscriber, but all other parameters are relevant only to + one side of the replication. For more details about logical replication + configuration settings refer to + <xref linkend="logical-replication-config"/>. + </para>The second last line seems a bit odd here. In my last round of feedback, I had meant to add the line "The parameter .... " onwards to the top of logical-replication-config.sgml.
What if we made the top of logical-replication-config.sgml like below?
Logical replication requires several configuration options to be set. Most configuration options are relevant only on one side of the replication (i.e. publisher or subscriber). However, max_replication_slots is applicable on both sides but has different meanings on each side.
OK. Moving this note is not quite following the same pattern as the
"streaming replication" intro blurb, but anyway it looks fine when
moved, so I've done as suggested.
OK. I copied the tablesync note back to config.sgml definition of
'max_replication_slots' and removed the link as suggested. Frankly, I
also thought it is a bit strange that the max_replication_slots in the
“Sending Servers” section was describing this parameter for
“Subscribers”. OTOH, I did not want to split the definition in half so
instead, I’ve added another Subscriber <varlistentry> that just refers
back to this place. It looks like an improvement to me.Hmm, I agree this is a tricky scenario. However, to me, it seems odd to mention the parameter twice as this chapter of the docs just lists each parameter and describes them. So, I'd probably remove the reference to it in the subscriber section. We should describe it's usage in different places in the logical replication part of the docs (as we do).
The 'max_replication_slots' is problematic because it is almost like
having 2 different GUCs that happen to have the same name. So I
preferred it also gets a mention in the “Subscriber” section to make
it obvious that it wears 2 hats, but IIUC you prefer that 2nd mention
is not present because typically each GUC should appear once only in
this chapter. TBH, I think both ways could be successfully argued for
or against -- so I’m just going to leave this as-is for now and let
the committer decide.
+ <para> + <link linkend="guc-max-logical-replication-workers"><varname>max_logical_replication_workers</varname></link> + must be set to at least the number of subscriptions (for apply workers), plus + some reserve for the table synchronization workers. Configuration parameter + <link linkend="guc-max-worker-processes"><varname>max_worker_processes</varname></link> + may need to be adjusted to accommodate for replication workers, at least ( + <link linkend="guc-max-logical-replication-workers"><varname>max_logical_replication_workers</varname></link> + + <literal>1</literal>). Note, some extensions and parallel queries also + take worker slots from <varname>max_worker_processes</varname>. + </para>Maybe do max_worker_processes in a new line like the rest.
OK. Done as suggested.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachments:
v7-0001-Logical-replication-GUCs-links-and-tidy.patchapplication/octet-stream; name=v7-0001-Logical-replication-GUCs-links-and-tidy.patchDownload
From 75577a3363bcd0c97eac8b82d5b0233ab363dd18 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 8 Dec 2022 15:55:05 +1100
Subject: [PATCH v7] Logical replication GUCs - links and tidy
---
doc/src/sgml/config.sgml | 39 ++++++++++++----
doc/src/sgml/logical-replication.sgml | 86 ++++++++++++++++++++++++++---------
2 files changed, 95 insertions(+), 30 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index ff6fcd9..46b18cf 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4156,7 +4156,13 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
<para>
These settings control the behavior of the built-in
<firstterm>streaming replication</firstterm> feature (see
- <xref linkend="streaming-replication"/>). Servers will be either a
+ <xref linkend="streaming-replication"/>), and the built-in
+ <firstterm>logical replication</firstterm> feature (see
+ <xref linkend="logical-replication"/>).
+ </para>
+
+ <para>
+ For <emphasis>streaming replication</emphasis>, servers will be either a
primary or a standby server. Primaries can send data, while standbys
are always receivers of replicated data. When cascading replication
(see <xref linkend="cascading-replication"/>) is used, standby servers
@@ -4166,6 +4172,17 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
across the cluster without problems if that is required.
</para>
+ <para>
+ For <emphasis>logical replication</emphasis>, <firstterm>publishers</firstterm>
+ (servers that do <link linkend="sql-createpublication"><command>CREATE PUBLICATION</command></link>)
+ replicate data to <firstterm>subscribers</firstterm>
+ (servers that do <link linkend="sql-createsubscription"><command>CREATE SUBSCRIPTION</command></link>).
+ Servers can also be publishers and subscribers at the same time. Note,
+ the following sections refer to publishers as "senders". For more details
+ about logical replication configuration settings refer to
+ <xref linkend="logical-replication-config"/>.
+ </para>
+
<sect2 id="runtime-config-replication-sender">
<title>Sending Servers</title>
@@ -4237,6 +4254,9 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
<link linkend="view-pg-replication-origin-status">pg_replication_origin_status</link>,
not <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>)
will prevent the server from starting.
+ <literal>max_replication_slots</literal> must be set to at least the
+ number of subscriptions that will be added to the subscriber, plus some
+ reserve for table synchronization.
</para>
</listitem>
</varlistentry>
@@ -4914,17 +4934,20 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
<para>
These settings control the behavior of a logical replication subscriber.
Their values on the publisher are irrelevant.
- </para>
-
- <para>
- Note that <varname>wal_receiver_timeout</varname>,
- <varname>wal_receiver_status_interval</varname> and
- <varname>wal_retrieve_retry_interval</varname> configuration parameters
- affect the logical replication workers as well.
+ See <xref linkend="logical-replication-config"/> for more details.
</para>
<variablelist>
+ <varlistentry>
+ <term><varname>max_replication_slots</varname> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ See <xref linkend="guc-max-replication-slots"/>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-max-logical-replication-workers" xreflabel="max_logical_replication_workers">
<term><varname>max_logical_replication_workers</varname> (<type>integer</type>)
<indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index f875638..1f22616 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -1765,31 +1765,73 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER
<title>Configuration Settings</title>
<para>
- Logical replication requires several configuration options to be set.
+ Logical replication requires several configuration options to be set. Most
+ options are relevant only on one side of the replication. However,
+ <varname>max_replication_slots</varname> is used on both the publisher and
+ the subscriber, but it has a different meaning for each.
</para>
- <para>
- On the publisher side, <varname>wal_level</varname> must be set to
- <literal>logical</literal>, and <varname>max_replication_slots</varname>
- must be set to at least the number of subscriptions expected to connect,
- plus some reserve for table synchronization. And
- <varname>max_wal_senders</varname> should be set to at least the same as
- <varname>max_replication_slots</varname> plus the number of physical
- replicas that are connected at the same time.
- </para>
+ <sect2 id="logical-replication-config-publisher">
+ <title>Publishers</title>
+
+ <para>
+ <link linkend="guc-wal-level"><varname>wal_level</varname></link> must be
+ set to <literal>logical</literal>.
+ </para>
+
+ <para>
+ <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+ must be set to at least the number of subscriptions expected to connect,
+ plus some reserve for table synchronization.
+ </para>
+
+ <para>
+ <link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
+ should be set to at least the same as
+ <varname>max_replication_slots</varname>, plus the number of physical
+ replicas that are connected at the same time.
+ </para>
+
+ </sect2>
+
+ <sect2 id="logical-replication-config-subscriber">
+ <title>Subscribers</title>
+
+ <para>
+ <link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
+ must be set to at least the number of subscriptions that will be added to
+ the subscriber, plus some reserve for table synchronization.
+ </para>
+
+ <para>
+ <link linkend="guc-max-logical-replication-workers"><varname>max_logical_replication_workers</varname></link>
+ must be set to at least the number of subscriptions (for apply workers), plus
+ some reserve for the table synchronization workers.
+ </para>
+
+ <para>
+ <link linkend="guc-max-worker-processes"><varname>max_worker_processes</varname></link>
+ may need to be adjusted to accommodate for replication workers, at least
+ (<link linkend="guc-max-logical-replication-workers"><varname>max_logical_replication_workers</varname></link>
+ + <literal>1</literal>). Note, some extensions and parallel queries also
+ take worker slots from <varname>max_worker_processes</varname>.
+ </para>
+
+ <para>
+ <link linkend="guc-max-sync-workers-per-subscription"><varname>max_sync_workers_per_subscription</varname></link>
+ controls the amount of parallelism of the initial data copy during the
+ subscription initialization or when new tables are added.
+ </para>
+
+ <para>
+ Logical replication workers are also affected by
+ <link linkend="guc-wal-receiver-timeout"><varname>wal_receiver_timeout</varname></link>,
+ <link linkend="guc-wal-receiver-status-interval"><varname>wal_receiver_status_interval</varname></link> and
+ <link linkend="guc-wal-retrieve-retry-interval"><varname>wal_receiver_retry_interval</varname></link>.
+ </para>
+
+ </sect2>
- <para>
- <varname>max_replication_slots</varname> must also be set on the subscriber.
- It should be set to at least the number of subscriptions that will be added
- to the subscriber, plus some reserve for table synchronization.
- <varname>max_logical_replication_workers</varname> must be set to at least
- the number of subscriptions, again plus some reserve for the table
- synchronization. Additionally the <varname>max_worker_processes</varname>
- may need to be adjusted to accommodate for replication workers, at least
- (<varname>max_logical_replication_workers</varname>
- + <literal>1</literal>). Note that some extensions and parallel queries
- also take worker slots from <varname>max_worker_processes</varname>.
- </para>
</sect1>
<sect1 id="logical-replication-quick-setup">
--
1.8.3.1
Hi,
On Wed, Dec 7, 2022 at 9:20 PM Peter Smith <smithpb2250@gmail.com> wrote:
On Thu, Dec 8, 2022 at 10:49 AM samay sharma <smilingsamay@gmail.com>
wrote:...
Thanks for the changes. See a few points of feedback below.
Patch v7 addresses this feedback. PSA.
+ <para> + For <emphasis>logical replication</emphasis>,<firstterm>publishers</firstterm>
+ (servers that do <link
linkend="sql-createpublication"><command>CREATE
PUBLICATION</command></link>)+ replicate data to <firstterm>subscribers</firstterm> + (servers that do <linklinkend="sql-createsubscription"><command>CREATE
SUBSCRIPTION</command></link>).+ Servers can also be publishers and subscribers at the same time.
Note,
+ the following sections refers to publishers as "senders". The
parameter
+ <literal>max_replication_slots</literal> has a different meaning
for the
+ publisher and subscriber, but all other parameters are relevant
only to
+ one side of the replication. For more details about logical
replication
+ configuration settings refer to + <xref linkend="logical-replication-config"/>. + </para>The second last line seems a bit odd here. In my last round of feedback,
I had meant to add the line "The parameter .... " onwards to the top of
logical-replication-config.sgml.What if we made the top of logical-replication-config.sgml like below?
Logical replication requires several configuration options to be set.
Most configuration options are relevant only on one side of the replication
(i.e. publisher or subscriber). However, max_replication_slots is
applicable on both sides but has different meanings on each side.OK. Moving this note is not quite following the same pattern as the
"streaming replication" intro blurb, but anyway it looks fine when
moved, so I've done as suggested.OK. I copied the tablesync note back to config.sgml definition of
'max_replication_slots' and removed the link as suggested. Frankly, I
also thought it is a bit strange that the max_replication_slots in the
“Sending Servers” section was describing this parameter for
“Subscribers”. OTOH, I did not want to split the definition in half so
instead, I’ve added another Subscriber <varlistentry> that just refers
back to this place. It looks like an improvement to me.Hmm, I agree this is a tricky scenario. However, to me, it seems odd to
mention the parameter twice as this chapter of the docs just lists each
parameter and describes them. So, I'd probably remove the reference to it
in the subscriber section. We should describe it's usage in different
places in the logical replication part of the docs (as we do).The 'max_replication_slots' is problematic because it is almost like
having 2 different GUCs that happen to have the same name. So I
preferred it also gets a mention in the “Subscriber” section to make
it obvious that it wears 2 hats, but IIUC you prefer that 2nd mention
is not present because typically each GUC should appear once only in
this chapter. TBH, I think both ways could be successfully argued for
or against -- so I’m just going to leave this as-is for now and let
the committer decide.
Sounds fair.
I don't have any other feedback. This looks good to me.
Also, I don't see this patch in the 2023/01 commitfest. Might be worth
moving to that one.
Regards,
Samay
Microsoft
Show quoted text
+ <para>
+ <linklinkend="guc-max-logical-replication-workers"><varname>max_logical_replication_workers</varname></link>
+ must be set to at least the number of subscriptions (for apply
workers), plus
+ some reserve for the table synchronization workers. Configuration
parameter
+ <link
linkend="guc-max-worker-processes"><varname>max_worker_processes</varname></link>
+ may need to be adjusted to accommodate for replication workers,
at least (
+ <link
linkend="guc-max-logical-replication-workers"><varname>max_logical_replication_workers</varname></link>
+ + <literal>1</literal>). Note, some extensions and parallel
queries also
+ take worker slots from <varname>max_worker_processes</varname>. + </para>Maybe do max_worker_processes in a new line like the rest.
OK. Done as suggested.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Sat, Dec 10, 2022 at 5:10 AM samay sharma <smilingsamay@gmail.com> wrote:
I don't have any other feedback. This looks good to me.
Also, I don't see this patch in the 2023/01 commitfest. Might be worth moving to that one.
Hmm, it was already recorded in the 2022-11 commitfest [1]2022-11 CF - https://commitfest.postgresql.org/40/3959/, so I
assumed it would just carry forward to the next one.
Anyway, I've added it again to 2023-01 commitfest [2]2023-01 CF - https://commitfest.postgresql.org/41/4061/. Thanks for telling me.
------
[1]: 2022-11 CF - https://commitfest.postgresql.org/40/3959/
[2]: 2023-01 CF - https://commitfest.postgresql.org/41/4061/
Kind Regards,
Peter Smith.
Fujitsu Australia.
Peter Smith <smithpb2250@gmail.com> writes:
On Sat, Dec 10, 2022 at 5:10 AM samay sharma <smilingsamay@gmail.com> wrote:
Also, I don't see this patch in the 2023/01 commitfest. Might be worth moving to that one.
Hmm, it was already recorded in the 2022-11 commitfest [1], so I
assumed it would just carry forward to the next one.
Ian is still working on closing out the November 'fest :-(.
I suspect that in a day or so that one will get moved, and
you will have duplicate entries in the January 'fest.
regards, tom lane
On 2022-Dec-07, samay sharma wrote:
On Tue, Dec 6, 2022 at 11:12 PM Peter Smith <smithpb2250@gmail.com> wrote:
OK. I copied the tablesync note back to config.sgml definition of
'max_replication_slots' and removed the link as suggested. Frankly, I
also thought it is a bit strange that the max_replication_slots in the
“Sending Servers” section was describing this parameter for
“Subscribers”. OTOH, I did not want to split the definition in half so
instead, I’ve added another Subscriber <varlistentry> that just refers
back to this place. It looks like an improvement to me.Hmm, I agree this is a tricky scenario. However, to me, it seems odd to
mention the parameter twice as this chapter of the docs just lists each
parameter and describes them. So, I'd probably remove the reference to it
in the subscriber section. We should describe it's usage in different
places in the logical replication part of the docs (as we do).
I agree this is tricky. However, because they essentially have
completely different behaviors on each side, and because we're
documenting each side separately, to me it makes more sense to document
each behavior separately, so I've split it. I also added mention at
each side that the other one exists. My rationale is that a user is
likely going to search for stuff to set on one side first, then for
stuff to set on the other side. So doing it this way maximizes
helpfulness (or so I hope anyway). I also added a separate index entry.
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"I love the Postgres community. It's all about doing things _properly_. :-)"
(David Garamond)
On 2022-Dec-11, Tom Lane wrote:
Ian is still working on closing out the November 'fest :-(.
I suspect that in a day or so that one will get moved, and
you will have duplicate entries in the January 'fest.
I've marked both as committed.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"Here's a general engineering tip: if the non-fun part is too complex for you
to figure out, that might indicate the fun part is too ambitious." (John Naylor)
/messages/by-id/CAFBsxsG4OWHBbSDM=sSeXrQGOtkPiOEOuME4yD7Ce41NtaAD9g@mail.gmail.com
On Tue, Dec 13, 2022 at 6:25 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2022-Dec-07, samay sharma wrote:
On Tue, Dec 6, 2022 at 11:12 PM Peter Smith <smithpb2250@gmail.com> wrote:
OK. I copied the tablesync note back to config.sgml definition of
'max_replication_slots' and removed the link as suggested. Frankly, I
also thought it is a bit strange that the max_replication_slots in the
“Sending Servers” section was describing this parameter for
“Subscribers”. OTOH, I did not want to split the definition in half so
instead, I’ve added another Subscriber <varlistentry> that just refers
back to this place. It looks like an improvement to me.Hmm, I agree this is a tricky scenario. However, to me, it seems odd to
mention the parameter twice as this chapter of the docs just lists each
parameter and describes them. So, I'd probably remove the reference to it
in the subscriber section. We should describe it's usage in different
places in the logical replication part of the docs (as we do).I agree this is tricky. However, because they essentially have
completely different behaviors on each side, and because we're
documenting each side separately, to me it makes more sense to document
each behavior separately, so I've split it. I also added mention at
each side that the other one exists. My rationale is that a user is
likely going to search for stuff to set on one side first, then for
stuff to set on the other side. So doing it this way maximizes
helpfulness (or so I hope anyway). I also added a separate index entry.
LGTM. Thank you for pushing this.
------
Kind Regards,
Peter Smith.
Fujitsu Australia.