Separate GUC for replication origins
Hi,
We usually use max_replication_slots as the maximum number of replication
origins. It predates the logical replication infrastructure (commit
5aa2350426c). However, it is confusing (if it is the first time you are dealing
with logical replication. Why do I need to set replication slots on subscriber
if the replication slots are stored on publisher?) and it has a limitation (you
cannot have a setup where you want to restrict the number of replication slots
and have a high number of subscriptions or vice-versa). The initial commit also
mentions that it is not adequate (origin.c).
/*
* XXX: max_replication_slots is arguably the wrong thing to use, as here
* we keep the replay state of *remote* transactions. But for now it seems
* sufficient to reuse it, rather than introduce a separate GUC.
*/
and another comment suggests that we should have a separate GUC for it.
/*
* Base address into a shared memory array of replication states of size
* max_replication_slots.
*
* XXX: Should we use a separate variable to size this rather than
* max_replication_slots?
*/
The commit a8500750ca0 is an attempt to clarify that the max_replication_slots
can have different meanings depending on the node role (publisher or
subscriber).
I'm attaching a patch that adds max_replication_origins. It basically replaces
all of the points that refers to max_replication_slots on the subscriber. It
uses the same default value as max_replication_slots (10). I did nothing to
keep the backward compatibility. I mean has a special value (like -1) that
means same value as max_replication_slots. Is it worth it? (If not, it should
be mentioned in the commit message.)
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
v1-0001-Separate-GUC-for-replication-origins.patchtext/x-patch; name="=?UTF-8?Q?v1-0001-Separate-GUC-for-replication-origins.patch?="Download+83-86
On 10.12.24 19:41, Euler Taveira wrote:
I'm attaching a patch that adds max_replication_origins. It basically
replaces
all of the points that refers to max_replication_slots on the subscriber. It
uses the same default value as max_replication_slots (10). I did nothing to
keep the backward compatibility. I mean has a special value (like -1) that
means same value as max_replication_slots. Is it worth it? (If not, it
should
be mentioned in the commit message.)
I think some backward compatibility tweak like that would be useful.
Otherwise, the net effect of this is that most people will have to do
one more thing than before to keep their systems working. Also, all
deployment and automation scripts would have to be updated for this.
On Thu, Dec 19, 2024, at 10:31 AM, Peter Eisentraut wrote:
On 10.12.24 19:41, Euler Taveira wrote:
I'm attaching a patch that adds max_replication_origins. It basically
replaces
all of the points that refers to max_replication_slots on the subscriber. It
uses the same default value as max_replication_slots (10). I did nothing to
keep the backward compatibility. I mean has a special value (like -1) that
means same value as max_replication_slots. Is it worth it? (If not, it
should
be mentioned in the commit message.)I think some backward compatibility tweak like that would be useful.
Otherwise, the net effect of this is that most people will have to do
one more thing than before to keep their systems working. Also, all
deployment and automation scripts would have to be updated for this.
This new patch includes the special value (-1) that uses max_replication_slots
instead.
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
v2-0001-Separate-GUC-for-replication-origins.patchtext/x-patch; name="=?UTF-8?Q?v2-0001-Separate-GUC-for-replication-origins.patch?="Download+111-86
Hi Euler,
As you may have already read, separating the max_replication_slots parameter
is also discussed in the following thread.
/messages/by-id/CAA4eK1KF4MbyWmPj9ctNo8Fiei=K91RGYtzV5ELeCvR=_rqNgg@mail.gmail.com
21ac553b42d53249e42
I agree with separating the parameters.
I think we need to discuss the names of the parameter.
Parameter names were also discussed in the above thread.
It was suggested to name them 'max_replication_origin_states' or
'max_tracked_replication_origins'.
Personally, I find 'max_replication_origin_states' easier to understand,
although it is a long name.
Best Regards,
Keisuke Kuroda
NTT Comware
On Wed, Jan 8, 2025 at 7:39 AM Euler Taveira <euler@eulerto.com> wrote:
On Thu, Dec 19, 2024, at 10:31 AM, Peter Eisentraut wrote:
On 10.12.24 19:41, Euler Taveira wrote:
I'm attaching a patch that adds max_replication_origins. It basically
replaces
all of the points that refers to max_replication_slots on the subscriber. It
uses the same default value as max_replication_slots (10). I did nothing to
keep the backward compatibility. I mean has a special value (like -1) that
means same value as max_replication_slots. Is it worth it? (If not, it
should
be mentioned in the commit message.)I think some backward compatibility tweak like that would be useful.
Otherwise, the net effect of this is that most people will have to do
one more thing than before to keep their systems working. Also, all
deployment and automation scripts would have to be updated for this.This new patch includes the special value (-1) that uses max_replication_slots
instead.
Thank you for working on this. The proposed idea makes sense to me.
Here are some comments on v2 patch:
---
/* Report this after the initial starting message for consistency. */
- if (max_replication_slots == 0)
+ if (max_replication_origins == 0)
ereport(ERROR,
(errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED),
errmsg("cannot start logical
replication workers when \"max_replication_slots\"=0")));
Need to update the error message too.
---
+ {"max_replication_origins",
+ PGC_POSTMASTER,
+ REPLICATION_SUBSCRIBERS,
+ gettext_noop("Sets the maximum number of
simultaneously defined replication origins."),
+ NULL
+ },
I think the description is not accurate; this GUC controls the maximum
number of simultaneous replication origins that can be setup.
---
Given that max_replication_origins doesn't control the maximum number
of replication origins that can be defined, probably we need to find a
better name. As Kuroda-san already mentioned some proposed names,
max_tracked_replication_origins or max_replication_origin_states seem
reasonable to me.
---
+#include "utils/guc_hooks.h"
I think #include'ing guc.h would be more appropriate.
------
ereport(PANIC,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("replication slot checkpoint has wrong
checksum %u, expected %u",
crc, file_crc)));
I don't understand why we use the term "replication slot checkpoint"
in an error message for the replication origin code. It could be fixed
in a separate patch for back-patching.
---
It's not related to the proposed patch but I found a suspicious
behavior; when the server startup we try to restore the
replorigin_checkpoint file only when max_replication_slots (or
max_replication_origins with the patch) > 0. Therefore, we don't get
the error message "could not find free replication state, increase
\"max_replication_slots\"" when it's set to 0, even if we have some
replication states in the replorigin_checkpoint file. Which seems
quite confusing.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Fri, Jan 24, 2025, at 8:12 PM, Masahiko Sawada wrote:
Here are some comments on v2 patch:
--- /* Report this after the initial starting message for consistency. */ - if (max_replication_slots == 0) + if (max_replication_origins == 0) ereport(ERROR, (errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED), errmsg("cannot start logical replication workers when \"max_replication_slots\"=0")));Need to update the error message too.
Good catch!
--- + {"max_replication_origins", + PGC_POSTMASTER, + REPLICATION_SUBSCRIBERS, + gettext_noop("Sets the maximum number of simultaneously defined replication origins."), + NULL + },I think the description is not accurate; this GUC controls the maximum
number of simultaneous replication origins that can be setup.
Instead of "defined" I use "configured". It seems closer to "setup".
---
Given that max_replication_origins doesn't control the maximum number
of replication origins that can be defined, probably we need to find a
better name. As Kuroda-san already mentioned some proposed names,
max_tracked_replication_origins or max_replication_origin_states seem
reasonable to me.
The max_replication_origins name is not accurate. I chose it because (a) it is
a runtime limit, (b) it is short and (c) a description can provide the exact
meaning. I think the proposed names don't still reflect the exact meaning. The
"tracked" word provides the meaning that the replication origin is tracked but
in this case it should mean "setup". An existing replication origin that is not
in use is tracked although its information is not available in the
pg_replication_origin_status. The "states" word doesn't make sense in this
context. Do you mean "status" (same as the view name)?
Under reflection, an accurate name is max_replication_origin_session_setup. A
counter argument is that it is a long name (top-5 length).
postgres=# select n, length(n) from (values('max_replication_origins'),
('max_tracked_replication_origins'),('max_replication_origin_states'),
('max_replication_origin_session_setup')) as gucs(n);
n | length
--------------------------------------+--------
max_replication_origins | 23
max_tracked_replication_origins | 31
max_replication_origin_states | 29
max_replication_origin_session_setup | 36
(4 rows)
postgres=# select name, length(name) from pg_settings order by 2 desc
limit 15;
name | length
---------------------------------------------+--------
max_parallel_apply_workers_per_subscription | 43
ssl_passphrase_command_supports_reload | 38
autovacuum_vacuum_insert_scale_factor | 37
autovacuum_multixact_freeze_max_age | 35
debug_logical_replication_streaming | 35
idle_in_transaction_session_timeout | 35
autovacuum_vacuum_insert_threshold | 34
log_parameter_max_length_on_error | 33
vacuum_multixact_freeze_table_age | 33
max_sync_workers_per_subscription | 33
client_connection_check_interval | 32
max_parallel_maintenance_workers | 32
shared_memory_size_in_huge_pages | 32
restrict_nonsystem_relation_kind | 32
autovacuum_analyze_scale_factor | 31
(15 rows)
--- +#include "utils/guc_hooks.h"I think #include'ing guc.h would be more appropriate.
Fixed.
I also updated the pg_createsubscriber documentation that refers to
max_replication_slots.
Since we don't have an agreement about the name, I still kept
max_replication_origins.
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
v3-0001-Separate-GUC-for-replication-origins.patchtext/x-patch; name="=?UTF-8?Q?v3-0001-Separate-GUC-for-replication-origins.patch?="Download+113-88
On Wed, Feb 5, 2025 at 8:17 AM Euler Taveira <euler@eulerto.com> wrote:
Under reflection, an accurate name is max_replication_origin_session_setup. A
counter argument is that it is a long name (top-5 length).postgres=# select n, length(n) from (values('max_replication_origins'),
('max_tracked_replication_origins'),('max_replication_origin_states'),
('max_replication_origin_session_setup')) as gucs(n);
n | length
--------------------------------------+--------
max_replication_origins | 23
max_tracked_replication_origins | 31
max_replication_origin_states | 29
max_replication_origin_session_setup | 36
(4 rows)
The other possibility is max_replication_origin_sessions.
--
With Regards,
Amit Kapila.
On Wed, Feb 5, 2025, at 1:56 AM, Amit Kapila wrote:
On Wed, Feb 5, 2025 at 8:17 AM Euler Taveira <euler@eulerto.com> wrote:
Under reflection, an accurate name is max_replication_origin_session_setup. A
counter argument is that it is a long name (top-5 length).postgres=# select n, length(n) from (values('max_replication_origins'),
('max_tracked_replication_origins'),('max_replication_origin_states'),
('max_replication_origin_session_setup')) as gucs(n);
n | length
--------------------------------------+--------
max_replication_origins | 23
max_tracked_replication_origins | 31
max_replication_origin_states | 29
max_replication_origin_session_setup | 36
(4 rows)The other possibility is max_replication_origin_sessions.
Looks reasonable to me.
Opinions?
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Wed, Feb 5, 2025 at 4:39 PM Euler Taveira <euler@eulerto.com> wrote:
On Wed, Feb 5, 2025, at 1:56 AM, Amit Kapila wrote:
On Wed, Feb 5, 2025 at 8:17 AM Euler Taveira <euler@eulerto.com> wrote:
Under reflection, an accurate name is max_replication_origin_session_setup. A
counter argument is that it is a long name (top-5 length).postgres=# select n, length(n) from (values('max_replication_origins'),
('max_tracked_replication_origins'),('max_replication_origin_states'),
('max_replication_origin_session_setup')) as gucs(n);
n | length
--------------------------------------+--------
max_replication_origins | 23
max_tracked_replication_origins | 31
max_replication_origin_states | 29
max_replication_origin_session_setup | 36
(4 rows)The other possibility is max_replication_origin_sessions.
Looks reasonable to me.
Opinions?
+1
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Wed, Feb 5, 2025, at 9:49 PM, Masahiko Sawada wrote:
On Wed, Feb 5, 2025 at 4:39 PM Euler Taveira <euler@eulerto.com> wrote:
On Wed, Feb 5, 2025, at 1:56 AM, Amit Kapila wrote:
On Wed, Feb 5, 2025 at 8:17 AM Euler Taveira <euler@eulerto.com> wrote:
Under reflection, an accurate name is max_replication_origin_session_setup. A
counter argument is that it is a long name (top-5 length).postgres=# select n, length(n) from (values('max_replication_origins'),
('max_tracked_replication_origins'),('max_replication_origin_states'),
('max_replication_origin_session_setup')) as gucs(n);
n | length
--------------------------------------+--------
max_replication_origins | 23
max_tracked_replication_origins | 31
max_replication_origin_states | 29
max_replication_origin_session_setup | 36
(4 rows)The other possibility is max_replication_origin_sessions.
Looks reasonable to me.
Opinions?
+1
Here is another patch that only changes the GUC name to
max_replication_origin_sessions.
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
v4-0001-Separate-GUC-for-replication-origins.patchtext/x-patch; name="=?UTF-8?Q?v4-0001-Separate-GUC-for-replication-origins.patch?="Download+118-91
On Tue, Feb 11, 2025 at 12:27 PM Euler Taveira <euler@eulerto.com> wrote:
On Wed, Feb 5, 2025, at 9:49 PM, Masahiko Sawada wrote:
On Wed, Feb 5, 2025 at 4:39 PM Euler Taveira <euler@eulerto.com> wrote:
On Wed, Feb 5, 2025, at 1:56 AM, Amit Kapila wrote:
On Wed, Feb 5, 2025 at 8:17 AM Euler Taveira <euler@eulerto.com> wrote:
Under reflection, an accurate name is max_replication_origin_session_setup. A
counter argument is that it is a long name (top-5 length).postgres=# select n, length(n) from (values('max_replication_origins'),
('max_tracked_replication_origins'),('max_replication_origin_states'),
('max_replication_origin_session_setup')) as gucs(n);
n | length
--------------------------------------+--------
max_replication_origins | 23
max_tracked_replication_origins | 31
max_replication_origin_states | 29
max_replication_origin_session_setup | 36
(4 rows)The other possibility is max_replication_origin_sessions.
Looks reasonable to me.
Opinions?
+1
Here is another patch that only changes the GUC name to
max_replication_origin_sessions.
Thank you for updating the patch! The patch looks mostly good to me.
Here is one minor point:
In logical-replication.sgml there is another place we need to update
(in 29.13.2. Prepare for subscriber upgrades):
<listitem>
<para>
The new cluster must have
<link linkend="guc-max-replication-slots"><varname>max_replication_slots</varname></link>
configured to a value greater than or equal to the number of
subscriptions present in the old cluster.
</para>
</listitem>
</itemizedlist>
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Thu, Feb 13, 2025 at 6:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Thank you for updating the patch! The patch looks mostly good to me.
+ /*
+ * Prior to PostgreSQL 18, max_replication_slots was used to set the
+ * number of replication origins. For backward compatibility, -1 indicates
+ * to use the fallback value (max_replication_slots).
+ */
+ if (max_replication_origin_sessions == -1)
Shouldn't we let users use max_replication_origin_sessions for
subscribers? Maintaining this mapping for a long time can create
confusion such that some users will keep using max_replication_slots
for origins, and others will start using
max_replication_origin_sessions. Such a mapping would be useful if we
were doing something like this in back-branches, but doing it in a
major version appears to be more of a maintenance burden.
--
With Regards,
Amit Kapila.
On Sat, Mar 1, 2025, at 10:08 AM, Amit Kapila wrote:
On Thu, Feb 13, 2025 at 6:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Thank you for updating the patch! The patch looks mostly good to me.
+ /* + * Prior to PostgreSQL 18, max_replication_slots was used to set the + * number of replication origins. For backward compatibility, -1 indicates + * to use the fallback value (max_replication_slots). + */ + if (max_replication_origin_sessions == -1)Shouldn't we let users use max_replication_origin_sessions for
subscribers? Maintaining this mapping for a long time can create
confusion such that some users will keep using max_replication_slots
for origins, and others will start using
max_replication_origin_sessions. Such a mapping would be useful if we
were doing something like this in back-branches, but doing it in a
major version appears to be more of a maintenance burden.
It was my initial proposal. Of course, it breaks compatibility but it
forces the users to adopt this new GUC. It will be a smooth adoption
because if we use the same value as max_replication_slots it means
most of the scenarios will be covered (10 is a good start point for the
majority of replication scenarios in my experience).
However, Peter E suggested that it would be a good idea to have a
backward compatibility so I added it.
We need to decide which option we want:
1. no backward compatibility and max_replication_origin_sessions = 10 as
default value. It might break scenarios that have the number of
subscriptions greater than the default value.
2. backward compatibility for a certain number of releases until the
tools can be adjusted. However, the applications that use it were
adjusted as part of this patch. The drawback here is that once we
accept -1, we cannot remove it without breaking compatibility.
My preference was 1 but I'm fine with 2 too. Since it is a major
release, it is natural to introduce features that break things.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Wed, Mar 5, 2025 at 6:24 AM Euler Taveira <euler@eulerto.com> wrote:
On Sat, Mar 1, 2025, at 10:08 AM, Amit Kapila wrote:
On Thu, Feb 13, 2025 at 6:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Thank you for updating the patch! The patch looks mostly good to me.
+ /* + * Prior to PostgreSQL 18, max_replication_slots was used to set the + * number of replication origins. For backward compatibility, -1 indicates + * to use the fallback value (max_replication_slots). + */ + if (max_replication_origin_sessions == -1)Shouldn't we let users use max_replication_origin_sessions for
subscribers? Maintaining this mapping for a long time can create
confusion such that some users will keep using max_replication_slots
for origins, and others will start using
max_replication_origin_sessions. Such a mapping would be useful if we
were doing something like this in back-branches, but doing it in a
major version appears to be more of a maintenance burden.It was my initial proposal. Of course, it breaks compatibility but it
forces the users to adopt this new GUC. It will be a smooth adoption
because if we use the same value as max_replication_slots it means
most of the scenarios will be covered (10 is a good start point for the
majority of replication scenarios in my experience).However, Peter E suggested that it would be a good idea to have a
backward compatibility so I added it.We need to decide which option we want:
1. no backward compatibility and max_replication_origin_sessions = 10 as
default value. It might break scenarios that have the number of
subscriptions greater than the default value.
To avoid breakage, can we add a check during the upgrade to ensure
that users have set max_replication_origin_sessions appropriately? See
the current check in function
check_new_cluster_subscription_configuration(). It uses
max_replication_slots, we can change it to use
max_replication_origin_sessions for versions greater than or equal to
18. Will that address this concern?
2. backward compatibility for a certain number of releases until the
tools can be adjusted. However, the applications that use it were
adjusted as part of this patch. The drawback here is that once we
accept -1, we cannot remove it without breaking compatibility.
Right, I find that path will add more maintenance burden in terms of
remembering this and finding a way to deprecate such a check.
My preference was 1 but I'm fine with 2 too. Since it is a major
release, it is natural to introduce features that break things.
+1.
Does anyone else have an opinion on this?
--
With Regards,
Amit Kapila.
On Tue, Mar 4, 2025 at 10:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Mar 5, 2025 at 6:24 AM Euler Taveira <euler@eulerto.com> wrote:
On Sat, Mar 1, 2025, at 10:08 AM, Amit Kapila wrote:
On Thu, Feb 13, 2025 at 6:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Thank you for updating the patch! The patch looks mostly good to me.
+ /* + * Prior to PostgreSQL 18, max_replication_slots was used to set the + * number of replication origins. For backward compatibility, -1 indicates + * to use the fallback value (max_replication_slots). + */ + if (max_replication_origin_sessions == -1)Shouldn't we let users use max_replication_origin_sessions for
subscribers? Maintaining this mapping for a long time can create
confusion such that some users will keep using max_replication_slots
for origins, and others will start using
max_replication_origin_sessions. Such a mapping would be useful if we
were doing something like this in back-branches, but doing it in a
major version appears to be more of a maintenance burden.It was my initial proposal. Of course, it breaks compatibility but it
forces the users to adopt this new GUC. It will be a smooth adoption
because if we use the same value as max_replication_slots it means
most of the scenarios will be covered (10 is a good start point for the
majority of replication scenarios in my experience).However, Peter E suggested that it would be a good idea to have a
backward compatibility so I added it.We need to decide which option we want:
1. no backward compatibility and max_replication_origin_sessions = 10 as
default value. It might break scenarios that have the number of
subscriptions greater than the default value.To avoid breakage, can we add a check during the upgrade to ensure
that users have set max_replication_origin_sessions appropriately? See
the current check in function
check_new_cluster_subscription_configuration(). It uses
max_replication_slots, we can change it to use
max_replication_origin_sessions for versions greater than or equal to
18. Will that address this concern?
I think that the patch already replaced max_replication_slots with
max_replication_origin_sessions in that function.
2. backward compatibility for a certain number of releases until the
tools can be adjusted. However, the applications that use it were
adjusted as part of this patch. The drawback here is that once we
accept -1, we cannot remove it without breaking compatibility.Right, I find that path will add more maintenance burden in terms of
remembering this and finding a way to deprecate such a check.My preference was 1 but I'm fine with 2 too. Since it is a major
release, it is natural to introduce features that break things.+1.
+1
A major release would be a good timing to break off the relationship
between the number of origins and the number of replication slots.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Wed, Mar 5, 2025 at 12:42 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Tue, Mar 4, 2025 at 10:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Mar 5, 2025 at 6:24 AM Euler Taveira <euler@eulerto.com> wrote:
On Sat, Mar 1, 2025, at 10:08 AM, Amit Kapila wrote:
On Thu, Feb 13, 2025 at 6:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Thank you for updating the patch! The patch looks mostly good to me.
+ /* + * Prior to PostgreSQL 18, max_replication_slots was used to set the + * number of replication origins. For backward compatibility, -1 indicates + * to use the fallback value (max_replication_slots). + */ + if (max_replication_origin_sessions == -1)Shouldn't we let users use max_replication_origin_sessions for
subscribers? Maintaining this mapping for a long time can create
confusion such that some users will keep using max_replication_slots
for origins, and others will start using
max_replication_origin_sessions. Such a mapping would be useful if we
were doing something like this in back-branches, but doing it in a
major version appears to be more of a maintenance burden.It was my initial proposal. Of course, it breaks compatibility but it
forces the users to adopt this new GUC. It will be a smooth adoption
because if we use the same value as max_replication_slots it means
most of the scenarios will be covered (10 is a good start point for the
majority of replication scenarios in my experience).However, Peter E suggested that it would be a good idea to have a
backward compatibility so I added it.We need to decide which option we want:
1. no backward compatibility and max_replication_origin_sessions = 10 as
default value. It might break scenarios that have the number of
subscriptions greater than the default value.To avoid breakage, can we add a check during the upgrade to ensure
that users have set max_replication_origin_sessions appropriately? See
the current check in function
check_new_cluster_subscription_configuration(). It uses
max_replication_slots, we can change it to use
max_replication_origin_sessions for versions greater than or equal to
18. Will that address this concern?I think that the patch already replaced max_replication_slots with
max_replication_origin_sessions in that function.
Then, that should be sufficient to avoid upgrade related risks.
--
With Regards,
Amit Kapila.
On 11.02.25 21:25, Euler Taveira wrote:
Here is another patch that only changes the GUC name to
max_replication_origin_sessions.
I think the naming and description of this is still confusing.
What does this name mean? There is (I think) no such thing as a
"replication origin session". So why would there be a maximum of those?
The description in the documentation, after the patch, says "Specifies
how many replication origins (see Chapter 48) can be tracked
simultaneously". But Chapter 48 does not say anything about what it
means for a replication slot to be tracked. The only use of the word
tracked in that chapter is to say that replication slots do the tracking
of replication progress.
Both of these are confusing independently, but moreover we are now using
two different words ("sessions" and "tracked") for apparently the same
thing, but neither of which is adequately documented in those terms
anywhere else.
I agree that the originally proposed name max_replication_origins is not
good, because you can "create" (using pg_replication_origin_create())
more than the configured maximum. What is the term for what the setting
actually controls? How many are "active"? "In use"? Per session? etc.
On Wed, Mar 5, 2025 at 4:38 PM Peter Eisentraut <peter@eisentraut.org> wrote:
On 11.02.25 21:25, Euler Taveira wrote:
Here is another patch that only changes the GUC name to
max_replication_origin_sessions.I think the naming and description of this is still confusing.
...
...
I agree that the originally proposed name max_replication_origins is not
good, because you can "create" (using pg_replication_origin_create())
more than the configured maximum. What is the term for what the setting
actually controls? How many are "active"? "In use"? Per session? etc.
It controls the number of active sessions using origin. The idea is
that to track replication progress via replication_origin we need to
do replorigin_session_setup(). If you look in the code, we have used
the term replorigin_session* in many places, so we thought of naming
this as max_replication_origin_sessions. But the other options could
be max_active_replication_origins or max_replication_origins_in_use.
--
With Regards,
Amit Kapila.
On Thu, Mar 6, 2025, at 6:55 AM, Amit Kapila wrote:
On Wed, Mar 5, 2025 at 4:38 PM Peter Eisentraut <peter@eisentraut.org> wrote:
On 11.02.25 21:25, Euler Taveira wrote:
Here is another patch that only changes the GUC name to
max_replication_origin_sessions.I think the naming and description of this is still confusing.
...
...I agree that the originally proposed name max_replication_origins is not
good, because you can "create" (using pg_replication_origin_create())
more than the configured maximum. What is the term for what the setting
actually controls? How many are "active"? "In use"? Per session? etc.It controls the number of active sessions using origin. The idea is
that to track replication progress via replication_origin we need to
do replorigin_session_setup(). If you look in the code, we have used
the term replorigin_session* in many places, so we thought of naming
this as max_replication_origin_sessions. But the other options could
be max_active_replication_origins or max_replication_origins_in_use.
The word "session" is correlated to "replication origin" but requires some
knowledge to know the replication progress tracking design. The word "active"
can express the fact that it was setup and is currently in use. I vote for
max_active_replication_origins.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Thu, Mar 6, 2025 at 6:36 PM Euler Taveira <euler@eulerto.com> wrote:
On Thu, Mar 6, 2025, at 6:55 AM, Amit Kapila wrote:
On Wed, Mar 5, 2025 at 4:38 PM Peter Eisentraut <peter@eisentraut.org> wrote:
On 11.02.25 21:25, Euler Taveira wrote:
Here is another patch that only changes the GUC name to
max_replication_origin_sessions.I think the naming and description of this is still confusing.
...
...I agree that the originally proposed name max_replication_origins is not
good, because you can "create" (using pg_replication_origin_create())
more than the configured maximum. What is the term for what the setting
actually controls? How many are "active"? "In use"? Per session? etc.It controls the number of active sessions using origin. The idea is
that to track replication progress via replication_origin we need to
do replorigin_session_setup(). If you look in the code, we have used
the term replorigin_session* in many places, so we thought of naming
this as max_replication_origin_sessions. But the other options could
be max_active_replication_origins or max_replication_origins_in_use.The word "session" is correlated to "replication origin" but requires some
knowledge to know the replication progress tracking design. The word "active"
can express the fact that it was setup and is currently in use. I vote for
max_active_replication_origins.
Sounds reasonable. Let's go with max_active_replication_origins then,
unless people think otherwise.
--
With Regards,
Amit Kapila.