[Patch] add new parameter to pg_replication_origin_session_setup

Started by Doruk Yilmazover 1 year ago50 messageshackers
Jump to latest
#1Doruk Yilmaz
doruk@mixrank.com

Hello all,
While working on our internal tools that utilise replication, we
realised that a new parameter was added to the internal C function
corresponding to pg_replication_origin_session_setup.
However this parameter wasn't included in the user-facing API [1]https://www.postgresql.org/docs/current/functions-admin.html#PG-REPLICATION-ORIGIN-SESSION-SETUP ---.

In 'src/backend/replication/logical/origin.c' at line 1359,
pg_replication_origin_session_setup function calls

replorigin_session_setup(origin, 0);

where currently 0 is assigned to the acquired_by parameter of the
replorigin_session_setup.

I made this patch to the master which adds a way to control this
parameter by adding a new version of the
pg_replication_origin_session_setup function with user facing
parameters 'text int4' in place of the current 'text' while keeping
the existing variant
(ensuring backwards compatibility). Could someone take a look at it?

[1]: https://www.postgresql.org/docs/current/functions-admin.html#PG-REPLICATION-ORIGIN-SESSION-SETUP ---
---

Thanks for the help,
Doruk Yılmaz

Attachments:

0001-pg_replication_origin_session_setup-new-parameter.patchtext/x-patch; charset=US-ASCII; name=0001-pg_replication_origin_session_setup-new-parameter.patchDownload+33-2
#2Euler Taveira
euler@eulerto.com
In reply to: Doruk Yilmaz (#1)
Re: [Patch] add new parameter to pg_replication_origin_session_setup

On Mon, Aug 12, 2024, at 3:43 PM, Doruk Yilmaz wrote:

Hello all,

Hi!

While working on our internal tools that utilise replication, we
realised that a new parameter was added to the internal C function
corresponding to pg_replication_origin_session_setup.
However this parameter wasn't included in the user-facing API [1].

I'm curious about your use case. Is it just because the internal function has a
different signature or your tool is capable of apply logical replication changes
in parallel using the SQL API?

I made this patch to the master which adds a way to control this
parameter by adding a new version of the
pg_replication_origin_session_setup function with user facing
parameters 'text int4' in place of the current 'text' while keeping
the existing variant
(ensuring backwards compatibility). Could someone take a look at it?

I did a quick look at your patch and have a few suggestions.

* no documentation changes. Since the function you are changing has a new
signature, this change should be reflected in the documentation.
* no need for a new internal function. The second parameter (PID) can be
optional and defaults to 0 in this case. See how we changed the
pg_create_logical_replication_slot along the years add some IN parameters like
twophase and failover in the recent versions.
* add a CF entry [1]https://commitfest.postgresql.org/49/ for this patch so we don't forget it. Another advantage is
that this patch is covered by CI [2]https://wiki.postgresql.org/wiki/Cfbot[3]http://cfbot.cputube.org/index.html.

[1]: https://commitfest.postgresql.org/49/
[2]: https://wiki.postgresql.org/wiki/Cfbot
[3]: http://cfbot.cputube.org/index.html

--
Euler Taveira
EDB https://www.enterprisedb.com/

#3Doruk Yilmaz
doruk@mixrank.com
In reply to: Euler Taveira (#2)
Re: [Patch] add new parameter to pg_replication_origin_session_setup

Hello again,

On Tue, Aug 13, 2024 at 12:48 AM Euler Taveira <euler@eulerto.com> wrote:

I'm curious about your use case. Is it just because the internal function has a
different signature or your tool is capable of apply logical replication changes
in parallel using the SQL API?

The latter is correct, it applies logical replication changes in parallel.
Since multiple connections may commit, we need all of them to be able
to advance the replication origin.

* no documentation changes. Since the function you are changing has a new
signature, this change should be reflected in the documentation.
* no need for a new internal function. The second parameter (PID) can be
optional and defaults to 0 in this case. See how we changed the
pg_create_logical_replication_slot along the years add some IN parameters like
twophase and failover in the recent versions.

I updated/rewrote the patch to reflect these suggestions.
It now has the same DEFAULT 0 style used in pg_create_logical_replication_slot.
I also updated the documentation.

* add a CF entry [1] for this patch so we don't forget it. Another advantage is
that this patch is covered by CI [2][3].

Sadly I still can't log in to the Commitfest due to the cool-off
period. I will create an entry as soon as this period ends.

Thanks for all the feedback,
Doruk Yılmaz

Attachments:

v2-0001-pg_replication_origin_session_setup-new-parameter.patchtext/x-patch; charset=US-ASCII; name=v2-0001-pg_replication_origin_session_setup-new-parameter.patchDownload+15-7
#4Euler Taveira
euler@eulerto.com
In reply to: Doruk Yilmaz (#3)
Re: [Patch] add new parameter to pg_replication_origin_session_setup

On Thu, Aug 15, 2024, at 5:53 PM, Doruk Yilmaz wrote:

Hello again,

On Tue, Aug 13, 2024 at 12:48 AM Euler Taveira <euler@eulerto.com> wrote:

I'm curious about your use case. Is it just because the internal function has a
different signature or your tool is capable of apply logical replication changes
in parallel using the SQL API?

The latter is correct, it applies logical replication changes in parallel.
Since multiple connections may commit, we need all of them to be able
to advance the replication origin.

* no documentation changes. Since the function you are changing has a new
signature, this change should be reflected in the documentation.
* no need for a new internal function. The second parameter (PID) can be
optional and defaults to 0 in this case. See how we changed the
pg_create_logical_replication_slot along the years add some IN parameters like
twophase and failover in the recent versions.

I updated/rewrote the patch to reflect these suggestions.
It now has the same DEFAULT 0 style used in pg_create_logical_replication_slot.
I also updated the documentation.

[after a long hiatus...]

I tested your patch again and it does what is advertised. I changed your patch
a bit. The main change was the documentation. You didn't explain what this new
parameter is for. I tried to explain but don't want to add lots of details.
(There is a section that explain how parallel apply processes work behind the
scenes.) I also renamed it from acquired_by to pid to be more descriptive. I
fixed some white space issues too. I noticed that there are no tests. This
doesn't appear to be a shortcoming from this patch but we need to cover some of
these replication functions with an additional test file in another patch.
Finally, I wrote a commit message and it is RfC.

session 1:

postgres=# select * from pg_replication_origin;
roident | roname
---------+--------
(0 rows)

postgres=# SELECT pg_backend_pid();
pg_backend_pid
----------------
260732
(1 row)

postgres=# SELECT pg_replication_origin_create('test');
pg_replication_origin_create
------------------------------
1
(1 row)

postgres=# SELECT pg_replication_origin_session_setup('test', 0);
pg_replication_origin_session_setup
-------------------------------------

(1 row)

postgres=# select * from pg_replication_origin;
roident | roname
---------+--------
1 | test
(1 row)

session 2:

postgres=# SELECT pg_replication_origin_session_setup('test', 260732);
pg_replication_origin_session_setup
-------------------------------------

(1 row)

session 3:

postgres=# SELECT pg_replication_origin_session_setup('test', 12345);
ERROR: could not find replication state slot for replication origin with OID 1 which was acquired by 12345

--
Euler Taveira
EDB https://www.enterprisedb.com/

Attachments:

v3-0001-pg_replication_origin_session_setup-pid-parameter.patchtext/x-patch; name="=?UTF-8?Q?v3-0001-pg=5Freplication=5Forigin=5Fsession=5Fsetup-pid-parame?= =?UTF-8?Q?ter.patch?="Download+19-5
#5Doruk Yilmaz
doruk@mixrank.com
In reply to: Doruk Yilmaz (#1)
Re: [Patch] add new parameter to pg_replication_origin_session_setup

I noticed that the patch needs rebasing, so here is the rebased version.
Hopefully it makes to the commitfest.

Doruk Yılmaz

Attachments:

v4-0001-pg_replication_origin_session_setup-pid-parameter.patchapplication/x-patch; name=v4-0001-pg_replication_origin_session_setup-pid-parameter.patchDownload+19-5
#6Amit Kapila
amit.kapila16@gmail.com
In reply to: Euler Taveira (#4)
Re: [Patch] add new parameter to pg_replication_origin_session_setup

On Thu, Jan 9, 2025 at 3:26 AM Euler Taveira <euler@eulerto.com> wrote:

On Thu, Aug 15, 2024, at 5:53 PM, Doruk Yilmaz wrote:

Hello again,

On Tue, Aug 13, 2024 at 12:48 AM Euler Taveira <euler@eulerto.com> wrote:

I'm curious about your use case. Is it just because the internal function has a
different signature or your tool is capable of apply logical replication changes
in parallel using the SQL API?

The latter is correct, it applies logical replication changes in parallel.
Since multiple connections may commit, we need all of them to be able
to advance the replication origin.

To use replication_origin by multiple processes, one must maintain the
commit order as we do internally by allowing the leader process to
wait for the parallel worker to finish the commit. See comments atop
replorigin_session_setup(). Now, we could expose the pid parameter as
proposed by the patch after documenting the additional requirements,
but I am afraid that users may directly start using the API without
following the commit order principle, which can lead to incorrect
replication. So, isn't it better to do something to avoid the misuse
of this feature before exposing it?

--
With Regards,
Amit Kapila.

#7Doruk Yilmaz
doruk@mixrank.com
In reply to: Amit Kapila (#6)
Re: [Patch] add new parameter to pg_replication_origin_session_setup

On Mon, Mar 3, 2025 at 6:39 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

To use replication_origin by multiple processes, one must maintain the
commit order as we do internally by allowing the leader process to
wait for the parallel worker to finish the commit. See comments atop
replorigin_session_setup(). Now, we could expose the pid parameter as
proposed by the patch after documenting the additional requirements,
but I am afraid that users may directly start using the API without
following the commit order principle, which can lead to incorrect
replication. So, isn't it better to do something to avoid the misuse
of this feature before exposing it?

Wouldn't mentioning/describing needing to follow the commit order
principle on the documentation be enough for this?
It is quite an advanced feature that I don't believe person intending
to use it won't start with reading documentation first.

Is there any updates on the commit? I see that intended commitfest window ended.

Thanks,
Doruk Yılmaz

#8Amit Kapila
amit.kapila16@gmail.com
In reply to: Doruk Yilmaz (#7)
Re: [Patch] add new parameter to pg_replication_origin_session_setup

On Tue, Jul 29, 2025 at 2:43 AM Doruk Yilmaz <doruk@mixrank.com> wrote:

On Mon, Mar 3, 2025 at 6:39 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

To use replication_origin by multiple processes, one must maintain the
commit order as we do internally by allowing the leader process to
wait for the parallel worker to finish the commit. See comments atop
replorigin_session_setup(). Now, we could expose the pid parameter as
proposed by the patch after documenting the additional requirements,
but I am afraid that users may directly start using the API without
following the commit order principle, which can lead to incorrect
replication. So, isn't it better to do something to avoid the misuse
of this feature before exposing it?

Wouldn't mentioning/describing needing to follow the commit order
principle on the documentation be enough for this?
It is quite an advanced feature that I don't believe person intending
to use it won't start with reading documentation first.

That is true but I still feel there has to be some mechanism where we
can catch and give an ERROR to the user, if it doesn't follow the
same. For example, pg_replication_origin_advance() always allows going
backwards in terms of LSN which means if one doesn't follow commit
order, it can lead to breaking the replication as after restart the
client can ask to start replication from some prior point.

Is there any updates on the commit?

I think we are still under discussion about the requirements and
design for this API. Can you tell us the use case? Did you also intend
to use it for parallel apply, if so, can you also tell at a high
level, how you are planning to manage origin? It will help us to
extend the API(s) in a meaningful way.

--
With Regards,
Amit Kapila.

#9Doruk Yilmaz
doruk@mixrank.com
In reply to: Amit Kapila (#8)
Re: [Patch] add new parameter to pg_replication_origin_session_setup

On Mon, Jul 29, 2025 at 8:13 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

That is true but I still feel there has to be some mechanism where we
can catch and give an ERROR to the user, if it doesn't follow the
same. For example, pg_replication_origin_advance() always allows going
backwards in terms of LSN which means if one doesn't follow commit
order, it can lead to breaking the replication as after restart the
client can ask to start replication from some prior point.

If you have any ideas for safeguards or API changes, I'd be happy to
help implement them or discuss them.

Can you tell us the use case? Did you also intend to use it for parallel apply, if so, can you also tell at a high
level, how you are planning to manage origin?

Yes, we use it for parallel apply. We have a custom logical
replication system that applies changes using multiple worker
processes, each with their own database connection.
Our use case requires multiple connections to be able to advance the
same replication origin. We handle this by having a master process
coordinate the workers, where each worker process calls
pg_replication_origin_session_setup with the master's PID as the
second parameter.
We may send operations out of order but we always commit in order, so
there's no chance of creating inconsistencies. There's the chance of
deadlocks, but these can be detected. It's really similar to the
existing parallel apply implementation - the main difference is that
we're applying from jsonl files instead of directly from another
database.
Currently we use a local patch to expose the PID parameter, but having
this upstream would be great. It causes a lot of headaches for us to
use a patched PostgreSQL.
Thanks,
Doruk Yılmaz

#10Amit Kapila
amit.kapila16@gmail.com
In reply to: Doruk Yilmaz (#9)
Re: [Patch] add new parameter to pg_replication_origin_session_setup

On Wed, Jul 30, 2025 at 12:00 AM Doruk Yilmaz <doruk@mixrank.com> wrote:

On Mon, Jul 29, 2025 at 8:13 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

That is true but I still feel there has to be some mechanism where we
can catch and give an ERROR to the user, if it doesn't follow the
same. For example, pg_replication_origin_advance() always allows going
backwards in terms of LSN which means if one doesn't follow commit
order, it can lead to breaking the replication as after restart the
client can ask to start replication from some prior point.

If you have any ideas for safeguards or API changes, I'd be happy to
help implement them or discuss them.

Can you tell us the use case? Did you also intend to use it for parallel apply, if so, can you also tell at a high
level, how you are planning to manage origin?

Yes, we use it for parallel apply. We have a custom logical
replication system that applies changes using multiple worker
processes, each with their own database connection.
Our use case requires multiple connections to be able to advance the
same replication origin.

How do you advance the origin? Did you use
pg_replication_origin_advance()? If so, you should be aware that it
can be used for initial setup; see comment in that API code: "Can't
sensibly pass a local commit to be flushed at checkpoint - this xact
hasn't committed yet. This is why this function should be used to set
up the initial replication state, but not for replay." I wonder if you
are using pg_replication_origin_advance(), won't its current
implementation has the potential to cause a problem for your usecase?
I think the problem it can cause is it may miss a transaction to apply
after restart because we can use remote_lsn without a corresponding
transaction (local_lsn) flushed on the subscriber. This can happen
because ideally we want the transaction that is not successfully
flushed to be replayed after restart.

In general, I was thinking of adding a restriction
pg_replication_origin_advance() such that it gives an ERROR when a
user tries to move remote_lsn backward unless requested explicitly.

It would be good to know the opinion of others involved in the
original change of maintaining commit order for parallel apply of
large transactions.

--
With Regards,
Amit Kapila.

#11Doruk Yilmaz
doruk@mixrank.com
In reply to: Amit Kapila (#10)
Re: [Patch] add new parameter to pg_replication_origin_session_setup

On Mon, Aug 11, 2025 at 9:44 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

How do you advance the origin? Did you use > pg_replication_origin_advance()? If so, you should be aware that it
can be used for initial setup; see comment in that API code...

No, we don't use pg_replication_origin_advance(). We use
pg_replication_origin_xact_setup() instead as I mentioned before.

Each worker does the following:
1. Sets up its own replication-origin session with
pg_replication_origin_session_setup() (using the master process PID).
2. Applies changes inside transactions.
3. Right before commit, calls pg_replication_origin_xact_setup(lsn,
commit_timestamp).
4. Commits only if everything succeeded, so the origin only advances
on a real commit.

That way, the origin LSN moves forward only when the transaction is
actually committed. If something fails or the process crashes, the
origin stays at the last successful commit, and on restart we replay
from the correct spot. It's safer than advancing the origin without
knowing the transaction made it to disk.

So the issue you described is not relevant for our implementation.

#12Amit Kapila
amit.kapila16@gmail.com
In reply to: Doruk Yilmaz (#11)
Re: [Patch] add new parameter to pg_replication_origin_session_setup

On Mon, Aug 11, 2025 at 10:41 PM Doruk Yilmaz <doruk@mixrank.com> wrote:

On Mon, Aug 11, 2025 at 9:44 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

How do you advance the origin? Did you use > pg_replication_origin_advance()? If so, you should be aware that it
can be used for initial setup; see comment in that API code...

No, we don't use pg_replication_origin_advance(). We use
pg_replication_origin_xact_setup() instead as I mentioned before.

Each worker does the following:
1. Sets up its own replication-origin session with
pg_replication_origin_session_setup() (using the master process PID).
2. Applies changes inside transactions.
3. Right before commit, calls pg_replication_origin_xact_setup(lsn,
commit_timestamp).
4. Commits only if everything succeeded, so the origin only advances
on a real commit.

That way, the origin LSN moves forward only when the transaction is
actually committed. If something fails or the process crashes, the
origin stays at the last successful commit, and on restart we replay
from the correct spot. It's safer than advancing the origin without
knowing the transaction made it to disk.

Your use looks good to me. So, maybe we can update the docs with the
dangers if the users of API doesn't follow commit order then it may
lead to data inconsistency should be sufficient. Additionally, we may
want to give an example as to how to use this API for parallel apply.

Thoughts?

--
With Regards,
Amit Kapila.

#13Doruk Yilmaz
doruk@mixrank.com
In reply to: Amit Kapila (#12)
Re: [Patch] add new parameter to pg_replication_origin_session_setup

Your use looks good to me. So, maybe we can update the docs with the
dangers if the users of API doesn't follow commit order then it may
lead to data inconsistency should be sufficient. Additionally, we may
want to give an example as to how to use this API for parallel apply.

That sounds reasonable. I’ve updated the patch and added more
information to the documentation covering the topics you mentioned.
I also added a Caution block so potential users won’t miss it. I hope
this patch meets your expectations.

Attachments:

v5-0001-pg_replication_origin_session_setup-pid-parameter.patchtext/x-patch; charset=US-ASCII; name=v5-0001-pg_replication_origin_session_setup-pid-parameter.patchDownload+33-5
#14Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Doruk Yilmaz (#13)
RE: [Patch] add new parameter to pg_replication_origin_session_setup

Dear Doruk,

That sounds reasonable. I’ve updated the patch and added more
information to the documentation covering the topics you mentioned.
I also added a Caution block so potential users won’t miss it. I hope
this patch meets your expectations.

Can you explain more why we must extend the SQL interface? I read your use
case [1]/messages/by-id/CAMPB6wckvkKrXVPH5j8Ske2cVedkb-TRLdnOb5e74zYM1CynGw@mail.gmail.com, and looks like that a new type of background worker is introduced in
your system. If so, why doesn't the worker directly call C-lang interface
replorigin_session_setup()?
Personally considered, SQL functions are usable by unfamiliar users so that this
change may be dangerous. It is better if developers can use C APIs instead.

[1]: /messages/by-id/CAMPB6wckvkKrXVPH5j8Ske2cVedkb-TRLdnOb5e74zYM1CynGw@mail.gmail.com

Best regards,
Hayato Kuroda
FUJITSU LIMITED

#15Doruk Yilmaz
doruk@mixrank.com
In reply to: Hayato Kuroda (Fujitsu) (#14)
Re: [Patch] add new parameter to pg_replication_origin_session_setup

Dear Hayato,

Can you explain more why we must extend the SQL interface?

In our system the workers aren't background workers and we don't ship
a server-side extension; they're plain external processes (Python in
our case) talking over standard database connections. In many
deployments -especially managed Postgres- we can't load custom C code
even if we wanted to. That's why we want to expose the existing pid
knob via SQL: it lets ordinary client sessions participate in the
same, already-implemented origin coordination without maintaining a
fork or an extension.
This patch doesn't invent a new capability, it just makes the internal
behavior reachable from SQL. The new argument is optional and defaults
to the current behavior, so nothing changes for existing users. It
also keeps the feature usable from any language/runtime that
coordinates parallel apply at the application layer. And I don't
believe it is that dangerous or risky. The actual code we use in
python is not that complex that I believe a person using replication
already should be able to set it up. I don't understand why being able
to achieve parallel replication is not accessible via SQL already.

I am happy to do changes to the patch if you think there should be
more guardrails.

Thanks,
Doruk Yılmaz

#16Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Doruk Yilmaz (#15)
RE: [Patch] add new parameter to pg_replication_origin_session_setup

Dear Doruk,

In our system the workers aren't background workers and we don't ship
a server-side extension; they're plain external processes (Python in
our case) talking over standard database connections. In many
deployments -especially managed Postgres- we can't load custom C code
even if we wanted to. That's why we want to expose the existing pid
knob via SQL: it lets ordinary client sessions participate in the
same, already-implemented origin coordination without maintaining a
fork or an extension.

So, your python process establishes two connections, for publisher (replication connection)
and subscriber (normal connection). It receives changes from the publisher,
constructs SQL statements from the received results, and sends to subscriber's
backend, is it right?
I'm not sure it is the common approach, but I see your point that you cannot
install your extensions on managed postgres.

Anyway, I still feel bit dangerous but OK if others can accept.

Regarding the patch, I want to ask one point.
```
+CREATE OR REPLACE FUNCTION
+  pg_replication_origin_session_setup(node_name text, pid integer DEFAULT 0)
+RETURNS void
+LANGUAGE INTERNAL
+STRICT VOLATILE
+AS 'pg_replication_origin_session_setup';
...
{ oid => '6006',
   descr => 'configure session to maintain replication progress tracking for the passed in origin',
   proname => 'pg_replication_origin_session_setup', provolatile => 'v',
-  proparallel => 'u', prorettype => 'void', proargtypes => 'text',
+  proparallel => 'u', prorettype => 'void', proargtypes => 'text int4',
   prosrc => 'pg_replication_origin_session_setup' },
```

Is there a rule which attribute is clarified and others are not?
For example, VOLATILE is specified on both side, STRICT is written only in the
system_functions.sql, and PARALLEL UNSAFE is set on pg_proc.dat.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

#17Doruk Yilmaz
doruk@mixrank.com
In reply to: Hayato Kuroda (Fujitsu) (#16)
Re: [Patch] add new parameter to pg_replication_origin_session_setup

Dear Hayato,

So, your python process establishes two connections, for publisher (replication connection)
and subscriber (normal connection). It receives changes from the publisher,
constructs SQL statements from the received results, and sends to subscriber's
backend, is it right?

Actually, it's a bit simpler than that - there are no two connections.
Our program reads changes from JSONL files rather than directly from a
publisher connection.
We have multiple Python processes, each with a single database
connection to the subscriber,
reading from these files and applying changes in parallel.

Is there a rule which attribute is clarified and others are not?
For example, VOLATILE is specified on both side, STRICT is written only in the
system_functions.sql, and PARALLEL UNSAFE is set on pg_proc.dat.

In pg_proc.dat, I believe the STRICT, IMMUTABLE, and PARALLEL SAFE are
the defaults (check out pg_proc.h).
So in pg_proc.dat, the ones that are specified are the ones that
aren't defaults,
there is provolatile => 'v' (for VOLATILE) and proparallel => 'u' (for
UNSAFE), but no prostrict since it's already true by default.
In system_functions.sql, I went with being explicit about all the
attributes for clarity as it is the code declaration.
If you want, I can also make the pg_proc.dat explicit.

Thanks,
Doruk Yılmaz

#18Amit Kapila
amit.kapila16@gmail.com
In reply to: Doruk Yilmaz (#17)
Re: [Patch] add new parameter to pg_replication_origin_session_setup

On Wed, Sep 3, 2025 at 6:13 PM Doruk Yilmaz <doruk@mixrank.com> wrote:

Dear Hayato,

So, your python process establishes two connections, for publisher (replication connection)
and subscriber (normal connection). It receives changes from the publisher,
constructs SQL statements from the received results, and sends to subscriber's
backend, is it right?

Actually, it's a bit simpler than that - there are no two connections.
Our program reads changes from JSONL files rather than directly from a
publisher connection.
We have multiple Python processes, each with a single database
connection to the subscriber,
reading from these files and applying changes in parallel.

Is there a rule which attribute is clarified and others are not?
For example, VOLATILE is specified on both side, STRICT is written only in the
system_functions.sql, and PARALLEL UNSAFE is set on pg_proc.dat.

In pg_proc.dat, I believe the STRICT, IMMUTABLE, and PARALLEL SAFE are
the defaults (check out pg_proc.h).
So in pg_proc.dat, the ones that are specified are the ones that
aren't defaults,
there is provolatile => 'v' (for VOLATILE) and proparallel => 'u' (for
UNSAFE), but no prostrict since it's already true by default.
In system_functions.sql, I went with being explicit about all the
attributes for clarity as it is the code declaration.

Then why didn't you specified PARALLEL UNSAFE as well?

BTW, yesterday a new thread started with the same requirement [1]/messages/by-id/CAE2gYzyTSNvHY1+iWUwykaLETSuAZsCWyryokjP6rG46ZvRgQA@mail.gmail.com -- With Regards, Amit Kapila.. It
uses a slightly different way to define the new function. do you have
any opinion on it?

[1]: /messages/by-id/CAE2gYzyTSNvHY1+iWUwykaLETSuAZsCWyryokjP6rG46ZvRgQA@mail.gmail.com -- With Regards, Amit Kapila.
--
With Regards,
Amit Kapila.

#19Doruk Yilmaz
doruk@mixrank.com
In reply to: Amit Kapila (#18)
Re: [Patch] add new parameter to pg_replication_origin_session_setup

Then why didn't you specified PARALLEL UNSAFE as well?

You are correct, I missed marking the function as PARALLEL UNSAFE.
I’ve attached a revised patch with the correct annotation.

BTW, yesterday a new thread started with the same requirement [1]. It
uses a slightly different way to define the new function. do you have
any opinion on it?

I don’t think introducing a separate function is a good idea. It’s
effectively the same behavior, technical debt, and maintenance
overhead without a clear benefit.
Our patch keeps a single function with a default parameter, so it’s
not a breaking change. So I believe our approach is preferable.
But I would say that, the fact that another patch is proposing the
same capability indicates there’s broader demand for this change.

Attachments:

v6-0001-pg_replication_origin_session_setup-pid-parameter.patchtext/x-patch; charset=US-ASCII; name=v6-0001-pg_replication_origin_session_setup-pid-parameter.patchDownload+33-5
#20Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Doruk Yilmaz (#19)
RE: [Patch] add new parameter to pg_replication_origin_session_setup

Dear Doruk,

Thanks for updating the patch and sorry for being late.
The new patch looks good to me.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

#21Doruk Yilmaz
doruk@mixrank.com
In reply to: Hayato Kuroda (Fujitsu) (#20)
#22Amit Kapila
amit.kapila16@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#20)
#23Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Amit Kapila (#22)
#24Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Hayato Kuroda (Fujitsu) (#23)
#25Amit Kapila
amit.kapila16@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#24)
#26Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Amit Kapila (#25)
#27Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Hayato Kuroda (Fujitsu) (#26)
#28Amit Kapila
amit.kapila16@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#27)
#29shveta malik
shveta.malik@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#27)
#30shveta malik
shveta.malik@gmail.com
In reply to: shveta malik (#29)
#31Amit Kapila
amit.kapila16@gmail.com
In reply to: shveta malik (#30)
#32Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Amit Kapila (#31)
#33shveta malik
shveta.malik@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#32)
#34Amit Kapila
amit.kapila16@gmail.com
In reply to: shveta malik (#33)
#35Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Amit Kapila (#34)
#36shveta malik
shveta.malik@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#35)
#37Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: shveta malik (#36)
#38shveta malik
shveta.malik@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#37)
#39Amit Kapila
amit.kapila16@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#37)
#40Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Amit Kapila (#39)
#41shveta malik
shveta.malik@gmail.com
In reply to: Heikki Linnakangas (#40)
#42Amit Kapila
amit.kapila16@gmail.com
In reply to: Heikki Linnakangas (#40)
#43Amit Kapila
amit.kapila16@gmail.com
In reply to: shveta malik (#41)
#44shveta malik
shveta.malik@gmail.com
In reply to: Amit Kapila (#43)
#45Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#42)
#46Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#45)
#47Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#46)
#48Amit Kapila
amit.kapila16@gmail.com
In reply to: shveta malik (#44)
#49shveta malik
shveta.malik@gmail.com
In reply to: Amit Kapila (#48)
#50Amit Kapila
amit.kapila16@gmail.com
In reply to: shveta malik (#49)