Add support for COPY TO in tablesync for partitioned tables.

Started by Ajin Cherian2 months ago7 messages
#1Ajin Cherian
itsajin@gmail.com
1 attachment(s)

Hi all,

After patch [1]https://github.com/postgres/postgres/commit/4bea91f21f61d01bd40a4191a4a8c82d0959fffe was committed, the COPY TO command is now supported
for partitioned tables. This change updates the tablesync logic to use
COPY TO for partitioned tables as well. This change will only be
invoked when the publication is configured with
publish_via_partition_root = true;
otherwise, partitioned tables will continue to be published using
their underlying partitions by default and publishing using partitions
already use COPY TO.

[1]: https://github.com/postgres/postgres/commit/4bea91f21f61d01bd40a4191a4a8c82d0959fffe

regards,
Ajin Cherian
Fujitsu Australia

Attachments:

v1-0001-Support-for-COPY-TO-for-partitioned-tables-in-tab.patchapplication/octet-stream; name=v1-0001-Support-for-COPY-TO-for-partitioned-tables-in-tab.patchDownload
From 370517cad57eb36cd93e1b5ef5135561732b1505 Mon Sep 17 00:00:00 2001
From: Ajin Cherian <itsajin@gmail.com>
Date: Tue, 11 Nov 2025 13:43:16 +1100
Subject: [PATCH v1] Support for COPY TO for partitioned tables in tablesync

This commit modifies tablesync logic to use COPY TO for partitioned tables. Performance tests
show it's faster than the COPY (SELECT ...) TO variant as it avoids the overheads of
query processing and sending results to the COPY TO command.
---
 src/backend/replication/logical/tablesync.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index dcc6124cc73..fa8e3bf969a 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -1068,8 +1068,9 @@ copy_table(Relation rel)
 	/* Start copy on the publisher. */
 	initStringInfo(&cmd);
 
-	/* Regular table with no row filter or generated columns */
-	if (lrel.relkind == RELKIND_RELATION && qual == NIL && !gencol_published)
+	/* Regular or partitioned table with no row filter or generated columns */
+	if ((lrel.relkind == RELKIND_RELATION || lrel.relkind == RELKIND_PARTITIONED_TABLE)
+		&& qual == NIL && !gencol_published)
 	{
 		appendStringInfo(&cmd, "COPY %s",
 						 quote_qualified_identifier(lrel.nspname, lrel.relname));
-- 
2.47.3

#2Amit Kapila
amit.kapila16@gmail.com
In reply to: Ajin Cherian (#1)
Re: Add support for COPY TO in tablesync for partitioned tables.

On Tue, Nov 11, 2025 at 8:39 AM Ajin Cherian <itsajin@gmail.com> wrote:

After patch [1] was committed, the COPY TO command is now supported
for partitioned tables. This change updates the tablesync logic to use
COPY TO for partitioned tables as well.

In the commit message, you mentioned: "Performance tests show it's
faster than the COPY (SELECT ...) TO variant as it avoids the
overheads of query processing and sending results to the COPY TO
command.". Can you share the performance data to substantiate this
point?

--
With Regards,
Amit Kapila.

#3Ajin Cherian
itsajin@gmail.com
In reply to: Amit Kapila (#2)
Re: Add support for COPY TO in tablesync for partitioned tables.

On Tue, Nov 11, 2025 at 2:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

In the commit message, you mentioned: "Performance tests show it's
faster than the COPY (SELECT ...) TO variant as it avoids the
overheads of query processing and sending results to the COPY TO
command.". Can you share the performance data to substantiate this
point?

This was based on the tests done in the original thread [1]/messages/by-id/174219852967.294107.6195385625494034792.pgcf@coridan.postgresql.org and [2]/messages/by-id/CALdSSPi5GUx1XtVTEOmvZ73MDM9HrpzE7L_Dp55z30wfp7KMvw@mail.gmail.com

[1]: /messages/by-id/174219852967.294107.6195385625494034792.pgcf@coridan.postgresql.org
[2]: /messages/by-id/CALdSSPi5GUx1XtVTEOmvZ73MDM9HrpzE7L_Dp55z30wfp7KMvw@mail.gmail.com

regards,
Ajin Cherian
Fujitsu Australia

#4Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Ajin Cherian (#3)
Re: Add support for COPY TO in tablesync for partitioned tables.

On Mon, Nov 10, 2025 at 7:37 PM Ajin Cherian <itsajin@gmail.com> wrote:

On Tue, Nov 11, 2025 at 2:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

In the commit message, you mentioned: "Performance tests show it's
faster than the COPY (SELECT ...) TO variant as it avoids the
overheads of query processing and sending results to the COPY TO
command.". Can you share the performance data to substantiate this
point?

This was based on the tests done in the original thread [1] and [2]

Thank you for working on this item. I think it's a good follow-up
patch for commit 4bea91f.

Have you conducted any performance tests with logical replication
setup? I've measured normal COPY TO cases but I think it would be
worth checking how much the performance increase we can see in logical
replication setup too.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#5Ajin Cherian
itsajin@gmail.com
In reply to: Masahiko Sawada (#4)
2 attachment(s)
Re: Add support for COPY TO in tablesync for partitioned tables.

On Thu, Nov 13, 2025 at 8:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Nov 10, 2025 at 7:37 PM Ajin Cherian <itsajin@gmail.com> wrote:

On Tue, Nov 11, 2025 at 2:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

In the commit message, you mentioned: "Performance tests show it's
faster than the COPY (SELECT ...) TO variant as it avoids the
overheads of query processing and sending results to the COPY TO
command.". Can you share the performance data to substantiate this
point?

This was based on the tests done in the original thread [1] and [2]

Thank you for working on this item. I think it's a good follow-up
patch for commit 4bea91f.

Have you conducted any performance tests with logical replication
setup? I've measured normal COPY TO cases but I think it would be
worth checking how much the performance increase we can see in logical
replication setup too.

Thanks for your interest in this patch.
I've tested the same setup as mentioned in [1]/messages/by-id/174219852967.294107.6195385625494034792.pgcf@coridan.postgresql.org but with 10 tables and
500 records each and measuring the total time it would take for all
the tablesync workers to finish sync (from log timings).
On the average:
Without patch
Tablesync time: 185.4 ms
Average COPY command times: 1.4168 ms

With patch
Tablesync time: 172.2 ms (7% improvement)
Average COPY command times: 0.633 ms

The improvement in performance is smaller as the table size increases.
There is better improvement for smaller tables.
Attaching my test scripts as well.

regards,
Ajin Cherian
Fujitsu Australia

[1]: /messages/by-id/174219852967.294107.6195385625494034792.pgcf@coridan.postgresql.org

Attachments:

log_analyzer_functions.shapplication/octet-stream; name=log_analyzer_functions.shDownload
test-tablesync-perf-v4.shapplication/octet-stream; name=test-tablesync-perf-v4.shDownload
#6Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Ajin Cherian (#5)
Re: Add support for COPY TO in tablesync for partitioned tables.

On Thu, Nov 13, 2025 at 7:17 PM Ajin Cherian <itsajin@gmail.com> wrote:

On Thu, Nov 13, 2025 at 8:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Nov 10, 2025 at 7:37 PM Ajin Cherian <itsajin@gmail.com> wrote:

On Tue, Nov 11, 2025 at 2:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

In the commit message, you mentioned: "Performance tests show it's
faster than the COPY (SELECT ...) TO variant as it avoids the
overheads of query processing and sending results to the COPY TO
command.". Can you share the performance data to substantiate this
point?

This was based on the tests done in the original thread [1] and [2]

Thank you for working on this item. I think it's a good follow-up
patch for commit 4bea91f.

Have you conducted any performance tests with logical replication
setup? I've measured normal COPY TO cases but I think it would be
worth checking how much the performance increase we can see in logical
replication setup too.

Thanks for your interest in this patch.
I've tested the same setup as mentioned in [1] but with 10 tables and
500 records each and measuring the total time it would take for all
the tablesync workers to finish sync (from log timings).
On the average:
Without patch
Tablesync time: 185.4 ms
Average COPY command times: 1.4168 ms

With patch
Tablesync time: 172.2 ms (7% improvement)
Average COPY command times: 0.633 ms

The improvement in performance is smaller as the table size increases.
There is better improvement for smaller tables.
Attaching my test scripts as well.

Thank you for the test! I've also done some performance tests and got
similar results.

The patch is pretty simple and looks good to me. I'll push the patch,
barring objections.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#7Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#6)
Re: Add support for COPY TO in tablesync for partitioned tables.

On Wed, Nov 19, 2025 at 2:56 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Nov 13, 2025 at 7:17 PM Ajin Cherian <itsajin@gmail.com> wrote:

On Thu, Nov 13, 2025 at 8:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Nov 10, 2025 at 7:37 PM Ajin Cherian <itsajin@gmail.com> wrote:

On Tue, Nov 11, 2025 at 2:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

In the commit message, you mentioned: "Performance tests show it's
faster than the COPY (SELECT ...) TO variant as it avoids the
overheads of query processing and sending results to the COPY TO
command.". Can you share the performance data to substantiate this
point?

This was based on the tests done in the original thread [1] and [2]

Thank you for working on this item. I think it's a good follow-up
patch for commit 4bea91f.

Have you conducted any performance tests with logical replication
setup? I've measured normal COPY TO cases but I think it would be
worth checking how much the performance increase we can see in logical
replication setup too.

Thanks for your interest in this patch.
I've tested the same setup as mentioned in [1] but with 10 tables and
500 records each and measuring the total time it would take for all
the tablesync workers to finish sync (from log timings).
On the average:
Without patch
Tablesync time: 185.4 ms
Average COPY command times: 1.4168 ms

With patch
Tablesync time: 172.2 ms (7% improvement)
Average COPY command times: 0.633 ms

The improvement in performance is smaller as the table size increases.
There is better improvement for smaller tables.
Attaching my test scripts as well.

Thank you for the test! I've also done some performance tests and got
similar results.

The patch is pretty simple and looks good to me. I'll push the patch,
barring objections.

Pushed.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com