Prevent COPY FREEZE on Foreign tables

Started by Sami Imseih11 months ago14 messages
#1Sami Imseih
samimseih@gmail.com
1 attachment(s)

Hi,

I was looking at COPY FREEZE and I found that it's
possible to run this command on a foreign table,
This really does not make sense as this
optimization cannot be applied to a remote table and it
can give a user a false impression that it was.

"""
postgres=# begin;
BEGIN
postgres=*# create foreign table t1 (id int) server r1;
CREATE FOREIGN TABLE
postgres=*# copy t1 FROM '/tmp/copy_data' freeze;
COPY 999999

-- on the foreign server

postgres=# SELECT count(*), all_visible, all_frozen, pd_all_visible
FROM pg_visibility('t1'::regclass)
group by all_visible, all_frozen, pd_all_visible;

count | all_visible | all_frozen | pd_all_visible
-------+-------------+------------+----------------
5 | f | f | f
(1 row)

"""

The other issue here is that one can only use COPY FREEZE
on a foreign table only if the foreign table is created in the
transaction. A truncate will not work, making the error
message wrong.

"""
postgres=# begin;
BEGIN
postgres=*# truncate table foreign_table_1;
TRUNCATE TABLE
postgres=*# copy foreign_table_1 FROM 'copy_data' freeze;
ERROR: cannot perform COPY FREEZE because the table was not created
or truncated in the current subtransaction
postgres=!#
"""

I think we should just block Foreign tables as we do with
partition tables. Attached patch does that.

I was also looking at why we block a parent from COPY FREEZE[1]https://github.com/postgres/postgres/blob/master/src/backend/commands/copyfrom.c#L727-L735, but
the comments do not convince me this is a good idea. I think there
are good cases to allow this considering there is a common use case in
which a single
COPY command can load a large amount of data, making the overhead to check the
partitions worth the value of the FREEZE optimization. I will probably
start a separate thread for this.

Regards,

Sami Imseih
Amazon Web Services (AWS)

[1]: https://github.com/postgres/postgres/blob/master/src/backend/commands/copyfrom.c#L727-L735

Attachments:

v1-0001-Disallow-Foreign-Tables-with-COPY-FREEZE.patchapplication/octet-stream; name=v1-0001-Disallow-Foreign-Tables-with-COPY-FREEZE.patchDownload
From 3c2fe32558b6e8ab480e2a137cb92ea710fa7fcd Mon Sep 17 00:00:00 2001
From: "Sami Imseih (AWS)"
 <simseih@dev-dsk-simseih-1d-3940b79e.us-east-1.amazon.com>
Date: Mon, 3 Feb 2025 16:19:31 +0000
Subject: [PATCH v1 1/2] Disallow Foreign Tables with COPY FREEZE

Foreign Tables cannot take advantage of the COPY FREEZE
optimization, so it should just be disallowed.
---
 doc/src/sgml/ref/copy.sgml         |  2 +-
 src/backend/commands/copyfrom.c    | 11 +++++++++++
 src/test/regress/expected/copy.out |  8 ++++++++
 src/test/regress/sql/copy.sql      | 11 +++++++++++
 4 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 8394402f09..090ce32876 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -237,7 +237,7 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       or truncated in the current subtransaction, there are no cursors
       open and there are no older snapshots held by this transaction.  It is
       currently not possible to perform a <command>COPY FREEZE</command> on
-      a partitioned table.
+      a partitioned or foreign table.
       This option is only allowed in <command>COPY FROM</command>.
      </para>
      <para>
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 0cbd05f560..0783f06bf7 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -740,6 +740,17 @@ CopyFrom(CopyFromState cstate)
 					 errmsg("cannot perform COPY FREEZE on a partitioned table")));
 		}
 
+		/*
+		 * Raise an error on foreign tables as it's not possible to apply
+		 * the COPY FREEZE optimization to a remote relation.
+		 */
+		if (cstate->rel->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
+		{
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("cannot perform COPY FREEZE on a foreign table")));
+		}
+
 		/*
 		 * Tolerate one registration for the benefit of FirstXactSnapshot.
 		 * Scan-bearing queries generally create at least two registrations,
diff --git a/src/test/regress/expected/copy.out b/src/test/regress/expected/copy.out
index f554d42c84..9036358ee6 100644
--- a/src/test/regress/expected/copy.out
+++ b/src/test/regress/expected/copy.out
@@ -325,3 +325,11 @@ SELECT tableoid::regclass, id % 2 = 0 is_even, count(*) from parted_si GROUP BY
 (2 rows)
 
 DROP TABLE parted_si;
+-- Ensure COPY FREEZE errors for foreign tables.
+BEGIN;
+CREATE FOREIGN DATA WRAPPER dummy;
+CREATE SERVER dummy_server FOREIGN DATA WRAPPER dummy;
+CREATE FOREIGN TABLE test_foreign_table (a int) SERVER dummy_server;
+copy test_foreign_table from stdin with (header match, freeze on);
+ERROR:  cannot perform COPY FREEZE on a foreign table
+ROLLBACK;
diff --git a/src/test/regress/sql/copy.sql b/src/test/regress/sql/copy.sql
index f1699b66b0..687eaf79f6 100644
--- a/src/test/regress/sql/copy.sql
+++ b/src/test/regress/sql/copy.sql
@@ -348,3 +348,14 @@ COPY parted_si(id, data) FROM :'filename';
 SELECT tableoid::regclass, id % 2 = 0 is_even, count(*) from parted_si GROUP BY 1, 2 ORDER BY 1;
 
 DROP TABLE parted_si;
+
+-- Ensure COPY FREEZE errors for foreign tables.
+BEGIN;
+CREATE FOREIGN DATA WRAPPER dummy;
+CREATE SERVER dummy_server FOREIGN DATA WRAPPER dummy;
+CREATE FOREIGN TABLE test_foreign_table (a int) SERVER dummy_server;
+copy test_foreign_table from stdin with (header match, freeze on);
+a
+1
+\.
+ROLLBACK;
-- 
2.47.1

#2Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#1)
Re: Prevent COPY FREEZE on Foreign tables

On Mon, Feb 03, 2025 at 02:18:21PM -0600, Sami Imseih wrote:

I think we should just block Foreign tables as we do with
partition tables. Attached patch does that.

Unless there's some existing way to specify a FREEZE option in the FDW API
(I assume there isn't), I agree with this.

I was also looking at why we block a parent from COPY FREEZE[1], but
the comments do not convince me this is a good idea. I think there
are good cases to allow this considering there is a common use case in
which a single
COPY command can load a large amount of data, making the overhead to check the
partitions worth the value of the FREEZE optimization. I will probably
start a separate thread for this.

Commit 5c9a551 and its thread [0]/messages/by-id/CAKJS1f9BbJ+FY3TbdCiap3qXHTFOiwtays9s36-oQkkM9_R5bg@mail.gmail.com appear to have some additional details
about this.

[0]: /messages/by-id/CAKJS1f9BbJ+FY3TbdCiap3qXHTFOiwtays9s36-oQkkM9_R5bg@mail.gmail.com

--
nathan

#3Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#2)
Re: Prevent COPY FREEZE on Foreign tables

Unless there's some existing way to specify a FREEZE option in the FDW API
(I assume there isn't), I agree with this.

I am not aware of any way to do this either as postgres_fdw is simply
using libpq.

I was also looking at why we block a parent from COPY FREEZE[1], but

Commit 5c9a551 and its thread [0] appear to have some additional details
about this.

[0] /messages/by-id/CAKJS1f9BbJ+FY3TbdCiap3qXHTFOiwtays9s36-oQkkM9_R5bg@mail.gmail.com

Thanks for the thread. I will go through it.
I did not too much investigation here yet but I can imagine this might
be a worthwhile
to see the possibility to remove this limitation on a partitioned table.

Regards,

Sami

#4Zhang Mingli
zmlpostgres@gmail.com
In reply to: Sami Imseih (#1)
Re: Prevent COPY FREEZE on Foreign tables

Hi,

Zhang Mingli
www.hashdata.xyz
On Feb 4, 2025 at 04:18 +0800, Sami Imseih <samimseih@gmail.com>, wrote:

This really does not make sense as this
optimization cannot be applied to a remote table and it
can give a user a false impression that it was.

+1,

```
+ /*
+ * Raise an error on foreign tables as it's not possible to apply
+ * the COPY FREEZE optimization to a remote relation.
+ */
+ if (cstate->rel->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("cannot perform COPY FREEZE on a foreign table")));
+ }
+
```
Instead of throwing an error, how about we turn that into a warning?
This way, if someone is batch-importing data for multiple tables, it won’t interrupt their script that generates a COPY for each table.
Is it better to give them a heads-up without making them modify their commands right away?
#5Sami Imseih
samimseih@gmail.com
In reply to: Zhang Mingli (#4)
Re: Prevent COPY FREEZE on Foreign tables

Thanks for the feedback!

Instead of throwing an error, how about we turn that into a warning?
This way, if someone is batch-importing data for multiple tables, it won’t interrupt their script that generates a COPY for each table.
Is it better to give them a heads-up without making them modify their commands right away?

Hmm, I prefer we error out the transaction as it can be difficult
to detect a warning and the user will assume that the
transaction completed. Also, we currently error out when
copy freeze is on a parent partition, so I rather we keep
this behavior consistent.

Regards,

Sami

Show quoted text

On Mon, Feb 3, 2025 at 7:31 PM Zhang Mingli <zmlpostgres@gmail.com> wrote:

Hi,

Zhang Mingli
www.hashdata.xyz
On Feb 4, 2025 at 04:18 +0800, Sami Imseih <samimseih@gmail.com>, wrote:

This really does not make sense as this
optimization cannot be applied to a remote table and it
can give a user a false impression that it was.

+1,

```
+ /*
+ * Raise an error on foreign tables as it's not possible to apply
+ * the COPY FREEZE optimization to a remote relation.
+ */
+ if (cstate->rel->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("cannot perform COPY FREEZE on a foreign table")));
+ }
+
```
Instead of throwing an error, how about we turn that into a warning?
This way, if someone is batch-importing data for multiple tables, it won’t interrupt their script that generates a COPY for each table.
Is it better to give them a heads-up without making them modify their commands right away?
#6Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#5)
Re: Prevent COPY FREEZE on Foreign tables

my apologies for the top post in the last reply. I hit send too fast :)

I also created a CF entry for this https://commitfest.postgresql.org/52/5544/

I don't think we will need to backpatch, unless someone has a different
opinion about this.

Regards,

Sami

#7Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#5)
Re: Prevent COPY FREEZE on Foreign tables

On Wed, Feb 05, 2025 at 01:05:32PM -0600, Sami Imseih wrote:

Instead of throwing an error, how about we turn that into a warning?
This way, if someone is batch-importing data for multiple tables, it won�t interrupt their script that generates a COPY for each table.
Is it better to give them a heads-up without making them modify their commands right away?

Hmm, I prefer we error out the transaction as it can be difficult
to detect a warning and the user will assume that the
transaction completed. Also, we currently error out when
copy freeze is on a parent partition, so I rather we keep
this behavior consistent.

Yeah, I'd rather error out than expect users to respond to warnings to the
effect of "hey, you told us to do something, but we did something else that
isn't what you asked us to do." That both retains the broken feature and
adds more noise, neither of which seems desirable.

--
nathan

#8Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#7)
Re: Prevent COPY FREEZE on Foreign tables

Yeah, I'd rather error out than expect users to respond to warnings to the
effect of "hey, you told us to do something, but we did something else that
isn't what you asked us to do." That both retains the broken feature and
adds more noise, neither of which seems desirable.

I agree.

Sami

#9Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#6)
Re: Prevent COPY FREEZE on Foreign tables

On Wed, Feb 05, 2025 at 01:11:44PM -0600, Sami Imseih wrote:

I don't think we will need to backpatch, unless someone has a different
opinion about this.

I think this falls into the category of a bug, so the only reason I can see
for not back-patching it is that it could lead to somewhat widespread
breakage for existing scripts, etc. which are arguably kind-of working
today. That seems unlikely, but it'd be good to hear some other opinions
on the matter.

--
nathan

#10Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#9)
Re: Prevent COPY FREEZE on Foreign tables

so the only reason I can see
for not back-patching it is that it could lead to somewhat widespread
breakage for existing scripts, etc. which are arguably kind-of working
today.

That is my thought for not backpatching. It's not breaking the COPY
command, it's just not applying an optimization.

Let's see what others think.

Sami

#11Zhang Mingli
zmlpostgres@gmail.com
In reply to: Sami Imseih (#8)
Re: Prevent COPY FREEZE on Foreign tables

On Feb 6, 2025 at 03:27 +0800, Sami Imseih <samimseih@gmail.com>, wrote:

Yeah, I'd rather error out than expect users to respond to warnings to the
effect of "hey, you told us to do something, but we did something else that
isn't what you asked us to do." That both retains the broken feature and
adds more noise, neither of which seems desirable.

I agree.

Sami

Hi,

I understand your perspective, and I'm okay with it.
Since partitioning is currently unsupported in PostgreSQL, returning an error makes sense.
My reason for the warning on foreign tables is: that mainly to inform users that the behavior is unchanged (data copied), they just need to correct their commands.

When COPY FREEZE for partitioned tables becomes supported, we’ll need to ensure that none of the partitions are foreign tables.
If we find any, the FREEZE operation won’t be applicable. However, that's a consideration for the future.

LGTM.

--
Zhang Mingli
HashData

#12Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#10)
Re: Prevent COPY FREEZE on Foreign tables

On Wed, Feb 05, 2025 at 01:36:57PM -0600, Sami Imseih wrote:

so the only reason I can see
for not back-patching it is that it could lead to somewhat widespread
breakage for existing scripts, etc. which are arguably kind-of working
today.

That is my thought for not backpatching. It's not breaking the COPY
command, it's just not applying an optimization.

Let's see what others think.

Given we don't have any other reports or opinions, I'm probably just going
to apply the patch to v18. That leaves the door open to back-patch later.
The next minor release is next week, anyway, and this doesn't seem urgent.

I did find one other thread that mentions this behavior [0]/messages/by-id/20181219023814.GA6577@paquier.xyz. They floated
the idea of blocking COPY FREEZE on foreign tables, but the topic fizzled
out in indecision. *shrug*

[0]: /messages/by-id/20181219023814.GA6577@paquier.xyz

--
nathan

#13Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#12)
Re: Prevent COPY FREEZE on Foreign tables

Given we don't have any other reports or opinions, I'm probably just going
to apply the patch to v18.

That makes sense. Thanks!

Sami

#14Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#13)
Re: Prevent COPY FREEZE on Foreign tables

On Thu, Feb 06, 2025 at 10:26:09AM -0600, Sami Imseih wrote:

That makes sense. Thanks!

Committed.

--
nathan