Prevent COPY FREEZE on Foreign tables

Started by Sami Imseihabout 1 year ago14 messages
Jump to latest
#1Sami Imseih
samimseih@gmail.com

Hi,

I was looking at COPY FREEZE and I found that it's
possible to run this command on a foreign table,
This really does not make sense as this
optimization cannot be applied to a remote table and it
can give a user a false impression that it was.

"""
postgres=# begin;
BEGIN
postgres=*# create foreign table t1 (id int) server r1;
CREATE FOREIGN TABLE
postgres=*# copy t1 FROM '/tmp/copy_data' freeze;
COPY 999999

-- on the foreign server

postgres=# SELECT count(*), all_visible, all_frozen, pd_all_visible
FROM pg_visibility('t1'::regclass)
group by all_visible, all_frozen, pd_all_visible;

count | all_visible | all_frozen | pd_all_visible
-------+-------------+------------+----------------
5 | f | f | f
(1 row)

"""

The other issue here is that one can only use COPY FREEZE
on a foreign table only if the foreign table is created in the
transaction. A truncate will not work, making the error
message wrong.

"""
postgres=# begin;
BEGIN
postgres=*# truncate table foreign_table_1;
TRUNCATE TABLE
postgres=*# copy foreign_table_1 FROM 'copy_data' freeze;
ERROR: cannot perform COPY FREEZE because the table was not created
or truncated in the current subtransaction
postgres=!#
"""

I think we should just block Foreign tables as we do with
partition tables. Attached patch does that.

I was also looking at why we block a parent from COPY FREEZE[1]https://github.com/postgres/postgres/blob/master/src/backend/commands/copyfrom.c#L727-L735, but
the comments do not convince me this is a good idea. I think there
are good cases to allow this considering there is a common use case in
which a single
COPY command can load a large amount of data, making the overhead to check the
partitions worth the value of the FREEZE optimization. I will probably
start a separate thread for this.

Regards,

Sami Imseih
Amazon Web Services (AWS)

[1]: https://github.com/postgres/postgres/blob/master/src/backend/commands/copyfrom.c#L727-L735

Attachments:

v1-0001-Disallow-Foreign-Tables-with-COPY-FREEZE.patchapplication/octet-stream; name=v1-0001-Disallow-Foreign-Tables-with-COPY-FREEZE.patchDownload+31-2
#2Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#1)
Re: Prevent COPY FREEZE on Foreign tables

On Mon, Feb 03, 2025 at 02:18:21PM -0600, Sami Imseih wrote:

I think we should just block Foreign tables as we do with
partition tables. Attached patch does that.

Unless there's some existing way to specify a FREEZE option in the FDW API
(I assume there isn't), I agree with this.

I was also looking at why we block a parent from COPY FREEZE[1], but
the comments do not convince me this is a good idea. I think there
are good cases to allow this considering there is a common use case in
which a single
COPY command can load a large amount of data, making the overhead to check the
partitions worth the value of the FREEZE optimization. I will probably
start a separate thread for this.

Commit 5c9a551 and its thread [0]/messages/by-id/CAKJS1f9BbJ+FY3TbdCiap3qXHTFOiwtays9s36-oQkkM9_R5bg@mail.gmail.com appear to have some additional details
about this.

[0]: /messages/by-id/CAKJS1f9BbJ+FY3TbdCiap3qXHTFOiwtays9s36-oQkkM9_R5bg@mail.gmail.com

--
nathan

#3Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#2)
Re: Prevent COPY FREEZE on Foreign tables

Unless there's some existing way to specify a FREEZE option in the FDW API
(I assume there isn't), I agree with this.

I am not aware of any way to do this either as postgres_fdw is simply
using libpq.

I was also looking at why we block a parent from COPY FREEZE[1], but

Commit 5c9a551 and its thread [0] appear to have some additional details
about this.

[0] /messages/by-id/CAKJS1f9BbJ+FY3TbdCiap3qXHTFOiwtays9s36-oQkkM9_R5bg@mail.gmail.com

Thanks for the thread. I will go through it.
I did not too much investigation here yet but I can imagine this might
be a worthwhile
to see the possibility to remove this limitation on a partitioned table.

Regards,

Sami

#4Zhang Mingli
zmlpostgres@gmail.com
In reply to: Sami Imseih (#1)
Re: Prevent COPY FREEZE on Foreign tables

Hi,

Zhang Mingli
www.hashdata.xyz
On Feb 4, 2025 at 04:18 +0800, Sami Imseih <samimseih@gmail.com>, wrote:

This really does not make sense as this
optimization cannot be applied to a remote table and it
can give a user a false impression that it was.

+1,

```
+ /*
+ * Raise an error on foreign tables as it's not possible to apply
+ * the COPY FREEZE optimization to a remote relation.
+ */
+ if (cstate->rel->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("cannot perform COPY FREEZE on a foreign table")));
+ }
+
```
Instead of throwing an error, how about we turn that into a warning?
This way, if someone is batch-importing data for multiple tables, it won’t interrupt their script that generates a COPY for each table.
Is it better to give them a heads-up without making them modify their commands right away?
#5Sami Imseih
samimseih@gmail.com
In reply to: Zhang Mingli (#4)
Re: Prevent COPY FREEZE on Foreign tables

Thanks for the feedback!

Instead of throwing an error, how about we turn that into a warning?
This way, if someone is batch-importing data for multiple tables, it won’t interrupt their script that generates a COPY for each table.
Is it better to give them a heads-up without making them modify their commands right away?

Hmm, I prefer we error out the transaction as it can be difficult
to detect a warning and the user will assume that the
transaction completed. Also, we currently error out when
copy freeze is on a parent partition, so I rather we keep
this behavior consistent.

Regards,

Sami

Show quoted text

On Mon, Feb 3, 2025 at 7:31 PM Zhang Mingli <zmlpostgres@gmail.com> wrote:

Hi,

Zhang Mingli
www.hashdata.xyz
On Feb 4, 2025 at 04:18 +0800, Sami Imseih <samimseih@gmail.com>, wrote:

This really does not make sense as this
optimization cannot be applied to a remote table and it
can give a user a false impression that it was.

+1,

```
+ /*
+ * Raise an error on foreign tables as it's not possible to apply
+ * the COPY FREEZE optimization to a remote relation.
+ */
+ if (cstate->rel->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("cannot perform COPY FREEZE on a foreign table")));
+ }
+
```
Instead of throwing an error, how about we turn that into a warning?
This way, if someone is batch-importing data for multiple tables, it won’t interrupt their script that generates a COPY for each table.
Is it better to give them a heads-up without making them modify their commands right away?
#6Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#5)
Re: Prevent COPY FREEZE on Foreign tables

my apologies for the top post in the last reply. I hit send too fast :)

I also created a CF entry for this https://commitfest.postgresql.org/52/5544/

I don't think we will need to backpatch, unless someone has a different
opinion about this.

Regards,

Sami

#7Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#5)
Re: Prevent COPY FREEZE on Foreign tables

On Wed, Feb 05, 2025 at 01:05:32PM -0600, Sami Imseih wrote:

Instead of throwing an error, how about we turn that into a warning?
This way, if someone is batch-importing data for multiple tables, it won�t interrupt their script that generates a COPY for each table.
Is it better to give them a heads-up without making them modify their commands right away?

Hmm, I prefer we error out the transaction as it can be difficult
to detect a warning and the user will assume that the
transaction completed. Also, we currently error out when
copy freeze is on a parent partition, so I rather we keep
this behavior consistent.

Yeah, I'd rather error out than expect users to respond to warnings to the
effect of "hey, you told us to do something, but we did something else that
isn't what you asked us to do." That both retains the broken feature and
adds more noise, neither of which seems desirable.

--
nathan

#8Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#7)
Re: Prevent COPY FREEZE on Foreign tables

Yeah, I'd rather error out than expect users to respond to warnings to the
effect of "hey, you told us to do something, but we did something else that
isn't what you asked us to do." That both retains the broken feature and
adds more noise, neither of which seems desirable.

I agree.

Sami

#9Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#6)
Re: Prevent COPY FREEZE on Foreign tables

On Wed, Feb 05, 2025 at 01:11:44PM -0600, Sami Imseih wrote:

I don't think we will need to backpatch, unless someone has a different
opinion about this.

I think this falls into the category of a bug, so the only reason I can see
for not back-patching it is that it could lead to somewhat widespread
breakage for existing scripts, etc. which are arguably kind-of working
today. That seems unlikely, but it'd be good to hear some other opinions
on the matter.

--
nathan

#10Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#9)
Re: Prevent COPY FREEZE on Foreign tables

so the only reason I can see
for not back-patching it is that it could lead to somewhat widespread
breakage for existing scripts, etc. which are arguably kind-of working
today.

That is my thought for not backpatching. It's not breaking the COPY
command, it's just not applying an optimization.

Let's see what others think.

Sami

#11Zhang Mingli
zmlpostgres@gmail.com
In reply to: Sami Imseih (#8)
Re: Prevent COPY FREEZE on Foreign tables

On Feb 6, 2025 at 03:27 +0800, Sami Imseih <samimseih@gmail.com>, wrote:

Yeah, I'd rather error out than expect users to respond to warnings to the
effect of "hey, you told us to do something, but we did something else that
isn't what you asked us to do." That both retains the broken feature and
adds more noise, neither of which seems desirable.

I agree.

Sami

Hi,

I understand your perspective, and I'm okay with it.
Since partitioning is currently unsupported in PostgreSQL, returning an error makes sense.
My reason for the warning on foreign tables is: that mainly to inform users that the behavior is unchanged (data copied), they just need to correct their commands.

When COPY FREEZE for partitioned tables becomes supported, we’ll need to ensure that none of the partitions are foreign tables.
If we find any, the FREEZE operation won’t be applicable. However, that's a consideration for the future.

LGTM.

--
Zhang Mingli
HashData

#12Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#10)
Re: Prevent COPY FREEZE on Foreign tables

On Wed, Feb 05, 2025 at 01:36:57PM -0600, Sami Imseih wrote:

so the only reason I can see
for not back-patching it is that it could lead to somewhat widespread
breakage for existing scripts, etc. which are arguably kind-of working
today.

That is my thought for not backpatching. It's not breaking the COPY
command, it's just not applying an optimization.

Let's see what others think.

Given we don't have any other reports or opinions, I'm probably just going
to apply the patch to v18. That leaves the door open to back-patch later.
The next minor release is next week, anyway, and this doesn't seem urgent.

I did find one other thread that mentions this behavior [0]/messages/by-id/20181219023814.GA6577@paquier.xyz. They floated
the idea of blocking COPY FREEZE on foreign tables, but the topic fizzled
out in indecision. *shrug*

[0]: /messages/by-id/20181219023814.GA6577@paquier.xyz

--
nathan

#13Sami Imseih
samimseih@gmail.com
In reply to: Nathan Bossart (#12)
Re: Prevent COPY FREEZE on Foreign tables

Given we don't have any other reports or opinions, I'm probably just going
to apply the patch to v18.

That makes sense. Thanks!

Sami

#14Nathan Bossart
nathandbossart@gmail.com
In reply to: Sami Imseih (#13)
Re: Prevent COPY FREEZE on Foreign tables

On Thu, Feb 06, 2025 at 10:26:09AM -0600, Sami Imseih wrote:

That makes sense. Thanks!

Committed.

--
nathan