Regression in COPY FROM caused by 9f8377f7a2

Started by Laurenz Albeover 2 years ago9 messages
#1Laurenz Albe
laurenz.albe@cybertec.at

In v16 and later, the following fails:

CREATE TABLE boom (t character varying(5) DEFAULT 'a long string');

COPY boom FROM STDIN;
ERROR: value too long for type character varying(5)

In PostgreSQL v15 and earlier, the COPY statement succeeds.

The error is thrown in BeginCopyFrom in line 1578 (HEAD)

defexpr = expression_planner(defexpr);

Bisecting shows that the regression was introduced by commit 9f8377f7a2,
which introduced DEFAULT values for COPY FROM.

The table definition is clearly silly, so I am not sure if that
regression is worth fixing. On the other hand, it is not cool if
something that worked without an error in v15 starts to fail later on.

Yours,
Laurenz Albe

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Laurenz Albe (#1)
1 attachment(s)
Re: Regression in COPY FROM caused by 9f8377f7a2

On Mon, 2023-09-25 at 09:54 +0200, Laurenz Albe wrote:

In v16 and later, the following fails:

CREATE TABLE boom (t character varying(5) DEFAULT 'a long string');

COPY boom FROM STDIN;
ERROR:  value too long for type character varying(5)

In PostgreSQL v15 and earlier, the COPY statement succeeds.

The error is thrown in BeginCopyFrom in line 1578 (HEAD)

  defexpr = expression_planner(defexpr);

Bisecting shows that the regression was introduced by commit 9f8377f7a2,
which introduced DEFAULT values for COPY FROM.

I suggest the attached fix, which evaluates default values only if
the DEFAULT option was specified or if the column does not appear in
the column list of COPY.

Yours,
Laurenz Albe

Attachments:

0001-Evaluate-defaults-in-COPY-FROM-only-if-necessary.patchtext/x-patch; charset=UTF-8; name=0001-Evaluate-defaults-in-COPY-FROM-only-if-necessary.patchDownload
From 4af982c56df57a1a0f04340d394c297559fbabb5 Mon Sep 17 00:00:00 2001
From: Laurenz Albe <laurenz.albe@cybertec.at>
Date: Mon, 25 Sep 2023 10:56:15 +0200
Subject: [PATCH] Evaluate defaults in COPY FROM only if necessary

Since commit 9f8377f7a2, we evaluate the column default values in
COPY FROM for all columns except generated ones, because they could
be needed if the input value matches the DEFAULT option.

This can cause a surprising regression:

  CREATE TABLE boom (t character varying(5) DEFAULT 'a long string');
  COPY boom FROM STDIN;
  ERROR:  value too long for type character varying(5)

This worked before 9f8377f7a2, since default values were only
evaluated for columns that were not specified in the column list.

To fix, fetch the default values only if the DEFAULT option was
specified or for columns not specified in the column list.
---
 src/backend/commands/copyfrom.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 70871ed819..320b764aa9 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1567,7 +1567,14 @@ BeginCopyFrom(ParseState *pstate,
 		/* Get default info if available */
 		defexprs[attnum - 1] = NULL;
 
-		if (!att->attgenerated)
+		/*
+		 * We need the default values only for columns that do not appear in the
+		 * column list or if the DEFAULT option was given.  We also don't need
+		 * it for generated columns.
+		 */
+		if ((!list_member_int(cstate->attnumlist, attnum) ||
+			 cstate->opts.default_print != NULL) &&
+			!att->attgenerated)
 		{
 			Expr	   *defexpr = (Expr *) build_column_default(cstate->rel,
 																attnum);
-- 
2.41.0

#3Andrew Dunstan
andrew@dunslane.net
In reply to: Laurenz Albe (#2)
Re: Regression in COPY FROM caused by 9f8377f7a2

On 2023-09-25 Mo 04:59, Laurenz Albe wrote:

On Mon, 2023-09-25 at 09:54 +0200, Laurenz Albe wrote:

In v16 and later, the following fails:

CREATE TABLE boom (t character varying(5) DEFAULT 'a long string');

COPY boom FROM STDIN;
ERROR:  value too long for type character varying(5)

In PostgreSQL v15 and earlier, the COPY statement succeeds.

The error is thrown in BeginCopyFrom in line 1578 (HEAD)

  defexpr = expression_planner(defexpr);

Bisecting shows that the regression was introduced by commit 9f8377f7a2,
which introduced DEFAULT values for COPY FROM.

Oops :-(

I suggest the attached fix, which evaluates default values only if
the DEFAULT option was specified or if the column does not appear in
the column list of COPY.

Patch looks reasonable, haven't tested yet.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#4Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#3)
Re: Regression in COPY FROM caused by 9f8377f7a2

On 2023-09-25 Mo 11:06, Andrew Dunstan wrote:

On 2023-09-25 Mo 04:59, Laurenz Albe wrote:

On Mon, 2023-09-25 at 09:54 +0200, Laurenz Albe wrote:

In v16 and later, the following fails:

CREATE TABLE boom (t character varying(5) DEFAULT 'a long string');

COPY boom FROM STDIN;
ERROR:  value too long for type character varying(5)

In PostgreSQL v15 and earlier, the COPY statement succeeds.

The error is thrown in BeginCopyFrom in line 1578 (HEAD)

  defexpr = expression_planner(defexpr);

Bisecting shows that the regression was introduced by commit 9f8377f7a2,
which introduced DEFAULT values for COPY FROM.

Thinking about this a little more, wouldn't it be better if we checked
at the time we set the default that the value is actually valid for the
given column? This is only one manifestation of a problem you could run
into given this table definition.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#4)
Re: Regression in COPY FROM caused by 9f8377f7a2

Andrew Dunstan <andrew@dunslane.net> writes:

On 2023-09-25 Mo 11:06, Andrew Dunstan wrote:

On 2023-09-25 Mo 04:59, Laurenz Albe wrote:

CREATE TABLE boom (t character varying(5) DEFAULT 'a long string');

Thinking about this a little more, wouldn't it be better if we checked
at the time we set the default that the value is actually valid for the
given column? This is only one manifestation of a problem you could run
into given this table definition.

I dunno, it seems at least possible that someone would do this
deliberately as a means of preventing the column from being defaulted.
In any case, the current behavior has stood for a very long time and
no one has complained that an error should be thrown sooner.

regards, tom lane

#6Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Tom Lane (#5)
Re: Regression in COPY FROM caused by 9f8377f7a2

On Mon, 2023-09-25 at 17:49 -0400, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

On 2023-09-25 Mo 11:06, Andrew Dunstan wrote:

On 2023-09-25 Mo 04:59, Laurenz Albe wrote:

CREATE TABLE boom (t character varying(5) DEFAULT 'a long string');

Thinking about this a little more, wouldn't it be better if we checked
at the time we set the default that the value is actually valid for the
given column? This is only one manifestation of a problem you could run
into given this table definition.

I dunno, it seems at least possible that someone would do this
deliberately as a means of preventing the column from being defaulted.
In any case, the current behavior has stood for a very long time and
no one has complained that an error should be thrown sooner.

Moreover, this makes restoring a pg_dump from v15 to v16 fail, which
should never happen. This is how I got that bug report.

Yours,
Laurenz Albe

#7Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Laurenz Albe (#6)
1 attachment(s)
Re: Regression in COPY FROM caused by 9f8377f7a2

Here is an improved version of the patch with regression tests.

Yours,
Laurenz Albe

Attachments:

0001-Evaluate-defaults-in-COPY-FROM-only-if-necessary.V2.patchtext/x-patch; charset=UTF-8; name=0001-Evaluate-defaults-in-COPY-FROM-only-if-necessary.V2.patchDownload
From 71744ada1e2c8cfdbb57e03018572a1af623b09e Mon Sep 17 00:00:00 2001
From: Laurenz Albe <laurenz.albe@cybertec.at>
Date: Tue, 26 Sep 2023 10:09:49 +0200
Subject: [PATCH] Evaluate defaults in COPY FROM only if necessary

Since commit 9f8377f7a2, we evaluate the column default values in
COPY FROM for all columns except generated ones, because they could
be needed if the input value matches the DEFAULT option.

This can cause a surprising regression:

  CREATE TABLE boom (t character varying(5) DEFAULT 'a long string');
  COPY boom FROM STDIN;
  ERROR:  value too long for type character varying(5)

This worked before 9f8377f7a2, since default values were only
evaluated for columns that were not specified in the column list.

To fix, fetch the default values only if the DEFAULT option was
specified or for columns not specified in the column list.
---
 src/backend/commands/copyfrom.c    |  9 ++++++++-
 src/test/regress/expected/copy.out | 17 +++++++++++++++++
 src/test/regress/sql/copy.sql      | 15 +++++++++++++++
 3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 70871ed819..3f3e631dee 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1567,7 +1567,14 @@ BeginCopyFrom(ParseState *pstate,
 		/* Get default info if available */
 		defexprs[attnum - 1] = NULL;
 
-		if (!att->attgenerated)
+		/*
+		 * We need the default values only for columns that do not appear in the
+		 * column list.  But if the DEFAULT option was given, we may need all
+		 * column default values.  We never need defaults for generated columns.
+		 */
+		if ((cstate->opts.default_print != NULL ||
+			 !list_member_int(cstate->attnumlist, attnum)) &&
+			!att->attgenerated)
 		{
 			Expr	   *defexpr = (Expr *) build_column_default(cstate->rel,
 																attnum);
diff --git a/src/test/regress/expected/copy.out b/src/test/regress/expected/copy.out
index 8a8bf43fde..a5912c13a8 100644
--- a/src/test/regress/expected/copy.out
+++ b/src/test/regress/expected/copy.out
@@ -240,3 +240,20 @@ SELECT * FROM header_copytest ORDER BY a;
 (5 rows)
 
 drop table header_copytest;
+-- test COPY with overlong column defaults
+create temp table oversized_column_default (
+    col1 varchar(5) DEFAULT 'more than 5 chars',
+    col2 varchar(5));
+-- normal COPY should work
+copy oversized_column_default from stdin;
+-- error if the column is excluded
+copy oversized_column_default (col2) from stdin;
+ERROR:  value too long for type character varying(5)
+\.
+invalid command \.
+-- error if the DEFAULT option is given
+copy oversized_column_default from stdin (default '');
+ERROR:  value too long for type character varying(5)
+\.
+invalid command \.
+drop table oversized_column_default;
diff --git a/src/test/regress/sql/copy.sql b/src/test/regress/sql/copy.sql
index f9da7b1508..7fdb26d14f 100644
--- a/src/test/regress/sql/copy.sql
+++ b/src/test/regress/sql/copy.sql
@@ -268,3 +268,18 @@ a	c	b
 
 SELECT * FROM header_copytest ORDER BY a;
 drop table header_copytest;
+
+-- test COPY with overlong column defaults
+create temp table oversized_column_default (
+    col1 varchar(5) DEFAULT 'more than 5 chars',
+    col2 varchar(5));
+-- normal COPY should work
+copy oversized_column_default from stdin;
+\.
+-- error if the column is excluded
+copy oversized_column_default (col2) from stdin;
+\.
+-- error if the DEFAULT option is given
+copy oversized_column_default from stdin (default '');
+\.
+drop table oversized_column_default;
-- 
2.41.0

#8Andrew Dunstan
andrew@dunslane.net
In reply to: Laurenz Albe (#7)
Re: Regression in COPY FROM caused by 9f8377f7a2

On 2023-09-26 Tu 04:11, Laurenz Albe wrote:

Here is an improved version of the patch with regression tests.

Thanks, pushed.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#9Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Andrew Dunstan (#8)
Re: Regression in COPY FROM caused by 9f8377f7a2

On Sun, 2023-10-01 at 10:55 -0400, Andrew Dunstan wrote:

Thanks, pushed.

Thanks for taking care of that.

Yours,
Laurenz Albe