Add new error_action COPY ON_ERROR "log"

Started by torikoshiaalmost 2 years ago58 messages
#1torikoshia
torikoshia@oss.nttdata.com
1 attachment(s)

Hi,

As described in 9e2d870119, COPY ON_EEOR is expected to have more
"error_action".
(Note that option name was changed by b725b7eec)

I'd like to have a new option "log", which skips soft errors and logs
information that should have resulted in errors to PostgreSQL log.

I think this option has some advantages like below:

1) We can know which number of line input data was not loaded and
reason.

Example:

=# copy t1 from stdin with (on_error log);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF
signal.

1
2
3
z
\.

LOG: invalid input syntax for type integer: "z"
NOTICE: 1 row was skipped due to data type incompatibility
COPY 3

=# \! tail data/log/postgresql*.log
LOG: 22P02: invalid input syntax for type integer: "z"
CONTEXT: COPY t1, line 4, column i: "z"
LOCATION: pg_strtoint32_safe, numutils.c:620
STATEMENT: copy t1 from stdin with (on_error log);

2) Easier maintenance than storing error information in tables or
proprietary log files.
For example, in case a large number of soft errors occur, some
mechanisms are needed to prevent an infinite increase in the size of the
destination data, but we can left it to PostgreSQL's log rotation.

Attached a patch.
This basically comes from previous discussion[1]/messages/by-id/c0fb57b82b150953f26a5c7e340412e8@oss.nttdata.com which did both "ignore"
and "log" soft error.

As shown in the example above, the log output to the client does not
contain CONTEXT, so I'm a little concerned that client cannot see what
line of the input data had a problem without looking at the server log.

What do you think?

[1]: /messages/by-id/c0fb57b82b150953f26a5c7e340412e8@oss.nttdata.com
/messages/by-id/c0fb57b82b150953f26a5c7e340412e8@oss.nttdata.com

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

Attachments:

v1-0001-Add-new-error_action-log-to-ON_ERROR.patchtext/x-diff; name=v1-0001-Add-new-error_action-log-to-ON_ERROR.patchDownload
From 04e643facfea4b4e8dd174d22fbe5e008747a91a Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Date: Fri, 26 Jan 2024 01:17:59 +0900
Subject: [PATCH v1] Add new error_action "log" to ON_ERROR option

Currently ON_ERROR option only has "ignore" to skip malformed data and
there are no ways to know where and why COPY skipped them.

"log" skips malformed data as well as "ignore", but it logs information that
should have resulted in errors to PostgreSQL log.


---
 doc/src/sgml/ref/copy.sgml          |  8 ++++++--
 src/backend/commands/copy.c         |  4 +++-
 src/backend/commands/copyfrom.c     | 24 ++++++++++++++++++++----
 src/include/commands/copy.h         |  1 +
 src/test/regress/expected/copy2.out | 14 +++++++++-----
 src/test/regress/sql/copy2.sql      |  9 +++++++++
 6 files changed, 48 insertions(+), 12 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 21a5c4a052..9662c90a8b 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -380,12 +380,16 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
      <para>
       Specifies which <replaceable class="parameter">
       error_action</replaceable> to perform when there is malformed data in the input.
-      Currently, only <literal>stop</literal> (default) and <literal>ignore</literal>
-      values are supported.
+      Currently, only <literal>stop</literal> (default), <literal>ignore</literal>
+      and <literal>log</literal> values are supported.
       If the <literal>stop</literal> value is specified,
       <command>COPY</command> stops operation at the first error.
       If the <literal>ignore</literal> value is specified,
       <command>COPY</command> skips malformed data and continues copying data.
+      If the <literal>log</literal> value is specified,
+      <command>COPY</command> behaves the same as <literal>ignore</literal>, exept that
+      it logs information that should have resulted in errors to PostgreSQL log at
+      <literal>INFO</literal> level.
       The option is allowed only in <command>COPY FROM</command>.
       Only <literal>stop</literal> value is allowed when
       using <literal>binary</literal> format.
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index cc0786c6f4..812ca63350 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -415,13 +415,15 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
 		return COPY_ON_ERROR_STOP;
 
 	/*
-	 * Allow "stop", or "ignore" values.
+	 * Allow "stop", "ignore" or "log" values.
 	 */
 	sval = defGetString(def);
 	if (pg_strcasecmp(sval, "stop") == 0)
 		return COPY_ON_ERROR_STOP;
 	if (pg_strcasecmp(sval, "ignore") == 0)
 		return COPY_ON_ERROR_IGNORE;
+	if (pg_strcasecmp(sval, "log") == 0)
+		return COPY_ON_ERROR_LOG;
 
 	ereport(ERROR,
 			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 1fe70b9133..7886bd5353 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1013,6 +1013,23 @@ CopyFrom(CopyFromState cstate)
 				 */
 				cstate->escontext->error_occurred = false;
 
+			else if (cstate->opts.on_error == COPY_ON_ERROR_LOG)
+			{
+				/* Adjust elevel so we don't jump out */
+				cstate->escontext->error_data->elevel = LOG;
+
+				/*
+				 * Despite the name, this won't raise an error since elevel is
+				 * LOG now.
+				 */
+				ThrowErrorData(cstate->escontext->error_data);
+
+				/* Initialize escontext in preparation for next soft error */
+				cstate->escontext->error_occurred = false;
+				cstate->escontext->details_wanted = true;
+				memset(cstate->escontext->error_data, 0, sizeof(ErrorData));
+			}
+
 			/* Report that this tuple was skipped by the ON_ERROR clause */
 			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_SKIPPED,
 										 ++skipped);
@@ -1462,12 +1479,11 @@ BeginCopyFrom(ParseState *pstate,
 		cstate->escontext->type = T_ErrorSaveContext;
 		cstate->escontext->error_occurred = false;
 
-		/*
-		 * Currently we only support COPY_ON_ERROR_IGNORE. We'll add other
-		 * options later
-		 */
+		/* Error Details are required except when "ignore" is specified */
 		if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
 			cstate->escontext->details_wanted = false;
+		else
+			cstate->escontext->details_wanted = true;
 	}
 	else
 		cstate->escontext = NULL;
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index b3da3cb0be..c61ac2445f 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -38,6 +38,7 @@ typedef enum CopyOnErrorChoice
 {
 	COPY_ON_ERROR_STOP = 0,		/* immediately throw errors, default */
 	COPY_ON_ERROR_IGNORE,		/* ignore errors */
+	COPY_ON_ERROR_LOG,			/* save error to PostgreSQL log */
 } CopyOnErrorChoice;
 
 /*
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 25c401ce34..dc3ac2b494 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -731,12 +731,16 @@ ERROR:  invalid input syntax for type integer: "a"
 CONTEXT:  COPY check_ign_err, line 2, column n: "a"
 COPY check_ign_err FROM STDIN WITH (on_error ignore);
 NOTICE:  4 rows were skipped due to data type incompatibility
+COPY check_ign_err FROM STDIN WITH (on_error log);
+NOTICE:  4 rows were skipped due to data type incompatibility
 SELECT * FROM check_ign_err;
- n |  m  | k 
----+-----+---
- 1 | {1} | 1
- 5 | {5} | 5
-(2 rows)
+ n  |  m   | k  
+----+------+----
+  1 | {1}  |  1
+  5 | {5}  |  5
+  6 | {6}  |  6
+ 10 | {10} | 10
+(4 rows)
 
 -- test datatype error that can't be handled as soft: should fail
 CREATE TABLE hard_err(foo widget);
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index b5e549e856..54e1bc7f91 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -516,6 +516,15 @@ a	{2}	2
 
 5	{5}	5
 \.
+
+COPY check_ign_err FROM STDIN WITH (on_error log);
+6	{6}	6
+a	{7}	7
+8	{8}	8888888888
+9	{a, 9}	9
+
+10	{10}	10
+\.
 SELECT * FROM check_ign_err;
 
 -- test datatype error that can't be handled as soft: should fail

base-commit: 66ea94e8e606529bb334515f388c62314956739e
-- 
2.39.2

#2jian he
jian.universality@gmail.com
In reply to: torikoshia (#1)
Re: Add new error_action COPY ON_ERROR "log"

On Fri, Jan 26, 2024 at 12:42 AM torikoshia <torikoshia@oss.nttdata.com> wrote:

Hi,

As described in 9e2d870119, COPY ON_EEOR is expected to have more
"error_action".
(Note that option name was changed by b725b7eec)

I'd like to have a new option "log", which skips soft errors and logs
information that should have resulted in errors to PostgreSQL log.

I think this option has some advantages like below:

1) We can know which number of line input data was not loaded and
reason.

Example:

=# copy t1 from stdin with (on_error log);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF
signal.

1
2
3
z
\.

LOG: invalid input syntax for type integer: "z"
NOTICE: 1 row was skipped due to data type incompatibility
COPY 3

=# \! tail data/log/postgresql*.log
LOG: 22P02: invalid input syntax for type integer: "z"
CONTEXT: COPY t1, line 4, column i: "z"
LOCATION: pg_strtoint32_safe, numutils.c:620
STATEMENT: copy t1 from stdin with (on_error log);

2) Easier maintenance than storing error information in tables or
proprietary log files.
For example, in case a large number of soft errors occur, some
mechanisms are needed to prevent an infinite increase in the size of the
destination data, but we can left it to PostgreSQL's log rotation.

Attached a patch.
This basically comes from previous discussion[1] which did both "ignore"
and "log" soft error.

As shown in the example above, the log output to the client does not
contain CONTEXT, so I'm a little concerned that client cannot see what
line of the input data had a problem without looking at the server log.

What do you think?

I doubt the following part:
If the <literal>log</literal> value is specified,
<command>COPY</command> behaves the same as
<literal>ignore</literal>, exept that
it logs information that should have resulted in errors to
PostgreSQL log at
<literal>INFO</literal> level.

I think it does something like:
When an error happens, cstate->escontext->error_data->elevel will be ERROR
you manually change the cstate->escontext->error_data->elevel to LOG,
then you call ThrowErrorData.

but it's not related to `<literal>INFO</literal> level`?
my log_min_messages is default, warning.

#3David G. Johnston
david.g.johnston@gmail.com
In reply to: torikoshia (#1)
Re: Add new error_action COPY ON_ERROR "log"

On Thu, Jan 25, 2024 at 9:42 AM torikoshia <torikoshia@oss.nttdata.com>
wrote:

Hi,

As described in 9e2d870119, COPY ON_EEOR is expected to have more
"error_action".
(Note that option name was changed by b725b7eec)

I'd like to have a new option "log", which skips soft errors and logs
information that should have resulted in errors to PostgreSQL log.

Seems like an easy win but largely unhelpful in the typical case. I
suppose ETL routines using this feature may be running on their machine
under root or "postgres" but in a system where they are not this very
useful information is inaccessible to them. I suppose the DBA could set up
an extractor to send these specific log lines elsewhere but that seems like
enough hassle to disfavor this approach and favor one that can place the
soft error data and feedback into user-specified tables in the same
database. Setting up temporary tables or unlogged tables probably is going
to be a more acceptable methodology than trying to get to the log files.

David J.

#4torikoshia
torikoshia@oss.nttdata.com
In reply to: David G. Johnston (#3)
1 attachment(s)
Re: Add new error_action COPY ON_ERROR "log"

On Fri, Jan 26, 2024 at 10:44 PM jian he <jian.universality@gmail.com>
wrote:

I doubt the following part:
If the <literal>log</literal> value is specified,
<command>COPY</command> behaves the same as
<literal>ignore</literal>, exept that
it logs information that should have resulted in errors to
PostgreSQL log at
<literal>INFO</literal> level.

I think it does something like:
When an error happens, cstate->escontext->error_data->elevel will be
ERROR
you manually change the cstate->escontext->error_data->elevel to LOG,
then you call ThrowErrorData.

but it's not related to `<literal>INFO</literal> level`?
my log_min_messages is default, warning.

Thanks!

Modified them to NOTICE in accordance with the following summary
message:

NOTICE: x row was skipped due to data type incompatibility

On 2024-01-27 00:43, David G. Johnston wrote:

On Thu, Jan 25, 2024 at 9:42 AM torikoshia
<torikoshia@oss.nttdata.com> wrote:

Hi,

As described in 9e2d870119, COPY ON_EEOR is expected to have more
"error_action".
(Note that option name was changed by b725b7eec)

I'd like to have a new option "log", which skips soft errors and
logs
information that should have resulted in errors to PostgreSQL log.

Seems like an easy win but largely unhelpful in the typical case. I
suppose ETL routines using this feature may be running on their
machine under root or "postgres" but in a system where they are not
this very useful information is inaccessible to them. I suppose the
DBA could set up an extractor to send these specific log lines
elsewhere but that seems like enough hassle to disfavor this approach
and favor one that can place the soft error data and feedback into
user-specified tables in the same database. Setting up temporary
tables or unlogged tables probably is going to be a more acceptable
methodology than trying to get to the log files.

David J.

I agree that not a few people would prefer to store error information in
tables and there have already been suggestions[1]/messages/by-id/CACJufxEkkqnozdnvNMGxVAA94KZaCPkYw_Cx4JKG9ueNaZma_A@mail.gmail.com.

OTOH not everyone thinks saving table information is the best idea[2]/messages/by-id/20231109002600.fuihn34bjqqgmbjm@awork3.anarazel.de.

I think it would be desirable for ON_ERROR to be in a form that allows
the user to choose where to store error information from among some
options, such as table, log and file.

"ON_ERROR log" would be useful at least in the case of 'running on their
machine under root or "postgres"' as you pointed out.

[1]: /messages/by-id/CACJufxEkkqnozdnvNMGxVAA94KZaCPkYw_Cx4JKG9ueNaZma_A@mail.gmail.com
/messages/by-id/CACJufxEkkqnozdnvNMGxVAA94KZaCPkYw_Cx4JKG9ueNaZma_A@mail.gmail.com

[2]: /messages/by-id/20231109002600.fuihn34bjqqgmbjm@awork3.anarazel.de
/messages/by-id/20231109002600.fuihn34bjqqgmbjm@awork3.anarazel.de

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

Attachments:

v2-0001-Add-new-error_action-log-to-ON_ERROR-option.patchtext/x-diff; name=v2-0001-Add-new-error_action-log-to-ON_ERROR-option.patchDownload
From 5f44cc7525641302842a3d67c14ebb09615bf67b Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Date: Mon, 29 Jan 2024 12:02:32 +0900
Subject: [PATCH v2] Add new error_action "log" to ON_ERROR option

Currently ON_ERROR option only has "ignore" to skip malformed data and
there are no ways to know where and why COPY skipped them.

"log" skips malformed data as well as "ignore", but it logs information that
should have resulted in errors to PostgreSQL log.
---
 doc/src/sgml/ref/copy.sgml          |  9 +++++++--
 src/backend/commands/copy.c         |  4 +++-
 src/backend/commands/copyfrom.c     | 24 ++++++++++++++++++++----
 src/include/commands/copy.h         |  1 +
 src/test/regress/expected/copy2.out | 18 +++++++++++++-----
 src/test/regress/sql/copy2.sql      |  9 +++++++++
 6 files changed, 53 insertions(+), 12 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 21a5c4a052..3d949f04a4 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -380,12 +380,17 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
      <para>
       Specifies which <replaceable class="parameter">
       error_action</replaceable> to perform when there is malformed data in the input.
-      Currently, only <literal>stop</literal> (default) and <literal>ignore</literal>
-      values are supported.
+      Currently, only <literal>stop</literal> (default), <literal>ignore</literal>
+      and <literal>log</literal> values are supported.
       If the <literal>stop</literal> value is specified,
       <command>COPY</command> stops operation at the first error.
       If the <literal>ignore</literal> value is specified,
       <command>COPY</command> skips malformed data and continues copying data.
+      If the <literal>log</literal> value is specified,
+      <command>COPY</command> behaves the same as <literal>ignore</literal>,
+      except that it logs information that should have resulted in errors to
+      <productname>PostgreSQL</productname> log at <literal>NOTICE</literal>
+      level.
       The option is allowed only in <command>COPY FROM</command>.
       Only <literal>stop</literal> value is allowed when
       using <literal>binary</literal> format.
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index cc0786c6f4..812ca63350 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -415,13 +415,15 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
 		return COPY_ON_ERROR_STOP;
 
 	/*
-	 * Allow "stop", or "ignore" values.
+	 * Allow "stop", "ignore" or "log" values.
 	 */
 	sval = defGetString(def);
 	if (pg_strcasecmp(sval, "stop") == 0)
 		return COPY_ON_ERROR_STOP;
 	if (pg_strcasecmp(sval, "ignore") == 0)
 		return COPY_ON_ERROR_IGNORE;
+	if (pg_strcasecmp(sval, "log") == 0)
+		return COPY_ON_ERROR_LOG;
 
 	ereport(ERROR,
 			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 1fe70b9133..f2438023c8 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1013,6 +1013,23 @@ CopyFrom(CopyFromState cstate)
 				 */
 				cstate->escontext->error_occurred = false;
 
+			else if (cstate->opts.on_error == COPY_ON_ERROR_LOG)
+			{
+				/* Adjust elevel so we don't jump out */
+				cstate->escontext->error_data->elevel = NOTICE;
+
+				/*
+				 * Despite the name, this won't raise an error since elevel is
+				 * NOTICE now.
+				 */
+				ThrowErrorData(cstate->escontext->error_data);
+
+				/* Initialize escontext in preparation for next soft error */
+				cstate->escontext->error_occurred = false;
+				cstate->escontext->details_wanted = true;
+				memset(cstate->escontext->error_data, 0, sizeof(ErrorData));
+			}
+
 			/* Report that this tuple was skipped by the ON_ERROR clause */
 			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_SKIPPED,
 										 ++skipped);
@@ -1462,12 +1479,11 @@ BeginCopyFrom(ParseState *pstate,
 		cstate->escontext->type = T_ErrorSaveContext;
 		cstate->escontext->error_occurred = false;
 
-		/*
-		 * Currently we only support COPY_ON_ERROR_IGNORE. We'll add other
-		 * options later
-		 */
+		/* Error Details are required except when "ignore" is specified */
 		if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
 			cstate->escontext->details_wanted = false;
+		else
+			cstate->escontext->details_wanted = true;
 	}
 	else
 		cstate->escontext = NULL;
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index b3da3cb0be..c61ac2445f 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -38,6 +38,7 @@ typedef enum CopyOnErrorChoice
 {
 	COPY_ON_ERROR_STOP = 0,		/* immediately throw errors, default */
 	COPY_ON_ERROR_IGNORE,		/* ignore errors */
+	COPY_ON_ERROR_LOG,			/* save error to PostgreSQL log */
 } CopyOnErrorChoice;
 
 /*
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 25c401ce34..6905d77de6 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -731,12 +731,20 @@ ERROR:  invalid input syntax for type integer: "a"
 CONTEXT:  COPY check_ign_err, line 2, column n: "a"
 COPY check_ign_err FROM STDIN WITH (on_error ignore);
 NOTICE:  4 rows were skipped due to data type incompatibility
+COPY check_ign_err FROM STDIN WITH (on_error log);
+NOTICE:  invalid input syntax for type integer: "a"
+NOTICE:  value "8888888888" is out of range for type integer
+NOTICE:  invalid input syntax for type integer: "a"
+NOTICE:  invalid input syntax for type integer: ""
+NOTICE:  4 rows were skipped due to data type incompatibility
 SELECT * FROM check_ign_err;
- n |  m  | k 
----+-----+---
- 1 | {1} | 1
- 5 | {5} | 5
-(2 rows)
+ n  |  m   | k  
+----+------+----
+  1 | {1}  |  1
+  5 | {5}  |  5
+  6 | {6}  |  6
+ 10 | {10} | 10
+(4 rows)
 
 -- test datatype error that can't be handled as soft: should fail
 CREATE TABLE hard_err(foo widget);
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index b5e549e856..54e1bc7f91 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -516,6 +516,15 @@ a	{2}	2
 
 5	{5}	5
 \.
+
+COPY check_ign_err FROM STDIN WITH (on_error log);
+6	{6}	6
+a	{7}	7
+8	{8}	8888888888
+9	{a, 9}	9
+
+10	{10}	10
+\.
 SELECT * FROM check_ign_err;
 
 -- test datatype error that can't be handled as soft: should fail

base-commit: 08e6344fd6423210b339e92c069bb979ba4e7cd6
-- 
2.39.2

#5Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: torikoshia (#4)
Re: Add new error_action COPY ON_ERROR "log"

On Mon, Jan 29, 2024 at 8:41 AM torikoshia <torikoshia@oss.nttdata.com> wrote:

On 2024-01-27 00:43, David G. Johnston wrote:

I'd like to have a new option "log", which skips soft errors and
logs
information that should have resulted in errors to PostgreSQL log.

user-specified tables in the same database. Setting up temporary
tables or unlogged tables probably is going to be a more acceptable
methodology than trying to get to the log files.

OTOH not everyone thinks saving table information is the best idea[2].

The added NOTICE message gives a summary of how many rows were
skipped, but not the line numbers. It's hard for the users to find the
malformed data, especially when they are bulk-inserting from data
files of multiple GBs in size (hard to open such files in any editor
too).

I understand the value of writing the info to server log or tables of
users' choice as being discussed in a couple of other threads.
However, I'd prefer we do the simplest thing first before we go debate
server log or table - let the users know what line numbers are
containing the errors that COPY ignored something like [1]postgres=# CREATE TABLE check_ign_err (n int, m int[], k int); CREATE TABLE postgres=# COPY check_ign_err FROM STDIN WITH (on_error ignore); Enter data to be copied followed by a newline. End with a backslash and a period on a line by itself, or an EOF signal. with a
simple change like [2]diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c index 906756362e..93ab5d4ebb 100644 --- a/src/backend/commands/copyfromparse.c +++ b/src/backend/commands/copyfromparse.c @@ -961,7 +961,16 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,. It not only helps users figure out which rows
and attributes were malformed, but also helps them redirect them to
server logs with setting log_min_messages = notice [3]2024-02-12 06:20:29.363 UTC [427892] NOTICE: detected data type incompatibility at line number 2 for column check_ign_err, COPY n 2024-02-12 06:20:29.363 UTC [427892] CONTEXT: COPY check_ign_err, line 2, column n: "a" 2024-02-12 06:20:29.363 UTC [427892] NOTICE: detected data type incompatibility at line number 3 for column check_ign_err, COPY k 2024-02-12 06:20:29.363 UTC [427892] CONTEXT: COPY check_ign_err, line 3, column k: "3333333333" 2024-02-12 06:20:29.363 UTC [427892] NOTICE: detected data type incompatibility at line number 4 for column check_ign_err, COPY m 2024-02-12 06:20:29.363 UTC [427892] CONTEXT: COPY check_ign_err, line 4, column m: "{a, 4}" 2024-02-12 06:20:29.363 UTC [427892] NOTICE: detected data type incompatibility at line number 5 for column check_ign_err, COPY n 2024-02-12 06:20:29.363 UTC [427892] CONTEXT: COPY check_ign_err, line 5, column n: "" 2024-02-12 06:20:29.363 UTC [427892] NOTICE: 4 rows were skipped due to data type incompatibility. In the worst
case scenario, a problem with this one NOTICE per malformed row is
that it can overload the psql session if all the rows are malformed.
I'm not sure if this is a big problem, but IMO better than a single
summary NOTICE message and simpler than writing to tables of users'
choice.

Thoughts?

FWIW, I presented the new COPY ... ON_ERROR option feature at
Hyderabad PostgreSQL User Group meetup and it was well-received by the
audience. The audience felt it's going to be extremely helpful for
bulk-loading tasks. Thank you all for working on this feature.

[1]: postgres=# CREATE TABLE check_ign_err (n int, m int[], k int); CREATE TABLE postgres=# COPY check_ign_err FROM STDIN WITH (on_error ignore); Enter data to be copied followed by a newline. End with a backslash and a period on a line by itself, or an EOF signal.
postgres=# CREATE TABLE check_ign_err (n int, m int[], k int);
CREATE TABLE
postgres=# COPY check_ign_err FROM STDIN WITH (on_error ignore);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.

1 {1} 1

a {2} 2
3 {3} 3333333333
4 {a, 4} 4

5 {5}>> >> >> >> >> 5
\.>>
NOTICE: detected data type incompatibility at line number 2 for
column check_ign_err, COPY n
NOTICE: detected data type incompatibility at line number 3 for
column check_ign_err, COPY k
NOTICE: detected data type incompatibility at line number 4 for
column check_ign_err, COPY m
NOTICE: detected data type incompatibility at line number 5 for
column check_ign_err, COPY n
NOTICE: 4 rows were skipped due to data type incompatibility
COPY 2

[2]
diff --git a/src/backend/commands/copyfromparse.c
b/src/backend/commands/copyfromparse.c
index 906756362e..93ab5d4ebb 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -961,7 +961,16 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
                 &values[m]))
                        {
                                cstate->num_errors++;
-                               return true;
+
+                               if (cstate->opts.on_error != COPY_ON_ERROR_STOP)
+                               {
+                                       ereport(NOTICE,
+
errmsg("detected data type incompatibility at line number %llu for
column %s, COPY %s",
+
(unsigned long long) cstate->cur_lineno,
+
cstate->cur_relname,
+
cstate->cur_attname));
+                                       return true;
+                               }
                        }

cstate->cur_attname = NULL;

[3]: 2024-02-12 06:20:29.363 UTC [427892] NOTICE: detected data type incompatibility at line number 2 for column check_ign_err, COPY n 2024-02-12 06:20:29.363 UTC [427892] CONTEXT: COPY check_ign_err, line 2, column n: "a" 2024-02-12 06:20:29.363 UTC [427892] NOTICE: detected data type incompatibility at line number 3 for column check_ign_err, COPY k 2024-02-12 06:20:29.363 UTC [427892] CONTEXT: COPY check_ign_err, line 3, column k: "3333333333" 2024-02-12 06:20:29.363 UTC [427892] NOTICE: detected data type incompatibility at line number 4 for column check_ign_err, COPY m 2024-02-12 06:20:29.363 UTC [427892] CONTEXT: COPY check_ign_err, line 4, column m: "{a, 4}" 2024-02-12 06:20:29.363 UTC [427892] NOTICE: detected data type incompatibility at line number 5 for column check_ign_err, COPY n 2024-02-12 06:20:29.363 UTC [427892] CONTEXT: COPY check_ign_err, line 5, column n: "" 2024-02-12 06:20:29.363 UTC [427892] NOTICE: 4 rows were skipped due to data type incompatibility
2024-02-12 06:20:29.363 UTC [427892] NOTICE: detected data type
incompatibility at line number 2 for column check_ign_err, COPY n
2024-02-12 06:20:29.363 UTC [427892] CONTEXT: COPY check_ign_err,
line 2, column n: "a"
2024-02-12 06:20:29.363 UTC [427892] NOTICE: detected data type
incompatibility at line number 3 for column check_ign_err, COPY k
2024-02-12 06:20:29.363 UTC [427892] CONTEXT: COPY check_ign_err,
line 3, column k: "3333333333"
2024-02-12 06:20:29.363 UTC [427892] NOTICE: detected data type
incompatibility at line number 4 for column check_ign_err, COPY m
2024-02-12 06:20:29.363 UTC [427892] CONTEXT: COPY check_ign_err,
line 4, column m: "{a, 4}"
2024-02-12 06:20:29.363 UTC [427892] NOTICE: detected data type
incompatibility at line number 5 for column check_ign_err, COPY n
2024-02-12 06:20:29.363 UTC [427892] CONTEXT: COPY check_ign_err,
line 5, column n: ""
2024-02-12 06:20:29.363 UTC [427892] NOTICE: 4 rows were skipped due
to data type incompatibility

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#6torikoshia
torikoshia@oss.nttdata.com
In reply to: Bharath Rupireddy (#5)
Re: Add new error_action COPY ON_ERROR "log"

On 2024-02-12 15:46, Bharath Rupireddy wrote:

Thanks for your comments.

On Mon, Jan 29, 2024 at 8:41 AM torikoshia <torikoshia@oss.nttdata.com>
wrote:

On 2024-01-27 00:43, David G. Johnston wrote:

I'd like to have a new option "log", which skips soft errors and
logs
information that should have resulted in errors to PostgreSQL log.

user-specified tables in the same database. Setting up temporary
tables or unlogged tables probably is going to be a more acceptable
methodology than trying to get to the log files.

OTOH not everyone thinks saving table information is the best idea[2].

The added NOTICE message gives a summary of how many rows were
skipped, but not the line numbers. It's hard for the users to find the
malformed data, especially when they are bulk-inserting from data
files of multiple GBs in size (hard to open such files in any editor
too).

I understand the value of writing the info to server log or tables of
users' choice as being discussed in a couple of other threads.
However, I'd prefer we do the simplest thing first before we go debate
server log or table - let the users know what line numbers are
containing the errors that COPY ignored something like [1] with a
simple change like [2].

Agreed.
Unlike my patch, it hides the error information(i.e. 22P02: invalid
input syntax for type integer: ), but I feel that it's usually
sufficient to know the row number and column where the error occurred.

It not only helps users figure out which rows
and attributes were malformed, but also helps them redirect them to
server logs with setting log_min_messages = notice [3]. In the worst
case scenario, a problem with this one NOTICE per malformed row is
that it can overload the psql session if all the rows are malformed.
I'm not sure if this is a big problem, but IMO better than a single
summary NOTICE message and simpler than writing to tables of users'
choice.

Maybe could we do what you suggested for the behavior when 'log' is set
to on_error?

Thoughts?

FWIW, I presented the new COPY ... ON_ERROR option feature at
Hyderabad PostgreSQL User Group meetup and it was well-received by the
audience. The audience felt it's going to be extremely helpful for
bulk-loading tasks. Thank you all for working on this feature.

Thanks for informing it!
I hope it will not be reverted as the audience:)

[1]
postgres=# CREATE TABLE check_ign_err (n int, m int[], k int);
CREATE TABLE
postgres=# COPY check_ign_err FROM STDIN WITH (on_error ignore);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF
signal.

1 {1} 1

a {2} 2
3 {3} 3333333333
4 {a, 4} 4

5 {5}>> >> >> >> >> 5
\.>>
NOTICE: detected data type incompatibility at line number 2 for
column check_ign_err, COPY n
NOTICE: detected data type incompatibility at line number 3 for
column check_ign_err, COPY k
NOTICE: detected data type incompatibility at line number 4 for
column check_ign_err, COPY m
NOTICE: detected data type incompatibility at line number 5 for
column check_ign_err, COPY n
NOTICE: 4 rows were skipped due to data type incompatibility
COPY 2

[2]
diff --git a/src/backend/commands/copyfromparse.c
b/src/backend/commands/copyfromparse.c
index 906756362e..93ab5d4ebb 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -961,7 +961,16 @@ NextCopyFrom(CopyFromState cstate, ExprContext 
*econtext,
&values[m]))
{
cstate->num_errors++;
-                               return true;
+
+                               if (cstate->opts.on_error != 
COPY_ON_ERROR_STOP)
+                               {
+                                       ereport(NOTICE,
+
errmsg("detected data type incompatibility at line number %llu for
column %s, COPY %s",
+
(unsigned long long) cstate->cur_lineno,
+
cstate->cur_relname,
+
cstate->cur_attname));
+                                       return true;
+                               }
}

cstate->cur_attname = NULL;

[3]
2024-02-12 06:20:29.363 UTC [427892] NOTICE: detected data type
incompatibility at line number 2 for column check_ign_err, COPY n
2024-02-12 06:20:29.363 UTC [427892] CONTEXT: COPY check_ign_err,
line 2, column n: "a"
2024-02-12 06:20:29.363 UTC [427892] NOTICE: detected data type
incompatibility at line number 3 for column check_ign_err, COPY k
2024-02-12 06:20:29.363 UTC [427892] CONTEXT: COPY check_ign_err,
line 3, column k: "3333333333"
2024-02-12 06:20:29.363 UTC [427892] NOTICE: detected data type
incompatibility at line number 4 for column check_ign_err, COPY m
2024-02-12 06:20:29.363 UTC [427892] CONTEXT: COPY check_ign_err,
line 4, column m: "{a, 4}"
2024-02-12 06:20:29.363 UTC [427892] NOTICE: detected data type
incompatibility at line number 5 for column check_ign_err, COPY n
2024-02-12 06:20:29.363 UTC [427892] CONTEXT: COPY check_ign_err,
line 5, column n: ""
2024-02-12 06:20:29.363 UTC [427892] NOTICE: 4 rows were skipped due
to data type incompatibility

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

#7Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: torikoshia (#6)
1 attachment(s)
Re: Add new error_action COPY ON_ERROR "log"

On Wed, Feb 14, 2024 at 5:04 PM torikoshia <torikoshia@oss.nttdata.com> wrote:

[....] let the users know what line numbers are

containing the errors that COPY ignored something like [1] with a
simple change like [2].

Agreed.
Unlike my patch, it hides the error information(i.e. 22P02: invalid
input syntax for type integer: ), but I feel that it's usually
sufficient to know the row number and column where the error occurred.

Right.

It not only helps users figure out which rows
and attributes were malformed, but also helps them redirect them to
server logs with setting log_min_messages = notice [3]. In the worst
case scenario, a problem with this one NOTICE per malformed row is
that it can overload the psql session if all the rows are malformed.
I'm not sure if this is a big problem, but IMO better than a single
summary NOTICE message and simpler than writing to tables of users'
choice.

Maybe could we do what you suggested for the behavior when 'log' is set
to on_error?

My point is that why someone wants just the summary of failures
without row and column info especially for bulk loading tasks. I'd
suggest doing it independently of 'log' or 'table'. I think we can
keep things simple just like the attached patch, and see how this
feature will be adopted. I'm sure we can come back and do things like
saving to 'log' or 'table' or 'separate_error_file' etc., if we
receive any firsthand feedback.

Thoughts?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v1-0001-Add-detailed-info-when-COPY-skips-soft-errors.patchapplication/octet-stream; name=v1-0001-Add-detailed-info-when-COPY-skips-soft-errors.patchDownload
From 6431db6405214d3aa02dc780946568d41eca2188 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 16 Feb 2024 08:01:15 +0000
Subject: [PATCH v1] Add detailed info when COPY skips soft errors

---
 src/backend/commands/copyfromparse.c | 12 +++++++++++-
 src/test/regress/expected/copy2.out  |  4 ++++
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 7cacd0b752..747e173d9c 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -969,7 +969,17 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 											&values[m]))
 			{
 				cstate->num_errors++;
-				return true;
+
+				if (cstate->opts.on_error != COPY_ON_ERROR_STOP)
+				{
+					ereport(NOTICE,
+							errmsg("detected data type incompatibility at line number %llu for column %s; COPY %s",
+								   (unsigned long long) cstate->cur_lineno,
+								   cstate->cur_attname,
+								   cstate->cur_relname));
+
+					return true;
+				}
 			}
 
 			cstate->cur_attname = NULL;
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 25c401ce34..1bf37236f0 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -730,6 +730,10 @@ COPY check_ign_err FROM STDIN WITH (on_error stop);
 ERROR:  invalid input syntax for type integer: "a"
 CONTEXT:  COPY check_ign_err, line 2, column n: "a"
 COPY check_ign_err FROM STDIN WITH (on_error ignore);
+NOTICE:  detected data type incompatibility at line number 2 for column n; COPY check_ign_err
+NOTICE:  detected data type incompatibility at line number 3 for column k; COPY check_ign_err
+NOTICE:  detected data type incompatibility at line number 4 for column m; COPY check_ign_err
+NOTICE:  detected data type incompatibility at line number 5 for column n; COPY check_ign_err
 NOTICE:  4 rows were skipped due to data type incompatibility
 SELECT * FROM check_ign_err;
  n |  m  | k 
-- 
2.34.1

#8torikoshia
torikoshia@oss.nttdata.com
In reply to: Bharath Rupireddy (#7)
Re: Add new error_action COPY ON_ERROR "log"

On 2024-02-16 17:15, Bharath Rupireddy wrote:

On Wed, Feb 14, 2024 at 5:04 PM torikoshia <torikoshia@oss.nttdata.com>
wrote:

[....] let the users know what line numbers are

containing the errors that COPY ignored something like [1] with a
simple change like [2].

Agreed.
Unlike my patch, it hides the error information(i.e. 22P02: invalid
input syntax for type integer: ), but I feel that it's usually
sufficient to know the row number and column where the error occurred.

Right.

It not only helps users figure out which rows
and attributes were malformed, but also helps them redirect them to
server logs with setting log_min_messages = notice [3]. In the worst
case scenario, a problem with this one NOTICE per malformed row is
that it can overload the psql session if all the rows are malformed.
I'm not sure if this is a big problem, but IMO better than a single
summary NOTICE message and simpler than writing to tables of users'
choice.

Maybe could we do what you suggested for the behavior when 'log' is
set
to on_error?

My point is that why someone wants just the summary of failures
without row and column info especially for bulk loading tasks. I'd
suggest doing it independently of 'log' or 'table'. I think we can
keep things simple just like the attached patch, and see how this
feature will be adopted. I'm sure we can come back and do things like
saving to 'log' or 'table' or 'separate_error_file' etc., if we
receive any firsthand feedback.

Thoughts?

I may be wrong since I seldom do data loading tasks, but I greed with
you.

I also a little concerned about the case where there are many malformed
data and it causes lots of messages, but the information is usually
valuable and if users don't need it, they can suppress it by changing
client_min_messages.

Currently both summary of failures and individual information is logged
in NOTICE level.
If we should assume that there are cases where only summary information
is required, it'd be useful to set lower log level, i.e. LOG to the
individual information.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

#9Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: torikoshia (#8)
1 attachment(s)
Re: Add new error_action COPY ON_ERROR "log"

On Fri, Feb 16, 2024 at 8:17 PM torikoshia <torikoshia@oss.nttdata.com> wrote:

I may be wrong since I seldom do data loading tasks, but I greed with
you.

I also a little concerned about the case where there are many malformed
data and it causes lots of messages, but the information is usually
valuable and if users don't need it, they can suppress it by changing
client_min_messages.

Currently both summary of failures and individual information is logged
in NOTICE level.
If we should assume that there are cases where only summary information
is required, it'd be useful to set lower log level, i.e. LOG to the
individual information.

How about we emit the summary at INFO level and individual information
at NOTICE level? With this, the summary is given a different priority
than the individual info. With SET client_min_messages = WARNING; one
can still get the summary but not the individual info. Also, to get
all of these into server log, one can SET log_min_messages = INFO; or
SET log_min_messages = NOTICE;.

Thoughts?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v2-0001-Add-detailed-info-when-COPY-skips-soft-errors.patchapplication/x-patch; name=v2-0001-Add-detailed-info-when-COPY-skips-soft-errors.patchDownload
From 3cdb3512ec1cffbaeae00d0e3cc41c57021fd6ee Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 17 Feb 2024 05:43:19 +0000
Subject: [PATCH v2] Add detailed info when COPY skips soft errors

This commit emits individual info like line number and column name
when COPY skips soft errors. Because, the summary containing the
total rows skipped isn't enough for the users to know what exactly
are the malformed rows in the input data.

It emits individual info and summary at NOTICE and INFO level
respectively to let users switch of individual info by changing
client_min_messages to WARNING. Also, one can get all of these
information into server logs by changing log_min_messages.
---
 src/backend/commands/copyfrom.c      |  2 +-
 src/backend/commands/copyfromparse.c | 12 +++++++++++-
 src/test/regress/expected/copy2.out  |  6 +++++-
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 1fe70b9133..e11c2d1cff 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1314,7 +1314,7 @@ CopyFrom(CopyFromState cstate)
 
 	if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
 		cstate->num_errors > 0)
-		ereport(NOTICE,
+		ereport(INFO,
 				errmsg_plural("%llu row was skipped due to data type incompatibility",
 							  "%llu rows were skipped due to data type incompatibility",
 							  (unsigned long long) cstate->num_errors,
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 7cacd0b752..747e173d9c 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -969,7 +969,17 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 											&values[m]))
 			{
 				cstate->num_errors++;
-				return true;
+
+				if (cstate->opts.on_error != COPY_ON_ERROR_STOP)
+				{
+					ereport(NOTICE,
+							errmsg("detected data type incompatibility at line number %llu for column %s; COPY %s",
+								   (unsigned long long) cstate->cur_lineno,
+								   cstate->cur_attname,
+								   cstate->cur_relname));
+
+					return true;
+				}
 			}
 
 			cstate->cur_attname = NULL;
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 25c401ce34..15a1da2eac 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -730,7 +730,11 @@ COPY check_ign_err FROM STDIN WITH (on_error stop);
 ERROR:  invalid input syntax for type integer: "a"
 CONTEXT:  COPY check_ign_err, line 2, column n: "a"
 COPY check_ign_err FROM STDIN WITH (on_error ignore);
-NOTICE:  4 rows were skipped due to data type incompatibility
+NOTICE:  detected data type incompatibility at line number 2 for column n; COPY check_ign_err
+NOTICE:  detected data type incompatibility at line number 3 for column k; COPY check_ign_err
+NOTICE:  detected data type incompatibility at line number 4 for column m; COPY check_ign_err
+NOTICE:  detected data type incompatibility at line number 5 for column n; COPY check_ign_err
+INFO:  4 rows were skipped due to data type incompatibility
 SELECT * FROM check_ign_err;
  n |  m  | k 
 ---+-----+---
-- 
2.34.1

#10torikoshia
torikoshia@oss.nttdata.com
In reply to: Bharath Rupireddy (#9)
Re: Add new error_action COPY ON_ERROR "log"

On 2024-02-17 15:00, Bharath Rupireddy wrote:

On Fri, Feb 16, 2024 at 8:17 PM torikoshia <torikoshia@oss.nttdata.com>
wrote:

I may be wrong since I seldom do data loading tasks, but I greed with
you.

I also a little concerned about the case where there are many
malformed
data and it causes lots of messages, but the information is usually
valuable and if users don't need it, they can suppress it by changing
client_min_messages.

Currently both summary of failures and individual information is
logged
in NOTICE level.
If we should assume that there are cases where only summary
information
is required, it'd be useful to set lower log level, i.e. LOG to the
individual information.

How about we emit the summary at INFO level and individual information
at NOTICE level? With this, the summary is given a different priority
than the individual info. With SET client_min_messages = WARNING; one
can still get the summary but not the individual info. Also, to get
all of these into server log, one can SET log_min_messages = INFO; or
SET log_min_messages = NOTICE;.

Thoughts?

It looks good to me.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

#11torikoshia
torikoshia@oss.nttdata.com
In reply to: torikoshia (#10)
Re: Add new error_action COPY ON_ERROR "log"

On 2024-02-20 17:22, torikoshia wrote:

On 2024-02-17 15:00, Bharath Rupireddy wrote:

On Fri, Feb 16, 2024 at 8:17 PM torikoshia
<torikoshia@oss.nttdata.com> wrote:

I may be wrong since I seldom do data loading tasks, but I greed with
you.

I also a little concerned about the case where there are many
malformed
data and it causes lots of messages, but the information is usually
valuable and if users don't need it, they can suppress it by changing
client_min_messages.

Currently both summary of failures and individual information is
logged
in NOTICE level.
If we should assume that there are cases where only summary
information
is required, it'd be useful to set lower log level, i.e. LOG to the
individual information.

How about we emit the summary at INFO level and individual information
at NOTICE level? With this, the summary is given a different priority
than the individual info. With SET client_min_messages = WARNING; one
can still get the summary but not the individual info. Also, to get
all of these into server log, one can SET log_min_messages = INFO; or
SET log_min_messages = NOTICE;.

Thoughts?

It looks good to me.

Here are comments on the v2 patch.

+               if (cstate->opts.on_error != COPY_ON_ERROR_STOP)
+               {
+                   ereport(NOTICE,

I think cstate->opts.on_error is not COPY_ON_ERROR_STOP here, since if
it is COPY_ON_ERROR_STOP, InputFunctionCallSafe() should already have
errored out.

Should it be something like "Assert(cstate->opts.on_error !=
COPY_ON_ERROR_STOP)"?

Should below manual also be updated?

A NOTICE message containing the ignored row count is emitted at the end
of the COPY FROM if at least one row was discarded.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

#12Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: torikoshia (#11)
1 attachment(s)
Re: Add new error_action COPY ON_ERROR "log"

On Mon, Feb 26, 2024 at 5:47 PM torikoshia <torikoshia@oss.nttdata.com> wrote:

It looks good to me.

Here are comments on the v2 patch.

Thanks for looking at it.

+               if (cstate->opts.on_error != COPY_ON_ERROR_STOP)
+               {
+                   ereport(NOTICE,

I think cstate->opts.on_error is not COPY_ON_ERROR_STOP here, since if
it is COPY_ON_ERROR_STOP, InputFunctionCallSafe() should already have
errored out.

Should it be something like "Assert(cstate->opts.on_error !=
COPY_ON_ERROR_STOP)"?

Nice catch. When COPY_ON_ERROR_STOP is specified, we use ereport's
soft error mechanism. An assertion seems a good choice to validate the
state is what we expect. Done that way.

Should below manual also be updated?

A NOTICE message containing the ignored row count is emitted at the end
of the COPY FROM if at least one row was discarded.

Changed.

PSA v3 patch with the above review comments addressed.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v3-0001-Add-detailed-info-when-COPY-skips-soft-errors.patchapplication/octet-stream; name=v3-0001-Add-detailed-info-when-COPY-skips-soft-errors.patchDownload
From a40adad6e24d8b4cdfc8ec26749a5bf32915716a Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 28 Feb 2024 06:24:54 +0000
Subject: [PATCH v3] Add detailed info when COPY skips soft errors

This commit emits individual info like line number and column name
when COPY skips soft errors. Because, the summary containing the
total rows skipped isn't enough for the users to know what exactly
are the malformed rows in the input data.

It emits individual info and summary at NOTICE and INFO level
respectively to let users switch of individual info by changing
client_min_messages to WARNING. Also, one can get all of these
information into server logs by changing log_min_messages.
---
 doc/src/sgml/ref/copy.sgml           | 5 ++++-
 src/backend/commands/copyfrom.c      | 2 +-
 src/backend/commands/copyfromparse.c | 8 ++++++++
 src/test/regress/expected/copy2.out  | 6 +++++-
 4 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 55764fc1f2..c633ad5aa3 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -397,7 +397,10 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
      </para>
      <para>
-      A <literal>NOTICE</literal> message containing the ignored row count is emitted at the end
+      When <literal>ignore</literal> option is specified, a
+      <literal>NOTICE</literal> message containing the line number and column
+      name is emitted for each discarded row, and <literal>INFO</literal>
+      message containing the ignored row count is emitted at the end
       of the <command>COPY FROM</command> if at least one row was discarded.
      </para>
     </listitem>
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 1fe70b9133..e11c2d1cff 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1314,7 +1314,7 @@ CopyFrom(CopyFromState cstate)
 
 	if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
 		cstate->num_errors > 0)
-		ereport(NOTICE,
+		ereport(INFO,
 				errmsg_plural("%llu row was skipped due to data type incompatibility",
 							  "%llu rows were skipped due to data type incompatibility",
 							  (unsigned long long) cstate->num_errors,
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 7cacd0b752..12e604acfa 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -968,7 +968,15 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 											(Node *) cstate->escontext,
 											&values[m]))
 			{
+				Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
 				cstate->num_errors++;
+
+				ereport(NOTICE,
+						errmsg("detected data type incompatibility at line number %llu for column %s; COPY %s",
+							   (unsigned long long) cstate->cur_lineno,
+							   cstate->cur_attname,
+							   cstate->cur_relname));
+
 				return true;
 			}
 
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 25c401ce34..15a1da2eac 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -730,7 +730,11 @@ COPY check_ign_err FROM STDIN WITH (on_error stop);
 ERROR:  invalid input syntax for type integer: "a"
 CONTEXT:  COPY check_ign_err, line 2, column n: "a"
 COPY check_ign_err FROM STDIN WITH (on_error ignore);
-NOTICE:  4 rows were skipped due to data type incompatibility
+NOTICE:  detected data type incompatibility at line number 2 for column n; COPY check_ign_err
+NOTICE:  detected data type incompatibility at line number 3 for column k; COPY check_ign_err
+NOTICE:  detected data type incompatibility at line number 4 for column m; COPY check_ign_err
+NOTICE:  detected data type incompatibility at line number 5 for column n; COPY check_ign_err
+INFO:  4 rows were skipped due to data type incompatibility
 SELECT * FROM check_ign_err;
  n |  m  | k 
 ---+-----+---
-- 
2.34.1

#13Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#12)
Re: Add new error_action COPY ON_ERROR "log"

On Wed, Feb 28, 2024 at 12:10:00PM +0530, Bharath Rupireddy wrote:

On Mon, Feb 26, 2024 at 5:47 PM torikoshia <torikoshia@oss.nttdata.com> wrote:

+               if (cstate->opts.on_error != COPY_ON_ERROR_STOP)
+               {
+                   ereport(NOTICE,

I think cstate->opts.on_error is not COPY_ON_ERROR_STOP here, since if
it is COPY_ON_ERROR_STOP, InputFunctionCallSafe() should already have
errored out.

Should it be something like "Assert(cstate->opts.on_error !=
COPY_ON_ERROR_STOP)"?

Nice catch. When COPY_ON_ERROR_STOP is specified, we use ereport's
soft error mechanism. An assertion seems a good choice to validate the
state is what we expect. Done that way.

Hmm. I am not really on board with this patch, that would generate
one NOTICE message each time a row is incompatible in the soft error
mode. If you have a couple of billion rows to bulk-load into the
backend and even 0.01% of them are corrupted, you could finish with a
more than 100k log entries, and all systems should be careful about
the log quantity generated, especially if we use the syslogger which
could become easily a bottleneck.

The existing ON_ERROR controls what to do on error. I think that we'd
better control the amount of information reported with a completely
separate option, an option even different than where to redirect
errors (if required, which would be either the logs, the client, a
pipe, a combination of these or even all of them).
--
Michael

#14Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#13)
Re: Add new error_action COPY ON_ERROR "log"

On Fri, Mar 1, 2024 at 10:22 AM Michael Paquier <michael@paquier.xyz> wrote:

Nice catch. When COPY_ON_ERROR_STOP is specified, we use ereport's
soft error mechanism. An assertion seems a good choice to validate the
state is what we expect. Done that way.

Hmm. I am not really on board with this patch, that would generate
one NOTICE message each time a row is incompatible in the soft error
mode. If you have a couple of billion rows to bulk-load into the
backend and even 0.01% of them are corrupted, you could finish with a
more than 100k log entries, and all systems should be careful about
the log quantity generated, especially if we use the syslogger which
could become easily a bottleneck.

Hm. I was having some concerns about it as mentioned upthread. But,
thanks a lot for illustrating it.

The existing ON_ERROR controls what to do on error. I think that we'd
better control the amount of information reported with a completely
separate option, an option even different than where to redirect
errors (if required, which would be either the logs, the client, a
pipe, a combination of these or even all of them).

How about an extra option to error_action ignore-with-verbose which is
similar to ignore but when specified emits one NOTICE per malformed
row? With this, one can say COPY x FROM stdin (ON_ERROR
ignore-with-verbose);.

Alternatively, we can think of adding a new option verbose altogether
which can be used for not only this but for reporting some other COPY
related info/errors etc. With this, one can say COPY x FROM stdin
(VERBOSE on, ON_ERROR ignore);.

There's also another way of having a separate GUC, but -100 from me
for it. Because, it not only increases the total number of GUCs by 1,
but also might set a wrong precedent to have a new GUC for controlling
command level outputs.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#15Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#14)
Re: Add new error_action COPY ON_ERROR "log"

On Mon, Mar 04, 2024 at 05:00:00AM +0530, Bharath Rupireddy wrote:

How about an extra option to error_action ignore-with-verbose which is
similar to ignore but when specified emits one NOTICE per malformed
row? With this, one can say COPY x FROM stdin (ON_ERROR
ignore-with-verbose);.

Alternatively, we can think of adding a new option verbose altogether
which can be used for not only this but for reporting some other COPY
related info/errors etc. With this, one can say COPY x FROM stdin
(VERBOSE on, ON_ERROR ignore);.

I would suggest a completely separate option, because that offers more
flexibility as each option has a separate meaning. My main concern in
using one option to control them all is that one may want in the
future to be able to specify more combinations of actions at query
level, especially if more modes are added to the ON_ERROR mode. One
option would prevent that.

Perhaps ERROR_VERBOSE or ERROR_VERBOSITY would be better names, but
I'm never wedded to my naming suggestions. Bad history with the
matter.

There's also another way of having a separate GUC, but -100 from me
for it. Because, it not only increases the total number of GUCs by 1,
but also might set a wrong precedent to have a new GUC for controlling
command level outputs.

What does this have to do with GUCs? The ON_ERROR option does not
have one.
--
Michael

#16Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#15)
1 attachment(s)
Re: Add new error_action COPY ON_ERROR "log"

On Tue, Mar 5, 2024 at 4:48 AM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Mar 04, 2024 at 05:00:00AM +0530, Bharath Rupireddy wrote:

How about an extra option to error_action ignore-with-verbose which is
similar to ignore but when specified emits one NOTICE per malformed
row? With this, one can say COPY x FROM stdin (ON_ERROR
ignore-with-verbose);.

Alternatively, we can think of adding a new option verbose altogether
which can be used for not only this but for reporting some other COPY
related info/errors etc. With this, one can say COPY x FROM stdin
(VERBOSE on, ON_ERROR ignore);.

I would suggest a completely separate option, because that offers more
flexibility as each option has a separate meaning. My main concern in
using one option to control them all is that one may want in the
future to be able to specify more combinations of actions at query
level, especially if more modes are added to the ON_ERROR mode. One
option would prevent that.

Perhaps ERROR_VERBOSE or ERROR_VERBOSITY would be better names, but
I'm never wedded to my naming suggestions. Bad history with the
matter.

+1 for a separate option and LOG_VERBOSITY seemed a better and generic
naming choice. Because, the ON_ERROR ignore isn't actually an error
per se IMO.

There's also another way of having a separate GUC, but -100 from me
for it. Because, it not only increases the total number of GUCs by 1,
but also might set a wrong precedent to have a new GUC for controlling
command level outputs.

What does this have to do with GUCs? The ON_ERROR option does not
have one.

My thought was to have a separate GUC for deciding log level for COPY
command messages/errors similar to log_replication_commands. But
that's a no-go for sure when compared with a separate option.

Please see the attached v4 patch. If it looks good, I can pull
LOG_VERBOSITY changes out into 0001 and with 0002 containing the
detailed messages for discarded rows.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v4-0001-Add-detailed-info-when-COPY-skips-soft-errors.patchapplication/x-patch; name=v4-0001-Add-detailed-info-when-COPY-skips-soft-errors.patchDownload
From 802b5bde48cc378dc69baa1e781548bb7182fb45 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 6 Mar 2024 13:17:41 +0000
Subject: [PATCH v4] Add detailed info when COPY skips soft errors

---
 doc/src/sgml/ref/copy.sgml           | 21 +++++++++++++++++++--
 src/backend/commands/copy.c          |  8 ++++++++
 src/backend/commands/copyfromparse.c | 10 ++++++++++
 src/bin/psql/tab-complete.c          |  2 +-
 src/include/commands/copy.h          |  1 +
 src/test/regress/expected/copy2.out  |  6 +++++-
 src/test/regress/sql/copy2.sql       |  2 +-
 7 files changed, 45 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 55764fc1f2..d0e58c0f9f 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -44,6 +44,7 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     FORCE_NOT_NULL { ( <replaceable class="parameter">column_name</replaceable> [, ...] ) | * }
     FORCE_NULL { ( <replaceable class="parameter">column_name</replaceable> [, ...] ) | * }
     ON_ERROR '<replaceable class="parameter">error_action</replaceable>'
+    LOG_VERBOSITY [ <replaceable class="parameter">boolean</replaceable> ]
     ENCODING '<replaceable class="parameter">encoding_name</replaceable>'
 </synopsis>
  </refsynopsisdiv>
@@ -397,8 +398,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
      </para>
      <para>
-      A <literal>NOTICE</literal> message containing the ignored row count is emitted at the end
-      of the <command>COPY FROM</command> if at least one row was discarded.
+      A <literal>NOTICE</literal> message containing the ignored row count is
+      emitted at the end of the <command>COPY FROM</command> if at least one
+      row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
+      <literal>true</literal> (or equivalent Boolean value), a
+      <literal>NOTICE</literal> message containing the line number and column
+      name for each discarded row is emitted.
      </para>
     </listitem>
    </varlistentry>
@@ -415,6 +420,18 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>LOG_VERBOSITY</literal></term>
+    <listitem>
+     <para>
+      Sets the verbosity of logged messages by <command>COPY</command>
+      command. As an example, see its usage for
+      <command>COPY FROM</command> command's <literal>ON_ERROR</literal>
+      clause with <literal>ignore</literal> option.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>WHERE</literal></term>
     <listitem>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 056b6733c8..aa9fee5a71 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -454,6 +454,7 @@ ProcessCopyOptions(ParseState *pstate,
 	bool		freeze_specified = false;
 	bool		header_specified = false;
 	bool		on_error_specified = false;
+	bool		log_verbosity_specified = false;
 	ListCell   *option;
 
 	/* Support external use for option sanity checking */
@@ -613,6 +614,13 @@ ProcessCopyOptions(ParseState *pstate,
 			on_error_specified = true;
 			opts_out->on_error = defGetCopyOnErrorChoice(defel, pstate, is_from);
 		}
+		else if (strcmp(defel->defname, "log_verbosity") == 0)
+		{
+			if (log_verbosity_specified)
+				errorConflictingDefElem(defel, pstate);
+			log_verbosity_specified = true;
+			opts_out->log_verbosity = defGetBoolean(defel);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 5682d5d054..5f6be5c400 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -967,7 +967,17 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 											(Node *) cstate->escontext,
 											&values[m]))
 			{
+				Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
+
 				cstate->num_errors++;
+
+				if (cstate->opts.log_verbosity)
+					ereport(NOTICE,
+							errmsg("detected data type incompatibility at line number %llu for column %s; COPY %s",
+								   (unsigned long long) cstate->cur_lineno,
+								   cstate->cur_attname,
+								   cstate->cur_relname));
+
 				return true;
 			}
 
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index aa1acf8523..b6d5767acc 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2900,7 +2900,7 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("FORMAT", "FREEZE", "DELIMITER", "NULL",
 					  "HEADER", "QUOTE", "ESCAPE", "FORCE_QUOTE",
 					  "FORCE_NOT_NULL", "FORCE_NULL", "ENCODING", "DEFAULT",
-					  "ON_ERROR");
+					  "ON_ERROR", "LOG_VERBOSITY");
 
 	/* Complete COPY <sth> FROM|TO filename WITH (FORMAT */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "FORMAT"))
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index b3da3cb0be..e194081fad 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -73,6 +73,7 @@ typedef struct CopyFormatOptions
 	bool	   *force_null_flags;	/* per-column CSV FN flags */
 	bool		convert_selectively;	/* do selective binary conversion? */
 	CopyOnErrorChoice on_error; /* what to do when error happened */
+	bool		log_verbosity;	/* log more verbose messages? */
 	List	   *convert_select; /* list of column names (can be NIL) */
 } CopyFormatOptions;
 
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 25c401ce34..07e52bcd4a 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -729,7 +729,11 @@ CREATE TABLE check_ign_err (n int, m int[], k int);
 COPY check_ign_err FROM STDIN WITH (on_error stop);
 ERROR:  invalid input syntax for type integer: "a"
 CONTEXT:  COPY check_ign_err, line 2, column n: "a"
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity on);
+NOTICE:  detected data type incompatibility at line number 2 for column n; COPY check_ign_err
+NOTICE:  detected data type incompatibility at line number 3 for column k; COPY check_ign_err
+NOTICE:  detected data type incompatibility at line number 4 for column m; COPY check_ign_err
+NOTICE:  detected data type incompatibility at line number 5 for column n; COPY check_ign_err
 NOTICE:  4 rows were skipped due to data type incompatibility
 SELECT * FROM check_ign_err;
  n |  m  | k 
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index b5e549e856..47d131c1ce 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -508,7 +508,7 @@ a	{2}	2
 
 5	{5}	5
 \.
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity on);
 1	{1}	1
 a	{2}	2
 3	{3}	3333333333
-- 
2.34.1

#17Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#16)
Re: Add new error_action COPY ON_ERROR "log"

On Wed, Mar 06, 2024 at 07:32:28PM +0530, Bharath Rupireddy wrote:

Please see the attached v4 patch. If it looks good, I can pull
LOG_VERBOSITY changes out into 0001 and with 0002 containing the
detailed messages for discarded rows.

The approach looks sensible seen from here.

+    LOG_VERBOSITY [ <replaceable class="parameter">boolean</replaceable> ]
[...]
+    <term><literal>LOG_VERBOSITY</literal></term>
+    <listitem>
+     <para>
+      Sets the verbosity of logged messages by <command>COPY</command>
+      command. As an example, see its usage for
+      <command>COPY FROM</command> command's <literal>ON_ERROR</literal>
+      clause with <literal>ignore</literal> option.
+     </para>
+    </listitem>

Is a boolean the best interface for the end-user, though? Maybe
something like a "mode" value would speak more than a yes/no from the
start, say a "default" mode to emit only the last LOG and a "verbose"
for the whole set in the case of ON_ERROR? That could use an enum
from the start internally, but that's an implementation detail.

Describing what gets logged in the paragraph of ON_ERROR sounds fine,
especially if in the future more logs are added depending on other
options. That's an assumption at this stage, of course.

I am adding Alexander Korotkov in CC, as the original committer of
9e2d8701194f, as I assume that he may want to chime in this
discussion.

Torikoshi-san or others, if you have any comments about the interface,
feel free.
--
Michael

#18torikoshia
torikoshia@oss.nttdata.com
In reply to: Michael Paquier (#17)
Re: Add new error_action COPY ON_ERROR "log"

On 2024-03-07 13:00, Michael Paquier wrote:

On Wed, Mar 06, 2024 at 07:32:28PM +0530, Bharath Rupireddy wrote:

Please see the attached v4 patch. If it looks good, I can pull
LOG_VERBOSITY changes out into 0001 and with 0002 containing the
detailed messages for discarded rows.

The approach looks sensible seen from here.

+    LOG_VERBOSITY [ <replaceable 
class="parameter">boolean</replaceable> ]
[...]
+    <term><literal>LOG_VERBOSITY</literal></term>
+    <listitem>
+     <para>
+      Sets the verbosity of logged messages by <command>COPY</command>
+      command. As an example, see its usage for
+      <command>COPY FROM</command> command's 
<literal>ON_ERROR</literal>
+      clause with <literal>ignore</literal> option.
+     </para>
+    </listitem>

Is a boolean the best interface for the end-user, though? Maybe
something like a "mode" value would speak more than a yes/no from the
start, say a "default" mode to emit only the last LOG and a "verbose"
for the whole set in the case of ON_ERROR? That could use an enum
from the start internally, but that's an implementation detail.

Describing what gets logged in the paragraph of ON_ERROR sounds fine,
especially if in the future more logs are added depending on other
options. That's an assumption at this stage, of course.

I am adding Alexander Korotkov in CC, as the original committer of
9e2d8701194f, as I assume that he may want to chime in this
discussion.

Torikoshi-san or others, if you have any comments about the interface,
feel free.

Thanks.

Maybe I'm overly concerned, but I'm a little concerned about whether the
contents of this output can really be called verbose, since it does not
output the actual soft error message that occurred, but only the row and
column where the error occurred.

Since the soft error mechanism can at least output the contents of soft
errors in the server log [1]/messages/by-id/20230322175000.qbdctk7bnmifh5an@awork3.anarazel.de, it might be a good idea to use something
like a 'mode' value instead of boolean as Michael-san suggested, so that
the log output contents can be adjusted at multiple levels.

[1]: /messages/by-id/20230322175000.qbdctk7bnmifh5an@awork3.anarazel.de
/messages/by-id/20230322175000.qbdctk7bnmifh5an@awork3.anarazel.de

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

#19Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Michael Paquier (#17)
Re: Add new error_action COPY ON_ERROR "log"

On Thu, Mar 7, 2024 at 1:00 PM Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Mar 06, 2024 at 07:32:28PM +0530, Bharath Rupireddy wrote:

Please see the attached v4 patch. If it looks good, I can pull
LOG_VERBOSITY changes out into 0001 and with 0002 containing the
detailed messages for discarded rows.

The approach looks sensible seen from here.

Looks like a good approach to me.

+    LOG_VERBOSITY [ <replaceable class="parameter">boolean</replaceable> ]
[...]
+    <term><literal>LOG_VERBOSITY</literal></term>
+    <listitem>
+     <para>
+      Sets the verbosity of logged messages by <command>COPY</command>
+      command. As an example, see its usage for
+      <command>COPY FROM</command> command's <literal>ON_ERROR</literal>
+      clause with <literal>ignore</literal> option.
+     </para>
+    </listitem>

Is a boolean the best interface for the end-user, though? Maybe
something like a "mode" value would speak more than a yes/no from the
start, say a "default" mode to emit only the last LOG and a "verbose"
for the whole set in the case of ON_ERROR? That could use an enum
from the start internally, but that's an implementation detail.

+1 for making it an enum, so that we will be able to have multiple
levels for example to get actual soft error contents.

One question I have is; do we want to write multiple NOTICE messages
for one row if the row has malformed data on some columns?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#20Michael Paquier
michael@paquier.xyz
In reply to: Masahiko Sawada (#19)
Re: Add new error_action COPY ON_ERROR "log"

On Thu, Mar 07, 2024 at 03:52:41PM +0900, Masahiko Sawada wrote:

One question I have is; do we want to write multiple NOTICE messages
for one row if the row has malformed data on some columns?

Good idea. We can do that as the field strings are already parsed.
--
Michael

#21Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#17)
Re: Add new error_action COPY ON_ERROR "log"

On Thu, Mar 7, 2024 at 9:30 AM Michael Paquier <michael@paquier.xyz> wrote:

Is a boolean the best interface for the end-user, though? Maybe
something like a "mode" value would speak more than a yes/no from the
start, say a "default" mode to emit only the last LOG and a "verbose"
for the whole set in the case of ON_ERROR? That could use an enum
from the start internally, but that's an implementation detail.

I'm okay with it. But, help me understand it better. We want the
'log_verbosity' clause to have options 'default' and 'verbose', right?
And, later it can also be extended to contain all the LOG levels like
'notice', 'error', 'info' , 'debugX' etc. depending on the need,
right?

One more thing, how does it sound using both verbosity and verbose in
log_verbosity verbose something like below? Is this okay?

COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity verbose);

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#22Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#20)
Re: Add new error_action COPY ON_ERROR "log"

On Thu, Mar 7, 2024 at 12:37 PM Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Mar 07, 2024 at 03:52:41PM +0900, Masahiko Sawada wrote:

One question I have is; do we want to write multiple NOTICE messages
for one row if the row has malformed data on some columns?

Good idea. We can do that as the field strings are already parsed.

Nice catch. So, are you suggesting to log one NOTICE message per row
even if multiple columns in the single row fail to parse or are
malformed?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#23Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#21)
Re: Add new error_action COPY ON_ERROR "log"

On Thu, Mar 07, 2024 at 12:48:12PM +0530, Bharath Rupireddy wrote:

I'm okay with it. But, help me understand it better. We want the
'log_verbosity' clause to have options 'default' and 'verbose', right?
And, later it can also be extended to contain all the LOG levels like
'notice', 'error', 'info' , 'debugX' etc. depending on the need,
right?

You could, or names that have some status like row_details, etc.

One more thing, how does it sound using both verbosity and verbose in
log_verbosity verbose something like below? Is this okay?

There's some history with this pattern in psql at least with \set
VERBOSITY verbose. For the patch, I would tend to choose these two,
but that's as far as my opinion goes and I am OK other ideas gather
more votes.
--
Michael

#24Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#22)
Re: Add new error_action COPY ON_ERROR "log"

On Thu, Mar 07, 2024 at 12:50:33PM +0530, Bharath Rupireddy wrote:

Nice catch. So, are you suggesting to log one NOTICE message per row
even if multiple columns in the single row fail to parse or are
malformed?

One NOTICE per malformed value, I guess, which would be easier to
parse particularly if the values are long (like, JSON-long).
--
Michael

#25Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#23)
2 attachment(s)
Re: Add new error_action COPY ON_ERROR "log"

On Thu, Mar 7, 2024 at 12:54 PM Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Mar 07, 2024 at 12:48:12PM +0530, Bharath Rupireddy wrote:

I'm okay with it. But, help me understand it better. We want the
'log_verbosity' clause to have options 'default' and 'verbose', right?
And, later it can also be extended to contain all the LOG levels like
'notice', 'error', 'info' , 'debugX' etc. depending on the need,
right?

You could, or names that have some status like row_details, etc.

One more thing, how does it sound using both verbosity and verbose in
log_verbosity verbose something like below? Is this okay?

There's some history with this pattern in psql at least with \set
VERBOSITY verbose. For the patch, I would tend to choose these two,
but that's as far as my opinion goes and I am OK other ideas gather
more votes.

Please see the attached v5-0001 patch implementing LOG_VERBOSITY with
options 'default' and 'verbose'. v5-0002 adds the detailed info to
ON_ERROR 'ignore' option.

We have a CF entry https://commitfest.postgresql.org/47/4798/ for the
original idea proposed in this thread, that is, to have the ON_ERROR
'log' option. I'll probably start a new thread and add a new CF entry
in the next commitfest if there's no objection from anyone.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v5-0001-Add-LOG_VERBOSITY-option-to-COPY-command.patchapplication/octet-stream; name=v5-0001-Add-LOG_VERBOSITY-option-to-COPY-command.patchDownload
From 04327f7b6d649d74bf25e09b42a131e220f61068 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 8 Mar 2024 09:34:00 +0000
Subject: [PATCH v5 1/2] Add LOG_VERBOSITY option to COPY command

---
 doc/src/sgml/ref/copy.sgml          | 14 +++++++++++
 src/backend/commands/copy.c         | 38 +++++++++++++++++++++++++++++
 src/bin/psql/tab-complete.c         |  6 ++++-
 src/include/commands/copy.h         | 10 ++++++++
 src/test/regress/expected/copy2.out |  8 ++++++
 src/test/regress/sql/copy2.sql      |  2 ++
 src/tools/pgindent/typedefs.list    |  1 +
 7 files changed, 78 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 55764fc1f2..67ba6212fe 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -45,6 +45,7 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     FORCE_NULL { ( <replaceable class="parameter">column_name</replaceable> [, ...] ) | * }
     ON_ERROR '<replaceable class="parameter">error_action</replaceable>'
     ENCODING '<replaceable class="parameter">encoding_name</replaceable>'
+    LOG_VERBOSITY [ <replaceable class="parameter">mode</replaceable> ]
 </synopsis>
  </refsynopsisdiv>
 
@@ -415,6 +416,19 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>LOG_VERBOSITY</literal></term>
+    <listitem>
+     <para>
+      Sets the verbosity of logged messages by <command>COPY</command> command.
+      A <replaceable class="parameter">mode</replaceable> value of
+      <literal>verbose</literal> can be used to emit more informative messages
+      by the command, while value of <literal>default</literal> (which is the
+      default) can be used to not log any additional messages.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>WHERE</literal></term>
     <listitem>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 056b6733c8..23eb8c9c79 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -428,6 +428,36 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
 	return COPY_ON_ERROR_STOP;	/* keep compiler quiet */
 }
 
+/*
+ * Extract a CopyLogVerbosityChoice value from a DefElem.
+ */
+static CopyLogVerbosityChoice
+defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate)
+{
+	char	   *sval;
+
+	/*
+	 * If no parameter value given, assume the default value.
+	 */
+	if (def->arg == NULL)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+
+	/*
+	 * Allow "default", or "verbose" values.
+	 */
+	sval = defGetString(def);
+	if (pg_strcasecmp(sval, "default") == 0)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+	if (pg_strcasecmp(sval, "verbose") == 0)
+		return COPY_LOG_VERBOSITY_VERBOSE;
+
+	ereport(ERROR,
+			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			 errmsg("COPY LOG_VERBOSITY \"%s\" not recognized", sval),
+			 parser_errposition(pstate, def->location)));
+	return COPY_LOG_VERBOSITY_DEFAULT;	/* keep compiler quiet */
+}
+
 /*
  * Process the statement option list for COPY.
  *
@@ -454,6 +484,7 @@ ProcessCopyOptions(ParseState *pstate,
 	bool		freeze_specified = false;
 	bool		header_specified = false;
 	bool		on_error_specified = false;
+	bool		log_verbosity_specified = false;
 	ListCell   *option;
 
 	/* Support external use for option sanity checking */
@@ -613,6 +644,13 @@ ProcessCopyOptions(ParseState *pstate,
 			on_error_specified = true;
 			opts_out->on_error = defGetCopyOnErrorChoice(defel, pstate, is_from);
 		}
+		else if (strcmp(defel->defname, "log_verbosity") == 0)
+		{
+			if (log_verbosity_specified)
+				errorConflictingDefElem(defel, pstate);
+			log_verbosity_specified = true;
+			opts_out->log_verbosity = defGetCopyLogVerbosityChoice(defel, pstate);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 73133ce735..9305800340 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2901,7 +2901,7 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("FORMAT", "FREEZE", "DELIMITER", "NULL",
 					  "HEADER", "QUOTE", "ESCAPE", "FORCE_QUOTE",
 					  "FORCE_NOT_NULL", "FORCE_NULL", "ENCODING", "DEFAULT",
-					  "ON_ERROR");
+					  "ON_ERROR", "LOG_VERBOSITY");
 
 	/* Complete COPY <sth> FROM|TO filename WITH (FORMAT */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "FORMAT"))
@@ -2911,6 +2911,10 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "ON_ERROR"))
 		COMPLETE_WITH("stop", "ignore");
 
+	/* Complete COPY <sth> FROM filename WITH (LOG_VERBOSITY */
+	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "LOG_VERBOSITY"))
+		COMPLETE_WITH("default", "verbose");
+
 	/* Complete COPY <sth> FROM <sth> WITH (<options>) */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM", MatchAny, "WITH", MatchAny))
 		COMPLETE_WITH("WHERE");
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index b3da3cb0be..99d183fa4d 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -40,6 +40,15 @@ typedef enum CopyOnErrorChoice
 	COPY_ON_ERROR_IGNORE,		/* ignore errors */
 } CopyOnErrorChoice;
 
+/*
+ * Represents verbosity of logged messages by COPY command.
+ */
+typedef enum CopyLogVerbosityChoice
+{
+	COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional messages, default */
+	COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */
+} CopyLogVerbosityChoice;
+
 /*
  * A struct to hold COPY options, in a parsed form. All of these are related
  * to formatting, except for 'freeze', which doesn't really belong here, but
@@ -73,6 +82,7 @@ typedef struct CopyFormatOptions
 	bool	   *force_null_flags;	/* per-column CSV FN flags */
 	bool		convert_selectively;	/* do selective binary conversion? */
 	CopyOnErrorChoice on_error; /* what to do when error happened */
+	CopyLogVerbosityChoice log_verbosity;	/* verbosity of logged messages */
 	List	   *convert_select; /* list of column names (can be NIL) */
 } CopyFormatOptions;
 
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 25c401ce34..62406ef827 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
 ERROR:  conflicting or redundant options
 LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
                                             ^
+COPY x from stdin (log_verbosity 'default', log_verbosity 'verbose');
+ERROR:  conflicting or redundant options
+LINE 1: COPY x from stdin (log_verbosity 'default', log_verbosity 'v...
+                                                    ^
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
 ERROR:  cannot specify DELIMITER in BINARY mode
@@ -108,6 +112,10 @@ COPY x to stdin (format BINARY, on_error unsupported);
 ERROR:  COPY ON_ERROR cannot be used with COPY TO
 LINE 1: COPY x to stdin (format BINARY, on_error unsupported);
                                         ^
+COPY x to stdout (log_verbosity 'unsupported');
+ERROR:  COPY LOG_VERBOSITY "unsupported" not recognized
+LINE 1: COPY x to stdout (log_verbosity 'unsupported');
+                          ^
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
 ERROR:  column "d" specified more than once
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index b5e549e856..5116157cc9 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,6 +67,7 @@ COPY x from stdin (force_null (a), force_null (b));
 COPY x from stdin (convert_selectively (a), convert_selectively (b));
 COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
 COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (log_verbosity 'default', log_verbosity 'verbose');
 
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
@@ -80,6 +81,7 @@ COPY x to stdin (format CSV, force_not_null(a));
 COPY x to stdout (format TEXT, force_null(a));
 COPY x to stdin (format CSV, force_null(a));
 COPY x to stdin (format BINARY, on_error unsupported);
+COPY x to stdout (log_verbosity 'unsupported');
 
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index d3a7f75b08..b9b9f8a7f0 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -478,6 +478,7 @@ CopyFromState
 CopyFromStateData
 CopyHeaderChoice
 CopyInsertMethod
+CopyLogVerbosityChoice
 CopyMultiInsertBuffer
 CopyMultiInsertInfo
 CopyOnErrorChoice
-- 
2.34.1

v5-0002-Add-detailed-info-when-COPY-skips-soft-errors.patchapplication/octet-stream; name=v5-0002-Add-detailed-info-when-COPY-skips-soft-errors.patchDownload
From e72035d692fb092376f6a2aaa047a94e129fcbfc Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 8 Mar 2024 09:48:36 +0000
Subject: [PATCH v5 2/2] Add detailed info when COPY skips soft errors

---
 doc/src/sgml/ref/copy.sgml           | 12 +++++++++---
 src/backend/commands/copyfromparse.c | 10 ++++++++++
 src/test/regress/expected/copy2.out  |  7 ++++++-
 src/test/regress/sql/copy2.sql       |  4 +++-
 4 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 67ba6212fe..ef5a12e0d3 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -398,8 +398,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
      </para>
      <para>
-      A <literal>NOTICE</literal> message containing the ignored row count is emitted at the end
-      of the <command>COPY FROM</command> if at least one row was discarded.
+      A <literal>NOTICE</literal> message containing the ignored row count is
+      emitted at the end of the <command>COPY FROM</command> if at least one
+      row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
+      <literal>verbose</literal>, a <literal>NOTICE</literal> message
+      containing the line number and column name for each discarded row is
+      emitted.
      </para>
     </listitem>
    </varlistentry>
@@ -424,7 +428,9 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       A <replaceable class="parameter">mode</replaceable> value of
       <literal>verbose</literal> can be used to emit more informative messages
       by the command, while value of <literal>default</literal> (which is the
-      default) can be used to not log any additional messages.
+      default) can be used to not log any additional messages. As an example,
+      see its usage for <command>COPY FROM</command> command when
+      <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>.
      </para>
     </listitem>
    </varlistentry>
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 5682d5d054..a7ad6c17c8 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -967,7 +967,17 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 											(Node *) cstate->escontext,
 											&values[m]))
 			{
+				Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
+
 				cstate->num_errors++;
+
+				if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
+					ereport(NOTICE,
+							errmsg("detected data type incompatibility at line number %llu for column %s; COPY %s",
+								   (unsigned long long) cstate->cur_lineno,
+								   cstate->cur_attname,
+								   cstate->cur_relname));
+
 				return true;
 			}
 
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 62406ef827..c6655000e4 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -737,7 +737,12 @@ CREATE TABLE check_ign_err (n int, m int[], k int);
 COPY check_ign_err FROM STDIN WITH (on_error stop);
 ERROR:  invalid input syntax for type integer: "a"
 CONTEXT:  COPY check_ign_err, line 2, column n: "a"
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+-- tests for options on_error and log_verbosity
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity 'verbose');
+NOTICE:  detected data type incompatibility at line number 2 for column n; COPY check_ign_err
+NOTICE:  detected data type incompatibility at line number 3 for column k; COPY check_ign_err
+NOTICE:  detected data type incompatibility at line number 4 for column m; COPY check_ign_err
+NOTICE:  detected data type incompatibility at line number 5 for column n; COPY check_ign_err
 NOTICE:  4 rows were skipped due to data type incompatibility
 SELECT * FROM check_ign_err;
  n |  m  | k 
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index 5116157cc9..b637a5b3bb 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -510,7 +510,9 @@ a	{2}	2
 
 5	{5}	5
 \.
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+
+-- tests for options on_error and log_verbosity
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity 'verbose');
 1	{1}	1
 a	{2}	2
 3	{3}	3333333333
-- 
2.34.1

#26Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#25)
Re: Add new error_action COPY ON_ERROR "log"

On Fri, Mar 08, 2024 at 03:36:30PM +0530, Bharath Rupireddy wrote:

Please see the attached v5-0001 patch implementing LOG_VERBOSITY with
options 'default' and 'verbose'. v5-0002 adds the detailed info to
ON_ERROR 'ignore' option.

I may be reading this patch set incorrectly, but why doesn't this
patch generate one NOTICE per attribute, as suggested by Sawada-san,
incrementing num_errors once per row when the last attribute has been
processed? Also, why not have a test that checks that multiple rows
spawn more than more messages in some distributed fashion? Did you
look at this idea?

We have a CF entry https://commitfest.postgresql.org/47/4798/ for the
original idea proposed in this thread, that is, to have the ON_ERROR
'log' option. I'll probably start a new thread and add a new CF entry
in the next commitfest if there's no objection from anyone.

Hmm. You are referring to the part where you'd want to control where
the errors are sent, right? At least, what you have here would
address point 1) mentioned in the first message of this thread.
--
Michael

#27Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#26)
Re: Add new error_action COPY ON_ERROR "log"

On Fri, Mar 8, 2024 at 4:42 PM Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Mar 08, 2024 at 03:36:30PM +0530, Bharath Rupireddy wrote:

Please see the attached v5-0001 patch implementing LOG_VERBOSITY with
options 'default' and 'verbose'. v5-0002 adds the detailed info to
ON_ERROR 'ignore' option.

I may be reading this patch set incorrectly, but why doesn't this
patch generate one NOTICE per attribute, as suggested by Sawada-san,
incrementing num_errors once per row when the last attribute has been
processed? Also, why not have a test that checks that multiple rows
spawn more than more messages in some distributed fashion? Did you
look at this idea?

If NOTICE per attribute and incrementing num_errors per row is
implemented, it ends up erroring out with ERROR: missing data for
column "m" for all-column-empty-row. Shall we treat this ERROR softly
too if on_error ignore is specified? Or shall we discuss this idea
separately?

-- tests for options on_error and log_verbosity
COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity 'verbose');
1 {1} 1
a {2} 2
3 {3} 3333333333
4 {a, 4} 4

5 {5} 5
\.

NOTICE: detected data type incompatibility at line number 2 for
column n; COPY check_ign_err
NOTICE: detected data type incompatibility at line number 2 for
column m; COPY check_ign_err
NOTICE: detected data type incompatibility at line number 2 for
column k; COPY check_ign_err
NOTICE: detected data type incompatibility at line number 3 for
column k; COPY check_ign_err
NOTICE: detected data type incompatibility at line number 4 for
column m; COPY check_ign_err
NOTICE: detected data type incompatibility at line number 4 for
column k; COPY check_ign_err
NOTICE: detected data type incompatibility at line number 5 for
column n; COPY check_ign_err
ERROR: missing data for column "m"
CONTEXT: COPY check_ign_err, line 5: ""

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#28Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#27)
Re: Add new error_action COPY ON_ERROR "log"

On Sat, Mar 09, 2024 at 12:01:49AM +0530, Bharath Rupireddy wrote:

If NOTICE per attribute and incrementing num_errors per row is
implemented, it ends up erroring out with ERROR: missing data for
column "m" for all-column-empty-row. Shall we treat this ERROR softly
too if on_error ignore is specified? Or shall we discuss this idea
separately?

ERROR: missing data for column "m"
CONTEXT: COPY check_ign_err, line 5: ""

Hmm. I have spent some time looking at the bevahior of ON_ERROR, and
there are two tests in copy2.sql, one for the case where there is more
data than attributes and a second where there is not enough data in a
row that checks for errors.

For example, take this table:
=# create table tab (a int, b int, c int);
CREATE TABLE

This case works, even if the row has clearly not enough attributes.
The first attribute can be parsed, not the second one and this causes
the remaining data of the row to be skipped:
=# copy tab from stdin (delimiter ',', on_error ignore);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.

1,
\.

NOTICE: 00000: 1 row was skipped due to data type incompatibility
LOCATION: CopyFrom, copyfrom.c:1314
COPY 0

This case fails. The first and the second attributes can be parsed,
and the line fails because we are missing the last attribute as of a
lack of delimiter:
=# copy tab from stdin (delimiter ',', on_error ignore);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.

1,1
\.

ERROR: 22P04: missing data for column "c"
CONTEXT: COPY tab, line 1: "1,1"

This brings a weird edge case for the all-column-empty-row case you
are mentioning once if we try to get information about all the rows we
should expect, but this has as side effect to break a case that's
intended ro work with ON_ERROR, as far as I understand, which is to
skip entirely a raw on the first conversion error we find, even if
there is no data afterwards. I was a bit confused by that first, but
I can also see why it is useful as-is on HEAD.

At the end of the day, this comes down to what is more helpful to the
user. And I'd agree on the side what ON_ERROR does currently, which
is what your patch relies on: on the first conversion failure, give up
and skip the rest of the row because we cannot trust its contents.
That's my way of saying that I am fine with the proposal of your
patch, and that we cannot provide the full state of a row without
making the error stack of COPY more invasive.

Perhaps we could discuss this idea of ensuring that all the attributes
are on a row in a different thread, as you say, but I am not really
convinced that there's a strong need for it at this stage as ON_ERROR
is new to v17. So it does not sound like a bad thing to let it brew
more before implementing more options and make the COPY paths more
complicated than they already are. I suspect that this may trigger
some discussion during the beta/stability phase depending on the
initial feedback. Perhaps I'm wrong, though.

+      <literal>verbose</literal>, a <literal>NOTICE</literal> message
+      containing the line number and column name for each discarded row is
+      emitted.

This should clarify that the column name refers to the attribute where
the input conversion has failed, I guess. Specifying only "column
name" without more context is a bit confusing.
--
Michael

#29Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#28)
2 attachment(s)
Re: Add new error_action COPY ON_ERROR "log"

On Mon, Mar 11, 2024 at 11:16 AM Michael Paquier <michael@paquier.xyz> wrote:

At the end of the day, this comes down to what is more helpful to the
user. And I'd agree on the side what ON_ERROR does currently, which
is what your patch relies on: on the first conversion failure, give up
and skip the rest of the row because we cannot trust its contents.
That's my way of saying that I am fine with the proposal of your
patch, and that we cannot provide the full state of a row without
making the error stack of COPY more invasive.

+1.

+      <literal>verbose</literal>, a <literal>NOTICE</literal> message
+      containing the line number and column name for each discarded row is
+      emitted.

This should clarify that the column name refers to the attribute where
the input conversion has failed, I guess. Specifying only "column
name" without more context is a bit confusing.

Done.

Please see the attached v6 patch set.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v6-0001-Add-LOG_VERBOSITY-option-to-COPY-command.patchapplication/octet-stream; name=v6-0001-Add-LOG_VERBOSITY-option-to-COPY-command.patchDownload
From 78db3e788719fa1f9735f35dad8cc0f7e10a8742 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Mon, 11 Mar 2024 11:37:49 +0000
Subject: [PATCH v6 1/2] Add LOG_VERBOSITY option to COPY command

This commit adds a new option LOG_VERBOSITY to set the verbosity of
logged messages by COPY command. A value of 'verbose' can be used
to emit more informative messages by the command, while the value
of 'default (which is the default) can be used to not log any
additional messages. More values such as 'terse', 'row_details'
etc. can be added based on the need  to the LOG_VERBOSITY option.

An upcoming commit for emitting more info on soft errors by
COPY FROM command with ON_ERROR 'ignore' uses this.

Author: Bharath Rupireddy
Reviewed-by: Michael Paquier, Masahiko Sawada
Reviewed-by: Atsushi Torikoshi
Discussion: https://www.postgresql.org/message-id/CALj2ACXNA0focNeriYRvQQaCGc4CsTuOnFbzF9LqTKNWxuJdhA%40mail.gmail.com
---
 doc/src/sgml/ref/copy.sgml          | 14 +++++++++++
 src/backend/commands/copy.c         | 38 +++++++++++++++++++++++++++++
 src/bin/psql/tab-complete.c         |  6 ++++-
 src/include/commands/copy.h         | 10 ++++++++
 src/test/regress/expected/copy2.out |  8 ++++++
 src/test/regress/sql/copy2.sql      |  2 ++
 src/tools/pgindent/typedefs.list    |  1 +
 7 files changed, 78 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 55764fc1f2..67ba6212fe 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -45,6 +45,7 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     FORCE_NULL { ( <replaceable class="parameter">column_name</replaceable> [, ...] ) | * }
     ON_ERROR '<replaceable class="parameter">error_action</replaceable>'
     ENCODING '<replaceable class="parameter">encoding_name</replaceable>'
+    LOG_VERBOSITY [ <replaceable class="parameter">mode</replaceable> ]
 </synopsis>
  </refsynopsisdiv>
 
@@ -415,6 +416,19 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>LOG_VERBOSITY</literal></term>
+    <listitem>
+     <para>
+      Sets the verbosity of logged messages by <command>COPY</command> command.
+      A <replaceable class="parameter">mode</replaceable> value of
+      <literal>verbose</literal> can be used to emit more informative messages
+      by the command, while value of <literal>default</literal> (which is the
+      default) can be used to not log any additional messages.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>WHERE</literal></term>
     <listitem>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 056b6733c8..23eb8c9c79 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -428,6 +428,36 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
 	return COPY_ON_ERROR_STOP;	/* keep compiler quiet */
 }
 
+/*
+ * Extract a CopyLogVerbosityChoice value from a DefElem.
+ */
+static CopyLogVerbosityChoice
+defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate)
+{
+	char	   *sval;
+
+	/*
+	 * If no parameter value given, assume the default value.
+	 */
+	if (def->arg == NULL)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+
+	/*
+	 * Allow "default", or "verbose" values.
+	 */
+	sval = defGetString(def);
+	if (pg_strcasecmp(sval, "default") == 0)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+	if (pg_strcasecmp(sval, "verbose") == 0)
+		return COPY_LOG_VERBOSITY_VERBOSE;
+
+	ereport(ERROR,
+			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			 errmsg("COPY LOG_VERBOSITY \"%s\" not recognized", sval),
+			 parser_errposition(pstate, def->location)));
+	return COPY_LOG_VERBOSITY_DEFAULT;	/* keep compiler quiet */
+}
+
 /*
  * Process the statement option list for COPY.
  *
@@ -454,6 +484,7 @@ ProcessCopyOptions(ParseState *pstate,
 	bool		freeze_specified = false;
 	bool		header_specified = false;
 	bool		on_error_specified = false;
+	bool		log_verbosity_specified = false;
 	ListCell   *option;
 
 	/* Support external use for option sanity checking */
@@ -613,6 +644,13 @@ ProcessCopyOptions(ParseState *pstate,
 			on_error_specified = true;
 			opts_out->on_error = defGetCopyOnErrorChoice(defel, pstate, is_from);
 		}
+		else if (strcmp(defel->defname, "log_verbosity") == 0)
+		{
+			if (log_verbosity_specified)
+				errorConflictingDefElem(defel, pstate);
+			log_verbosity_specified = true;
+			opts_out->log_verbosity = defGetCopyLogVerbosityChoice(defel, pstate);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 73133ce735..9305800340 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2901,7 +2901,7 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("FORMAT", "FREEZE", "DELIMITER", "NULL",
 					  "HEADER", "QUOTE", "ESCAPE", "FORCE_QUOTE",
 					  "FORCE_NOT_NULL", "FORCE_NULL", "ENCODING", "DEFAULT",
-					  "ON_ERROR");
+					  "ON_ERROR", "LOG_VERBOSITY");
 
 	/* Complete COPY <sth> FROM|TO filename WITH (FORMAT */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "FORMAT"))
@@ -2911,6 +2911,10 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "ON_ERROR"))
 		COMPLETE_WITH("stop", "ignore");
 
+	/* Complete COPY <sth> FROM filename WITH (LOG_VERBOSITY */
+	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "LOG_VERBOSITY"))
+		COMPLETE_WITH("default", "verbose");
+
 	/* Complete COPY <sth> FROM <sth> WITH (<options>) */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM", MatchAny, "WITH", MatchAny))
 		COMPLETE_WITH("WHERE");
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index b3da3cb0be..99d183fa4d 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -40,6 +40,15 @@ typedef enum CopyOnErrorChoice
 	COPY_ON_ERROR_IGNORE,		/* ignore errors */
 } CopyOnErrorChoice;
 
+/*
+ * Represents verbosity of logged messages by COPY command.
+ */
+typedef enum CopyLogVerbosityChoice
+{
+	COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional messages, default */
+	COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */
+} CopyLogVerbosityChoice;
+
 /*
  * A struct to hold COPY options, in a parsed form. All of these are related
  * to formatting, except for 'freeze', which doesn't really belong here, but
@@ -73,6 +82,7 @@ typedef struct CopyFormatOptions
 	bool	   *force_null_flags;	/* per-column CSV FN flags */
 	bool		convert_selectively;	/* do selective binary conversion? */
 	CopyOnErrorChoice on_error; /* what to do when error happened */
+	CopyLogVerbosityChoice log_verbosity;	/* verbosity of logged messages */
 	List	   *convert_select; /* list of column names (can be NIL) */
 } CopyFormatOptions;
 
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 25c401ce34..62406ef827 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
 ERROR:  conflicting or redundant options
 LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
                                             ^
+COPY x from stdin (log_verbosity 'default', log_verbosity 'verbose');
+ERROR:  conflicting or redundant options
+LINE 1: COPY x from stdin (log_verbosity 'default', log_verbosity 'v...
+                                                    ^
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
 ERROR:  cannot specify DELIMITER in BINARY mode
@@ -108,6 +112,10 @@ COPY x to stdin (format BINARY, on_error unsupported);
 ERROR:  COPY ON_ERROR cannot be used with COPY TO
 LINE 1: COPY x to stdin (format BINARY, on_error unsupported);
                                         ^
+COPY x to stdout (log_verbosity 'unsupported');
+ERROR:  COPY LOG_VERBOSITY "unsupported" not recognized
+LINE 1: COPY x to stdout (log_verbosity 'unsupported');
+                          ^
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
 ERROR:  column "d" specified more than once
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index b5e549e856..5116157cc9 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,6 +67,7 @@ COPY x from stdin (force_null (a), force_null (b));
 COPY x from stdin (convert_selectively (a), convert_selectively (b));
 COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
 COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (log_verbosity 'default', log_verbosity 'verbose');
 
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
@@ -80,6 +81,7 @@ COPY x to stdin (format CSV, force_not_null(a));
 COPY x to stdout (format TEXT, force_null(a));
 COPY x to stdin (format CSV, force_null(a));
 COPY x to stdin (format BINARY, on_error unsupported);
+COPY x to stdout (log_verbosity 'unsupported');
 
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index d3a7f75b08..b9b9f8a7f0 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -478,6 +478,7 @@ CopyFromState
 CopyFromStateData
 CopyHeaderChoice
 CopyInsertMethod
+CopyLogVerbosityChoice
 CopyMultiInsertBuffer
 CopyMultiInsertInfo
 CopyOnErrorChoice
-- 
2.34.1

v6-0002-Add-detailed-info-when-COPY-skips-soft-errors.patchapplication/octet-stream; name=v6-0002-Add-detailed-info-when-COPY-skips-soft-errors.patchDownload
From 6f21d689c73d3d7a2d80fb0e723960832d69d237 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Mon, 11 Mar 2024 11:53:44 +0000
Subject: [PATCH v6 2/2] Add detailed info when COPY skips soft errors

This commit emits individual info like line number and column name
when COPY skips soft errors. Because, the summary containing the
total rows skipped isn't enough for the users to know what exactly
are the malformed rows in the input data.

Author: Bharath Rupireddy
Reviewed-by: Michael Paquier, Masahiko Sawada
Reviewed-by: Atsushi Torikoshi
Discussion: https://www.postgresql.org/message-id/CALj2ACUk700cYhx1ATRQyRw-fBM%2BaRo6auRAitKGff7XNmYfqQ%40mail.gmail.com
---
 doc/src/sgml/ref/copy.sgml           | 12 +++++++++---
 src/backend/commands/copyfromparse.c | 10 ++++++++++
 src/test/regress/expected/copy2.out  |  7 ++++++-
 src/test/regress/sql/copy2.sql       |  4 +++-
 4 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 67ba6212fe..ed7fdc59fb 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -398,8 +398,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
      </para>
      <para>
-      A <literal>NOTICE</literal> message containing the ignored row count is emitted at the end
-      of the <command>COPY FROM</command> if at least one row was discarded.
+      A <literal>NOTICE</literal> message containing the ignored row count is
+      emitted at the end of the <command>COPY FROM</command> if at least one
+      row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
+      <literal>verbose</literal>, a <literal>NOTICE</literal> message
+      containing the line number and column name (whose input conversion has
+      failed) is emitted for each discarded row.
      </para>
     </listitem>
    </varlistentry>
@@ -424,7 +428,9 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       A <replaceable class="parameter">mode</replaceable> value of
       <literal>verbose</literal> can be used to emit more informative messages
       by the command, while value of <literal>default</literal> (which is the
-      default) can be used to not log any additional messages.
+      default) can be used to not log any additional messages. As an example,
+      see its usage for <command>COPY FROM</command> command when
+      <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>.
      </para>
     </listitem>
    </varlistentry>
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 5682d5d054..a7ad6c17c8 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -967,7 +967,17 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 											(Node *) cstate->escontext,
 											&values[m]))
 			{
+				Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
+
 				cstate->num_errors++;
+
+				if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
+					ereport(NOTICE,
+							errmsg("detected data type incompatibility at line number %llu for column %s; COPY %s",
+								   (unsigned long long) cstate->cur_lineno,
+								   cstate->cur_attname,
+								   cstate->cur_relname));
+
 				return true;
 			}
 
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 62406ef827..c6655000e4 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -737,7 +737,12 @@ CREATE TABLE check_ign_err (n int, m int[], k int);
 COPY check_ign_err FROM STDIN WITH (on_error stop);
 ERROR:  invalid input syntax for type integer: "a"
 CONTEXT:  COPY check_ign_err, line 2, column n: "a"
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+-- tests for options on_error and log_verbosity
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity 'verbose');
+NOTICE:  detected data type incompatibility at line number 2 for column n; COPY check_ign_err
+NOTICE:  detected data type incompatibility at line number 3 for column k; COPY check_ign_err
+NOTICE:  detected data type incompatibility at line number 4 for column m; COPY check_ign_err
+NOTICE:  detected data type incompatibility at line number 5 for column n; COPY check_ign_err
 NOTICE:  4 rows were skipped due to data type incompatibility
 SELECT * FROM check_ign_err;
  n |  m  | k 
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index 5116157cc9..b637a5b3bb 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -510,7 +510,9 @@ a	{2}	2
 
 5	{5}	5
 \.
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+
+-- tests for options on_error and log_verbosity
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity 'verbose');
 1	{1}	1
 a	{2}	2
 3	{3}	3333333333
-- 
2.34.1

#30Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#29)
Re: Add new error_action COPY ON_ERROR "log"

On Mon, Mar 11, 2024 at 06:00:00PM +0530, Bharath Rupireddy wrote:

Please see the attached v6 patch set.

I am tempted to tweak a few things in the docs, the comments and the
tests (particularly adding more patterns for tuples that fail on
conversion while it's clear that there are not enough attributes after
the incorrect values), but that looks roughly OK.

Wouldn't it be better to squash the patches together, by the way?
--
Michael

#31Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#30)
Re: Add new error_action COPY ON_ERROR "log"

On Tue, Mar 12, 2024 at 12:22 PM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Mar 11, 2024 at 06:00:00PM +0530, Bharath Rupireddy wrote:

Please see the attached v6 patch set.

I am tempted to tweak a few things in the docs, the comments and the
tests (particularly adding more patterns for tuples that fail on
conversion while it's clear that there are not enough attributes after
the incorrect values), but that looks roughly OK.

+1. But, do you want to add them now or later as a separate
patch/discussion altogether?

Wouldn't it be better to squash the patches together, by the way?

I guess not. They are two different things IMV.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#32Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#31)
3 attachment(s)
Re: Add new error_action COPY ON_ERROR "log"

On Tue, Mar 12, 2024 at 12:54:29PM +0530, Bharath Rupireddy wrote:

+1. But, do you want to add them now or later as a separate
patch/discussion altogether?

The attached 0003 is what I had in mind:
- Simplification of the LOG generated with quotes applied around the
column name, don't see much need to add the relation name, either, for
consistency and because the knowledge is known in the query.
- A few more tests.
- Some doc changes.

Wouldn't it be better to squash the patches together, by the way?

I guess not. They are two different things IMV.

Well, 0001 is sitting doing nothing because the COPY code does not
make use of it internally.
--
Michael

Attachments:

v6-0001-Add-LOG_VERBOSITY-option-to-COPY-command.patchtext/x-diff; charset=us-asciiDownload
From c45474726e084faf876a319485995ce84eef8293 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Mon, 11 Mar 2024 11:37:49 +0000
Subject: [PATCH v6 1/3] Add LOG_VERBOSITY option to COPY command

This commit adds a new option LOG_VERBOSITY to set the verbosity of
logged messages by COPY command. A value of 'verbose' can be used
to emit more informative messages by the command, while the value
of 'default (which is the default) can be used to not log any
additional messages. More values such as 'terse', 'row_details'
etc. can be added based on the need  to the LOG_VERBOSITY option.

An upcoming commit for emitting more info on soft errors by
COPY FROM command with ON_ERROR 'ignore' uses this.

Author: Bharath Rupireddy
Reviewed-by: Michael Paquier, Masahiko Sawada
Reviewed-by: Atsushi Torikoshi
Discussion: https://www.postgresql.org/message-id/CALj2ACXNA0focNeriYRvQQaCGc4CsTuOnFbzF9LqTKNWxuJdhA%40mail.gmail.com
---
 src/include/commands/copy.h         | 10 ++++++++
 src/backend/commands/copy.c         | 38 +++++++++++++++++++++++++++++
 src/bin/psql/tab-complete.c         |  6 ++++-
 src/test/regress/expected/copy2.out |  8 ++++++
 src/test/regress/sql/copy2.sql      |  2 ++
 doc/src/sgml/ref/copy.sgml          | 14 +++++++++++
 src/tools/pgindent/typedefs.list    |  1 +
 7 files changed, 78 insertions(+), 1 deletion(-)

diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index b3da3cb0be..99d183fa4d 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -40,6 +40,15 @@ typedef enum CopyOnErrorChoice
 	COPY_ON_ERROR_IGNORE,		/* ignore errors */
 } CopyOnErrorChoice;
 
+/*
+ * Represents verbosity of logged messages by COPY command.
+ */
+typedef enum CopyLogVerbosityChoice
+{
+	COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional messages, default */
+	COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */
+} CopyLogVerbosityChoice;
+
 /*
  * A struct to hold COPY options, in a parsed form. All of these are related
  * to formatting, except for 'freeze', which doesn't really belong here, but
@@ -73,6 +82,7 @@ typedef struct CopyFormatOptions
 	bool	   *force_null_flags;	/* per-column CSV FN flags */
 	bool		convert_selectively;	/* do selective binary conversion? */
 	CopyOnErrorChoice on_error; /* what to do when error happened */
+	CopyLogVerbosityChoice log_verbosity;	/* verbosity of logged messages */
 	List	   *convert_select; /* list of column names (can be NIL) */
 } CopyFormatOptions;
 
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 056b6733c8..23eb8c9c79 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -428,6 +428,36 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
 	return COPY_ON_ERROR_STOP;	/* keep compiler quiet */
 }
 
+/*
+ * Extract a CopyLogVerbosityChoice value from a DefElem.
+ */
+static CopyLogVerbosityChoice
+defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate)
+{
+	char	   *sval;
+
+	/*
+	 * If no parameter value given, assume the default value.
+	 */
+	if (def->arg == NULL)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+
+	/*
+	 * Allow "default", or "verbose" values.
+	 */
+	sval = defGetString(def);
+	if (pg_strcasecmp(sval, "default") == 0)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+	if (pg_strcasecmp(sval, "verbose") == 0)
+		return COPY_LOG_VERBOSITY_VERBOSE;
+
+	ereport(ERROR,
+			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			 errmsg("COPY LOG_VERBOSITY \"%s\" not recognized", sval),
+			 parser_errposition(pstate, def->location)));
+	return COPY_LOG_VERBOSITY_DEFAULT;	/* keep compiler quiet */
+}
+
 /*
  * Process the statement option list for COPY.
  *
@@ -454,6 +484,7 @@ ProcessCopyOptions(ParseState *pstate,
 	bool		freeze_specified = false;
 	bool		header_specified = false;
 	bool		on_error_specified = false;
+	bool		log_verbosity_specified = false;
 	ListCell   *option;
 
 	/* Support external use for option sanity checking */
@@ -613,6 +644,13 @@ ProcessCopyOptions(ParseState *pstate,
 			on_error_specified = true;
 			opts_out->on_error = defGetCopyOnErrorChoice(defel, pstate, is_from);
 		}
+		else if (strcmp(defel->defname, "log_verbosity") == 0)
+		{
+			if (log_verbosity_specified)
+				errorConflictingDefElem(defel, pstate);
+			log_verbosity_specified = true;
+			opts_out->log_verbosity = defGetCopyLogVerbosityChoice(defel, pstate);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 73133ce735..9305800340 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2901,7 +2901,7 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("FORMAT", "FREEZE", "DELIMITER", "NULL",
 					  "HEADER", "QUOTE", "ESCAPE", "FORCE_QUOTE",
 					  "FORCE_NOT_NULL", "FORCE_NULL", "ENCODING", "DEFAULT",
-					  "ON_ERROR");
+					  "ON_ERROR", "LOG_VERBOSITY");
 
 	/* Complete COPY <sth> FROM|TO filename WITH (FORMAT */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "FORMAT"))
@@ -2911,6 +2911,10 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "ON_ERROR"))
 		COMPLETE_WITH("stop", "ignore");
 
+	/* Complete COPY <sth> FROM filename WITH (LOG_VERBOSITY */
+	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "LOG_VERBOSITY"))
+		COMPLETE_WITH("default", "verbose");
+
 	/* Complete COPY <sth> FROM <sth> WITH (<options>) */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM", MatchAny, "WITH", MatchAny))
 		COMPLETE_WITH("WHERE");
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 25c401ce34..62406ef827 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
 ERROR:  conflicting or redundant options
 LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
                                             ^
+COPY x from stdin (log_verbosity 'default', log_verbosity 'verbose');
+ERROR:  conflicting or redundant options
+LINE 1: COPY x from stdin (log_verbosity 'default', log_verbosity 'v...
+                                                    ^
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
 ERROR:  cannot specify DELIMITER in BINARY mode
@@ -108,6 +112,10 @@ COPY x to stdin (format BINARY, on_error unsupported);
 ERROR:  COPY ON_ERROR cannot be used with COPY TO
 LINE 1: COPY x to stdin (format BINARY, on_error unsupported);
                                         ^
+COPY x to stdout (log_verbosity 'unsupported');
+ERROR:  COPY LOG_VERBOSITY "unsupported" not recognized
+LINE 1: COPY x to stdout (log_verbosity 'unsupported');
+                          ^
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
 ERROR:  column "d" specified more than once
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index b5e549e856..5116157cc9 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,6 +67,7 @@ COPY x from stdin (force_null (a), force_null (b));
 COPY x from stdin (convert_selectively (a), convert_selectively (b));
 COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
 COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (log_verbosity 'default', log_verbosity 'verbose');
 
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
@@ -80,6 +81,7 @@ COPY x to stdin (format CSV, force_not_null(a));
 COPY x to stdout (format TEXT, force_null(a));
 COPY x to stdin (format CSV, force_null(a));
 COPY x to stdin (format BINARY, on_error unsupported);
+COPY x to stdout (log_verbosity 'unsupported');
 
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 55764fc1f2..67ba6212fe 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -45,6 +45,7 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     FORCE_NULL { ( <replaceable class="parameter">column_name</replaceable> [, ...] ) | * }
     ON_ERROR '<replaceable class="parameter">error_action</replaceable>'
     ENCODING '<replaceable class="parameter">encoding_name</replaceable>'
+    LOG_VERBOSITY [ <replaceable class="parameter">mode</replaceable> ]
 </synopsis>
  </refsynopsisdiv>
 
@@ -415,6 +416,19 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>LOG_VERBOSITY</literal></term>
+    <listitem>
+     <para>
+      Sets the verbosity of logged messages by <command>COPY</command> command.
+      A <replaceable class="parameter">mode</replaceable> value of
+      <literal>verbose</literal> can be used to emit more informative messages
+      by the command, while value of <literal>default</literal> (which is the
+      default) can be used to not log any additional messages.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>WHERE</literal></term>
     <listitem>
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index aa7a25b8f8..549378c8ad 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -479,6 +479,7 @@ CopyFromState
 CopyFromStateData
 CopyHeaderChoice
 CopyInsertMethod
+CopyLogVerbosityChoice
 CopyMultiInsertBuffer
 CopyMultiInsertInfo
 CopyOnErrorChoice
-- 
2.43.0

v6-0002-Add-detailed-info-when-COPY-skips-soft-errors.patchtext/x-diff; charset=us-asciiDownload
From 1f087fb0cd740e927c6a475996430e2bb12de845 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Mon, 11 Mar 2024 11:53:44 +0000
Subject: [PATCH v6 2/3] Add detailed info when COPY skips soft errors

This commit emits individual info like line number and column name
when COPY skips soft errors. Because, the summary containing the
total rows skipped isn't enough for the users to know what exactly
are the malformed rows in the input data.

Author: Bharath Rupireddy
Reviewed-by: Michael Paquier, Masahiko Sawada
Reviewed-by: Atsushi Torikoshi
Discussion: https://www.postgresql.org/message-id/CALj2ACUk700cYhx1ATRQyRw-fBM%2BaRo6auRAitKGff7XNmYfqQ%40mail.gmail.com
---
 src/backend/commands/copyfromparse.c | 10 ++++++++++
 src/test/regress/expected/copy2.out  |  7 ++++++-
 src/test/regress/sql/copy2.sql       |  4 +++-
 doc/src/sgml/ref/copy.sgml           | 12 +++++++++---
 4 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 5682d5d054..a7ad6c17c8 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -967,7 +967,17 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 											(Node *) cstate->escontext,
 											&values[m]))
 			{
+				Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
+
 				cstate->num_errors++;
+
+				if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
+					ereport(NOTICE,
+							errmsg("detected data type incompatibility at line number %llu for column %s; COPY %s",
+								   (unsigned long long) cstate->cur_lineno,
+								   cstate->cur_attname,
+								   cstate->cur_relname));
+
 				return true;
 			}
 
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 62406ef827..c6655000e4 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -737,7 +737,12 @@ CREATE TABLE check_ign_err (n int, m int[], k int);
 COPY check_ign_err FROM STDIN WITH (on_error stop);
 ERROR:  invalid input syntax for type integer: "a"
 CONTEXT:  COPY check_ign_err, line 2, column n: "a"
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+-- tests for options on_error and log_verbosity
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity 'verbose');
+NOTICE:  detected data type incompatibility at line number 2 for column n; COPY check_ign_err
+NOTICE:  detected data type incompatibility at line number 3 for column k; COPY check_ign_err
+NOTICE:  detected data type incompatibility at line number 4 for column m; COPY check_ign_err
+NOTICE:  detected data type incompatibility at line number 5 for column n; COPY check_ign_err
 NOTICE:  4 rows were skipped due to data type incompatibility
 SELECT * FROM check_ign_err;
  n |  m  | k 
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index 5116157cc9..b637a5b3bb 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -510,7 +510,9 @@ a	{2}	2
 
 5	{5}	5
 \.
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+
+-- tests for options on_error and log_verbosity
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity 'verbose');
 1	{1}	1
 a	{2}	2
 3	{3}	3333333333
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 67ba6212fe..ed7fdc59fb 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -398,8 +398,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
      </para>
      <para>
-      A <literal>NOTICE</literal> message containing the ignored row count is emitted at the end
-      of the <command>COPY FROM</command> if at least one row was discarded.
+      A <literal>NOTICE</literal> message containing the ignored row count is
+      emitted at the end of the <command>COPY FROM</command> if at least one
+      row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
+      <literal>verbose</literal>, a <literal>NOTICE</literal> message
+      containing the line number and column name (whose input conversion has
+      failed) is emitted for each discarded row.
      </para>
     </listitem>
    </varlistentry>
@@ -424,7 +428,9 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       A <replaceable class="parameter">mode</replaceable> value of
       <literal>verbose</literal> can be used to emit more informative messages
       by the command, while value of <literal>default</literal> (which is the
-      default) can be used to not log any additional messages.
+      default) can be used to not log any additional messages. As an example,
+      see its usage for <command>COPY FROM</command> command when
+      <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>.
      </para>
     </listitem>
    </varlistentry>
-- 
2.43.0

v6-0003-Some-docs-and-comments-updates.patchtext/x-diff; charset=us-asciiDownload
From 96368b5af3c4186a065770cec96fcb0ea60dafd0 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Wed, 13 Mar 2024 08:45:20 +0900
Subject: [PATCH v6 3/3] Some docs and comments updates

---
 src/backend/commands/copyfromparse.c |  6 ++----
 src/test/regress/expected/copy2.out  | 15 +++++++++------
 src/test/regress/sql/copy2.sql       |  3 +++
 doc/src/sgml/ref/copy.sgml           | 18 ++++++++++--------
 4 files changed, 24 insertions(+), 18 deletions(-)

diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index a7ad6c17c8..44f615de28 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -973,11 +973,9 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 
 				if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
 					ereport(NOTICE,
-							errmsg("detected data type incompatibility at line number %llu for column %s; COPY %s",
+							errmsg("data type incompatibility at line %llu for column \"%s\"",
 								   (unsigned long long) cstate->cur_lineno,
-								   cstate->cur_attname,
-								   cstate->cur_relname));
-
+								   cstate->cur_attname));
 				return true;
 			}
 
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index c6655000e4..af669fedbe 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -739,17 +739,20 @@ ERROR:  invalid input syntax for type integer: "a"
 CONTEXT:  COPY check_ign_err, line 2, column n: "a"
 -- tests for options on_error and log_verbosity
 COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity 'verbose');
-NOTICE:  detected data type incompatibility at line number 2 for column n; COPY check_ign_err
-NOTICE:  detected data type incompatibility at line number 3 for column k; COPY check_ign_err
-NOTICE:  detected data type incompatibility at line number 4 for column m; COPY check_ign_err
-NOTICE:  detected data type incompatibility at line number 5 for column n; COPY check_ign_err
-NOTICE:  4 rows were skipped due to data type incompatibility
+NOTICE:  data type incompatibility at line 2 for column "n"
+NOTICE:  data type incompatibility at line 3 for column "k"
+NOTICE:  data type incompatibility at line 4 for column "m"
+NOTICE:  data type incompatibility at line 5 for column "n"
+NOTICE:  data type incompatibility at line 7 for column "m"
+NOTICE:  data type incompatibility at line 8 for column "k"
+NOTICE:  6 rows were skipped due to data type incompatibility
 SELECT * FROM check_ign_err;
  n |  m  | k 
 ---+-----+---
  1 | {1} | 1
  5 | {5} | 5
-(2 rows)
+ 8 | {8} | 8
+(3 rows)
 
 -- test datatype error that can't be handled as soft: should fail
 CREATE TABLE hard_err(foo widget);
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index b637a5b3bb..4fb736535d 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -519,6 +519,9 @@ a	{2}	2
 4	{a, 4}	4
 
 5	{5}	5
+6	a
+7	{7}	a
+8	{8}	8
 \.
 SELECT * FROM check_ign_err;
 
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index ed7fdc59fb..7d594d275e 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -402,8 +402,8 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       emitted at the end of the <command>COPY FROM</command> if at least one
       row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
       <literal>verbose</literal>, a <literal>NOTICE</literal> message
-      containing the line number and column name (whose input conversion has
-      failed) is emitted for each discarded row.
+      containing the line of the input file and the column name whose input
+      conversion has failed is emitted for each discarded row.
      </para>
     </listitem>
    </varlistentry>
@@ -424,13 +424,15 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     <term><literal>LOG_VERBOSITY</literal></term>
     <listitem>
      <para>
-      Sets the verbosity of logged messages by <command>COPY</command> command.
+      Sets the verbosity of some of the messages logged by a
+      <command>COPY</command> command.
       A <replaceable class="parameter">mode</replaceable> value of
-      <literal>verbose</literal> can be used to emit more informative messages
-      by the command, while value of <literal>default</literal> (which is the
-      default) can be used to not log any additional messages. As an example,
-      see its usage for <command>COPY FROM</command> command when
-      <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>.
+      <literal>verbose</literal> can be used to emit more informative messages.
+      <literal>default</literal> will not log any additional messages.
+     </para>
+     <para>
+      This is currently used in <command>COPY FROM</command> command when
+      <literal>ON_ERROR</literal> is set to <literal>ignore</literal>.
      </para>
     </listitem>
    </varlistentry>
-- 
2.43.0

#33Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#32)
2 attachment(s)
Re: Add new error_action COPY ON_ERROR "log"

On Wed, Mar 13, 2024 at 5:16 AM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Mar 12, 2024 at 12:54:29PM +0530, Bharath Rupireddy wrote:

+1. But, do you want to add them now or later as a separate
patch/discussion altogether?

The attached 0003 is what I had in mind:
- Simplification of the LOG generated with quotes applied around the
column name, don't see much need to add the relation name, either, for
consistency and because the knowledge is known in the query.
- A few more tests.
- Some doc changes.

LGMT. So, I've merged those changes into 0001 and 0002.

Wouldn't it be better to squash the patches together, by the way?

I guess not. They are two different things IMV.

Well, 0001 is sitting doing nothing because the COPY code does not
make use of it internally.

Yes. That's why I left a note in the commit message that a future
commit will use it.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v7-0001-Add-LOG_VERBOSITY-option-to-COPY-command.patchapplication/x-patch; name=v7-0001-Add-LOG_VERBOSITY-option-to-COPY-command.patchDownload
From 77b9f28121d6531f40d96f7c00ecdb860550b67f Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 13 Mar 2024 02:20:42 +0000
Subject: [PATCH v7 1/2] Add LOG_VERBOSITY option to COPY command

This commit adds a new option LOG_VERBOSITY to set the verbosity of
logged messages by COPY command. A value of 'verbose' can be used
to emit more informative messages by the command, while the value
of 'default (which is the default) can be used to not log any
additional messages. More values such as 'terse', 'row_details'
etc. can be added based on the need  to the LOG_VERBOSITY option.

An upcoming commit for emitting more info on soft errors by
COPY FROM command with ON_ERROR 'ignore' uses this.

Author: Bharath Rupireddy
Reviewed-by: Michael Paquier, Masahiko Sawada
Reviewed-by: Atsushi Torikoshi
Discussion: https://www.postgresql.org/message-id/CALj2ACXNA0focNeriYRvQQaCGc4CsTuOnFbzF9LqTKNWxuJdhA%40mail.gmail.com
---
 doc/src/sgml/ref/copy.sgml          | 14 +++++++++++
 src/backend/commands/copy.c         | 38 +++++++++++++++++++++++++++++
 src/bin/psql/tab-complete.c         |  6 ++++-
 src/include/commands/copy.h         | 10 ++++++++
 src/test/regress/expected/copy2.out |  8 ++++++
 src/test/regress/sql/copy2.sql      |  2 ++
 src/tools/pgindent/typedefs.list    |  1 +
 7 files changed, 78 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 55764fc1f2..eba9b8f64e 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -45,6 +45,7 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     FORCE_NULL { ( <replaceable class="parameter">column_name</replaceable> [, ...] ) | * }
     ON_ERROR '<replaceable class="parameter">error_action</replaceable>'
     ENCODING '<replaceable class="parameter">encoding_name</replaceable>'
+    LOG_VERBOSITY [ <replaceable class="parameter">mode</replaceable> ]
 </synopsis>
  </refsynopsisdiv>
 
@@ -415,6 +416,19 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>LOG_VERBOSITY</literal></term>
+    <listitem>
+     <para>
+      Sets the verbosity of some of the messages logged by a
+      <command>COPY</command> command.
+      A <replaceable class="parameter">mode</replaceable> value of
+      <literal>verbose</literal> can be used to emit more informative messages.
+      <literal>default</literal> will not log any additional messages.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>WHERE</literal></term>
     <listitem>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 056b6733c8..23eb8c9c79 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -428,6 +428,36 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
 	return COPY_ON_ERROR_STOP;	/* keep compiler quiet */
 }
 
+/*
+ * Extract a CopyLogVerbosityChoice value from a DefElem.
+ */
+static CopyLogVerbosityChoice
+defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate)
+{
+	char	   *sval;
+
+	/*
+	 * If no parameter value given, assume the default value.
+	 */
+	if (def->arg == NULL)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+
+	/*
+	 * Allow "default", or "verbose" values.
+	 */
+	sval = defGetString(def);
+	if (pg_strcasecmp(sval, "default") == 0)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+	if (pg_strcasecmp(sval, "verbose") == 0)
+		return COPY_LOG_VERBOSITY_VERBOSE;
+
+	ereport(ERROR,
+			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			 errmsg("COPY LOG_VERBOSITY \"%s\" not recognized", sval),
+			 parser_errposition(pstate, def->location)));
+	return COPY_LOG_VERBOSITY_DEFAULT;	/* keep compiler quiet */
+}
+
 /*
  * Process the statement option list for COPY.
  *
@@ -454,6 +484,7 @@ ProcessCopyOptions(ParseState *pstate,
 	bool		freeze_specified = false;
 	bool		header_specified = false;
 	bool		on_error_specified = false;
+	bool		log_verbosity_specified = false;
 	ListCell   *option;
 
 	/* Support external use for option sanity checking */
@@ -613,6 +644,13 @@ ProcessCopyOptions(ParseState *pstate,
 			on_error_specified = true;
 			opts_out->on_error = defGetCopyOnErrorChoice(defel, pstate, is_from);
 		}
+		else if (strcmp(defel->defname, "log_verbosity") == 0)
+		{
+			if (log_verbosity_specified)
+				errorConflictingDefElem(defel, pstate);
+			log_verbosity_specified = true;
+			opts_out->log_verbosity = defGetCopyLogVerbosityChoice(defel, pstate);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 73133ce735..9305800340 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2901,7 +2901,7 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("FORMAT", "FREEZE", "DELIMITER", "NULL",
 					  "HEADER", "QUOTE", "ESCAPE", "FORCE_QUOTE",
 					  "FORCE_NOT_NULL", "FORCE_NULL", "ENCODING", "DEFAULT",
-					  "ON_ERROR");
+					  "ON_ERROR", "LOG_VERBOSITY");
 
 	/* Complete COPY <sth> FROM|TO filename WITH (FORMAT */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "FORMAT"))
@@ -2911,6 +2911,10 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "ON_ERROR"))
 		COMPLETE_WITH("stop", "ignore");
 
+	/* Complete COPY <sth> FROM filename WITH (LOG_VERBOSITY */
+	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "LOG_VERBOSITY"))
+		COMPLETE_WITH("default", "verbose");
+
 	/* Complete COPY <sth> FROM <sth> WITH (<options>) */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM", MatchAny, "WITH", MatchAny))
 		COMPLETE_WITH("WHERE");
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index b3da3cb0be..99d183fa4d 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -40,6 +40,15 @@ typedef enum CopyOnErrorChoice
 	COPY_ON_ERROR_IGNORE,		/* ignore errors */
 } CopyOnErrorChoice;
 
+/*
+ * Represents verbosity of logged messages by COPY command.
+ */
+typedef enum CopyLogVerbosityChoice
+{
+	COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional messages, default */
+	COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */
+} CopyLogVerbosityChoice;
+
 /*
  * A struct to hold COPY options, in a parsed form. All of these are related
  * to formatting, except for 'freeze', which doesn't really belong here, but
@@ -73,6 +82,7 @@ typedef struct CopyFormatOptions
 	bool	   *force_null_flags;	/* per-column CSV FN flags */
 	bool		convert_selectively;	/* do selective binary conversion? */
 	CopyOnErrorChoice on_error; /* what to do when error happened */
+	CopyLogVerbosityChoice log_verbosity;	/* verbosity of logged messages */
 	List	   *convert_select; /* list of column names (can be NIL) */
 } CopyFormatOptions;
 
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 25c401ce34..62406ef827 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
 ERROR:  conflicting or redundant options
 LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
                                             ^
+COPY x from stdin (log_verbosity 'default', log_verbosity 'verbose');
+ERROR:  conflicting or redundant options
+LINE 1: COPY x from stdin (log_verbosity 'default', log_verbosity 'v...
+                                                    ^
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
 ERROR:  cannot specify DELIMITER in BINARY mode
@@ -108,6 +112,10 @@ COPY x to stdin (format BINARY, on_error unsupported);
 ERROR:  COPY ON_ERROR cannot be used with COPY TO
 LINE 1: COPY x to stdin (format BINARY, on_error unsupported);
                                         ^
+COPY x to stdout (log_verbosity 'unsupported');
+ERROR:  COPY LOG_VERBOSITY "unsupported" not recognized
+LINE 1: COPY x to stdout (log_verbosity 'unsupported');
+                          ^
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
 ERROR:  column "d" specified more than once
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index b5e549e856..5116157cc9 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,6 +67,7 @@ COPY x from stdin (force_null (a), force_null (b));
 COPY x from stdin (convert_selectively (a), convert_selectively (b));
 COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
 COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (log_verbosity 'default', log_verbosity 'verbose');
 
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
@@ -80,6 +81,7 @@ COPY x to stdin (format CSV, force_not_null(a));
 COPY x to stdout (format TEXT, force_null(a));
 COPY x to stdin (format CSV, force_null(a));
 COPY x to stdin (format BINARY, on_error unsupported);
+COPY x to stdout (log_verbosity 'unsupported');
 
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index aa7a25b8f8..549378c8ad 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -479,6 +479,7 @@ CopyFromState
 CopyFromStateData
 CopyHeaderChoice
 CopyInsertMethod
+CopyLogVerbosityChoice
 CopyMultiInsertBuffer
 CopyMultiInsertInfo
 CopyOnErrorChoice
-- 
2.34.1

v7-0002-Add-detailed-info-when-COPY-skips-soft-errors.patchapplication/x-patch; name=v7-0002-Add-detailed-info-when-COPY-skips-soft-errors.patchDownload
From badf71c273c4a496bcf14701d973f768f24fa7fd Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 13 Mar 2024 02:30:46 +0000
Subject: [PATCH v7 2/2] Add detailed info when COPY skips soft errors

This commit emits individual info like line number and column name
when COPY skips soft errors. Because, the summary containing the
total rows skipped isn't enough for the users to know what exactly
are the malformed rows in the input data.

Author: Bharath Rupireddy
Reviewed-by: Michael Paquier, Masahiko Sawada
Reviewed-by: Atsushi Torikoshi
Discussion: https://www.postgresql.org/message-id/CALj2ACUk700cYhx1ATRQyRw-fBM%2BaRo6auRAitKGff7XNmYfqQ%40mail.gmail.com
---
 doc/src/sgml/ref/copy.sgml           | 12 ++++++++++--
 src/backend/commands/copyfromparse.c |  9 +++++++++
 src/test/regress/expected/copy2.out  | 14 +++++++++++---
 src/test/regress/sql/copy2.sql       |  7 ++++++-
 4 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index eba9b8f64e..bdd6580721 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -398,8 +398,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
      </para>
      <para>
-      A <literal>NOTICE</literal> message containing the ignored row count is emitted at the end
-      of the <command>COPY FROM</command> if at least one row was discarded.
+      A <literal>NOTICE</literal> message containing the ignored row count is
+      emitted at the end of the <command>COPY FROM</command> if at least one
+      row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
+      <literal>verbose</literal>, a <literal>NOTICE</literal> message
+      containing the line of the input file and the column name whose input
+      conversion has failed is emitted for each discarded row.
      </para>
     </listitem>
    </varlistentry>
@@ -426,6 +430,10 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       <literal>verbose</literal> can be used to emit more informative messages.
       <literal>default</literal> will not log any additional messages.
      </para>
+     <para>
+      This is currently used in <command>COPY FROM</command> command when
+      <literal>ON_ERROR</literal> is set to <literal>ignore</literal>.
+      </para>
     </listitem>
    </varlistentry>
 
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 5682d5d054..e4a89eef13 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -967,7 +967,16 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 											(Node *) cstate->escontext,
 											&values[m]))
 			{
+				Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
+
 				cstate->num_errors++;
+
+				if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
+					ereport(NOTICE,
+							errmsg("data type incompatibility at line %llu for column \"%s\"",
+								   (unsigned long long) cstate->cur_lineno,
+								   cstate->cur_attname));
+
 				return true;
 			}
 
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 62406ef827..af669fedbe 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -737,14 +737,22 @@ CREATE TABLE check_ign_err (n int, m int[], k int);
 COPY check_ign_err FROM STDIN WITH (on_error stop);
 ERROR:  invalid input syntax for type integer: "a"
 CONTEXT:  COPY check_ign_err, line 2, column n: "a"
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
-NOTICE:  4 rows were skipped due to data type incompatibility
+-- tests for options on_error and log_verbosity
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity 'verbose');
+NOTICE:  data type incompatibility at line 2 for column "n"
+NOTICE:  data type incompatibility at line 3 for column "k"
+NOTICE:  data type incompatibility at line 4 for column "m"
+NOTICE:  data type incompatibility at line 5 for column "n"
+NOTICE:  data type incompatibility at line 7 for column "m"
+NOTICE:  data type incompatibility at line 8 for column "k"
+NOTICE:  6 rows were skipped due to data type incompatibility
 SELECT * FROM check_ign_err;
  n |  m  | k 
 ---+-----+---
  1 | {1} | 1
  5 | {5} | 5
-(2 rows)
+ 8 | {8} | 8
+(3 rows)
 
 -- test datatype error that can't be handled as soft: should fail
 CREATE TABLE hard_err(foo widget);
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index 5116157cc9..4fb736535d 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -510,13 +510,18 @@ a	{2}	2
 
 5	{5}	5
 \.
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+
+-- tests for options on_error and log_verbosity
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity 'verbose');
 1	{1}	1
 a	{2}	2
 3	{3}	3333333333
 4	{a, 4}	4
 
 5	{5}	5
+6	a
+7	{7}	a
+8	{8}	8
 \.
 SELECT * FROM check_ign_err;
 
-- 
2.34.1

#34Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#33)
Re: Add new error_action COPY ON_ERROR "log"

On Wed, Mar 13, 2024 at 08:08:42AM +0530, Bharath Rupireddy wrote:

On Wed, Mar 13, 2024 at 5:16 AM Michael Paquier <michael@paquier.xyz> wrote:

The attached 0003 is what I had in mind:
- Simplification of the LOG generated with quotes applied around the
column name, don't see much need to add the relation name, either, for
consistency and because the knowledge is known in the query.
- A few more tests.
- Some doc changes.

LGMT. So, I've merged those changes into 0001 and 0002.

I've applied the extra tests for now, as this was really confusing.

Hmm. This NOTICE is really bugging me. It is true that the clients
would get more information, but the information is duplicated on the
server side because the error context provides the same information as
the NOTICE itself:
NOTICE: data type incompatibility at line 1 for column "a"
CONTEXT: COPY aa, line 1, column a: "a"
STATEMENT: copy aa from stdin with (on_error ignore, log_verbosity verbose);
--
Michael

#35Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#34)
2 attachment(s)
Re: Add new error_action COPY ON_ERROR "log"

On Wed, Mar 13, 2024 at 11:09 AM Michael Paquier <michael@paquier.xyz> wrote:

Hmm. This NOTICE is really bugging me. It is true that the clients
would get more information, but the information is duplicated on the
server side because the error context provides the same information as
the NOTICE itself:
NOTICE: data type incompatibility at line 1 for column "a"
CONTEXT: COPY aa, line 1, column a: "a"
STATEMENT: copy aa from stdin with (on_error ignore, log_verbosity verbose);

Yes, if wanted, clients can also get the CONTEXT - for instance, using
'\set SHOW_CONTEXT always' in psql.

I think we can enhance the NOTICE message to include the column value
(just like CONTEXT message is showing) and leverage relname_only to
emit only the relation name in the CONTEXT message.

/*
* We suppress error context information other than the relation name,
* if one of the operations below fails.
*/
Assert(!cstate->relname_only);
cstate->relname_only = true;

I'm attaching the v8 patch set implementing the above idea. With this,
[1]: NOTICE: data type incompatibility at line 2 for column n: "a" NOTICE: data type incompatibility at line 3 for column k: "3333333333" NOTICE: data type incompatibility at line 4 for column m: "{a, 4}" NOTICE: data type incompatibility at line 5 for column n: "" NOTICE: data type incompatibility at line 7 for column m: "a" NOTICE: data type incompatibility at line 8 for column k: "a" NOTICE: 6 rows were skipped due to data type incompatibility COPY 3
approach not only reduces the duplicate info in the NOTICE and CONTEXT
messages, but also makes it easy for users to get all the necessary
info in the NOTICE message without having to set extra parameters to
get CONTEXT message.

Another idea is to move even the table name to NOTICE message and hide
the context with errhidecontext when we emit the new NOTICE messages.

Thoughts?

[1]: NOTICE: data type incompatibility at line 2 for column n: "a" NOTICE: data type incompatibility at line 3 for column k: "3333333333" NOTICE: data type incompatibility at line 4 for column m: "{a, 4}" NOTICE: data type incompatibility at line 5 for column n: "" NOTICE: data type incompatibility at line 7 for column m: "a" NOTICE: data type incompatibility at line 8 for column k: "a" NOTICE: 6 rows were skipped due to data type incompatibility COPY 3
NOTICE: data type incompatibility at line 2 for column n: "a"
NOTICE: data type incompatibility at line 3 for column k: "3333333333"
NOTICE: data type incompatibility at line 4 for column m: "{a, 4}"
NOTICE: data type incompatibility at line 5 for column n: ""
NOTICE: data type incompatibility at line 7 for column m: "a"
NOTICE: data type incompatibility at line 8 for column k: "a"
NOTICE: 6 rows were skipped due to data type incompatibility
COPY 3

[2]: 2024-03-13 13:49:14.138 UTC [1330270] NOTICE: data type incompatibility at line 2 for column n: "a" 2024-03-13 13:49:14.138 UTC [1330270] CONTEXT: COPY check_ign_err 2024-03-13 13:49:14.138 UTC [1330270] NOTICE: data type incompatibility at line 3 for column k: "3333333333" 2024-03-13 13:49:14.138 UTC [1330270] CONTEXT: COPY check_ign_err 2024-03-13 13:49:14.138 UTC [1330270] NOTICE: data type incompatibility at line 4 for column m: "{a, 4}" 2024-03-13 13:49:14.138 UTC [1330270] CONTEXT: COPY check_ign_err 2024-03-13 13:49:14.138 UTC [1330270] NOTICE: data type incompatibility at line 5 for column n: "" 2024-03-13 13:49:14.138 UTC [1330270] CONTEXT: COPY check_ign_err 2024-03-13 13:49:14.138 UTC [1330270] NOTICE: data type incompatibility at line 7 for column m: "a" 2024-03-13 13:49:14.138 UTC [1330270] CONTEXT: COPY check_ign_err 2024-03-13 13:49:14.138 UTC [1330270] NOTICE: data type incompatibility at line 8 for column k: "a" 2024-03-13 13:49:14.138 UTC [1330270] CONTEXT: COPY check_ign_err 2024-03-13 13:49:14.138 UTC [1330270] NOTICE: 6 rows were skipped due to data type incompatibility
2024-03-13 13:49:14.138 UTC [1330270] NOTICE: data type
incompatibility at line 2 for column n: "a"
2024-03-13 13:49:14.138 UTC [1330270] CONTEXT: COPY check_ign_err
2024-03-13 13:49:14.138 UTC [1330270] NOTICE: data type
incompatibility at line 3 for column k: "3333333333"
2024-03-13 13:49:14.138 UTC [1330270] CONTEXT: COPY check_ign_err
2024-03-13 13:49:14.138 UTC [1330270] NOTICE: data type
incompatibility at line 4 for column m: "{a, 4}"
2024-03-13 13:49:14.138 UTC [1330270] CONTEXT: COPY check_ign_err
2024-03-13 13:49:14.138 UTC [1330270] NOTICE: data type
incompatibility at line 5 for column n: ""
2024-03-13 13:49:14.138 UTC [1330270] CONTEXT: COPY check_ign_err
2024-03-13 13:49:14.138 UTC [1330270] NOTICE: data type
incompatibility at line 7 for column m: "a"
2024-03-13 13:49:14.138 UTC [1330270] CONTEXT: COPY check_ign_err
2024-03-13 13:49:14.138 UTC [1330270] NOTICE: data type
incompatibility at line 8 for column k: "a"
2024-03-13 13:49:14.138 UTC [1330270] CONTEXT: COPY check_ign_err
2024-03-13 13:49:14.138 UTC [1330270] NOTICE: 6 rows were skipped due
to data type incompatibility

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v8-0001-Add-LOG_VERBOSITY-option-to-COPY-command.patchapplication/octet-stream; name=v8-0001-Add-LOG_VERBOSITY-option-to-COPY-command.patchDownload
From b7929fc1de6e26ddc9b519a0bf1a2f12e3535213 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 13 Mar 2024 11:49:25 +0000
Subject: [PATCH v8 1/2] Add LOG_VERBOSITY option to COPY command

This commit adds a new option LOG_VERBOSITY to set the verbosity of
logged messages by COPY command. A value of 'verbose' can be used
to emit more informative messages by the command, while the value
of 'default (which is the default) can be used to not log any
additional messages. More values such as 'terse', 'row_details'
etc. can be added based on the need  to the LOG_VERBOSITY option.

An upcoming commit for emitting more info on soft errors by
COPY FROM command with ON_ERROR 'ignore' uses this.

Author: Bharath Rupireddy
Reviewed-by: Michael Paquier, Masahiko Sawada
Reviewed-by: Atsushi Torikoshi
Discussion: https://www.postgresql.org/message-id/CALj2ACXNA0focNeriYRvQQaCGc4CsTuOnFbzF9LqTKNWxuJdhA%40mail.gmail.com
---
 doc/src/sgml/ref/copy.sgml          | 14 +++++++++++
 src/backend/commands/copy.c         | 38 +++++++++++++++++++++++++++++
 src/bin/psql/tab-complete.c         |  6 ++++-
 src/include/commands/copy.h         | 10 ++++++++
 src/test/regress/expected/copy2.out |  8 ++++++
 src/test/regress/sql/copy2.sql      |  2 ++
 src/tools/pgindent/typedefs.list    |  1 +
 7 files changed, 78 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 55764fc1f2..eba9b8f64e 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -45,6 +45,7 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     FORCE_NULL { ( <replaceable class="parameter">column_name</replaceable> [, ...] ) | * }
     ON_ERROR '<replaceable class="parameter">error_action</replaceable>'
     ENCODING '<replaceable class="parameter">encoding_name</replaceable>'
+    LOG_VERBOSITY [ <replaceable class="parameter">mode</replaceable> ]
 </synopsis>
  </refsynopsisdiv>
 
@@ -415,6 +416,19 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>LOG_VERBOSITY</literal></term>
+    <listitem>
+     <para>
+      Sets the verbosity of some of the messages logged by a
+      <command>COPY</command> command.
+      A <replaceable class="parameter">mode</replaceable> value of
+      <literal>verbose</literal> can be used to emit more informative messages.
+      <literal>default</literal> will not log any additional messages.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>WHERE</literal></term>
     <listitem>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 056b6733c8..23eb8c9c79 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -428,6 +428,36 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
 	return COPY_ON_ERROR_STOP;	/* keep compiler quiet */
 }
 
+/*
+ * Extract a CopyLogVerbosityChoice value from a DefElem.
+ */
+static CopyLogVerbosityChoice
+defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate)
+{
+	char	   *sval;
+
+	/*
+	 * If no parameter value given, assume the default value.
+	 */
+	if (def->arg == NULL)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+
+	/*
+	 * Allow "default", or "verbose" values.
+	 */
+	sval = defGetString(def);
+	if (pg_strcasecmp(sval, "default") == 0)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+	if (pg_strcasecmp(sval, "verbose") == 0)
+		return COPY_LOG_VERBOSITY_VERBOSE;
+
+	ereport(ERROR,
+			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			 errmsg("COPY LOG_VERBOSITY \"%s\" not recognized", sval),
+			 parser_errposition(pstate, def->location)));
+	return COPY_LOG_VERBOSITY_DEFAULT;	/* keep compiler quiet */
+}
+
 /*
  * Process the statement option list for COPY.
  *
@@ -454,6 +484,7 @@ ProcessCopyOptions(ParseState *pstate,
 	bool		freeze_specified = false;
 	bool		header_specified = false;
 	bool		on_error_specified = false;
+	bool		log_verbosity_specified = false;
 	ListCell   *option;
 
 	/* Support external use for option sanity checking */
@@ -613,6 +644,13 @@ ProcessCopyOptions(ParseState *pstate,
 			on_error_specified = true;
 			opts_out->on_error = defGetCopyOnErrorChoice(defel, pstate, is_from);
 		}
+		else if (strcmp(defel->defname, "log_verbosity") == 0)
+		{
+			if (log_verbosity_specified)
+				errorConflictingDefElem(defel, pstate);
+			log_verbosity_specified = true;
+			opts_out->log_verbosity = defGetCopyLogVerbosityChoice(defel, pstate);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 73133ce735..9305800340 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2901,7 +2901,7 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("FORMAT", "FREEZE", "DELIMITER", "NULL",
 					  "HEADER", "QUOTE", "ESCAPE", "FORCE_QUOTE",
 					  "FORCE_NOT_NULL", "FORCE_NULL", "ENCODING", "DEFAULT",
-					  "ON_ERROR");
+					  "ON_ERROR", "LOG_VERBOSITY");
 
 	/* Complete COPY <sth> FROM|TO filename WITH (FORMAT */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "FORMAT"))
@@ -2911,6 +2911,10 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "ON_ERROR"))
 		COMPLETE_WITH("stop", "ignore");
 
+	/* Complete COPY <sth> FROM filename WITH (LOG_VERBOSITY */
+	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "LOG_VERBOSITY"))
+		COMPLETE_WITH("default", "verbose");
+
 	/* Complete COPY <sth> FROM <sth> WITH (<options>) */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM", MatchAny, "WITH", MatchAny))
 		COMPLETE_WITH("WHERE");
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index b3da3cb0be..99d183fa4d 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -40,6 +40,15 @@ typedef enum CopyOnErrorChoice
 	COPY_ON_ERROR_IGNORE,		/* ignore errors */
 } CopyOnErrorChoice;
 
+/*
+ * Represents verbosity of logged messages by COPY command.
+ */
+typedef enum CopyLogVerbosityChoice
+{
+	COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional messages, default */
+	COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */
+} CopyLogVerbosityChoice;
+
 /*
  * A struct to hold COPY options, in a parsed form. All of these are related
  * to formatting, except for 'freeze', which doesn't really belong here, but
@@ -73,6 +82,7 @@ typedef struct CopyFormatOptions
 	bool	   *force_null_flags;	/* per-column CSV FN flags */
 	bool		convert_selectively;	/* do selective binary conversion? */
 	CopyOnErrorChoice on_error; /* what to do when error happened */
+	CopyLogVerbosityChoice log_verbosity;	/* verbosity of logged messages */
 	List	   *convert_select; /* list of column names (can be NIL) */
 } CopyFormatOptions;
 
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index f98c2d1c4e..bb37a2ac70 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
 ERROR:  conflicting or redundant options
 LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
                                             ^
+COPY x from stdin (log_verbosity 'default', log_verbosity 'verbose');
+ERROR:  conflicting or redundant options
+LINE 1: COPY x from stdin (log_verbosity 'default', log_verbosity 'v...
+                                                    ^
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
 ERROR:  cannot specify DELIMITER in BINARY mode
@@ -108,6 +112,10 @@ COPY x to stdin (format BINARY, on_error unsupported);
 ERROR:  COPY ON_ERROR cannot be used with COPY TO
 LINE 1: COPY x to stdin (format BINARY, on_error unsupported);
                                         ^
+COPY x to stdout (log_verbosity 'unsupported');
+ERROR:  COPY LOG_VERBOSITY "unsupported" not recognized
+LINE 1: COPY x to stdout (log_verbosity 'unsupported');
+                          ^
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
 ERROR:  column "d" specified more than once
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index afaaa37e52..4cd3ae577d 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,6 +67,7 @@ COPY x from stdin (force_null (a), force_null (b));
 COPY x from stdin (convert_selectively (a), convert_selectively (b));
 COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
 COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (log_verbosity 'default', log_verbosity 'verbose');
 
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
@@ -80,6 +81,7 @@ COPY x to stdin (format CSV, force_not_null(a));
 COPY x to stdout (format TEXT, force_null(a));
 COPY x to stdin (format CSV, force_null(a));
 COPY x to stdin (format BINARY, on_error unsupported);
+COPY x to stdout (log_verbosity 'unsupported');
 
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index aa7a25b8f8..549378c8ad 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -479,6 +479,7 @@ CopyFromState
 CopyFromStateData
 CopyHeaderChoice
 CopyInsertMethod
+CopyLogVerbosityChoice
 CopyMultiInsertBuffer
 CopyMultiInsertInfo
 CopyOnErrorChoice
-- 
2.34.1

v8-0002-Add-detailed-info-when-COPY-skips-soft-errors.patchapplication/octet-stream; name=v8-0002-Add-detailed-info-when-COPY-skips-soft-errors.patchDownload
From 415842714f5b3a3f09f6af4a193398801d8bdf8c Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 13 Mar 2024 14:00:38 +0000
Subject: [PATCH v8 2/2] Add detailed info when COPY skips soft errors

This commit emits individual info like line number and column name
when COPY skips soft errors. Because, the summary containing the
total rows skipped isn't enough for the users to know what exactly
are the malformed rows in the input data.

Author: Bharath Rupireddy
Reviewed-by: Michael Paquier, Masahiko Sawada
Reviewed-by: Atsushi Torikoshi
Discussion: https://www.postgresql.org/message-id/CALj2ACUk700cYhx1ATRQyRw-fBM%2BaRo6auRAitKGff7XNmYfqQ%40mail.gmail.com
---
 doc/src/sgml/ref/copy.sgml           | 12 ++++++++--
 src/backend/commands/copyfrom.c      |  4 +---
 src/backend/commands/copyfromparse.c | 35 ++++++++++++++++++++++++++++
 src/include/commands/copy.h          |  1 +
 src/test/regress/expected/copy2.out  | 18 +++++++++++++-
 src/test/regress/sql/copy2.sql       |  9 ++++++-
 6 files changed, 72 insertions(+), 7 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index eba9b8f64e..bdd6580721 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -398,8 +398,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
      </para>
      <para>
-      A <literal>NOTICE</literal> message containing the ignored row count is emitted at the end
-      of the <command>COPY FROM</command> if at least one row was discarded.
+      A <literal>NOTICE</literal> message containing the ignored row count is
+      emitted at the end of the <command>COPY FROM</command> if at least one
+      row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
+      <literal>verbose</literal>, a <literal>NOTICE</literal> message
+      containing the line of the input file and the column name whose input
+      conversion has failed is emitted for each discarded row.
      </para>
     </listitem>
    </varlistentry>
@@ -426,6 +430,10 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       <literal>verbose</literal> can be used to emit more informative messages.
       <literal>default</literal> will not log any additional messages.
      </para>
+     <para>
+      This is currently used in <command>COPY FROM</command> command when
+      <literal>ON_ERROR</literal> is set to <literal>ignore</literal>.
+      </para>
     </listitem>
    </varlistentry>
 
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 8908a440e1..fc5bc86ac7 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -101,8 +101,6 @@ typedef struct CopyMultiInsertInfo
 
 
 /* non-export function prototypes */
-static char *limit_printout_length(const char *str);
-
 static void ClosePipeFromProgram(CopyFromState cstate);
 
 /*
@@ -189,7 +187,7 @@ CopyFromErrorCallback(void *arg)
  *
  * Returns a pstrdup'd copy of the input.
  */
-static char *
+char *
 limit_printout_length(const char *str)
 {
 #define MAX_COPY_DATA_DISPLAY 100
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 5682d5d054..01ab1de9bd 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -967,7 +967,42 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 											(Node *) cstate->escontext,
 											&values[m]))
 			{
+				Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
+
 				cstate->num_errors++;
+
+				if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
+				{
+					/*
+					 * Since we emit line number and column info in the below
+					 * notice message, we suppress error context information
+					 * other than the relation name.
+					 */
+					Assert(!cstate->relname_only);
+					cstate->relname_only = true;
+
+					if (cstate->cur_attval)
+					{
+						char	   *attval;
+
+						attval = limit_printout_length(cstate->cur_attval);
+						ereport(NOTICE,
+								errmsg("data type incompatibility at line %llu for column %s: \"%s\"",
+									   (unsigned long long) cstate->cur_lineno,
+									   cstate->cur_attname,
+									   attval));
+						pfree(attval);
+					}
+					else
+						ereport(NOTICE,
+								errmsg("data type incompatibility at line %llu for column %s: null input",
+									   (unsigned long long) cstate->cur_lineno,
+									   cstate->cur_attname));
+
+					/* reset relname_only */
+					cstate->relname_only = false;
+				}
+
 				return true;
 			}
 
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index 99d183fa4d..9c539772a5 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -107,6 +107,7 @@ extern bool NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 extern bool NextCopyFromRawFields(CopyFromState cstate,
 								  char ***fields, int *nfields);
 extern void CopyFromErrorCallback(void *arg);
+extern char *limit_printout_length(const char *str);
 
 extern uint64 CopyFrom(CopyFromState cstate);
 
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index bb37a2ac70..832b8b210f 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -737,8 +737,24 @@ CREATE TABLE check_ign_err (n int, m int[], k int);
 COPY check_ign_err FROM STDIN WITH (on_error stop);
 ERROR:  invalid input syntax for type integer: "a"
 CONTEXT:  COPY check_ign_err, line 2, column n: "a"
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+-- want context for notices
+\set SHOW_CONTEXT always
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+NOTICE:  data type incompatibility at line 2 for column n: "a"
+CONTEXT:  COPY check_ign_err
+NOTICE:  data type incompatibility at line 3 for column k: "3333333333"
+CONTEXT:  COPY check_ign_err
+NOTICE:  data type incompatibility at line 4 for column m: "{a, 4}"
+CONTEXT:  COPY check_ign_err
+NOTICE:  data type incompatibility at line 5 for column n: ""
+CONTEXT:  COPY check_ign_err
+NOTICE:  data type incompatibility at line 7 for column m: "a"
+CONTEXT:  COPY check_ign_err
+NOTICE:  data type incompatibility at line 8 for column k: "a"
+CONTEXT:  COPY check_ign_err
 NOTICE:  6 rows were skipped due to data type incompatibility
+-- reset context choice
+\set SHOW_CONTEXT errors
 SELECT * FROM check_ign_err;
  n |  m  | k 
 ---+-----+---
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index 4cd3ae577d..d290bea265 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -510,7 +510,11 @@ a	{2}	2
 
 5	{5}	5
 \.
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+
+-- want context for notices
+\set SHOW_CONTEXT always
+
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity verbose);
 1	{1}	1
 a	{2}	2
 3	{3}	3333333333
@@ -521,6 +525,9 @@ a	{2}	2
 7	{7}	a
 8	{8}	8
 \.
+-- reset context choice
+\set SHOW_CONTEXT errors
+
 SELECT * FROM check_ign_err;
 
 -- test datatype error that can't be handled as soft: should fail
-- 
2.34.1

#36Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#35)
Re: Add new error_action COPY ON_ERROR "log"

On Wed, Mar 13, 2024 at 07:32:37PM +0530, Bharath Rupireddy wrote:

I'm attaching the v8 patch set implementing the above idea. With this,
[1] is sent to the client, [2] is sent to the server log. This
approach not only reduces the duplicate info in the NOTICE and CONTEXT
messages, but also makes it easy for users to get all the necessary
info in the NOTICE message without having to set extra parameters to
get CONTEXT message.

In terms of eliminating the information duplication while allowing
clients to know the value, the attribute and the line involved in the
conversion failure, the approach of tweaking the context information
has merits, I guess.

How do others feel about this kind of tuning?
--
Michael

#37Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Michael Paquier (#36)
Re: Add new error_action COPY ON_ERROR "log"

At Fri, 15 Mar 2024 16:57:25 +0900, Michael Paquier <michael@paquier.xyz> wrote in

On Wed, Mar 13, 2024 at 07:32:37PM +0530, Bharath Rupireddy wrote:

I'm attaching the v8 patch set implementing the above idea. With this,
[1] is sent to the client, [2] is sent to the server log. This
approach not only reduces the duplicate info in the NOTICE and CONTEXT
messages, but also makes it easy for users to get all the necessary
info in the NOTICE message without having to set extra parameters to
get CONTEXT message.

In terms of eliminating the information duplication while allowing
clients to know the value, the attribute and the line involved in the
conversion failure, the approach of tweaking the context information
has merits, I guess.

How do others feel about this kind of tuning?

If required, I think that we have already included the minimum
information necessary for the primary diagnosis, including locations,
within the primary messages. Here is an example:

LOG: 00000: invalid record length at 0/18049F8: expected at least 24, got 0

And I believe that CONTEXT, if it exists, is augmentation information
to the primary messages. The objective of the precedent for the use of
relname_only was somewhat different, but this use also seems legit.

In short, I think the distribution between message types (primary and
context) is fine as it is in the latest patch.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#38Michael Paquier
michael@paquier.xyz
In reply to: Kyotaro Horiguchi (#37)
Re: Add new error_action COPY ON_ERROR "log"

On Mon, Mar 18, 2024 at 12:05:17PM +0900, Kyotaro Horiguchi wrote:

And I believe that CONTEXT, if it exists, is augmentation information
to the primary messages. The objective of the precedent for the use of
relname_only was somewhat different, but this use also seems legit.

In short, I think the distribution between message types (primary and
context) is fine as it is in the latest patch.

Okay, thanks for the feedback.
--
Michael

#39Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bharath Rupireddy (#35)
Re: Add new error_action COPY ON_ERROR "log"

On Wed, Mar 13, 2024 at 11:02 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Mar 13, 2024 at 11:09 AM Michael Paquier <michael@paquier.xyz> wrote:

Hmm. This NOTICE is really bugging me. It is true that the clients
would get more information, but the information is duplicated on the
server side because the error context provides the same information as
the NOTICE itself:
NOTICE: data type incompatibility at line 1 for column "a"
CONTEXT: COPY aa, line 1, column a: "a"
STATEMENT: copy aa from stdin with (on_error ignore, log_verbosity verbose);

Yes, if wanted, clients can also get the CONTEXT - for instance, using
'\set SHOW_CONTEXT always' in psql.

I think we can enhance the NOTICE message to include the column value
(just like CONTEXT message is showing) and leverage relname_only to
emit only the relation name in the CONTEXT message.

/*
* We suppress error context information other than the relation name,
* if one of the operations below fails.
*/
Assert(!cstate->relname_only);
cstate->relname_only = true;

I'm attaching the v8 patch set implementing the above idea. With this,
[1] is sent to the client, [2] is sent to the server log. This
approach not only reduces the duplicate info in the NOTICE and CONTEXT
messages, but also makes it easy for users to get all the necessary
info in the NOTICE message without having to set extra parameters to
get CONTEXT message.

Another idea is to move even the table name to NOTICE message and hide
the context with errhidecontext when we emit the new NOTICE messages.

Thoughts?

The current approach, eliminating the duplicated information in
CONTEXT, seems good to me.

One question about the latest (v8) patch:

+                   else
+                       ereport(NOTICE,
+                               errmsg("data type incompatibility at
line %llu for column %s: null input",
+                                      (unsigned long long) cstate->cur_lineno,
+                                      cstate->cur_attname));
+

How can we reach this path? It seems we don't cover this path by the tests.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#40Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Masahiko Sawada (#39)
2 attachment(s)
Re: Add new error_action COPY ON_ERROR "log"

On Mon, Mar 25, 2024 at 10:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

The current approach, eliminating the duplicated information in
CONTEXT, seems good to me.

Thanks for looking into it.

One question about the latest (v8) patch:

+                   else
+                       ereport(NOTICE,
+                               errmsg("data type incompatibility at
line %llu for column %s: null input",
+                                      (unsigned long long) cstate->cur_lineno,
+                                      cstate->cur_attname));
+

How can we reach this path? It seems we don't cover this path by the tests.

Tests don't cover that part, but it can be hit with something like
[1]: create domain dcheck_ign_err2 varchar(15) NOT NULL; CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2); COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose); 1 {1} 1 'foo' 2 {2} 2 \N \.

Note the use of domain to provide an indirect way of providing null
constraint check. Otherwise, COPY FROM fails early in
CopyFrom->ExecConstraints if the NOT NULL constraint is directly
provided next to the column in the table [2]CREATE TABLE check_ign_err3 (n int, m int[], k int, l varchar(15) NOT NULL); postgres=# COPY check_ign_err3 FROM STDIN WITH (on_error ignore, log_verbosity verbose); Enter data to be copied followed by a newline. End with a backslash and a period on a line by itself, or an EOF signal..

Please see the attached v9 patch set.

[1]: create domain dcheck_ign_err2 varchar(15) NOT NULL; CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2); COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose); 1 {1} 1 'foo' 2 {2} 2 \N \.
create domain dcheck_ign_err2 varchar(15) NOT NULL;
CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose);
1 {1} 1 'foo'
2 {2} 2 \N
\.

[2]: CREATE TABLE check_ign_err3 (n int, m int[], k int, l varchar(15) NOT NULL); postgres=# COPY check_ign_err3 FROM STDIN WITH (on_error ignore, log_verbosity verbose); Enter data to be copied followed by a newline. End with a backslash and a period on a line by itself, or an EOF signal.
CREATE TABLE check_ign_err3 (n int, m int[], k int, l varchar(15) NOT NULL);
postgres=# COPY check_ign_err3 FROM STDIN WITH (on_error ignore,
log_verbosity verbose);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.

1 {1} 1 'foo'

2 {2} 2 \N>>

\.

ERROR: null value in column "l" of relation "check_ign_err3" violates
not-null constraint
DETAIL: Failing row contains (2, {2}, 2, null).
CONTEXT: COPY check_ign_err3, line 2: "2 {2} 2 \N"

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v9-0001-Add-LOG_VERBOSITY-option-to-COPY-command.patchapplication/x-patch; name=v9-0001-Add-LOG_VERBOSITY-option-to-COPY-command.patchDownload
From bf0c1a166025c6ed1b88233eb8fc20df881d95ca Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Mon, 25 Mar 2024 10:55:00 +0000
Subject: [PATCH v9 1/2] Add LOG_VERBOSITY option to COPY command

This commit adds a new option LOG_VERBOSITY to set the verbosity of
logged messages by COPY command. A value of 'verbose' can be used
to emit more informative messages by the command, while the value
of 'default (which is the default) can be used to not log any
additional messages. More values such as 'terse', 'row_details'
etc. can be added based on the need  to the LOG_VERBOSITY option.

An upcoming commit for emitting more info on soft errors by
COPY FROM command with ON_ERROR 'ignore' uses this.

Author: Bharath Rupireddy
Reviewed-by: Michael Paquier, Masahiko Sawada
Reviewed-by: Atsushi Torikoshi
Discussion: https://www.postgresql.org/message-id/CALj2ACXNA0focNeriYRvQQaCGc4CsTuOnFbzF9LqTKNWxuJdhA%40mail.gmail.com
---
 doc/src/sgml/ref/copy.sgml          | 14 +++++++++++
 src/backend/commands/copy.c         | 38 +++++++++++++++++++++++++++++
 src/bin/psql/tab-complete.c         |  6 ++++-
 src/include/commands/copy.h         | 10 ++++++++
 src/test/regress/expected/copy2.out |  8 ++++++
 src/test/regress/sql/copy2.sql      |  2 ++
 src/tools/pgindent/typedefs.list    |  1 +
 7 files changed, 78 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 6c83e30ed0..4c307efb54 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -45,6 +45,7 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     FORCE_NULL { ( <replaceable class="parameter">column_name</replaceable> [, ...] ) | * }
     ON_ERROR '<replaceable class="parameter">error_action</replaceable>'
     ENCODING '<replaceable class="parameter">encoding_name</replaceable>'
+    LOG_VERBOSITY [ <replaceable class="parameter">mode</replaceable> ]
 </synopsis>
  </refsynopsisdiv>
 
@@ -418,6 +419,19 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>LOG_VERBOSITY</literal></term>
+    <listitem>
+     <para>
+      Sets the verbosity of some of the messages logged by a
+      <command>COPY</command> command.
+      A <replaceable class="parameter">mode</replaceable> value of
+      <literal>verbose</literal> can be used to emit more informative messages.
+      <literal>default</literal> will not log any additional messages.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>WHERE</literal></term>
     <listitem>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 28cf8b040a..67d5c3f7d0 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -422,6 +422,36 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
 	return COPY_ON_ERROR_STOP;	/* keep compiler quiet */
 }
 
+/*
+ * Extract a CopyLogVerbosityChoice value from a DefElem.
+ */
+static CopyLogVerbosityChoice
+defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate)
+{
+	char	   *sval;
+
+	/*
+	 * If no parameter value given, assume the default value.
+	 */
+	if (def->arg == NULL)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+
+	/*
+	 * Allow "default", or "verbose" values.
+	 */
+	sval = defGetString(def);
+	if (pg_strcasecmp(sval, "default") == 0)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+	if (pg_strcasecmp(sval, "verbose") == 0)
+		return COPY_LOG_VERBOSITY_VERBOSE;
+
+	ereport(ERROR,
+			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			 errmsg("COPY LOG_VERBOSITY \"%s\" not recognized", sval),
+			 parser_errposition(pstate, def->location)));
+	return COPY_LOG_VERBOSITY_DEFAULT;	/* keep compiler quiet */
+}
+
 /*
  * Process the statement option list for COPY.
  *
@@ -448,6 +478,7 @@ ProcessCopyOptions(ParseState *pstate,
 	bool		freeze_specified = false;
 	bool		header_specified = false;
 	bool		on_error_specified = false;
+	bool		log_verbosity_specified = false;
 	ListCell   *option;
 
 	/* Support external use for option sanity checking */
@@ -607,6 +638,13 @@ ProcessCopyOptions(ParseState *pstate,
 			on_error_specified = true;
 			opts_out->on_error = defGetCopyOnErrorChoice(defel, pstate, is_from);
 		}
+		else if (strcmp(defel->defname, "log_verbosity") == 0)
+		{
+			if (log_verbosity_specified)
+				errorConflictingDefElem(defel, pstate);
+			log_verbosity_specified = true;
+			opts_out->log_verbosity = defGetCopyLogVerbosityChoice(defel, pstate);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 56d723de8a..bfbb3899ad 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2903,7 +2903,7 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("FORMAT", "FREEZE", "DELIMITER", "NULL",
 					  "HEADER", "QUOTE", "ESCAPE", "FORCE_QUOTE",
 					  "FORCE_NOT_NULL", "FORCE_NULL", "ENCODING", "DEFAULT",
-					  "ON_ERROR");
+					  "ON_ERROR", "LOG_VERBOSITY");
 
 	/* Complete COPY <sth> FROM|TO filename WITH (FORMAT */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "FORMAT"))
@@ -2913,6 +2913,10 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "ON_ERROR"))
 		COMPLETE_WITH("stop", "ignore");
 
+	/* Complete COPY <sth> FROM filename WITH (LOG_VERBOSITY */
+	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "LOG_VERBOSITY"))
+		COMPLETE_WITH("default", "verbose");
+
 	/* Complete COPY <sth> FROM <sth> WITH (<options>) */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM", MatchAny, "WITH", MatchAny))
 		COMPLETE_WITH("WHERE");
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index b3da3cb0be..99d183fa4d 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -40,6 +40,15 @@ typedef enum CopyOnErrorChoice
 	COPY_ON_ERROR_IGNORE,		/* ignore errors */
 } CopyOnErrorChoice;
 
+/*
+ * Represents verbosity of logged messages by COPY command.
+ */
+typedef enum CopyLogVerbosityChoice
+{
+	COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional messages, default */
+	COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */
+} CopyLogVerbosityChoice;
+
 /*
  * A struct to hold COPY options, in a parsed form. All of these are related
  * to formatting, except for 'freeze', which doesn't really belong here, but
@@ -73,6 +82,7 @@ typedef struct CopyFormatOptions
 	bool	   *force_null_flags;	/* per-column CSV FN flags */
 	bool		convert_selectively;	/* do selective binary conversion? */
 	CopyOnErrorChoice on_error; /* what to do when error happened */
+	CopyLogVerbosityChoice log_verbosity;	/* verbosity of logged messages */
 	List	   *convert_select; /* list of column names (can be NIL) */
 } CopyFormatOptions;
 
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index f98c2d1c4e..bb37a2ac70 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
 ERROR:  conflicting or redundant options
 LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
                                             ^
+COPY x from stdin (log_verbosity 'default', log_verbosity 'verbose');
+ERROR:  conflicting or redundant options
+LINE 1: COPY x from stdin (log_verbosity 'default', log_verbosity 'v...
+                                                    ^
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
 ERROR:  cannot specify DELIMITER in BINARY mode
@@ -108,6 +112,10 @@ COPY x to stdin (format BINARY, on_error unsupported);
 ERROR:  COPY ON_ERROR cannot be used with COPY TO
 LINE 1: COPY x to stdin (format BINARY, on_error unsupported);
                                         ^
+COPY x to stdout (log_verbosity 'unsupported');
+ERROR:  COPY LOG_VERBOSITY "unsupported" not recognized
+LINE 1: COPY x to stdout (log_verbosity 'unsupported');
+                          ^
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
 ERROR:  column "d" specified more than once
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index afaaa37e52..4cd3ae577d 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,6 +67,7 @@ COPY x from stdin (force_null (a), force_null (b));
 COPY x from stdin (convert_selectively (a), convert_selectively (b));
 COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
 COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (log_verbosity 'default', log_verbosity 'verbose');
 
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
@@ -80,6 +81,7 @@ COPY x to stdin (format CSV, force_not_null(a));
 COPY x to stdout (format TEXT, force_null(a));
 COPY x to stdin (format CSV, force_null(a));
 COPY x to stdin (format BINARY, on_error unsupported);
+COPY x to stdout (log_verbosity 'unsupported');
 
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index e2a0525dd4..2651a1f199 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -480,6 +480,7 @@ CopyFromState
 CopyFromStateData
 CopyHeaderChoice
 CopyInsertMethod
+CopyLogVerbosityChoice
 CopyMultiInsertBuffer
 CopyMultiInsertInfo
 CopyOnErrorChoice
-- 
2.34.1

v9-0002-Add-detailed-info-when-COPY-skips-soft-errors.patchapplication/x-patch; name=v9-0002-Add-detailed-info-when-COPY-skips-soft-errors.patchDownload
From 7b6fd050469cf26afa4ede08120b123b620a551a Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Mon, 25 Mar 2024 11:00:59 +0000
Subject: [PATCH v9 2/2] Add detailed info when COPY skips soft errors

This commit emits individual info like line number and column name
when COPY skips soft errors. Because, the summary containing the
total rows skipped isn't enough for the users to know what exactly
are the malformed rows in the input data.

Author: Bharath Rupireddy
Reviewed-by: Michael Paquier, Masahiko Sawada
Reviewed-by: Atsushi Torikoshi
Discussion: https://www.postgresql.org/message-id/CALj2ACUk700cYhx1ATRQyRw-fBM%2BaRo6auRAitKGff7XNmYfqQ%40mail.gmail.com
---
 doc/src/sgml/ref/copy.sgml           | 12 ++++++++--
 src/backend/commands/copyfrom.c      |  4 +---
 src/backend/commands/copyfromparse.c | 35 ++++++++++++++++++++++++++++
 src/include/commands/copy.h          |  1 +
 src/test/regress/expected/copy2.out  | 33 +++++++++++++++++++++++++-
 src/test/regress/sql/copy2.sql       | 22 ++++++++++++++++-
 6 files changed, 100 insertions(+), 7 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 4c307efb54..ecbbf5f94a 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -401,8 +401,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
      </para>
      <para>
-      A <literal>NOTICE</literal> message containing the ignored row count is emitted at the end
-      of the <command>COPY FROM</command> if at least one row was discarded.
+      A <literal>NOTICE</literal> message containing the ignored row count is
+      emitted at the end of the <command>COPY FROM</command> if at least one
+      row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
+      <literal>verbose</literal>, a <literal>NOTICE</literal> message
+      containing the line of the input file and the column name whose input
+      conversion has failed is emitted for each discarded row.
      </para>
     </listitem>
    </varlistentry>
@@ -429,6 +433,10 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       <literal>verbose</literal> can be used to emit more informative messages.
       <literal>default</literal> will not log any additional messages.
      </para>
+     <para>
+      This is currently used in <command>COPY FROM</command> command when
+      <literal>ON_ERROR</literal> is set to <literal>ignore</literal>.
+      </para>
     </listitem>
    </varlistentry>
 
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 8908a440e1..fc5bc86ac7 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -101,8 +101,6 @@ typedef struct CopyMultiInsertInfo
 
 
 /* non-export function prototypes */
-static char *limit_printout_length(const char *str);
-
 static void ClosePipeFromProgram(CopyFromState cstate);
 
 /*
@@ -189,7 +187,7 @@ CopyFromErrorCallback(void *arg)
  *
  * Returns a pstrdup'd copy of the input.
  */
-static char *
+char *
 limit_printout_length(const char *str)
 {
 #define MAX_COPY_DATA_DISPLAY 100
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 5682d5d054..01ab1de9bd 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -967,7 +967,42 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 											(Node *) cstate->escontext,
 											&values[m]))
 			{
+				Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
+
 				cstate->num_errors++;
+
+				if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
+				{
+					/*
+					 * Since we emit line number and column info in the below
+					 * notice message, we suppress error context information
+					 * other than the relation name.
+					 */
+					Assert(!cstate->relname_only);
+					cstate->relname_only = true;
+
+					if (cstate->cur_attval)
+					{
+						char	   *attval;
+
+						attval = limit_printout_length(cstate->cur_attval);
+						ereport(NOTICE,
+								errmsg("data type incompatibility at line %llu for column %s: \"%s\"",
+									   (unsigned long long) cstate->cur_lineno,
+									   cstate->cur_attname,
+									   attval));
+						pfree(attval);
+					}
+					else
+						ereport(NOTICE,
+								errmsg("data type incompatibility at line %llu for column %s: null input",
+									   (unsigned long long) cstate->cur_lineno,
+									   cstate->cur_attname));
+
+					/* reset relname_only */
+					cstate->relname_only = false;
+				}
+
 				return true;
 			}
 
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index 99d183fa4d..9c539772a5 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -107,6 +107,7 @@ extern bool NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 extern bool NextCopyFromRawFields(CopyFromState cstate,
 								  char ***fields, int *nfields);
 extern void CopyFromErrorCallback(void *arg);
+extern char *limit_printout_length(const char *str);
 
 extern uint64 CopyFrom(CopyFromState cstate);
 
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index bb37a2ac70..d80d45dce4 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -737,8 +737,31 @@ CREATE TABLE check_ign_err (n int, m int[], k int);
 COPY check_ign_err FROM STDIN WITH (on_error stop);
 ERROR:  invalid input syntax for type integer: "a"
 CONTEXT:  COPY check_ign_err, line 2, column n: "a"
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+-- want context for notices
+\set SHOW_CONTEXT always
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+NOTICE:  data type incompatibility at line 2 for column n: "a"
+CONTEXT:  COPY check_ign_err
+NOTICE:  data type incompatibility at line 3 for column k: "3333333333"
+CONTEXT:  COPY check_ign_err
+NOTICE:  data type incompatibility at line 4 for column m: "{a, 4}"
+CONTEXT:  COPY check_ign_err
+NOTICE:  data type incompatibility at line 5 for column n: ""
+CONTEXT:  COPY check_ign_err
+NOTICE:  data type incompatibility at line 7 for column m: "a"
+CONTEXT:  COPY check_ign_err
+NOTICE:  data type incompatibility at line 8 for column k: "a"
+CONTEXT:  COPY check_ign_err
 NOTICE:  6 rows were skipped due to data type incompatibility
+-- tests for on_error option with log_verbosity and null constraint via domain
+CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
+CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
+COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+NOTICE:  data type incompatibility at line 2 for column l: null input
+CONTEXT:  COPY check_ign_err2
+NOTICE:  1 row was skipped due to data type incompatibility
+-- reset context choice
+\set SHOW_CONTEXT errors
 SELECT * FROM check_ign_err;
  n |  m  | k 
 ---+-----+---
@@ -747,6 +770,12 @@ SELECT * FROM check_ign_err;
  8 | {8} | 8
 (3 rows)
 
+SELECT * FROM check_ign_err2;
+ n |  m  | k |   l   
+---+-----+---+-------
+ 1 | {1} | 1 | 'foo'
+(1 row)
+
 -- test datatype error that can't be handled as soft: should fail
 CREATE TABLE hard_err(foo widget);
 COPY hard_err FROM STDIN WITH (on_error ignore);
@@ -775,6 +804,8 @@ DROP VIEW instead_of_insert_tbl_view;
 DROP VIEW instead_of_insert_tbl_view_2;
 DROP FUNCTION fun_instead_of_insert_tbl();
 DROP TABLE check_ign_err;
+DROP TABLE check_ign_err2;
+DROP DOMAIN dcheck_ign_err2;
 DROP TABLE hard_err;
 --
 -- COPY FROM ... DEFAULT
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index 4cd3ae577d..624d531fd6 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -510,7 +510,11 @@ a	{2}	2
 
 5	{5}	5
 \.
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+
+-- want context for notices
+\set SHOW_CONTEXT always
+
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity verbose);
 1	{1}	1
 a	{2}	2
 3	{3}	3333333333
@@ -521,8 +525,22 @@ a	{2}	2
 7	{7}	a
 8	{8}	8
 \.
+
+-- tests for on_error option with log_verbosity and null constraint via domain
+CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
+CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
+COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+1	{1}	1	'foo'
+2	{2}	2	\N
+\.
+
+-- reset context choice
+\set SHOW_CONTEXT errors
+
 SELECT * FROM check_ign_err;
 
+SELECT * FROM check_ign_err2;
+
 -- test datatype error that can't be handled as soft: should fail
 CREATE TABLE hard_err(foo widget);
 COPY hard_err FROM STDIN WITH (on_error ignore);
@@ -554,6 +572,8 @@ DROP VIEW instead_of_insert_tbl_view;
 DROP VIEW instead_of_insert_tbl_view_2;
 DROP FUNCTION fun_instead_of_insert_tbl();
 DROP TABLE check_ign_err;
+DROP TABLE check_ign_err2;
+DROP DOMAIN dcheck_ign_err2;
 DROP TABLE hard_err;
 
 --
-- 
2.34.1

#41Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bharath Rupireddy (#40)
Re: Add new error_action COPY ON_ERROR "log"

On Mon, Mar 25, 2024 at 8:21 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Mon, Mar 25, 2024 at 10:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

The current approach, eliminating the duplicated information in
CONTEXT, seems good to me.

Thanks for looking into it.

One question about the latest (v8) patch:

+                   else
+                       ereport(NOTICE,
+                               errmsg("data type incompatibility at
line %llu for column %s: null input",
+                                      (unsigned long long) cstate->cur_lineno,
+                                      cstate->cur_attname));
+

How can we reach this path? It seems we don't cover this path by the tests.

Tests don't cover that part, but it can be hit with something like
[1]. I've added a test for this.

Note the use of domain to provide an indirect way of providing null
constraint check. Otherwise, COPY FROM fails early in
CopyFrom->ExecConstraints if the NOT NULL constraint is directly
provided next to the column in the table [2].

Please see the attached v9 patch set.

Thank you for updating the patch! The patch mostly looks good to me.
Here are some minor comments:

---
/* non-export function prototypes */
-static char *limit_printout_length(const char *str);
-
static void ClosePipeFromProgram(CopyFromState cstate);

Now that we have only one function we should replace "prototypes" with
"prototype".

---
+                                                ereport(NOTICE,
+
errmsg("data type incompatibility at line %llu for column %s: \"%s\"",
+
     (unsigned long long) cstate->cur_lineno,
+
     cstate->cur_attname,
+
     attval));

I guess it would be better to make the log message clearer to convey
what we did for the malformed row. For example, how about something
like "skipping row due to data type incompatibility at line %llu for
column %s: \"s\""?

---
 extern void CopyFromErrorCallback(void *arg);
+extern char *limit_printout_length(const char *str);

I don't disagree with exposing the limit_printout_length() function
but I think it's better to rename it for consistency with other
exposed COPY command functions. Only this function is snake-case. How
about CopyLimitPrintoutLength() or similar?

FWIW I'm going to merge two patches before the push.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#42Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Masahiko Sawada (#41)
1 attachment(s)
Re: Add new error_action COPY ON_ERROR "log"

On Tue, Mar 26, 2024 at 7:16 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Please see the attached v9 patch set.

Thank you for updating the patch! The patch mostly looks good to me.
Here are some minor comments:

Thanks for looking into this.

---
/* non-export function prototypes */
-static char *limit_printout_length(const char *str);
-
static void ClosePipeFromProgram(CopyFromState cstate);

Now that we have only one function we should replace "prototypes" with
"prototype".

Well no. We might add a few more (never know). A quick look around the
GUCs under /* GUCs */ tells me that plural form there is being used
even just one GUC is defined (xlogprefetcher.c for instance).

---
+                                                ereport(NOTICE,
+
errmsg("data type incompatibility at line %llu for column %s: \"%s\"",
+
(unsigned long long) cstate->cur_lineno,
+
cstate->cur_attname,
+
attval));

I guess it would be better to make the log message clearer to convey
what we did for the malformed row. For example, how about something
like "skipping row due to data type incompatibility at line %llu for
column %s: \"s\""?

The summary message which gets printed at the end says that "NOTICE:
6 rows were skipped due to data type incompatibility". Isn't this
enough? If someone is using ON_ERROR 'ignore', it's quite natural that
such rows get skipped softly and the summary message can help them,
no?

---
extern void CopyFromErrorCallback(void *arg);
+extern char *limit_printout_length(const char *str);

I don't disagree with exposing the limit_printout_length() function
but I think it's better to rename it for consistency with other
exposed COPY command functions. Only this function is snake-case. How
about CopyLimitPrintoutLength() or similar?

WFM. Although its implementation is not related to COPY code, COPY is
the sole user of it right now, so I'm fine with it. Done that.

FWIW I'm going to merge two patches before the push.

Done that.

Please see the attached v10 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v10-0001-Add-detailed-info-when-COPY-skips-soft-errors.patchapplication/octet-stream; name=v10-0001-Add-detailed-info-when-COPY-skips-soft-errors.patchDownload
From 4562a1032cb6f26ed44c08a54411b7d9d769223f Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 26 Mar 2024 03:19:57 +0000
Subject: [PATCH v10] Add detailed info when COPY skips soft errors

This commit emits individual info like line number and column name
when COPY skips soft errors. Because, the summary containing the
total rows skipped isn't enough for the users to know what exactly
are the malformed rows in the input data.

This commit also adds a new option LOG_VERBOSITY to control the
verbosity of logged messages when COPY command skips soft errors.
This option if required can also be extended to control other COPY
related log messages. A value of 'verbose' can be used to emit
more informative messages by the command, while the value of
'default (which is the default) can be used to not log any
additional messages. More values such as 'terse', 'row_details'
etc. can be added based on the need  to the LOG_VERBOSITY option.
To see the individual info added by this commit when COPY skips
soft errors, one needs to set LOG_VERBOSITY to 'verbose'.

Author: Bharath Rupireddy
Reviewed-by: Michael Paquier, Masahiko Sawada
Reviewed-by: Atsushi Torikoshi
Discussion: https://www.postgresql.org/message-id/CALj2ACUk700cYhx1ATRQyRw-fBM%2BaRo6auRAitKGff7XNmYfqQ%40mail.gmail.com
---
 doc/src/sgml/ref/copy.sgml           | 26 ++++++++++++++++--
 src/backend/commands/copy.c          | 38 ++++++++++++++++++++++++++
 src/backend/commands/copyfrom.c      | 10 +++----
 src/backend/commands/copyfromparse.c | 35 ++++++++++++++++++++++++
 src/bin/psql/tab-complete.c          |  6 +++-
 src/include/commands/copy.h          | 11 ++++++++
 src/test/regress/expected/copy2.out  | 41 +++++++++++++++++++++++++++-
 src/test/regress/sql/copy2.sql       | 24 +++++++++++++++-
 src/tools/pgindent/typedefs.list     |  1 +
 9 files changed, 181 insertions(+), 11 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 6c83e30ed0..ecbbf5f94a 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -45,6 +45,7 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     FORCE_NULL { ( <replaceable class="parameter">column_name</replaceable> [, ...] ) | * }
     ON_ERROR '<replaceable class="parameter">error_action</replaceable>'
     ENCODING '<replaceable class="parameter">encoding_name</replaceable>'
+    LOG_VERBOSITY [ <replaceable class="parameter">mode</replaceable> ]
 </synopsis>
  </refsynopsisdiv>
 
@@ -400,8 +401,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
      </para>
      <para>
-      A <literal>NOTICE</literal> message containing the ignored row count is emitted at the end
-      of the <command>COPY FROM</command> if at least one row was discarded.
+      A <literal>NOTICE</literal> message containing the ignored row count is
+      emitted at the end of the <command>COPY FROM</command> if at least one
+      row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
+      <literal>verbose</literal>, a <literal>NOTICE</literal> message
+      containing the line of the input file and the column name whose input
+      conversion has failed is emitted for each discarded row.
      </para>
     </listitem>
    </varlistentry>
@@ -418,6 +423,23 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>LOG_VERBOSITY</literal></term>
+    <listitem>
+     <para>
+      Sets the verbosity of some of the messages logged by a
+      <command>COPY</command> command.
+      A <replaceable class="parameter">mode</replaceable> value of
+      <literal>verbose</literal> can be used to emit more informative messages.
+      <literal>default</literal> will not log any additional messages.
+     </para>
+     <para>
+      This is currently used in <command>COPY FROM</command> command when
+      <literal>ON_ERROR</literal> is set to <literal>ignore</literal>.
+      </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>WHERE</literal></term>
     <listitem>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 28cf8b040a..67d5c3f7d0 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -422,6 +422,36 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
 	return COPY_ON_ERROR_STOP;	/* keep compiler quiet */
 }
 
+/*
+ * Extract a CopyLogVerbosityChoice value from a DefElem.
+ */
+static CopyLogVerbosityChoice
+defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate)
+{
+	char	   *sval;
+
+	/*
+	 * If no parameter value given, assume the default value.
+	 */
+	if (def->arg == NULL)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+
+	/*
+	 * Allow "default", or "verbose" values.
+	 */
+	sval = defGetString(def);
+	if (pg_strcasecmp(sval, "default") == 0)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+	if (pg_strcasecmp(sval, "verbose") == 0)
+		return COPY_LOG_VERBOSITY_VERBOSE;
+
+	ereport(ERROR,
+			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			 errmsg("COPY LOG_VERBOSITY \"%s\" not recognized", sval),
+			 parser_errposition(pstate, def->location)));
+	return COPY_LOG_VERBOSITY_DEFAULT;	/* keep compiler quiet */
+}
+
 /*
  * Process the statement option list for COPY.
  *
@@ -448,6 +478,7 @@ ProcessCopyOptions(ParseState *pstate,
 	bool		freeze_specified = false;
 	bool		header_specified = false;
 	bool		on_error_specified = false;
+	bool		log_verbosity_specified = false;
 	ListCell   *option;
 
 	/* Support external use for option sanity checking */
@@ -607,6 +638,13 @@ ProcessCopyOptions(ParseState *pstate,
 			on_error_specified = true;
 			opts_out->on_error = defGetCopyOnErrorChoice(defel, pstate, is_from);
 		}
+		else if (strcmp(defel->defname, "log_verbosity") == 0)
+		{
+			if (log_verbosity_specified)
+				errorConflictingDefElem(defel, pstate);
+			log_verbosity_specified = true;
+			opts_out->log_verbosity = defGetCopyLogVerbosityChoice(defel, pstate);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 8908a440e1..06bc14636d 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -101,8 +101,6 @@ typedef struct CopyMultiInsertInfo
 
 
 /* non-export function prototypes */
-static char *limit_printout_length(const char *str);
-
 static void ClosePipeFromProgram(CopyFromState cstate);
 
 /*
@@ -141,7 +139,7 @@ CopyFromErrorCallback(void *arg)
 			/* error is relevant to a particular column */
 			char	   *attval;
 
-			attval = limit_printout_length(cstate->cur_attval);
+			attval = CopyLimitPrintoutLength(cstate->cur_attval);
 			errcontext("COPY %s, line %llu, column %s: \"%s\"",
 					   cstate->cur_relname,
 					   (unsigned long long) cstate->cur_lineno,
@@ -168,7 +166,7 @@ CopyFromErrorCallback(void *arg)
 			{
 				char	   *lineval;
 
-				lineval = limit_printout_length(cstate->line_buf.data);
+				lineval = CopyLimitPrintoutLength(cstate->line_buf.data);
 				errcontext("COPY %s, line %llu: \"%s\"",
 						   cstate->cur_relname,
 						   (unsigned long long) cstate->cur_lineno, lineval);
@@ -189,8 +187,8 @@ CopyFromErrorCallback(void *arg)
  *
  * Returns a pstrdup'd copy of the input.
  */
-static char *
-limit_printout_length(const char *str)
+char *
+CopyLimitPrintoutLength(const char *str)
 {
 #define MAX_COPY_DATA_DISPLAY 100
 
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 5682d5d054..bf14a63306 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -967,7 +967,42 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 											(Node *) cstate->escontext,
 											&values[m]))
 			{
+				Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
+
 				cstate->num_errors++;
+
+				if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
+				{
+					/*
+					 * Since we emit line number and column info in the below
+					 * notice message, we suppress error context information
+					 * other than the relation name.
+					 */
+					Assert(!cstate->relname_only);
+					cstate->relname_only = true;
+
+					if (cstate->cur_attval)
+					{
+						char	   *attval;
+
+						attval = CopyLimitPrintoutLength(cstate->cur_attval);
+						ereport(NOTICE,
+								errmsg("data type incompatibility at line %llu for column %s: \"%s\"",
+									   (unsigned long long) cstate->cur_lineno,
+									   cstate->cur_attname,
+									   attval));
+						pfree(attval);
+					}
+					else
+						ereport(NOTICE,
+								errmsg("data type incompatibility at line %llu for column %s: null input",
+									   (unsigned long long) cstate->cur_lineno,
+									   cstate->cur_attname));
+
+					/* reset relname_only */
+					cstate->relname_only = false;
+				}
+
 				return true;
 			}
 
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 56d723de8a..bfbb3899ad 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2903,7 +2903,7 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("FORMAT", "FREEZE", "DELIMITER", "NULL",
 					  "HEADER", "QUOTE", "ESCAPE", "FORCE_QUOTE",
 					  "FORCE_NOT_NULL", "FORCE_NULL", "ENCODING", "DEFAULT",
-					  "ON_ERROR");
+					  "ON_ERROR", "LOG_VERBOSITY");
 
 	/* Complete COPY <sth> FROM|TO filename WITH (FORMAT */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "FORMAT"))
@@ -2913,6 +2913,10 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "ON_ERROR"))
 		COMPLETE_WITH("stop", "ignore");
 
+	/* Complete COPY <sth> FROM filename WITH (LOG_VERBOSITY */
+	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "LOG_VERBOSITY"))
+		COMPLETE_WITH("default", "verbose");
+
 	/* Complete COPY <sth> FROM <sth> WITH (<options>) */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM", MatchAny, "WITH", MatchAny))
 		COMPLETE_WITH("WHERE");
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index b3da3cb0be..141fd48dc1 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -40,6 +40,15 @@ typedef enum CopyOnErrorChoice
 	COPY_ON_ERROR_IGNORE,		/* ignore errors */
 } CopyOnErrorChoice;
 
+/*
+ * Represents verbosity of logged messages by COPY command.
+ */
+typedef enum CopyLogVerbosityChoice
+{
+	COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional messages, default */
+	COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */
+} CopyLogVerbosityChoice;
+
 /*
  * A struct to hold COPY options, in a parsed form. All of these are related
  * to formatting, except for 'freeze', which doesn't really belong here, but
@@ -73,6 +82,7 @@ typedef struct CopyFormatOptions
 	bool	   *force_null_flags;	/* per-column CSV FN flags */
 	bool		convert_selectively;	/* do selective binary conversion? */
 	CopyOnErrorChoice on_error; /* what to do when error happened */
+	CopyLogVerbosityChoice log_verbosity;	/* verbosity of logged messages */
 	List	   *convert_select; /* list of column names (can be NIL) */
 } CopyFormatOptions;
 
@@ -97,6 +107,7 @@ extern bool NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 extern bool NextCopyFromRawFields(CopyFromState cstate,
 								  char ***fields, int *nfields);
 extern void CopyFromErrorCallback(void *arg);
+extern char *CopyLimitPrintoutLength(const char *str);
 
 extern uint64 CopyFrom(CopyFromState cstate);
 
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index f98c2d1c4e..d80d45dce4 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
 ERROR:  conflicting or redundant options
 LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
                                             ^
+COPY x from stdin (log_verbosity 'default', log_verbosity 'verbose');
+ERROR:  conflicting or redundant options
+LINE 1: COPY x from stdin (log_verbosity 'default', log_verbosity 'v...
+                                                    ^
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
 ERROR:  cannot specify DELIMITER in BINARY mode
@@ -108,6 +112,10 @@ COPY x to stdin (format BINARY, on_error unsupported);
 ERROR:  COPY ON_ERROR cannot be used with COPY TO
 LINE 1: COPY x to stdin (format BINARY, on_error unsupported);
                                         ^
+COPY x to stdout (log_verbosity 'unsupported');
+ERROR:  COPY LOG_VERBOSITY "unsupported" not recognized
+LINE 1: COPY x to stdout (log_verbosity 'unsupported');
+                          ^
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
 ERROR:  column "d" specified more than once
@@ -729,8 +737,31 @@ CREATE TABLE check_ign_err (n int, m int[], k int);
 COPY check_ign_err FROM STDIN WITH (on_error stop);
 ERROR:  invalid input syntax for type integer: "a"
 CONTEXT:  COPY check_ign_err, line 2, column n: "a"
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+-- want context for notices
+\set SHOW_CONTEXT always
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+NOTICE:  data type incompatibility at line 2 for column n: "a"
+CONTEXT:  COPY check_ign_err
+NOTICE:  data type incompatibility at line 3 for column k: "3333333333"
+CONTEXT:  COPY check_ign_err
+NOTICE:  data type incompatibility at line 4 for column m: "{a, 4}"
+CONTEXT:  COPY check_ign_err
+NOTICE:  data type incompatibility at line 5 for column n: ""
+CONTEXT:  COPY check_ign_err
+NOTICE:  data type incompatibility at line 7 for column m: "a"
+CONTEXT:  COPY check_ign_err
+NOTICE:  data type incompatibility at line 8 for column k: "a"
+CONTEXT:  COPY check_ign_err
 NOTICE:  6 rows were skipped due to data type incompatibility
+-- tests for on_error option with log_verbosity and null constraint via domain
+CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
+CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
+COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+NOTICE:  data type incompatibility at line 2 for column l: null input
+CONTEXT:  COPY check_ign_err2
+NOTICE:  1 row was skipped due to data type incompatibility
+-- reset context choice
+\set SHOW_CONTEXT errors
 SELECT * FROM check_ign_err;
  n |  m  | k 
 ---+-----+---
@@ -739,6 +770,12 @@ SELECT * FROM check_ign_err;
  8 | {8} | 8
 (3 rows)
 
+SELECT * FROM check_ign_err2;
+ n |  m  | k |   l   
+---+-----+---+-------
+ 1 | {1} | 1 | 'foo'
+(1 row)
+
 -- test datatype error that can't be handled as soft: should fail
 CREATE TABLE hard_err(foo widget);
 COPY hard_err FROM STDIN WITH (on_error ignore);
@@ -767,6 +804,8 @@ DROP VIEW instead_of_insert_tbl_view;
 DROP VIEW instead_of_insert_tbl_view_2;
 DROP FUNCTION fun_instead_of_insert_tbl();
 DROP TABLE check_ign_err;
+DROP TABLE check_ign_err2;
+DROP DOMAIN dcheck_ign_err2;
 DROP TABLE hard_err;
 --
 -- COPY FROM ... DEFAULT
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index afaaa37e52..624d531fd6 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,6 +67,7 @@ COPY x from stdin (force_null (a), force_null (b));
 COPY x from stdin (convert_selectively (a), convert_selectively (b));
 COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
 COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (log_verbosity 'default', log_verbosity 'verbose');
 
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
@@ -80,6 +81,7 @@ COPY x to stdin (format CSV, force_not_null(a));
 COPY x to stdout (format TEXT, force_null(a));
 COPY x to stdin (format CSV, force_null(a));
 COPY x to stdin (format BINARY, on_error unsupported);
+COPY x to stdout (log_verbosity 'unsupported');
 
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
@@ -508,7 +510,11 @@ a	{2}	2
 
 5	{5}	5
 \.
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+
+-- want context for notices
+\set SHOW_CONTEXT always
+
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity verbose);
 1	{1}	1
 a	{2}	2
 3	{3}	3333333333
@@ -519,8 +525,22 @@ a	{2}	2
 7	{7}	a
 8	{8}	8
 \.
+
+-- tests for on_error option with log_verbosity and null constraint via domain
+CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
+CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
+COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+1	{1}	1	'foo'
+2	{2}	2	\N
+\.
+
+-- reset context choice
+\set SHOW_CONTEXT errors
+
 SELECT * FROM check_ign_err;
 
+SELECT * FROM check_ign_err2;
+
 -- test datatype error that can't be handled as soft: should fail
 CREATE TABLE hard_err(foo widget);
 COPY hard_err FROM STDIN WITH (on_error ignore);
@@ -552,6 +572,8 @@ DROP VIEW instead_of_insert_tbl_view;
 DROP VIEW instead_of_insert_tbl_view_2;
 DROP FUNCTION fun_instead_of_insert_tbl();
 DROP TABLE check_ign_err;
+DROP TABLE check_ign_err2;
+DROP DOMAIN dcheck_ign_err2;
 DROP TABLE hard_err;
 
 --
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 4679660837..20724ebf33 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -480,6 +480,7 @@ CopyFromState
 CopyFromStateData
 CopyHeaderChoice
 CopyInsertMethod
+CopyLogVerbosityChoice
 CopyMultiInsertBuffer
 CopyMultiInsertInfo
 CopyOnErrorChoice
-- 
2.34.1

#43Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bharath Rupireddy (#42)
Re: Add new error_action COPY ON_ERROR "log"

On Tue, Mar 26, 2024 at 12:23 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Tue, Mar 26, 2024 at 7:16 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Please see the attached v9 patch set.

Thank you for updating the patch! The patch mostly looks good to me.
Here are some minor comments:

Thanks for looking into this.

---
/* non-export function prototypes */
-static char *limit_printout_length(const char *str);
-
static void ClosePipeFromProgram(CopyFromState cstate);

Now that we have only one function we should replace "prototypes" with
"prototype".

Well no. We might add a few more (never know). A quick look around the
GUCs under /* GUCs */ tells me that plural form there is being used
even just one GUC is defined (xlogprefetcher.c for instance).

Understood.

---
+                                                ereport(NOTICE,
+
errmsg("data type incompatibility at line %llu for column %s: \"%s\"",
+
(unsigned long long) cstate->cur_lineno,
+
cstate->cur_attname,
+
attval));

I guess it would be better to make the log message clearer to convey
what we did for the malformed row. For example, how about something
like "skipping row due to data type incompatibility at line %llu for
column %s: \"s\""?

The summary message which gets printed at the end says that "NOTICE:
6 rows were skipped due to data type incompatibility". Isn't this
enough? If someone is using ON_ERROR 'ignore', it's quite natural that
such rows get skipped softly and the summary message can help them,
no?

I think that in the main log message we should mention what happened
(or is happening) or what we did (or are doing). If the message "data
type incompatibility ..." was in the DETAIL message with the main
message saying something like "skipping row at line %llu for column
%s: ...", it would make sense to me. But the current message seems not
to be clear to me and consistent with other NOTICE messages. Also, the
last summary line would not be written if the user cancelled, and
someone other than person who used ON_ERROR 'ignore' might check the
server logs later.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#44Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Masahiko Sawada (#43)
1 attachment(s)
Re: Add new error_action COPY ON_ERROR "log"

On Tue, Mar 26, 2024 at 9:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

errmsg("data type incompatibility at line %llu for column %s: \"%s\"",

I guess it would be better to make the log message clearer to convey
what we did for the malformed row. For example, how about something
like "skipping row due to data type incompatibility at line %llu for
column %s: \"s\""?

The summary message which gets printed at the end says that "NOTICE:
6 rows were skipped due to data type incompatibility". Isn't this
enough? If someone is using ON_ERROR 'ignore', it's quite natural that
such rows get skipped softly and the summary message can help them,
no?

I think that in the main log message we should mention what happened
(or is happening) or what we did (or are doing). If the message "data
type incompatibility ..." was in the DETAIL message with the main
message saying something like "skipping row at line %llu for column
%s: ...", it would make sense to me. But the current message seems not
to be clear to me and consistent with other NOTICE messages. Also, the
last summary line would not be written if the user cancelled, and
someone other than person who used ON_ERROR 'ignore' might check the
server logs later.

Agree. I changed the NOTICE message to what you've suggested. Thanks.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v11-0001-Add-detailed-info-when-COPY-skips-soft-errors.patchapplication/x-patch; name=v11-0001-Add-detailed-info-when-COPY-skips-soft-errors.patchDownload
From da8c02dace865ea9b02f19968056f25069d8aa91 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 26 Mar 2024 05:51:55 +0000
Subject: [PATCH v11] Add detailed info when COPY skips soft errors

This commit emits individual info like line number and column name
when COPY skips soft errors. Because, the summary containing the
total rows skipped isn't enough for the users to know what exactly
are the malformed rows in the input data.

This commit also adds a new option LOG_VERBOSITY to control the
verbosity of logged messages when COPY command skips soft errors.
This option if required can also be extended to control other COPY
related log messages. A value of 'verbose' can be used to emit
more informative messages by the command, while the value of
'default (which is the default) can be used to not log any
additional messages. More values such as 'terse', 'row_details'
etc. can be added based on the need  to the LOG_VERBOSITY option.
To see the individual info added by this commit when COPY skips
soft errors, one needs to set LOG_VERBOSITY to 'verbose'.

Author: Bharath Rupireddy
Reviewed-by: Michael Paquier, Masahiko Sawada
Reviewed-by: Atsushi Torikoshi
Discussion: https://www.postgresql.org/message-id/CALj2ACUk700cYhx1ATRQyRw-fBM%2BaRo6auRAitKGff7XNmYfqQ%40mail.gmail.com
---
 doc/src/sgml/ref/copy.sgml           | 26 ++++++++++++++++--
 src/backend/commands/copy.c          | 38 ++++++++++++++++++++++++++
 src/backend/commands/copyfrom.c      | 10 +++----
 src/backend/commands/copyfromparse.c | 35 ++++++++++++++++++++++++
 src/bin/psql/tab-complete.c          |  6 +++-
 src/include/commands/copy.h          | 11 ++++++++
 src/test/regress/expected/copy2.out  | 41 +++++++++++++++++++++++++++-
 src/test/regress/sql/copy2.sql       | 24 +++++++++++++++-
 src/tools/pgindent/typedefs.list     |  1 +
 9 files changed, 181 insertions(+), 11 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 6c83e30ed0..ecbbf5f94a 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -45,6 +45,7 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     FORCE_NULL { ( <replaceable class="parameter">column_name</replaceable> [, ...] ) | * }
     ON_ERROR '<replaceable class="parameter">error_action</replaceable>'
     ENCODING '<replaceable class="parameter">encoding_name</replaceable>'
+    LOG_VERBOSITY [ <replaceable class="parameter">mode</replaceable> ]
 </synopsis>
  </refsynopsisdiv>
 
@@ -400,8 +401,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
      </para>
      <para>
-      A <literal>NOTICE</literal> message containing the ignored row count is emitted at the end
-      of the <command>COPY FROM</command> if at least one row was discarded.
+      A <literal>NOTICE</literal> message containing the ignored row count is
+      emitted at the end of the <command>COPY FROM</command> if at least one
+      row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
+      <literal>verbose</literal>, a <literal>NOTICE</literal> message
+      containing the line of the input file and the column name whose input
+      conversion has failed is emitted for each discarded row.
      </para>
     </listitem>
    </varlistentry>
@@ -418,6 +423,23 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>LOG_VERBOSITY</literal></term>
+    <listitem>
+     <para>
+      Sets the verbosity of some of the messages logged by a
+      <command>COPY</command> command.
+      A <replaceable class="parameter">mode</replaceable> value of
+      <literal>verbose</literal> can be used to emit more informative messages.
+      <literal>default</literal> will not log any additional messages.
+     </para>
+     <para>
+      This is currently used in <command>COPY FROM</command> command when
+      <literal>ON_ERROR</literal> is set to <literal>ignore</literal>.
+      </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>WHERE</literal></term>
     <listitem>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 28cf8b040a..67d5c3f7d0 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -422,6 +422,36 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
 	return COPY_ON_ERROR_STOP;	/* keep compiler quiet */
 }
 
+/*
+ * Extract a CopyLogVerbosityChoice value from a DefElem.
+ */
+static CopyLogVerbosityChoice
+defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate)
+{
+	char	   *sval;
+
+	/*
+	 * If no parameter value given, assume the default value.
+	 */
+	if (def->arg == NULL)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+
+	/*
+	 * Allow "default", or "verbose" values.
+	 */
+	sval = defGetString(def);
+	if (pg_strcasecmp(sval, "default") == 0)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+	if (pg_strcasecmp(sval, "verbose") == 0)
+		return COPY_LOG_VERBOSITY_VERBOSE;
+
+	ereport(ERROR,
+			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			 errmsg("COPY LOG_VERBOSITY \"%s\" not recognized", sval),
+			 parser_errposition(pstate, def->location)));
+	return COPY_LOG_VERBOSITY_DEFAULT;	/* keep compiler quiet */
+}
+
 /*
  * Process the statement option list for COPY.
  *
@@ -448,6 +478,7 @@ ProcessCopyOptions(ParseState *pstate,
 	bool		freeze_specified = false;
 	bool		header_specified = false;
 	bool		on_error_specified = false;
+	bool		log_verbosity_specified = false;
 	ListCell   *option;
 
 	/* Support external use for option sanity checking */
@@ -607,6 +638,13 @@ ProcessCopyOptions(ParseState *pstate,
 			on_error_specified = true;
 			opts_out->on_error = defGetCopyOnErrorChoice(defel, pstate, is_from);
 		}
+		else if (strcmp(defel->defname, "log_verbosity") == 0)
+		{
+			if (log_verbosity_specified)
+				errorConflictingDefElem(defel, pstate);
+			log_verbosity_specified = true;
+			opts_out->log_verbosity = defGetCopyLogVerbosityChoice(defel, pstate);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 8908a440e1..06bc14636d 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -101,8 +101,6 @@ typedef struct CopyMultiInsertInfo
 
 
 /* non-export function prototypes */
-static char *limit_printout_length(const char *str);
-
 static void ClosePipeFromProgram(CopyFromState cstate);
 
 /*
@@ -141,7 +139,7 @@ CopyFromErrorCallback(void *arg)
 			/* error is relevant to a particular column */
 			char	   *attval;
 
-			attval = limit_printout_length(cstate->cur_attval);
+			attval = CopyLimitPrintoutLength(cstate->cur_attval);
 			errcontext("COPY %s, line %llu, column %s: \"%s\"",
 					   cstate->cur_relname,
 					   (unsigned long long) cstate->cur_lineno,
@@ -168,7 +166,7 @@ CopyFromErrorCallback(void *arg)
 			{
 				char	   *lineval;
 
-				lineval = limit_printout_length(cstate->line_buf.data);
+				lineval = CopyLimitPrintoutLength(cstate->line_buf.data);
 				errcontext("COPY %s, line %llu: \"%s\"",
 						   cstate->cur_relname,
 						   (unsigned long long) cstate->cur_lineno, lineval);
@@ -189,8 +187,8 @@ CopyFromErrorCallback(void *arg)
  *
  * Returns a pstrdup'd copy of the input.
  */
-static char *
-limit_printout_length(const char *str)
+char *
+CopyLimitPrintoutLength(const char *str)
 {
 #define MAX_COPY_DATA_DISPLAY 100
 
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 5682d5d054..7ddd27f5c6 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -967,7 +967,42 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 											(Node *) cstate->escontext,
 											&values[m]))
 			{
+				Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
+
 				cstate->num_errors++;
+
+				if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
+				{
+					/*
+					 * Since we emit line number and column info in the below
+					 * notice message, we suppress error context information
+					 * other than the relation name.
+					 */
+					Assert(!cstate->relname_only);
+					cstate->relname_only = true;
+
+					if (cstate->cur_attval)
+					{
+						char	   *attval;
+
+						attval = CopyLimitPrintoutLength(cstate->cur_attval);
+						ereport(NOTICE,
+								errmsg("skipping row due to data type incompatibility at line %llu for column %s: \"%s\"",
+									   (unsigned long long) cstate->cur_lineno,
+									   cstate->cur_attname,
+									   attval));
+						pfree(attval);
+					}
+					else
+						ereport(NOTICE,
+								errmsg("skipping row due to data type incompatibility at line %llu for column %s: null input",
+									   (unsigned long long) cstate->cur_lineno,
+									   cstate->cur_attname));
+
+					/* reset relname_only */
+					cstate->relname_only = false;
+				}
+
 				return true;
 			}
 
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 56d723de8a..bfbb3899ad 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2903,7 +2903,7 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("FORMAT", "FREEZE", "DELIMITER", "NULL",
 					  "HEADER", "QUOTE", "ESCAPE", "FORCE_QUOTE",
 					  "FORCE_NOT_NULL", "FORCE_NULL", "ENCODING", "DEFAULT",
-					  "ON_ERROR");
+					  "ON_ERROR", "LOG_VERBOSITY");
 
 	/* Complete COPY <sth> FROM|TO filename WITH (FORMAT */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "FORMAT"))
@@ -2913,6 +2913,10 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "ON_ERROR"))
 		COMPLETE_WITH("stop", "ignore");
 
+	/* Complete COPY <sth> FROM filename WITH (LOG_VERBOSITY */
+	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "LOG_VERBOSITY"))
+		COMPLETE_WITH("default", "verbose");
+
 	/* Complete COPY <sth> FROM <sth> WITH (<options>) */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM", MatchAny, "WITH", MatchAny))
 		COMPLETE_WITH("WHERE");
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index b3da3cb0be..141fd48dc1 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -40,6 +40,15 @@ typedef enum CopyOnErrorChoice
 	COPY_ON_ERROR_IGNORE,		/* ignore errors */
 } CopyOnErrorChoice;
 
+/*
+ * Represents verbosity of logged messages by COPY command.
+ */
+typedef enum CopyLogVerbosityChoice
+{
+	COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional messages, default */
+	COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */
+} CopyLogVerbosityChoice;
+
 /*
  * A struct to hold COPY options, in a parsed form. All of these are related
  * to formatting, except for 'freeze', which doesn't really belong here, but
@@ -73,6 +82,7 @@ typedef struct CopyFormatOptions
 	bool	   *force_null_flags;	/* per-column CSV FN flags */
 	bool		convert_selectively;	/* do selective binary conversion? */
 	CopyOnErrorChoice on_error; /* what to do when error happened */
+	CopyLogVerbosityChoice log_verbosity;	/* verbosity of logged messages */
 	List	   *convert_select; /* list of column names (can be NIL) */
 } CopyFormatOptions;
 
@@ -97,6 +107,7 @@ extern bool NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 extern bool NextCopyFromRawFields(CopyFromState cstate,
 								  char ***fields, int *nfields);
 extern void CopyFromErrorCallback(void *arg);
+extern char *CopyLimitPrintoutLength(const char *str);
 
 extern uint64 CopyFrom(CopyFromState cstate);
 
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index f98c2d1c4e..0735551539 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
 ERROR:  conflicting or redundant options
 LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
                                             ^
+COPY x from stdin (log_verbosity 'default', log_verbosity 'verbose');
+ERROR:  conflicting or redundant options
+LINE 1: COPY x from stdin (log_verbosity 'default', log_verbosity 'v...
+                                                    ^
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
 ERROR:  cannot specify DELIMITER in BINARY mode
@@ -108,6 +112,10 @@ COPY x to stdin (format BINARY, on_error unsupported);
 ERROR:  COPY ON_ERROR cannot be used with COPY TO
 LINE 1: COPY x to stdin (format BINARY, on_error unsupported);
                                         ^
+COPY x to stdout (log_verbosity 'unsupported');
+ERROR:  COPY LOG_VERBOSITY "unsupported" not recognized
+LINE 1: COPY x to stdout (log_verbosity 'unsupported');
+                          ^
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
 ERROR:  column "d" specified more than once
@@ -729,8 +737,31 @@ CREATE TABLE check_ign_err (n int, m int[], k int);
 COPY check_ign_err FROM STDIN WITH (on_error stop);
 ERROR:  invalid input syntax for type integer: "a"
 CONTEXT:  COPY check_ign_err, line 2, column n: "a"
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+-- want context for notices
+\set SHOW_CONTEXT always
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+NOTICE:  skipping row due to data type incompatibility at line 2 for column n: "a"
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 3 for column k: "3333333333"
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 4 for column m: "{a, 4}"
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 5 for column n: ""
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 7 for column m: "a"
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 8 for column k: "a"
+CONTEXT:  COPY check_ign_err
 NOTICE:  6 rows were skipped due to data type incompatibility
+-- tests for on_error option with log_verbosity and null constraint via domain
+CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
+CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
+COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+NOTICE:  skipping row due to data type incompatibility at line 2 for column l: null input
+CONTEXT:  COPY check_ign_err2
+NOTICE:  1 row was skipped due to data type incompatibility
+-- reset context choice
+\set SHOW_CONTEXT errors
 SELECT * FROM check_ign_err;
  n |  m  | k 
 ---+-----+---
@@ -739,6 +770,12 @@ SELECT * FROM check_ign_err;
  8 | {8} | 8
 (3 rows)
 
+SELECT * FROM check_ign_err2;
+ n |  m  | k |   l   
+---+-----+---+-------
+ 1 | {1} | 1 | 'foo'
+(1 row)
+
 -- test datatype error that can't be handled as soft: should fail
 CREATE TABLE hard_err(foo widget);
 COPY hard_err FROM STDIN WITH (on_error ignore);
@@ -767,6 +804,8 @@ DROP VIEW instead_of_insert_tbl_view;
 DROP VIEW instead_of_insert_tbl_view_2;
 DROP FUNCTION fun_instead_of_insert_tbl();
 DROP TABLE check_ign_err;
+DROP TABLE check_ign_err2;
+DROP DOMAIN dcheck_ign_err2;
 DROP TABLE hard_err;
 --
 -- COPY FROM ... DEFAULT
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index afaaa37e52..624d531fd6 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,6 +67,7 @@ COPY x from stdin (force_null (a), force_null (b));
 COPY x from stdin (convert_selectively (a), convert_selectively (b));
 COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
 COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (log_verbosity 'default', log_verbosity 'verbose');
 
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
@@ -80,6 +81,7 @@ COPY x to stdin (format CSV, force_not_null(a));
 COPY x to stdout (format TEXT, force_null(a));
 COPY x to stdin (format CSV, force_null(a));
 COPY x to stdin (format BINARY, on_error unsupported);
+COPY x to stdout (log_verbosity 'unsupported');
 
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
@@ -508,7 +510,11 @@ a	{2}	2
 
 5	{5}	5
 \.
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+
+-- want context for notices
+\set SHOW_CONTEXT always
+
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity verbose);
 1	{1}	1
 a	{2}	2
 3	{3}	3333333333
@@ -519,8 +525,22 @@ a	{2}	2
 7	{7}	a
 8	{8}	8
 \.
+
+-- tests for on_error option with log_verbosity and null constraint via domain
+CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
+CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
+COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+1	{1}	1	'foo'
+2	{2}	2	\N
+\.
+
+-- reset context choice
+\set SHOW_CONTEXT errors
+
 SELECT * FROM check_ign_err;
 
+SELECT * FROM check_ign_err2;
+
 -- test datatype error that can't be handled as soft: should fail
 CREATE TABLE hard_err(foo widget);
 COPY hard_err FROM STDIN WITH (on_error ignore);
@@ -552,6 +572,8 @@ DROP VIEW instead_of_insert_tbl_view;
 DROP VIEW instead_of_insert_tbl_view_2;
 DROP FUNCTION fun_instead_of_insert_tbl();
 DROP TABLE check_ign_err;
+DROP TABLE check_ign_err2;
+DROP DOMAIN dcheck_ign_err2;
 DROP TABLE hard_err;
 
 --
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 4679660837..20724ebf33 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -480,6 +480,7 @@ CopyFromState
 CopyFromStateData
 CopyHeaderChoice
 CopyInsertMethod
+CopyLogVerbosityChoice
 CopyMultiInsertBuffer
 CopyMultiInsertInfo
 CopyOnErrorChoice
-- 
2.34.1

#45Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bharath Rupireddy (#44)
1 attachment(s)
Re: Add new error_action COPY ON_ERROR "log"

On Tue, Mar 26, 2024 at 3:04 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Tue, Mar 26, 2024 at 9:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

errmsg("data type incompatibility at line %llu for column %s: \"%s\"",

I guess it would be better to make the log message clearer to convey
what we did for the malformed row. For example, how about something
like "skipping row due to data type incompatibility at line %llu for
column %s: \"s\""?

The summary message which gets printed at the end says that "NOTICE:
6 rows were skipped due to data type incompatibility". Isn't this
enough? If someone is using ON_ERROR 'ignore', it's quite natural that
such rows get skipped softly and the summary message can help them,
no?

I think that in the main log message we should mention what happened
(or is happening) or what we did (or are doing). If the message "data
type incompatibility ..." was in the DETAIL message with the main
message saying something like "skipping row at line %llu for column
%s: ...", it would make sense to me. But the current message seems not
to be clear to me and consistent with other NOTICE messages. Also, the
last summary line would not be written if the user cancelled, and
someone other than person who used ON_ERROR 'ignore' might check the
server logs later.

Agree. I changed the NOTICE message to what you've suggested. Thanks.

Thank you for updating the patch! Looks good to me.

Please find the attached patch. I've made some changes for the
documentation and the commit message. I'll push it, barring any
objections.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

v12-0001-Add-new-COPY-option-LOG_VERBOSITY.patchapplication/octet-stream; name=v12-0001-Add-new-COPY-option-LOG_VERBOSITY.patchDownload
From ac18004de5cdcf32da16d2064546186d1dd52215 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Tue, 26 Mar 2024 05:51:55 +0000
Subject: [PATCH v12] Add new COPY option LOG_VERBOSITY.

This commit adds a new COPY option LOG_VERBOSITY, which controls the
amount of messages emitted during processing. Valid values are
'default' and 'verbose'.

This is currently used in COPY FROM when ON_ERROR option is set to
ignore. If 'verbose' is specified, additional information for each
discarded row is emitted.

Author: Bharath Rupireddy
Reviewed-by: Michael Paquier, Atsushi Torikoshi, Masahiko Sawada
Discussion: https://www.postgresql.org/message-id/CALj2ACUk700cYhx1ATRQyRw-fBM%2BaRo6auRAitKGff7XNmYfqQ%40mail.gmail.com
---
 doc/src/sgml/ref/copy.sgml           | 25 +++++++++++++++--
 src/backend/commands/copy.c          | 38 ++++++++++++++++++++++++++
 src/backend/commands/copyfrom.c      | 10 +++----
 src/backend/commands/copyfromparse.c | 35 ++++++++++++++++++++++++
 src/bin/psql/tab-complete.c          |  6 +++-
 src/include/commands/copy.h          | 11 ++++++++
 src/test/regress/expected/copy2.out  | 41 +++++++++++++++++++++++++++-
 src/test/regress/sql/copy2.sql       | 24 +++++++++++++++-
 src/tools/pgindent/typedefs.list     |  1 +
 9 files changed, 180 insertions(+), 11 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 6c83e30ed0..12ae49ce9f 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -45,6 +45,7 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     FORCE_NULL { ( <replaceable class="parameter">column_name</replaceable> [, ...] ) | * }
     ON_ERROR '<replaceable class="parameter">error_action</replaceable>'
     ENCODING '<replaceable class="parameter">encoding_name</replaceable>'
+    LOG_VERBOSITY [ <replaceable class="parameter">mode</replaceable> ]
 </synopsis>
  </refsynopsisdiv>
 
@@ -400,8 +401,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
      </para>
      <para>
-      A <literal>NOTICE</literal> message containing the ignored row count is emitted at the end
-      of the <command>COPY FROM</command> if at least one row was discarded.
+      A <literal>NOTICE</literal> message containing the ignored row count is
+      emitted at the end of the <command>COPY FROM</command> if at least one
+      row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
+      <literal>verbose</literal>, a <literal>NOTICE</literal> message
+      containing the line of the input file and the column name whose input
+      conversion has failed is emitted for each discarded row.
      </para>
     </listitem>
    </varlistentry>
@@ -418,6 +423,22 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>LOG_VERBOSITY</literal></term>
+    <listitem>
+     <para>
+      Specify the amount of messages emitted by a <command>COPY</command>
+      command: <literal>default</literal> or <literal>verbose</literal>. If
+      <literal>verbose</literal> is specified, additional messages are emitted
+      during processing.
+     </para>
+     <para>
+      This is currently used in <command>COPY FROM</command> command when
+      <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>.
+      </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>WHERE</literal></term>
     <listitem>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 28cf8b040a..67d5c3f7d0 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -422,6 +422,36 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
 	return COPY_ON_ERROR_STOP;	/* keep compiler quiet */
 }
 
+/*
+ * Extract a CopyLogVerbosityChoice value from a DefElem.
+ */
+static CopyLogVerbosityChoice
+defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate)
+{
+	char	   *sval;
+
+	/*
+	 * If no parameter value given, assume the default value.
+	 */
+	if (def->arg == NULL)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+
+	/*
+	 * Allow "default", or "verbose" values.
+	 */
+	sval = defGetString(def);
+	if (pg_strcasecmp(sval, "default") == 0)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+	if (pg_strcasecmp(sval, "verbose") == 0)
+		return COPY_LOG_VERBOSITY_VERBOSE;
+
+	ereport(ERROR,
+			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			 errmsg("COPY LOG_VERBOSITY \"%s\" not recognized", sval),
+			 parser_errposition(pstate, def->location)));
+	return COPY_LOG_VERBOSITY_DEFAULT;	/* keep compiler quiet */
+}
+
 /*
  * Process the statement option list for COPY.
  *
@@ -448,6 +478,7 @@ ProcessCopyOptions(ParseState *pstate,
 	bool		freeze_specified = false;
 	bool		header_specified = false;
 	bool		on_error_specified = false;
+	bool		log_verbosity_specified = false;
 	ListCell   *option;
 
 	/* Support external use for option sanity checking */
@@ -607,6 +638,13 @@ ProcessCopyOptions(ParseState *pstate,
 			on_error_specified = true;
 			opts_out->on_error = defGetCopyOnErrorChoice(defel, pstate, is_from);
 		}
+		else if (strcmp(defel->defname, "log_verbosity") == 0)
+		{
+			if (log_verbosity_specified)
+				errorConflictingDefElem(defel, pstate);
+			log_verbosity_specified = true;
+			opts_out->log_verbosity = defGetCopyLogVerbosityChoice(defel, pstate);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 8908a440e1..06bc14636d 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -101,8 +101,6 @@ typedef struct CopyMultiInsertInfo
 
 
 /* non-export function prototypes */
-static char *limit_printout_length(const char *str);
-
 static void ClosePipeFromProgram(CopyFromState cstate);
 
 /*
@@ -141,7 +139,7 @@ CopyFromErrorCallback(void *arg)
 			/* error is relevant to a particular column */
 			char	   *attval;
 
-			attval = limit_printout_length(cstate->cur_attval);
+			attval = CopyLimitPrintoutLength(cstate->cur_attval);
 			errcontext("COPY %s, line %llu, column %s: \"%s\"",
 					   cstate->cur_relname,
 					   (unsigned long long) cstate->cur_lineno,
@@ -168,7 +166,7 @@ CopyFromErrorCallback(void *arg)
 			{
 				char	   *lineval;
 
-				lineval = limit_printout_length(cstate->line_buf.data);
+				lineval = CopyLimitPrintoutLength(cstate->line_buf.data);
 				errcontext("COPY %s, line %llu: \"%s\"",
 						   cstate->cur_relname,
 						   (unsigned long long) cstate->cur_lineno, lineval);
@@ -189,8 +187,8 @@ CopyFromErrorCallback(void *arg)
  *
  * Returns a pstrdup'd copy of the input.
  */
-static char *
-limit_printout_length(const char *str)
+char *
+CopyLimitPrintoutLength(const char *str)
 {
 #define MAX_COPY_DATA_DISPLAY 100
 
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 5682d5d054..7ddd27f5c6 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -967,7 +967,42 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 											(Node *) cstate->escontext,
 											&values[m]))
 			{
+				Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
+
 				cstate->num_errors++;
+
+				if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
+				{
+					/*
+					 * Since we emit line number and column info in the below
+					 * notice message, we suppress error context information
+					 * other than the relation name.
+					 */
+					Assert(!cstate->relname_only);
+					cstate->relname_only = true;
+
+					if (cstate->cur_attval)
+					{
+						char	   *attval;
+
+						attval = CopyLimitPrintoutLength(cstate->cur_attval);
+						ereport(NOTICE,
+								errmsg("skipping row due to data type incompatibility at line %llu for column %s: \"%s\"",
+									   (unsigned long long) cstate->cur_lineno,
+									   cstate->cur_attname,
+									   attval));
+						pfree(attval);
+					}
+					else
+						ereport(NOTICE,
+								errmsg("skipping row due to data type incompatibility at line %llu for column %s: null input",
+									   (unsigned long long) cstate->cur_lineno,
+									   cstate->cur_attname));
+
+					/* reset relname_only */
+					cstate->relname_only = false;
+				}
+
 				return true;
 			}
 
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 56d723de8a..bfbb3899ad 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2903,7 +2903,7 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("FORMAT", "FREEZE", "DELIMITER", "NULL",
 					  "HEADER", "QUOTE", "ESCAPE", "FORCE_QUOTE",
 					  "FORCE_NOT_NULL", "FORCE_NULL", "ENCODING", "DEFAULT",
-					  "ON_ERROR");
+					  "ON_ERROR", "LOG_VERBOSITY");
 
 	/* Complete COPY <sth> FROM|TO filename WITH (FORMAT */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "FORMAT"))
@@ -2913,6 +2913,10 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "ON_ERROR"))
 		COMPLETE_WITH("stop", "ignore");
 
+	/* Complete COPY <sth> FROM filename WITH (LOG_VERBOSITY */
+	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "LOG_VERBOSITY"))
+		COMPLETE_WITH("default", "verbose");
+
 	/* Complete COPY <sth> FROM <sth> WITH (<options>) */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM", MatchAny, "WITH", MatchAny))
 		COMPLETE_WITH("WHERE");
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index b3da3cb0be..141fd48dc1 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -40,6 +40,15 @@ typedef enum CopyOnErrorChoice
 	COPY_ON_ERROR_IGNORE,		/* ignore errors */
 } CopyOnErrorChoice;
 
+/*
+ * Represents verbosity of logged messages by COPY command.
+ */
+typedef enum CopyLogVerbosityChoice
+{
+	COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional messages, default */
+	COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */
+} CopyLogVerbosityChoice;
+
 /*
  * A struct to hold COPY options, in a parsed form. All of these are related
  * to formatting, except for 'freeze', which doesn't really belong here, but
@@ -73,6 +82,7 @@ typedef struct CopyFormatOptions
 	bool	   *force_null_flags;	/* per-column CSV FN flags */
 	bool		convert_selectively;	/* do selective binary conversion? */
 	CopyOnErrorChoice on_error; /* what to do when error happened */
+	CopyLogVerbosityChoice log_verbosity;	/* verbosity of logged messages */
 	List	   *convert_select; /* list of column names (can be NIL) */
 } CopyFormatOptions;
 
@@ -97,6 +107,7 @@ extern bool NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 extern bool NextCopyFromRawFields(CopyFromState cstate,
 								  char ***fields, int *nfields);
 extern void CopyFromErrorCallback(void *arg);
+extern char *CopyLimitPrintoutLength(const char *str);
 
 extern uint64 CopyFrom(CopyFromState cstate);
 
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index f98c2d1c4e..0735551539 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
 ERROR:  conflicting or redundant options
 LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
                                             ^
+COPY x from stdin (log_verbosity 'default', log_verbosity 'verbose');
+ERROR:  conflicting or redundant options
+LINE 1: COPY x from stdin (log_verbosity 'default', log_verbosity 'v...
+                                                    ^
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
 ERROR:  cannot specify DELIMITER in BINARY mode
@@ -108,6 +112,10 @@ COPY x to stdin (format BINARY, on_error unsupported);
 ERROR:  COPY ON_ERROR cannot be used with COPY TO
 LINE 1: COPY x to stdin (format BINARY, on_error unsupported);
                                         ^
+COPY x to stdout (log_verbosity 'unsupported');
+ERROR:  COPY LOG_VERBOSITY "unsupported" not recognized
+LINE 1: COPY x to stdout (log_verbosity 'unsupported');
+                          ^
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
 ERROR:  column "d" specified more than once
@@ -729,8 +737,31 @@ CREATE TABLE check_ign_err (n int, m int[], k int);
 COPY check_ign_err FROM STDIN WITH (on_error stop);
 ERROR:  invalid input syntax for type integer: "a"
 CONTEXT:  COPY check_ign_err, line 2, column n: "a"
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+-- want context for notices
+\set SHOW_CONTEXT always
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+NOTICE:  skipping row due to data type incompatibility at line 2 for column n: "a"
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 3 for column k: "3333333333"
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 4 for column m: "{a, 4}"
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 5 for column n: ""
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 7 for column m: "a"
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 8 for column k: "a"
+CONTEXT:  COPY check_ign_err
 NOTICE:  6 rows were skipped due to data type incompatibility
+-- tests for on_error option with log_verbosity and null constraint via domain
+CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
+CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
+COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+NOTICE:  skipping row due to data type incompatibility at line 2 for column l: null input
+CONTEXT:  COPY check_ign_err2
+NOTICE:  1 row was skipped due to data type incompatibility
+-- reset context choice
+\set SHOW_CONTEXT errors
 SELECT * FROM check_ign_err;
  n |  m  | k 
 ---+-----+---
@@ -739,6 +770,12 @@ SELECT * FROM check_ign_err;
  8 | {8} | 8
 (3 rows)
 
+SELECT * FROM check_ign_err2;
+ n |  m  | k |   l   
+---+-----+---+-------
+ 1 | {1} | 1 | 'foo'
+(1 row)
+
 -- test datatype error that can't be handled as soft: should fail
 CREATE TABLE hard_err(foo widget);
 COPY hard_err FROM STDIN WITH (on_error ignore);
@@ -767,6 +804,8 @@ DROP VIEW instead_of_insert_tbl_view;
 DROP VIEW instead_of_insert_tbl_view_2;
 DROP FUNCTION fun_instead_of_insert_tbl();
 DROP TABLE check_ign_err;
+DROP TABLE check_ign_err2;
+DROP DOMAIN dcheck_ign_err2;
 DROP TABLE hard_err;
 --
 -- COPY FROM ... DEFAULT
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index afaaa37e52..624d531fd6 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,6 +67,7 @@ COPY x from stdin (force_null (a), force_null (b));
 COPY x from stdin (convert_selectively (a), convert_selectively (b));
 COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
 COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (log_verbosity 'default', log_verbosity 'verbose');
 
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
@@ -80,6 +81,7 @@ COPY x to stdin (format CSV, force_not_null(a));
 COPY x to stdout (format TEXT, force_null(a));
 COPY x to stdin (format CSV, force_null(a));
 COPY x to stdin (format BINARY, on_error unsupported);
+COPY x to stdout (log_verbosity 'unsupported');
 
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
@@ -508,7 +510,11 @@ a	{2}	2
 
 5	{5}	5
 \.
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+
+-- want context for notices
+\set SHOW_CONTEXT always
+
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity verbose);
 1	{1}	1
 a	{2}	2
 3	{3}	3333333333
@@ -519,8 +525,22 @@ a	{2}	2
 7	{7}	a
 8	{8}	8
 \.
+
+-- tests for on_error option with log_verbosity and null constraint via domain
+CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
+CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
+COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+1	{1}	1	'foo'
+2	{2}	2	\N
+\.
+
+-- reset context choice
+\set SHOW_CONTEXT errors
+
 SELECT * FROM check_ign_err;
 
+SELECT * FROM check_ign_err2;
+
 -- test datatype error that can't be handled as soft: should fail
 CREATE TABLE hard_err(foo widget);
 COPY hard_err FROM STDIN WITH (on_error ignore);
@@ -552,6 +572,8 @@ DROP VIEW instead_of_insert_tbl_view;
 DROP VIEW instead_of_insert_tbl_view_2;
 DROP FUNCTION fun_instead_of_insert_tbl();
 DROP TABLE check_ign_err;
+DROP TABLE check_ign_err2;
+DROP DOMAIN dcheck_ign_err2;
 DROP TABLE hard_err;
 
 --
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 4679660837..20724ebf33 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -480,6 +480,7 @@ CopyFromState
 CopyFromStateData
 CopyHeaderChoice
 CopyInsertMethod
+CopyLogVerbosityChoice
 CopyMultiInsertBuffer
 CopyMultiInsertInfo
 CopyOnErrorChoice
-- 
2.39.3

#46Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Masahiko Sawada (#45)
Re: Add new error_action COPY ON_ERROR "log"

On Tue, Mar 26, 2024 at 1:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Thank you for updating the patch! Looks good to me.

Please find the attached patch. I've made some changes for the
documentation and the commit message. I'll push it, barring any
objections.

Thanks. v12 patch LGTM.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#47Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bharath Rupireddy (#46)
Re: Add new error_action COPY ON_ERROR "log"

On Tue, Mar 26, 2024 at 6:36 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Tue, Mar 26, 2024 at 1:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Thank you for updating the patch! Looks good to me.

Please find the attached patch. I've made some changes for the
documentation and the commit message. I'll push it, barring any
objections.

Thanks. v12 patch LGTM.

While testing the new option, I realized that the tab-completion
complements DEFAULT value, but it doesn't work without single-quotes:

postgres(1:2179134)=# copy test from '/tmp/dump.data' with
(log_verbosity default );
ERROR: syntax error at or near "default"
LINE 1: ...py test from '/tmp/dump.data' with (log_verbosity default );
^
postgres(1:2179134)=# copy test from '/tmp/dump.data' with
(log_verbosity 'default' );
COPY 0

Whereas VERBOSE works even without single-quotes:

postgres(1:2179134)=# copy test from '/tmp/dump.data' with
(log_verbosity verbose );
COPY 0

postgres(1:2179134)=# copy test from '/tmp/dump.data' with
(log_verbosity 'verbose' );
COPY 0

Which could confuse users. This is because DEFAULT is a reserved
keyword and the COPY option doesn't accept them as an option value.

I think that there are two options to handle it:

1. change COPY grammar to accept DEFAULT as an option value.
2. change tab-completion to complement 'DEFAULT' instead of DEFAULT,
and update the doc too.

As for the documentation, we can add single-quotes as follows:

ENCODING '<replaceable class="parameter">encoding_name</replaceable>'
+ LOG_VERBOSITY [ '<replaceable class="parameter">mode</replaceable>' ]

I thought the option (2) is better but there seems no precedent of
complementing a single-quoted string other than file names. So the
option (1) could be clearer.

What do you think?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#48Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Masahiko Sawada (#47)
1 attachment(s)
Re: Add new error_action COPY ON_ERROR "log"

On Wed, Mar 27, 2024 at 7:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I think that there are two options to handle it:

1. change COPY grammar to accept DEFAULT as an option value.
2. change tab-completion to complement 'DEFAULT' instead of DEFAULT,
and update the doc too.

As for the documentation, we can add single-quotes as follows:

ENCODING '<replaceable class="parameter">encoding_name</replaceable>'
+ LOG_VERBOSITY [ '<replaceable class="parameter">mode</replaceable>' ]

I thought the option (2) is better but there seems no precedent of
complementing a single-quoted string other than file names. So the
option (1) could be clearer.

What do you think?

There is another option to change log_verbosity to {none, verbose} or
{none, skip_row_info} (discuseed here
/messages/by-id/Zelrqq-pkfkvsjPn@paquier.xyz
that we might extend this option to other use-cases in future). I tend
to agree with you to support log_verbose to be set to default without
quotes just to be consistent with other commands that allow that [1]column_compression: COMPRESSION ColId { $$ = $2; } | COMPRESSION DEFAULT { $$ = pstrdup("default"); } ;.
And, thanks for quickly identifying where to change in the gram.y.
With that change, now I have changed all the new tests added to use
log_verbosity default without quotes.

FWIW, a recent commit [2]commit b9424d014e195386a83b0f1fe9f5a8e5727e46ea Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Thu Nov 10 18:20:49 2022 -0500 did the same. Therefore, I don't see a
problem supporting it that way for COPY log_verbosity.

Please find the attached v13 patch with the change.

[1]: column_compression: COMPRESSION ColId { $$ = $2; } | COMPRESSION DEFAULT { $$ = pstrdup("default"); } ;
column_compression:
COMPRESSION ColId { $$ = $2; }
| COMPRESSION DEFAULT { $$ =
pstrdup("default"); }
;

column_storage:
STORAGE ColId { $$ = $2; }
| STORAGE DEFAULT { $$ =
pstrdup("default"); }
;

[2]: commit b9424d014e195386a83b0f1fe9f5a8e5727e46ea Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Thu Nov 10 18:20:49 2022 -0500
commit b9424d014e195386a83b0f1fe9f5a8e5727e46ea
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu Nov 10 18:20:49 2022 -0500

Support writing "CREATE/ALTER TABLE ... SET STORAGE DEFAULT".

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v13-0001-Add-new-COPY-option-LOG_VERBOSITY.patchapplication/x-patch; name=v13-0001-Add-new-COPY-option-LOG_VERBOSITY.patchDownload
From 26941c91ec7c2cfe92bf31eb7dc999f60137b809 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 27 Mar 2024 17:36:40 +0000
Subject: [PATCH v13] Add new COPY option LOG_VERBOSITY.

This commit adds a new COPY option LOG_VERBOSITY, which controls the
amount of messages emitted during processing. Valid values are
'default' and 'verbose'.

This is currently used in COPY FROM when ON_ERROR option is set to
ignore. If 'verbose' is specified, additional information for each
discarded row is emitted.

Author: Bharath Rupireddy
Reviewed-by: Michael Paquier, Atsushi Torikoshi, Masahiko Sawada
Discussion: https://www.postgresql.org/message-id/CALj2ACUk700cYhx1ATRQyRw-fBM%2BaRo6auRAitKGff7XNmYfqQ%40mail.gmail.com
---
 doc/src/sgml/ref/copy.sgml           | 25 +++++++++++++++--
 src/backend/commands/copy.c          | 38 ++++++++++++++++++++++++++
 src/backend/commands/copyfrom.c      | 10 +++----
 src/backend/commands/copyfromparse.c | 35 ++++++++++++++++++++++++
 src/backend/parser/gram.y            |  1 +
 src/bin/psql/tab-complete.c          |  6 +++-
 src/include/commands/copy.h          | 11 ++++++++
 src/test/regress/expected/copy2.out  | 41 +++++++++++++++++++++++++++-
 src/test/regress/sql/copy2.sql       | 24 +++++++++++++++-
 src/tools/pgindent/typedefs.list     |  1 +
 10 files changed, 181 insertions(+), 11 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 6c83e30ed0..12ae49ce9f 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -45,6 +45,7 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     FORCE_NULL { ( <replaceable class="parameter">column_name</replaceable> [, ...] ) | * }
     ON_ERROR '<replaceable class="parameter">error_action</replaceable>'
     ENCODING '<replaceable class="parameter">encoding_name</replaceable>'
+    LOG_VERBOSITY [ <replaceable class="parameter">mode</replaceable> ]
 </synopsis>
  </refsynopsisdiv>
 
@@ -400,8 +401,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
      </para>
      <para>
-      A <literal>NOTICE</literal> message containing the ignored row count is emitted at the end
-      of the <command>COPY FROM</command> if at least one row was discarded.
+      A <literal>NOTICE</literal> message containing the ignored row count is
+      emitted at the end of the <command>COPY FROM</command> if at least one
+      row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
+      <literal>verbose</literal>, a <literal>NOTICE</literal> message
+      containing the line of the input file and the column name whose input
+      conversion has failed is emitted for each discarded row.
      </para>
     </listitem>
    </varlistentry>
@@ -418,6 +423,22 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>LOG_VERBOSITY</literal></term>
+    <listitem>
+     <para>
+      Specify the amount of messages emitted by a <command>COPY</command>
+      command: <literal>default</literal> or <literal>verbose</literal>. If
+      <literal>verbose</literal> is specified, additional messages are emitted
+      during processing.
+     </para>
+     <para>
+      This is currently used in <command>COPY FROM</command> command when
+      <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>.
+      </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>WHERE</literal></term>
     <listitem>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 28cf8b040a..67d5c3f7d0 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -422,6 +422,36 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
 	return COPY_ON_ERROR_STOP;	/* keep compiler quiet */
 }
 
+/*
+ * Extract a CopyLogVerbosityChoice value from a DefElem.
+ */
+static CopyLogVerbosityChoice
+defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate)
+{
+	char	   *sval;
+
+	/*
+	 * If no parameter value given, assume the default value.
+	 */
+	if (def->arg == NULL)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+
+	/*
+	 * Allow "default", or "verbose" values.
+	 */
+	sval = defGetString(def);
+	if (pg_strcasecmp(sval, "default") == 0)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+	if (pg_strcasecmp(sval, "verbose") == 0)
+		return COPY_LOG_VERBOSITY_VERBOSE;
+
+	ereport(ERROR,
+			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			 errmsg("COPY LOG_VERBOSITY \"%s\" not recognized", sval),
+			 parser_errposition(pstate, def->location)));
+	return COPY_LOG_VERBOSITY_DEFAULT;	/* keep compiler quiet */
+}
+
 /*
  * Process the statement option list for COPY.
  *
@@ -448,6 +478,7 @@ ProcessCopyOptions(ParseState *pstate,
 	bool		freeze_specified = false;
 	bool		header_specified = false;
 	bool		on_error_specified = false;
+	bool		log_verbosity_specified = false;
 	ListCell   *option;
 
 	/* Support external use for option sanity checking */
@@ -607,6 +638,13 @@ ProcessCopyOptions(ParseState *pstate,
 			on_error_specified = true;
 			opts_out->on_error = defGetCopyOnErrorChoice(defel, pstate, is_from);
 		}
+		else if (strcmp(defel->defname, "log_verbosity") == 0)
+		{
+			if (log_verbosity_specified)
+				errorConflictingDefElem(defel, pstate);
+			log_verbosity_specified = true;
+			opts_out->log_verbosity = defGetCopyLogVerbosityChoice(defel, pstate);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 8908a440e1..06bc14636d 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -101,8 +101,6 @@ typedef struct CopyMultiInsertInfo
 
 
 /* non-export function prototypes */
-static char *limit_printout_length(const char *str);
-
 static void ClosePipeFromProgram(CopyFromState cstate);
 
 /*
@@ -141,7 +139,7 @@ CopyFromErrorCallback(void *arg)
 			/* error is relevant to a particular column */
 			char	   *attval;
 
-			attval = limit_printout_length(cstate->cur_attval);
+			attval = CopyLimitPrintoutLength(cstate->cur_attval);
 			errcontext("COPY %s, line %llu, column %s: \"%s\"",
 					   cstate->cur_relname,
 					   (unsigned long long) cstate->cur_lineno,
@@ -168,7 +166,7 @@ CopyFromErrorCallback(void *arg)
 			{
 				char	   *lineval;
 
-				lineval = limit_printout_length(cstate->line_buf.data);
+				lineval = CopyLimitPrintoutLength(cstate->line_buf.data);
 				errcontext("COPY %s, line %llu: \"%s\"",
 						   cstate->cur_relname,
 						   (unsigned long long) cstate->cur_lineno, lineval);
@@ -189,8 +187,8 @@ CopyFromErrorCallback(void *arg)
  *
  * Returns a pstrdup'd copy of the input.
  */
-static char *
-limit_printout_length(const char *str)
+char *
+CopyLimitPrintoutLength(const char *str)
 {
 #define MAX_COPY_DATA_DISPLAY 100
 
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 5682d5d054..7ddd27f5c6 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -967,7 +967,42 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 											(Node *) cstate->escontext,
 											&values[m]))
 			{
+				Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
+
 				cstate->num_errors++;
+
+				if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
+				{
+					/*
+					 * Since we emit line number and column info in the below
+					 * notice message, we suppress error context information
+					 * other than the relation name.
+					 */
+					Assert(!cstate->relname_only);
+					cstate->relname_only = true;
+
+					if (cstate->cur_attval)
+					{
+						char	   *attval;
+
+						attval = CopyLimitPrintoutLength(cstate->cur_attval);
+						ereport(NOTICE,
+								errmsg("skipping row due to data type incompatibility at line %llu for column %s: \"%s\"",
+									   (unsigned long long) cstate->cur_lineno,
+									   cstate->cur_attname,
+									   attval));
+						pfree(attval);
+					}
+					else
+						ereport(NOTICE,
+								errmsg("skipping row due to data type incompatibility at line %llu for column %s: null input",
+									   (unsigned long long) cstate->cur_lineno,
+									   cstate->cur_attname));
+
+					/* reset relname_only */
+					cstate->relname_only = false;
+				}
+
 				return true;
 			}
 
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c1b0cff1c9..a13f285970 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3528,6 +3528,7 @@ copy_generic_opt_arg:
 			opt_boolean_or_string			{ $$ = (Node *) makeString($1); }
 			| NumericOnly					{ $$ = (Node *) $1; }
 			| '*'							{ $$ = (Node *) makeNode(A_Star); }
+			| DEFAULT                       { $$ = (Node *) makeString("default"); }
 			| '(' copy_generic_opt_arg_list ')'		{ $$ = (Node *) $2; }
 			| /* EMPTY */					{ $$ = NULL; }
 		;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 56d723de8a..bfbb3899ad 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2903,7 +2903,7 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("FORMAT", "FREEZE", "DELIMITER", "NULL",
 					  "HEADER", "QUOTE", "ESCAPE", "FORCE_QUOTE",
 					  "FORCE_NOT_NULL", "FORCE_NULL", "ENCODING", "DEFAULT",
-					  "ON_ERROR");
+					  "ON_ERROR", "LOG_VERBOSITY");
 
 	/* Complete COPY <sth> FROM|TO filename WITH (FORMAT */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "FORMAT"))
@@ -2913,6 +2913,10 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "ON_ERROR"))
 		COMPLETE_WITH("stop", "ignore");
 
+	/* Complete COPY <sth> FROM filename WITH (LOG_VERBOSITY */
+	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "LOG_VERBOSITY"))
+		COMPLETE_WITH("default", "verbose");
+
 	/* Complete COPY <sth> FROM <sth> WITH (<options>) */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM", MatchAny, "WITH", MatchAny))
 		COMPLETE_WITH("WHERE");
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index b3da3cb0be..141fd48dc1 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -40,6 +40,15 @@ typedef enum CopyOnErrorChoice
 	COPY_ON_ERROR_IGNORE,		/* ignore errors */
 } CopyOnErrorChoice;
 
+/*
+ * Represents verbosity of logged messages by COPY command.
+ */
+typedef enum CopyLogVerbosityChoice
+{
+	COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional messages, default */
+	COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */
+} CopyLogVerbosityChoice;
+
 /*
  * A struct to hold COPY options, in a parsed form. All of these are related
  * to formatting, except for 'freeze', which doesn't really belong here, but
@@ -73,6 +82,7 @@ typedef struct CopyFormatOptions
 	bool	   *force_null_flags;	/* per-column CSV FN flags */
 	bool		convert_selectively;	/* do selective binary conversion? */
 	CopyOnErrorChoice on_error; /* what to do when error happened */
+	CopyLogVerbosityChoice log_verbosity;	/* verbosity of logged messages */
 	List	   *convert_select; /* list of column names (can be NIL) */
 } CopyFormatOptions;
 
@@ -97,6 +107,7 @@ extern bool NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 extern bool NextCopyFromRawFields(CopyFromState cstate,
 								  char ***fields, int *nfields);
 extern void CopyFromErrorCallback(void *arg);
+extern char *CopyLimitPrintoutLength(const char *str);
 
 extern uint64 CopyFrom(CopyFromState cstate);
 
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index f98c2d1c4e..931542f268 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
 ERROR:  conflicting or redundant options
 LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
                                             ^
+COPY x from stdin (log_verbosity default, log_verbosity verbose);
+ERROR:  conflicting or redundant options
+LINE 1: COPY x from stdin (log_verbosity default, log_verbosity verb...
+                                                  ^
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
 ERROR:  cannot specify DELIMITER in BINARY mode
@@ -108,6 +112,10 @@ COPY x to stdin (format BINARY, on_error unsupported);
 ERROR:  COPY ON_ERROR cannot be used with COPY TO
 LINE 1: COPY x to stdin (format BINARY, on_error unsupported);
                                         ^
+COPY x to stdout (log_verbosity unsupported);
+ERROR:  COPY LOG_VERBOSITY "unsupported" not recognized
+LINE 1: COPY x to stdout (log_verbosity unsupported);
+                          ^
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
 ERROR:  column "d" specified more than once
@@ -729,8 +737,31 @@ CREATE TABLE check_ign_err (n int, m int[], k int);
 COPY check_ign_err FROM STDIN WITH (on_error stop);
 ERROR:  invalid input syntax for type integer: "a"
 CONTEXT:  COPY check_ign_err, line 2, column n: "a"
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+-- want context for notices
+\set SHOW_CONTEXT always
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+NOTICE:  skipping row due to data type incompatibility at line 2 for column n: "a"
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 3 for column k: "3333333333"
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 4 for column m: "{a, 4}"
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 5 for column n: ""
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 7 for column m: "a"
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 8 for column k: "a"
+CONTEXT:  COPY check_ign_err
 NOTICE:  6 rows were skipped due to data type incompatibility
+-- tests for on_error option with log_verbosity and null constraint via domain
+CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
+CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
+COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+NOTICE:  skipping row due to data type incompatibility at line 2 for column l: null input
+CONTEXT:  COPY check_ign_err2
+NOTICE:  1 row was skipped due to data type incompatibility
+-- reset context choice
+\set SHOW_CONTEXT errors
 SELECT * FROM check_ign_err;
  n |  m  | k 
 ---+-----+---
@@ -739,6 +770,12 @@ SELECT * FROM check_ign_err;
  8 | {8} | 8
 (3 rows)
 
+SELECT * FROM check_ign_err2;
+ n |  m  | k |   l   
+---+-----+---+-------
+ 1 | {1} | 1 | 'foo'
+(1 row)
+
 -- test datatype error that can't be handled as soft: should fail
 CREATE TABLE hard_err(foo widget);
 COPY hard_err FROM STDIN WITH (on_error ignore);
@@ -767,6 +804,8 @@ DROP VIEW instead_of_insert_tbl_view;
 DROP VIEW instead_of_insert_tbl_view_2;
 DROP FUNCTION fun_instead_of_insert_tbl();
 DROP TABLE check_ign_err;
+DROP TABLE check_ign_err2;
+DROP DOMAIN dcheck_ign_err2;
 DROP TABLE hard_err;
 --
 -- COPY FROM ... DEFAULT
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index afaaa37e52..8b14962194 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,6 +67,7 @@ COPY x from stdin (force_null (a), force_null (b));
 COPY x from stdin (convert_selectively (a), convert_selectively (b));
 COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
 COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (log_verbosity default, log_verbosity verbose);
 
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
@@ -80,6 +81,7 @@ COPY x to stdin (format CSV, force_not_null(a));
 COPY x to stdout (format TEXT, force_null(a));
 COPY x to stdin (format CSV, force_null(a));
 COPY x to stdin (format BINARY, on_error unsupported);
+COPY x to stdout (log_verbosity unsupported);
 
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
@@ -508,7 +510,11 @@ a	{2}	2
 
 5	{5}	5
 \.
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+
+-- want context for notices
+\set SHOW_CONTEXT always
+
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity verbose);
 1	{1}	1
 a	{2}	2
 3	{3}	3333333333
@@ -519,8 +525,22 @@ a	{2}	2
 7	{7}	a
 8	{8}	8
 \.
+
+-- tests for on_error option with log_verbosity and null constraint via domain
+CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
+CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
+COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+1	{1}	1	'foo'
+2	{2}	2	\N
+\.
+
+-- reset context choice
+\set SHOW_CONTEXT errors
+
 SELECT * FROM check_ign_err;
 
+SELECT * FROM check_ign_err2;
+
 -- test datatype error that can't be handled as soft: should fail
 CREATE TABLE hard_err(foo widget);
 COPY hard_err FROM STDIN WITH (on_error ignore);
@@ -552,6 +572,8 @@ DROP VIEW instead_of_insert_tbl_view;
 DROP VIEW instead_of_insert_tbl_view_2;
 DROP FUNCTION fun_instead_of_insert_tbl();
 DROP TABLE check_ign_err;
+DROP TABLE check_ign_err2;
+DROP DOMAIN dcheck_ign_err2;
 DROP TABLE hard_err;
 
 --
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index cfa9d5aaea..585efc1412 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -480,6 +480,7 @@ CopyFromState
 CopyFromStateData
 CopyHeaderChoice
 CopyInsertMethod
+CopyLogVerbosityChoice
 CopyMultiInsertBuffer
 CopyMultiInsertInfo
 CopyOnErrorChoice
-- 
2.34.1

#49Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bharath Rupireddy (#48)
Re: Add new error_action COPY ON_ERROR "log"

On Thu, Mar 28, 2024 at 2:49 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Mar 27, 2024 at 7:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I think that there are two options to handle it:

1. change COPY grammar to accept DEFAULT as an option value.
2. change tab-completion to complement 'DEFAULT' instead of DEFAULT,
and update the doc too.

As for the documentation, we can add single-quotes as follows:

ENCODING '<replaceable class="parameter">encoding_name</replaceable>'
+ LOG_VERBOSITY [ '<replaceable class="parameter">mode</replaceable>' ]

I thought the option (2) is better but there seems no precedent of
complementing a single-quoted string other than file names. So the
option (1) could be clearer.

What do you think?

There is another option to change log_verbosity to {none, verbose} or
{none, skip_row_info} (discuseed here
/messages/by-id/Zelrqq-pkfkvsjPn@paquier.xyz
that we might extend this option to other use-cases in future). I tend
to agree with you to support log_verbose to be set to default without
quotes just to be consistent with other commands that allow that [1].
And, thanks for quickly identifying where to change in the gram.y.
With that change, now I have changed all the new tests added to use
log_verbosity default without quotes.

FWIW, a recent commit [2] did the same. Therefore, I don't see a
problem supporting it that way for COPY log_verbosity.

Please find the attached v13 patch with the change.

Thank you for updating the patch quickly, and sharing the reference.

I think {default, verbose} is a good start and more consistent with
other similar features. We can add other modes later.

Regarding the syntax change, since copy_generic_opt_arg rule is only
for COPY option syntax, the change doesn't affect other syntaxes. I've
confirmed the tab-completion works fine.

I'll push the patch, barring any objections.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#50torikoshia
torikoshia@oss.nttdata.com
In reply to: Masahiko Sawada (#45)
Re: Add new error_action COPY ON_ERROR "log"

On 2024-03-26 17:16, Masahiko Sawada wrote:

On Tue, Mar 26, 2024 at 3:04 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Tue, Mar 26, 2024 at 9:56 AM Masahiko Sawada
<sawada.mshk@gmail.com> wrote:

errmsg("data type incompatibility at line %llu for column %s: \"%s\"",

I guess it would be better to make the log message clearer to convey
what we did for the malformed row. For example, how about something
like "skipping row due to data type incompatibility at line %llu for
column %s: \"s\""?

The summary message which gets printed at the end says that "NOTICE:
6 rows were skipped due to data type incompatibility". Isn't this
enough? If someone is using ON_ERROR 'ignore', it's quite natural that
such rows get skipped softly and the summary message can help them,
no?

I think that in the main log message we should mention what happened
(or is happening) or what we did (or are doing). If the message "data
type incompatibility ..." was in the DETAIL message with the main
message saying something like "skipping row at line %llu for column
%s: ...", it would make sense to me. But the current message seems not
to be clear to me and consistent with other NOTICE messages. Also, the
last summary line would not be written if the user cancelled, and
someone other than person who used ON_ERROR 'ignore' might check the
server logs later.

Agree. I changed the NOTICE message to what you've suggested. Thanks.

Thank you for updating the patch! Looks good to me.

Please find the attached patch. I've made some changes for the
documentation and the commit message. I'll push it, barring any
objections.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Thanks!

Attached patch fixes the doc, but I'm wondering perhaps it might be
better to modify the codes to prohibit abbreviation of the value.

When seeing the query which abbreviates ON_ERROR value, I feel it's not
obvious what happens compared to other options which tolerates
abbreviation of the value such as FREEZE or HEADER.

COPY t1 FROM stdin WITH (ON_ERROR);

What do you think?

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

#51Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: torikoshia (#50)
Re: Add new error_action COPY ON_ERROR "log"

On Thu, Mar 28, 2024 at 1:43 PM torikoshia <torikoshia@oss.nttdata.com> wrote:

Attached patch fixes the doc,

May I know which patch you are referring to? And, what do you mean by
"fixes the doc"?

but I'm wondering perhaps it might be
better to modify the codes to prohibit abbreviation of the value.

Please help me understand the meaning here.

When seeing the query which abbreviates ON_ERROR value, I feel it's not
obvious what happens compared to other options which tolerates
abbreviation of the value such as FREEZE or HEADER.

COPY t1 FROM stdin WITH (ON_ERROR);

What do you think?

So, do you mean to prohibit ON_ERROR being specified without any value
like in COPY t1 FROM stdin WITH (ON_ERROR);? If yes, I think all the
other options do allow that [1]postgres=# COPY t1 FROM stdin WITH ( DEFAULT ESCAPE FORCE_QUOTE HEADER QUOTE DELIMITER FORCE_NOT_NULL FORMAT NULL ENCODING FORCE_NULL FREEZE ON_ERROR.

Even if we were to do something like this, shall we discuss this separately?

Having said that, what do you think of the v13 patch posted upthread?

[1]: postgres=# COPY t1 FROM stdin WITH ( DEFAULT ESCAPE FORCE_QUOTE HEADER QUOTE DELIMITER FORCE_NOT_NULL FORMAT NULL ENCODING FORCE_NULL FREEZE ON_ERROR
postgres=# COPY t1 FROM stdin WITH (
DEFAULT ESCAPE FORCE_QUOTE HEADER QUOTE
DELIMITER FORCE_NOT_NULL FORMAT NULL
ENCODING FORCE_NULL FREEZE ON_ERROR

postgres=# COPY t1 FROM stdin WITH ( QUOTE );
ERROR: relation "t1" does not exist
postgres=# COPY t1 FROM stdin WITH ( DEFAULT );
ERROR: relation "t1" does not exist
postgres=# COPY t1 FROM stdin WITH ( ENCODING );
ERROR: relation "t1" does not exist

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#52torikoshia
torikoshia@oss.nttdata.com
In reply to: Bharath Rupireddy (#51)
Re: Add new error_action COPY ON_ERROR "log"

On 2024-03-28 17:27, Bharath Rupireddy wrote:

On Thu, Mar 28, 2024 at 1:43 PM torikoshia <torikoshia@oss.nttdata.com>
wrote:

Attached patch fixes the doc,

May I know which patch you are referring to? And, what do you mean by
"fixes the doc"?

Ugh, I replied to the wrong thread.
Sorry for making you confused and please ignore it.

but I'm wondering perhaps it might be
better to modify the codes to prohibit abbreviation of the value.

Please help me understand the meaning here.

When seeing the query which abbreviates ON_ERROR value, I feel it's
not
obvious what happens compared to other options which tolerates
abbreviation of the value such as FREEZE or HEADER.

COPY t1 FROM stdin WITH (ON_ERROR);

What do you think?

So, do you mean to prohibit ON_ERROR being specified without any value
like in COPY t1 FROM stdin WITH (ON_ERROR);? If yes, I think all the
other options do allow that [1].

Even if we were to do something like this, shall we discuss this
separately?

Having said that, what do you think of the v13 patch posted upthread?

[1]
postgres=# COPY t1 FROM stdin WITH (
DEFAULT ESCAPE FORCE_QUOTE HEADER QUOTE
DELIMITER FORCE_NOT_NULL FORMAT NULL
ENCODING FORCE_NULL FREEZE ON_ERROR

postgres=# COPY t1 FROM stdin WITH ( QUOTE );
ERROR: relation "t1" does not exist
postgres=# COPY t1 FROM stdin WITH ( DEFAULT );
ERROR: relation "t1" does not exist
postgres=# COPY t1 FROM stdin WITH ( ENCODING );
ERROR: relation "t1" does not exist

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

#53Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bharath Rupireddy (#51)
Re: Add new error_action COPY ON_ERROR "log"

On Thu, Mar 28, 2024 at 5:28 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Thu, Mar 28, 2024 at 1:43 PM torikoshia <torikoshia@oss.nttdata.com> wrote:

Attached patch fixes the doc,

May I know which patch you are referring to? And, what do you mean by
"fixes the doc"?

but I'm wondering perhaps it might be
better to modify the codes to prohibit abbreviation of the value.

Please help me understand the meaning here.

When seeing the query which abbreviates ON_ERROR value, I feel it's not
obvious what happens compared to other options which tolerates
abbreviation of the value such as FREEZE or HEADER.

COPY t1 FROM stdin WITH (ON_ERROR);

What do you think?

So, do you mean to prohibit ON_ERROR being specified without any value
like in COPY t1 FROM stdin WITH (ON_ERROR);? If yes, I think all the
other options do allow that [1].

Even if we were to do something like this, shall we discuss this separately?

Having said that, what do you think of the v13 patch posted upthread?

This topic accidentally jumped in this thread, but it made me think
that the same might be true for the LOG_VERBOSITY option. That is,
since the LOG_VERBOSITY option is an enum parameter, it might make
more sense to require the value, instead of making the value optional.
For example, the following command could not be obvious for users:

COPY test FROM stdin (ON_ERROR ignore, LOG_VERBOSITY);

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#54torikoshia
torikoshia@oss.nttdata.com
In reply to: torikoshia (#52)
Re: Add new error_action COPY ON_ERROR "log"

On 2024-03-28 21:36, torikoshia wrote:

On 2024-03-28 17:27, Bharath Rupireddy wrote:

On Thu, Mar 28, 2024 at 1:43 PM torikoshia
<torikoshia@oss.nttdata.com> wrote:

Attached patch fixes the doc,

May I know which patch you are referring to? And, what do you mean by
"fixes the doc"?

Ugh, I replied to the wrong thread.
Sorry for making you confused and please ignore it.

but I'm wondering perhaps it might be
better to modify the codes to prohibit abbreviation of the value.

Please help me understand the meaning here.

When seeing the query which abbreviates ON_ERROR value, I feel it's
not
obvious what happens compared to other options which tolerates
abbreviation of the value such as FREEZE or HEADER.

COPY t1 FROM stdin WITH (ON_ERROR);

What do you think?

So, do you mean to prohibit ON_ERROR being specified without any value
like in COPY t1 FROM stdin WITH (ON_ERROR);? If yes, I think all the
other options do allow that [1].

Even if we were to do something like this, shall we discuss this
separately?

Having said that, what do you think of the v13 patch posted upthread?

It looks good to me other than what Sawada-san lastly pointed out.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

#55Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Masahiko Sawada (#53)
1 attachment(s)
Re: Add new error_action COPY ON_ERROR "log"

On Thu, Mar 28, 2024 at 6:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

That is,
since the LOG_VERBOSITY option is an enum parameter, it might make
more sense to require the value, instead of making the value optional.
For example, the following command could not be obvious for users:

COPY test FROM stdin (ON_ERROR ignore, LOG_VERBOSITY);

Agreed. Please see the attached v14 patch. The LOG_VERBOSITY now needs
a value to be specified. Note that I've not added any test for this
case as there seemed to be no such tests so far generating "ERROR:
<<option>> requires a parameter". I don't mind adding one for
LOG_VERBOSITY though.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v14-0001-Add-new-COPY-option-LOG_VERBOSITY.patchapplication/octet-stream; name=v14-0001-Add-new-COPY-option-LOG_VERBOSITY.patchDownload
From e42ba2e2636835d05c1de935c724b4e862543626 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 30 Mar 2024 11:12:08 +0000
Subject: [PATCH v14] Add new COPY option LOG_VERBOSITY.

This commit adds a new COPY option LOG_VERBOSITY, which controls the
amount of messages emitted during processing. Valid values are
'default' and 'verbose'.

This is currently used in COPY FROM when ON_ERROR option is set to
ignore. If 'verbose' is specified, additional information for each
discarded row is emitted.

Author: Bharath Rupireddy
Reviewed-by: Michael Paquier, Atsushi Torikoshi, Masahiko Sawada
Discussion: https://www.postgresql.org/message-id/CALj2ACUk700cYhx1ATRQyRw-fBM%2BaRo6auRAitKGff7XNmYfqQ%40mail.gmail.com
---
 doc/src/sgml/ref/copy.sgml           | 25 +++++++++++++++--
 src/backend/commands/copy.c          | 32 ++++++++++++++++++++++
 src/backend/commands/copyfrom.c      | 10 +++----
 src/backend/commands/copyfromparse.c | 35 ++++++++++++++++++++++++
 src/backend/parser/gram.y            |  1 +
 src/bin/psql/tab-complete.c          |  6 +++-
 src/include/commands/copy.h          | 11 ++++++++
 src/test/regress/expected/copy2.out  | 41 +++++++++++++++++++++++++++-
 src/test/regress/sql/copy2.sql       | 24 +++++++++++++++-
 src/tools/pgindent/typedefs.list     |  1 +
 10 files changed, 175 insertions(+), 11 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 6c83e30ed0..12ae49ce9f 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -45,6 +45,7 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     FORCE_NULL { ( <replaceable class="parameter">column_name</replaceable> [, ...] ) | * }
     ON_ERROR '<replaceable class="parameter">error_action</replaceable>'
     ENCODING '<replaceable class="parameter">encoding_name</replaceable>'
+    LOG_VERBOSITY [ <replaceable class="parameter">mode</replaceable> ]
 </synopsis>
  </refsynopsisdiv>
 
@@ -400,8 +401,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
      </para>
      <para>
-      A <literal>NOTICE</literal> message containing the ignored row count is emitted at the end
-      of the <command>COPY FROM</command> if at least one row was discarded.
+      A <literal>NOTICE</literal> message containing the ignored row count is
+      emitted at the end of the <command>COPY FROM</command> if at least one
+      row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
+      <literal>verbose</literal>, a <literal>NOTICE</literal> message
+      containing the line of the input file and the column name whose input
+      conversion has failed is emitted for each discarded row.
      </para>
     </listitem>
    </varlistentry>
@@ -418,6 +423,22 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>LOG_VERBOSITY</literal></term>
+    <listitem>
+     <para>
+      Specify the amount of messages emitted by a <command>COPY</command>
+      command: <literal>default</literal> or <literal>verbose</literal>. If
+      <literal>verbose</literal> is specified, additional messages are emitted
+      during processing.
+     </para>
+     <para>
+      This is currently used in <command>COPY FROM</command> command when
+      <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>.
+      </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>WHERE</literal></term>
     <listitem>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 28cf8b040a..f75e1d700d 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -422,6 +422,30 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
 	return COPY_ON_ERROR_STOP;	/* keep compiler quiet */
 }
 
+/*
+ * Extract a CopyLogVerbosityChoice value from a DefElem.
+ */
+static CopyLogVerbosityChoice
+defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate)
+{
+	char	   *sval;
+
+	/*
+	 * Allow "default", or "verbose" values.
+	 */
+	sval = defGetString(def);
+	if (pg_strcasecmp(sval, "default") == 0)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+	if (pg_strcasecmp(sval, "verbose") == 0)
+		return COPY_LOG_VERBOSITY_VERBOSE;
+
+	ereport(ERROR,
+			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			 errmsg("COPY LOG_VERBOSITY \"%s\" not recognized", sval),
+			 parser_errposition(pstate, def->location)));
+	return COPY_LOG_VERBOSITY_DEFAULT;	/* keep compiler quiet */
+}
+
 /*
  * Process the statement option list for COPY.
  *
@@ -448,6 +472,7 @@ ProcessCopyOptions(ParseState *pstate,
 	bool		freeze_specified = false;
 	bool		header_specified = false;
 	bool		on_error_specified = false;
+	bool		log_verbosity_specified = false;
 	ListCell   *option;
 
 	/* Support external use for option sanity checking */
@@ -607,6 +632,13 @@ ProcessCopyOptions(ParseState *pstate,
 			on_error_specified = true;
 			opts_out->on_error = defGetCopyOnErrorChoice(defel, pstate, is_from);
 		}
+		else if (strcmp(defel->defname, "log_verbosity") == 0)
+		{
+			if (log_verbosity_specified)
+				errorConflictingDefElem(defel, pstate);
+			log_verbosity_specified = true;
+			opts_out->log_verbosity = defGetCopyLogVerbosityChoice(defel, pstate);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 8908a440e1..06bc14636d 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -101,8 +101,6 @@ typedef struct CopyMultiInsertInfo
 
 
 /* non-export function prototypes */
-static char *limit_printout_length(const char *str);
-
 static void ClosePipeFromProgram(CopyFromState cstate);
 
 /*
@@ -141,7 +139,7 @@ CopyFromErrorCallback(void *arg)
 			/* error is relevant to a particular column */
 			char	   *attval;
 
-			attval = limit_printout_length(cstate->cur_attval);
+			attval = CopyLimitPrintoutLength(cstate->cur_attval);
 			errcontext("COPY %s, line %llu, column %s: \"%s\"",
 					   cstate->cur_relname,
 					   (unsigned long long) cstate->cur_lineno,
@@ -168,7 +166,7 @@ CopyFromErrorCallback(void *arg)
 			{
 				char	   *lineval;
 
-				lineval = limit_printout_length(cstate->line_buf.data);
+				lineval = CopyLimitPrintoutLength(cstate->line_buf.data);
 				errcontext("COPY %s, line %llu: \"%s\"",
 						   cstate->cur_relname,
 						   (unsigned long long) cstate->cur_lineno, lineval);
@@ -189,8 +187,8 @@ CopyFromErrorCallback(void *arg)
  *
  * Returns a pstrdup'd copy of the input.
  */
-static char *
-limit_printout_length(const char *str)
+char *
+CopyLimitPrintoutLength(const char *str)
 {
 #define MAX_COPY_DATA_DISPLAY 100
 
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 5682d5d054..7ddd27f5c6 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -967,7 +967,42 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 											(Node *) cstate->escontext,
 											&values[m]))
 			{
+				Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
+
 				cstate->num_errors++;
+
+				if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
+				{
+					/*
+					 * Since we emit line number and column info in the below
+					 * notice message, we suppress error context information
+					 * other than the relation name.
+					 */
+					Assert(!cstate->relname_only);
+					cstate->relname_only = true;
+
+					if (cstate->cur_attval)
+					{
+						char	   *attval;
+
+						attval = CopyLimitPrintoutLength(cstate->cur_attval);
+						ereport(NOTICE,
+								errmsg("skipping row due to data type incompatibility at line %llu for column %s: \"%s\"",
+									   (unsigned long long) cstate->cur_lineno,
+									   cstate->cur_attname,
+									   attval));
+						pfree(attval);
+					}
+					else
+						ereport(NOTICE,
+								errmsg("skipping row due to data type incompatibility at line %llu for column %s: null input",
+									   (unsigned long long) cstate->cur_lineno,
+									   cstate->cur_attname));
+
+					/* reset relname_only */
+					cstate->relname_only = false;
+				}
+
 				return true;
 			}
 
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 682748eb4b..f1af6147c3 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3530,6 +3530,7 @@ copy_generic_opt_arg:
 			opt_boolean_or_string			{ $$ = (Node *) makeString($1); }
 			| NumericOnly					{ $$ = (Node *) $1; }
 			| '*'							{ $$ = (Node *) makeNode(A_Star); }
+			| DEFAULT                       { $$ = (Node *) makeString("default"); }
 			| '(' copy_generic_opt_arg_list ')'		{ $$ = (Node *) $2; }
 			| /* EMPTY */					{ $$ = NULL; }
 		;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index fc6865fc70..82eb3955ab 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2904,7 +2904,7 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("FORMAT", "FREEZE", "DELIMITER", "NULL",
 					  "HEADER", "QUOTE", "ESCAPE", "FORCE_QUOTE",
 					  "FORCE_NOT_NULL", "FORCE_NULL", "ENCODING", "DEFAULT",
-					  "ON_ERROR");
+					  "ON_ERROR", "LOG_VERBOSITY");
 
 	/* Complete COPY <sth> FROM|TO filename WITH (FORMAT */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "FORMAT"))
@@ -2914,6 +2914,10 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "ON_ERROR"))
 		COMPLETE_WITH("stop", "ignore");
 
+	/* Complete COPY <sth> FROM filename WITH (LOG_VERBOSITY */
+	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "LOG_VERBOSITY"))
+		COMPLETE_WITH("default", "verbose");
+
 	/* Complete COPY <sth> FROM <sth> WITH (<options>) */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM", MatchAny, "WITH", MatchAny))
 		COMPLETE_WITH("WHERE");
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index b3da3cb0be..141fd48dc1 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -40,6 +40,15 @@ typedef enum CopyOnErrorChoice
 	COPY_ON_ERROR_IGNORE,		/* ignore errors */
 } CopyOnErrorChoice;
 
+/*
+ * Represents verbosity of logged messages by COPY command.
+ */
+typedef enum CopyLogVerbosityChoice
+{
+	COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional messages, default */
+	COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */
+} CopyLogVerbosityChoice;
+
 /*
  * A struct to hold COPY options, in a parsed form. All of these are related
  * to formatting, except for 'freeze', which doesn't really belong here, but
@@ -73,6 +82,7 @@ typedef struct CopyFormatOptions
 	bool	   *force_null_flags;	/* per-column CSV FN flags */
 	bool		convert_selectively;	/* do selective binary conversion? */
 	CopyOnErrorChoice on_error; /* what to do when error happened */
+	CopyLogVerbosityChoice log_verbosity;	/* verbosity of logged messages */
 	List	   *convert_select; /* list of column names (can be NIL) */
 } CopyFormatOptions;
 
@@ -97,6 +107,7 @@ extern bool NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 extern bool NextCopyFromRawFields(CopyFromState cstate,
 								  char ***fields, int *nfields);
 extern void CopyFromErrorCallback(void *arg);
+extern char *CopyLimitPrintoutLength(const char *str);
 
 extern uint64 CopyFrom(CopyFromState cstate);
 
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index f98c2d1c4e..931542f268 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
 ERROR:  conflicting or redundant options
 LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
                                             ^
+COPY x from stdin (log_verbosity default, log_verbosity verbose);
+ERROR:  conflicting or redundant options
+LINE 1: COPY x from stdin (log_verbosity default, log_verbosity verb...
+                                                  ^
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
 ERROR:  cannot specify DELIMITER in BINARY mode
@@ -108,6 +112,10 @@ COPY x to stdin (format BINARY, on_error unsupported);
 ERROR:  COPY ON_ERROR cannot be used with COPY TO
 LINE 1: COPY x to stdin (format BINARY, on_error unsupported);
                                         ^
+COPY x to stdout (log_verbosity unsupported);
+ERROR:  COPY LOG_VERBOSITY "unsupported" not recognized
+LINE 1: COPY x to stdout (log_verbosity unsupported);
+                          ^
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
 ERROR:  column "d" specified more than once
@@ -729,8 +737,31 @@ CREATE TABLE check_ign_err (n int, m int[], k int);
 COPY check_ign_err FROM STDIN WITH (on_error stop);
 ERROR:  invalid input syntax for type integer: "a"
 CONTEXT:  COPY check_ign_err, line 2, column n: "a"
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+-- want context for notices
+\set SHOW_CONTEXT always
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+NOTICE:  skipping row due to data type incompatibility at line 2 for column n: "a"
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 3 for column k: "3333333333"
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 4 for column m: "{a, 4}"
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 5 for column n: ""
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 7 for column m: "a"
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 8 for column k: "a"
+CONTEXT:  COPY check_ign_err
 NOTICE:  6 rows were skipped due to data type incompatibility
+-- tests for on_error option with log_verbosity and null constraint via domain
+CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
+CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
+COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+NOTICE:  skipping row due to data type incompatibility at line 2 for column l: null input
+CONTEXT:  COPY check_ign_err2
+NOTICE:  1 row was skipped due to data type incompatibility
+-- reset context choice
+\set SHOW_CONTEXT errors
 SELECT * FROM check_ign_err;
  n |  m  | k 
 ---+-----+---
@@ -739,6 +770,12 @@ SELECT * FROM check_ign_err;
  8 | {8} | 8
 (3 rows)
 
+SELECT * FROM check_ign_err2;
+ n |  m  | k |   l   
+---+-----+---+-------
+ 1 | {1} | 1 | 'foo'
+(1 row)
+
 -- test datatype error that can't be handled as soft: should fail
 CREATE TABLE hard_err(foo widget);
 COPY hard_err FROM STDIN WITH (on_error ignore);
@@ -767,6 +804,8 @@ DROP VIEW instead_of_insert_tbl_view;
 DROP VIEW instead_of_insert_tbl_view_2;
 DROP FUNCTION fun_instead_of_insert_tbl();
 DROP TABLE check_ign_err;
+DROP TABLE check_ign_err2;
+DROP DOMAIN dcheck_ign_err2;
 DROP TABLE hard_err;
 --
 -- COPY FROM ... DEFAULT
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index afaaa37e52..8b14962194 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,6 +67,7 @@ COPY x from stdin (force_null (a), force_null (b));
 COPY x from stdin (convert_selectively (a), convert_selectively (b));
 COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
 COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (log_verbosity default, log_verbosity verbose);
 
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
@@ -80,6 +81,7 @@ COPY x to stdin (format CSV, force_not_null(a));
 COPY x to stdout (format TEXT, force_null(a));
 COPY x to stdin (format CSV, force_null(a));
 COPY x to stdin (format BINARY, on_error unsupported);
+COPY x to stdout (log_verbosity unsupported);
 
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
@@ -508,7 +510,11 @@ a	{2}	2
 
 5	{5}	5
 \.
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+
+-- want context for notices
+\set SHOW_CONTEXT always
+
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity verbose);
 1	{1}	1
 a	{2}	2
 3	{3}	3333333333
@@ -519,8 +525,22 @@ a	{2}	2
 7	{7}	a
 8	{8}	8
 \.
+
+-- tests for on_error option with log_verbosity and null constraint via domain
+CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
+CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
+COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+1	{1}	1	'foo'
+2	{2}	2	\N
+\.
+
+-- reset context choice
+\set SHOW_CONTEXT errors
+
 SELECT * FROM check_ign_err;
 
+SELECT * FROM check_ign_err2;
+
 -- test datatype error that can't be handled as soft: should fail
 CREATE TABLE hard_err(foo widget);
 COPY hard_err FROM STDIN WITH (on_error ignore);
@@ -552,6 +572,8 @@ DROP VIEW instead_of_insert_tbl_view;
 DROP VIEW instead_of_insert_tbl_view_2;
 DROP FUNCTION fun_instead_of_insert_tbl();
 DROP TABLE check_ign_err;
+DROP TABLE check_ign_err2;
+DROP DOMAIN dcheck_ign_err2;
 DROP TABLE hard_err;
 
 --
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a8d7bed411..9add48f992 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -480,6 +480,7 @@ CopyFromState
 CopyFromStateData
 CopyHeaderChoice
 CopyInsertMethod
+CopyLogVerbosityChoice
 CopyMultiInsertBuffer
 CopyMultiInsertInfo
 CopyOnErrorChoice
-- 
2.34.1

#56Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bharath Rupireddy (#55)
1 attachment(s)
Re: Add new error_action COPY ON_ERROR "log"

On Sat, Mar 30, 2024 at 11:05 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Thu, Mar 28, 2024 at 6:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

That is,
since the LOG_VERBOSITY option is an enum parameter, it might make
more sense to require the value, instead of making the value optional.
For example, the following command could not be obvious for users:

COPY test FROM stdin (ON_ERROR ignore, LOG_VERBOSITY);

Agreed. Please see the attached v14 patch.

Thank you for updating the patch!

The LOG_VERBOSITY now needs
a value to be specified. Note that I've not added any test for this
case as there seemed to be no such tests so far generating "ERROR:
<<option>> requires a parameter". I don't mind adding one for
LOG_VERBOSITY though.

+1

One minor point:

ENCODING '<replaceable class="parameter">encoding_name</replaceable>'
+ LOG_VERBOSITY [ <replaceable class="parameter">mode</replaceable> ]
</synopsis>

'[' and ']' are not necessary because the value is no longer optional.

I've attached the updated patch. I'll push it, barring any objections.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

v15-0001-Add-new-COPY-option-LOG_VERBOSITY.patchapplication/octet-stream; name=v15-0001-Add-new-COPY-option-LOG_VERBOSITY.patchDownload
From 89868950db8d5440befccc30fcb7bdf2ae902b33 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 30 Mar 2024 11:12:08 +0000
Subject: [PATCH v15] Add new COPY option LOG_VERBOSITY.

This commit adds a new COPY option LOG_VERBOSITY, which controls the
amount of messages emitted during processing. Valid values are
'default' and 'verbose'.

This is currently used in COPY FROM when ON_ERROR option is set to
ignore. If 'verbose' is specified, additional information for each
discarded row is emitted.

Author: Bharath Rupireddy
Reviewed-by: Michael Paquier, Atsushi Torikoshi, Masahiko Sawada
Discussion: https://www.postgresql.org/message-id/CALj2ACUk700cYhx1ATRQyRw-fBM%2BaRo6auRAitKGff7XNmYfqQ%40mail.gmail.com
---
 doc/src/sgml/ref/copy.sgml           | 25 +++++++++++++++--
 src/backend/commands/copy.c          | 32 ++++++++++++++++++++++
 src/backend/commands/copyfrom.c      | 10 +++----
 src/backend/commands/copyfromparse.c | 35 ++++++++++++++++++++++++
 src/backend/parser/gram.y            |  1 +
 src/bin/psql/tab-complete.c          |  6 +++-
 src/include/commands/copy.h          | 11 ++++++++
 src/test/regress/expected/copy2.out  | 41 +++++++++++++++++++++++++++-
 src/test/regress/sql/copy2.sql       | 24 +++++++++++++++-
 src/tools/pgindent/typedefs.list     |  1 +
 10 files changed, 175 insertions(+), 11 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 6c83e30ed0..33ce7c4ea6 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -45,6 +45,7 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     FORCE_NULL { ( <replaceable class="parameter">column_name</replaceable> [, ...] ) | * }
     ON_ERROR '<replaceable class="parameter">error_action</replaceable>'
     ENCODING '<replaceable class="parameter">encoding_name</replaceable>'
+    LOG_VERBOSITY <replaceable class="parameter">mode</replaceable>
 </synopsis>
  </refsynopsisdiv>
 
@@ -400,8 +401,12 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
      </para>
      <para>
-      A <literal>NOTICE</literal> message containing the ignored row count is emitted at the end
-      of the <command>COPY FROM</command> if at least one row was discarded.
+      A <literal>NOTICE</literal> message containing the ignored row count is
+      emitted at the end of the <command>COPY FROM</command> if at least one
+      row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
+      <literal>verbose</literal>, a <literal>NOTICE</literal> message
+      containing the line of the input file and the column name whose input
+      conversion has failed is emitted for each discarded row.
      </para>
     </listitem>
    </varlistentry>
@@ -418,6 +423,22 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>LOG_VERBOSITY</literal></term>
+    <listitem>
+     <para>
+      Specify the amount of messages emitted by a <command>COPY</command>
+      command: <literal>default</literal> or <literal>verbose</literal>. If
+      <literal>verbose</literal> is specified, additional messages are emitted
+      during processing.
+     </para>
+     <para>
+      This is currently used in <command>COPY FROM</command> command when
+      <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>.
+      </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><literal>WHERE</literal></term>
     <listitem>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 28cf8b040a..f75e1d700d 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -422,6 +422,30 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
 	return COPY_ON_ERROR_STOP;	/* keep compiler quiet */
 }
 
+/*
+ * Extract a CopyLogVerbosityChoice value from a DefElem.
+ */
+static CopyLogVerbosityChoice
+defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate)
+{
+	char	   *sval;
+
+	/*
+	 * Allow "default", or "verbose" values.
+	 */
+	sval = defGetString(def);
+	if (pg_strcasecmp(sval, "default") == 0)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+	if (pg_strcasecmp(sval, "verbose") == 0)
+		return COPY_LOG_VERBOSITY_VERBOSE;
+
+	ereport(ERROR,
+			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			 errmsg("COPY LOG_VERBOSITY \"%s\" not recognized", sval),
+			 parser_errposition(pstate, def->location)));
+	return COPY_LOG_VERBOSITY_DEFAULT;	/* keep compiler quiet */
+}
+
 /*
  * Process the statement option list for COPY.
  *
@@ -448,6 +472,7 @@ ProcessCopyOptions(ParseState *pstate,
 	bool		freeze_specified = false;
 	bool		header_specified = false;
 	bool		on_error_specified = false;
+	bool		log_verbosity_specified = false;
 	ListCell   *option;
 
 	/* Support external use for option sanity checking */
@@ -607,6 +632,13 @@ ProcessCopyOptions(ParseState *pstate,
 			on_error_specified = true;
 			opts_out->on_error = defGetCopyOnErrorChoice(defel, pstate, is_from);
 		}
+		else if (strcmp(defel->defname, "log_verbosity") == 0)
+		{
+			if (log_verbosity_specified)
+				errorConflictingDefElem(defel, pstate);
+			log_verbosity_specified = true;
+			opts_out->log_verbosity = defGetCopyLogVerbosityChoice(defel, pstate);
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index b673636977..9d2900041e 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -101,8 +101,6 @@ typedef struct CopyMultiInsertInfo
 
 
 /* non-export function prototypes */
-static char *limit_printout_length(const char *str);
-
 static void ClosePipeFromProgram(CopyFromState cstate);
 
 /*
@@ -141,7 +139,7 @@ CopyFromErrorCallback(void *arg)
 			/* error is relevant to a particular column */
 			char	   *attval;
 
-			attval = limit_printout_length(cstate->cur_attval);
+			attval = CopyLimitPrintoutLength(cstate->cur_attval);
 			errcontext("COPY %s, line %llu, column %s: \"%s\"",
 					   cstate->cur_relname,
 					   (unsigned long long) cstate->cur_lineno,
@@ -168,7 +166,7 @@ CopyFromErrorCallback(void *arg)
 			{
 				char	   *lineval;
 
-				lineval = limit_printout_length(cstate->line_buf.data);
+				lineval = CopyLimitPrintoutLength(cstate->line_buf.data);
 				errcontext("COPY %s, line %llu: \"%s\"",
 						   cstate->cur_relname,
 						   (unsigned long long) cstate->cur_lineno, lineval);
@@ -189,8 +187,8 @@ CopyFromErrorCallback(void *arg)
  *
  * Returns a pstrdup'd copy of the input.
  */
-static char *
-limit_printout_length(const char *str)
+char *
+CopyLimitPrintoutLength(const char *str)
 {
 #define MAX_COPY_DATA_DISPLAY 100
 
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 5682d5d054..7ddd27f5c6 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -967,7 +967,42 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 											(Node *) cstate->escontext,
 											&values[m]))
 			{
+				Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
+
 				cstate->num_errors++;
+
+				if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
+				{
+					/*
+					 * Since we emit line number and column info in the below
+					 * notice message, we suppress error context information
+					 * other than the relation name.
+					 */
+					Assert(!cstate->relname_only);
+					cstate->relname_only = true;
+
+					if (cstate->cur_attval)
+					{
+						char	   *attval;
+
+						attval = CopyLimitPrintoutLength(cstate->cur_attval);
+						ereport(NOTICE,
+								errmsg("skipping row due to data type incompatibility at line %llu for column %s: \"%s\"",
+									   (unsigned long long) cstate->cur_lineno,
+									   cstate->cur_attname,
+									   attval));
+						pfree(attval);
+					}
+					else
+						ereport(NOTICE,
+								errmsg("skipping row due to data type incompatibility at line %llu for column %s: null input",
+									   (unsigned long long) cstate->cur_lineno,
+									   cstate->cur_attname));
+
+					/* reset relname_only */
+					cstate->relname_only = false;
+				}
+
 				return true;
 			}
 
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 682748eb4b..f1af6147c3 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3530,6 +3530,7 @@ copy_generic_opt_arg:
 			opt_boolean_or_string			{ $$ = (Node *) makeString($1); }
 			| NumericOnly					{ $$ = (Node *) $1; }
 			| '*'							{ $$ = (Node *) makeNode(A_Star); }
+			| DEFAULT                       { $$ = (Node *) makeString("default"); }
 			| '(' copy_generic_opt_arg_list ')'		{ $$ = (Node *) $2; }
 			| /* EMPTY */					{ $$ = NULL; }
 		;
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index fc6865fc70..82eb3955ab 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2904,7 +2904,7 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("FORMAT", "FREEZE", "DELIMITER", "NULL",
 					  "HEADER", "QUOTE", "ESCAPE", "FORCE_QUOTE",
 					  "FORCE_NOT_NULL", "FORCE_NULL", "ENCODING", "DEFAULT",
-					  "ON_ERROR");
+					  "ON_ERROR", "LOG_VERBOSITY");
 
 	/* Complete COPY <sth> FROM|TO filename WITH (FORMAT */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "FORMAT"))
@@ -2914,6 +2914,10 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "ON_ERROR"))
 		COMPLETE_WITH("stop", "ignore");
 
+	/* Complete COPY <sth> FROM filename WITH (LOG_VERBOSITY */
+	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "LOG_VERBOSITY"))
+		COMPLETE_WITH("default", "verbose");
+
 	/* Complete COPY <sth> FROM <sth> WITH (<options>) */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM", MatchAny, "WITH", MatchAny))
 		COMPLETE_WITH("WHERE");
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index b3da3cb0be..141fd48dc1 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -40,6 +40,15 @@ typedef enum CopyOnErrorChoice
 	COPY_ON_ERROR_IGNORE,		/* ignore errors */
 } CopyOnErrorChoice;
 
+/*
+ * Represents verbosity of logged messages by COPY command.
+ */
+typedef enum CopyLogVerbosityChoice
+{
+	COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional messages, default */
+	COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */
+} CopyLogVerbosityChoice;
+
 /*
  * A struct to hold COPY options, in a parsed form. All of these are related
  * to formatting, except for 'freeze', which doesn't really belong here, but
@@ -73,6 +82,7 @@ typedef struct CopyFormatOptions
 	bool	   *force_null_flags;	/* per-column CSV FN flags */
 	bool		convert_selectively;	/* do selective binary conversion? */
 	CopyOnErrorChoice on_error; /* what to do when error happened */
+	CopyLogVerbosityChoice log_verbosity;	/* verbosity of logged messages */
 	List	   *convert_select; /* list of column names (can be NIL) */
 } CopyFormatOptions;
 
@@ -97,6 +107,7 @@ extern bool NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 extern bool NextCopyFromRawFields(CopyFromState cstate,
 								  char ***fields, int *nfields);
 extern void CopyFromErrorCallback(void *arg);
+extern char *CopyLimitPrintoutLength(const char *str);
 
 extern uint64 CopyFrom(CopyFromState cstate);
 
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index f98c2d1c4e..931542f268 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
 ERROR:  conflicting or redundant options
 LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
                                             ^
+COPY x from stdin (log_verbosity default, log_verbosity verbose);
+ERROR:  conflicting or redundant options
+LINE 1: COPY x from stdin (log_verbosity default, log_verbosity verb...
+                                                  ^
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
 ERROR:  cannot specify DELIMITER in BINARY mode
@@ -108,6 +112,10 @@ COPY x to stdin (format BINARY, on_error unsupported);
 ERROR:  COPY ON_ERROR cannot be used with COPY TO
 LINE 1: COPY x to stdin (format BINARY, on_error unsupported);
                                         ^
+COPY x to stdout (log_verbosity unsupported);
+ERROR:  COPY LOG_VERBOSITY "unsupported" not recognized
+LINE 1: COPY x to stdout (log_verbosity unsupported);
+                          ^
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
 ERROR:  column "d" specified more than once
@@ -729,8 +737,31 @@ CREATE TABLE check_ign_err (n int, m int[], k int);
 COPY check_ign_err FROM STDIN WITH (on_error stop);
 ERROR:  invalid input syntax for type integer: "a"
 CONTEXT:  COPY check_ign_err, line 2, column n: "a"
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+-- want context for notices
+\set SHOW_CONTEXT always
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+NOTICE:  skipping row due to data type incompatibility at line 2 for column n: "a"
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 3 for column k: "3333333333"
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 4 for column m: "{a, 4}"
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 5 for column n: ""
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 7 for column m: "a"
+CONTEXT:  COPY check_ign_err
+NOTICE:  skipping row due to data type incompatibility at line 8 for column k: "a"
+CONTEXT:  COPY check_ign_err
 NOTICE:  6 rows were skipped due to data type incompatibility
+-- tests for on_error option with log_verbosity and null constraint via domain
+CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
+CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
+COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+NOTICE:  skipping row due to data type incompatibility at line 2 for column l: null input
+CONTEXT:  COPY check_ign_err2
+NOTICE:  1 row was skipped due to data type incompatibility
+-- reset context choice
+\set SHOW_CONTEXT errors
 SELECT * FROM check_ign_err;
  n |  m  | k 
 ---+-----+---
@@ -739,6 +770,12 @@ SELECT * FROM check_ign_err;
  8 | {8} | 8
 (3 rows)
 
+SELECT * FROM check_ign_err2;
+ n |  m  | k |   l   
+---+-----+---+-------
+ 1 | {1} | 1 | 'foo'
+(1 row)
+
 -- test datatype error that can't be handled as soft: should fail
 CREATE TABLE hard_err(foo widget);
 COPY hard_err FROM STDIN WITH (on_error ignore);
@@ -767,6 +804,8 @@ DROP VIEW instead_of_insert_tbl_view;
 DROP VIEW instead_of_insert_tbl_view_2;
 DROP FUNCTION fun_instead_of_insert_tbl();
 DROP TABLE check_ign_err;
+DROP TABLE check_ign_err2;
+DROP DOMAIN dcheck_ign_err2;
 DROP TABLE hard_err;
 --
 -- COPY FROM ... DEFAULT
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index afaaa37e52..8b14962194 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,6 +67,7 @@ COPY x from stdin (force_null (a), force_null (b));
 COPY x from stdin (convert_selectively (a), convert_selectively (b));
 COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
 COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (log_verbosity default, log_verbosity verbose);
 
 -- incorrect options
 COPY x to stdin (format BINARY, delimiter ',');
@@ -80,6 +81,7 @@ COPY x to stdin (format CSV, force_not_null(a));
 COPY x to stdout (format TEXT, force_null(a));
 COPY x to stdin (format CSV, force_null(a));
 COPY x to stdin (format BINARY, on_error unsupported);
+COPY x to stdout (log_verbosity unsupported);
 
 -- too many columns in column list: should fail
 COPY x (a, b, c, d, e, d, c) from stdin;
@@ -508,7 +510,11 @@ a	{2}	2
 
 5	{5}	5
 \.
-COPY check_ign_err FROM STDIN WITH (on_error ignore);
+
+-- want context for notices
+\set SHOW_CONTEXT always
+
+COPY check_ign_err FROM STDIN WITH (on_error ignore, log_verbosity verbose);
 1	{1}	1
 a	{2}	2
 3	{3}	3333333333
@@ -519,8 +525,22 @@ a	{2}	2
 7	{7}	a
 8	{8}	8
 \.
+
+-- tests for on_error option with log_verbosity and null constraint via domain
+CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
+CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
+COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose);
+1	{1}	1	'foo'
+2	{2}	2	\N
+\.
+
+-- reset context choice
+\set SHOW_CONTEXT errors
+
 SELECT * FROM check_ign_err;
 
+SELECT * FROM check_ign_err2;
+
 -- test datatype error that can't be handled as soft: should fail
 CREATE TABLE hard_err(foo widget);
 COPY hard_err FROM STDIN WITH (on_error ignore);
@@ -552,6 +572,8 @@ DROP VIEW instead_of_insert_tbl_view;
 DROP VIEW instead_of_insert_tbl_view_2;
 DROP FUNCTION fun_instead_of_insert_tbl();
 DROP TABLE check_ign_err;
+DROP TABLE check_ign_err2;
+DROP DOMAIN dcheck_ign_err2;
 DROP TABLE hard_err;
 
 --
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a8d7bed411..9add48f992 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -480,6 +480,7 @@ CopyFromState
 CopyFromStateData
 CopyHeaderChoice
 CopyInsertMethod
+CopyLogVerbosityChoice
 CopyMultiInsertBuffer
 CopyMultiInsertInfo
 CopyOnErrorChoice
-- 
2.39.3

#57Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#56)
Re: Add new error_action COPY ON_ERROR "log"

On Mon, Apr 1, 2024 at 10:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sat, Mar 30, 2024 at 11:05 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Thu, Mar 28, 2024 at 6:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

That is,
since the LOG_VERBOSITY option is an enum parameter, it might make
more sense to require the value, instead of making the value optional.
For example, the following command could not be obvious for users:

COPY test FROM stdin (ON_ERROR ignore, LOG_VERBOSITY);

Agreed. Please see the attached v14 patch.

Thank you for updating the patch!

The LOG_VERBOSITY now needs
a value to be specified. Note that I've not added any test for this
case as there seemed to be no such tests so far generating "ERROR:
<<option>> requires a parameter". I don't mind adding one for
LOG_VERBOSITY though.

+1

One minor point:

ENCODING '<replaceable class="parameter">encoding_name</replaceable>'
+ LOG_VERBOSITY [ <replaceable class="parameter">mode</replaceable> ]
</synopsis>

'[' and ']' are not necessary because the value is no longer optional.

I've attached the updated patch. I'll push it, barring any objections.

Pushed.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#58Michael Paquier
michael@paquier.xyz
In reply to: Masahiko Sawada (#57)
Re: Add new error_action COPY ON_ERROR "log"

On Tue, Apr 02, 2024 at 09:53:57AM +0900, Masahiko Sawada wrote:

Pushed.

Thanks for following up with this thread.
--
Michael