[PATCH] Add error hints for invalid COPY options
Hi,
This patch improves the user experience when working with COPY commands by
adding helpful error hints for invalid options.
Currently, when users make typos in COPY option names or values, they
receive
a generic error message without guidance on what went wrong. This patch adds
two types of hints:
1. For invalid option names: suggests the closest matching valid option
using
the ClosestMatch algorithm (e.g., "foramt" → "Perhaps you meant
'format'")
2. For invalid option values: lists all valid values when the set is small
(e.g., for format, on_error, log_verbosity options)
This follows the pattern already used elsewhere in PostgreSQL for providing
helpful error hints to users.
Additionally, this patch corrects a misleading comment for the
convert_selectively option. The comment stated it was
"not-accessible-from-SQL",
but actualy it has been accessible from SQL due to PostgreSQL's generic
option parser.
The updated comment clarifies that while technically accessible, it's
intended for
internal use and not recommended for end-user use due to potential data
loss.
Best regards,
Attachments:
0001-Add-error-hints-for-invalid-COPY-options.patchapplication/octet-stream; name=0001-Add-error-hints-for-invalid-COPY-options.patchDownload
From 773c3d5cfddfc3c9551a8e0d37d58808b60dfdcc Mon Sep 17 00:00:00 2001
From: Shinya Sugamoto <sugamoto@me.com>
Date: Sat, 22 Nov 2025 22:03:21 +0900
Subject: [PATCH] Add error hints for invalid COPY options
Add error hints for invalid COPY options. Use ClosestMatch when many
valid options exist (option names), or list all values when few exist
(option values like format, on_error).
Also update the misleading comment for the convert_selectively option.
The comment previously stated it was "not-accessible-from-SQL", but
actually it has been accessible from SQL due to PostgreSQL's generic
option parser. The updated comment clarifies that while technically
accessible, it's intended for internal use by file_fdw and not
recommended for end-user use due to potential data loss.
---
src/backend/commands/copy.c | 46 +++++++++++++++++++++++++++--
src/test/regress/expected/copy2.out | 27 +++++++++++++++++
src/test/regress/sql/copy2.sql | 7 +++++
3 files changed, 77 insertions(+), 3 deletions(-)
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 28e878c3688..14bb1756746 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -38,6 +38,7 @@
#include "utils/lsyscache.h"
#include "utils/rel.h"
#include "utils/rls.h"
+#include "utils/varlena.h"
/*
* DoCopy executes the SQL COPY statement
@@ -467,6 +468,7 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
/*- translator: first %s is the name of a COPY option, e.g. ON_ERROR */
errmsg("COPY %s \"%s\" not recognized", "ON_ERROR", sval),
+ errhint("Valid values are \"%s\" and \"%s\".", "ignore", "stop"),
parser_errposition(pstate, def->location)));
return COPY_ON_ERROR_STOP; /* keep compiler quiet */
}
@@ -525,6 +527,7 @@ defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate)
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
/*- translator: first %s is the name of a COPY option, e.g. ON_ERROR */
errmsg("COPY %s \"%s\" not recognized", "LOG_VERBOSITY", sval),
+ errhint("Valid values are \"%s\", \"%s\", and \"%s\".", "default", "silent", "verbose"),
parser_errposition(pstate, def->location)));
return COPY_LOG_VERBOSITY_DEFAULT; /* keep compiler quiet */
}
@@ -587,6 +590,7 @@ ProcessCopyOptions(ParseState *pstate,
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("COPY format \"%s\" not recognized", fmt),
+ errhint("Valid formats are \"%s\", \"%s\", and \"%s\".", "binary", "csv", "text"),
parser_errposition(pstate, defel->location)));
}
else if (strcmp(defel->defname, "freeze") == 0)
@@ -681,9 +685,9 @@ ProcessCopyOptions(ParseState *pstate,
else if (strcmp(defel->defname, "convert_selectively") == 0)
{
/*
- * Undocumented, not-accessible-from-SQL option: convert only the
- * named columns to binary form, storing the rest as NULLs. It's
- * allowed for the column list to be NIL.
+ * Undocumented option for file_fdw internal use. Converts only
+ * the listed columns; all others become NULL. The column list
+ * can be NIL.
*/
if (opts_out->convert_selectively)
errorConflictingDefElem(defel, pstate);
@@ -731,11 +735,47 @@ ProcessCopyOptions(ParseState *pstate,
opts_out->reject_limit = defGetCopyRejectLimitOption(defel);
}
else
+ {
+ ClosestMatchState match_state;
+ const char *closest_match;
+ int i;
+ /* Valid COPY option names for closest match suggestions */
+ static const char *const valid_copy_options[] = {
+ "default",
+ "delimiter",
+ "encoding",
+ "escape",
+ "force_not_null",
+ "force_null",
+ "force_quote",
+ "format",
+ "freeze",
+ "header",
+ "log_verbosity",
+ "null",
+ "on_error",
+ "quote",
+ "reject_limit",
+ NULL
+ };
+
+ /*
+ * Unknown option specified, complain about it. Provide a hint
+ * with a valid option that looks similar, if there is one.
+ */
+ initClosestMatch(&match_state, defel->defname, 4);
+ for (i = 0; valid_copy_options[i] != NULL; i++)
+ updateClosestMatch(&match_state, valid_copy_options[i]);
+
+ closest_match = getClosestMatch(&match_state);
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("option \"%s\" not recognized",
defel->defname),
+ closest_match ?
+ errhint("Perhaps you meant \"%s\".", closest_match) : 0,
parser_errposition(pstate, defel->location)));
+ }
}
/*
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index f3fdce23459..a4b4e33248f 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -96,6 +96,7 @@ COPY x from stdin (on_error unsupported);
ERROR: COPY ON_ERROR "unsupported" not recognized
LINE 1: COPY x from stdin (on_error unsupported);
^
+HINT: Valid values are "ignore" and "stop".
COPY x from stdin (format TEXT, force_quote(a));
ERROR: COPY FORCE_QUOTE requires CSV mode
COPY x from stdin (format TEXT, force_quote *);
@@ -128,6 +129,7 @@ COPY x from stdin (log_verbosity unsupported);
ERROR: COPY LOG_VERBOSITY "unsupported" not recognized
LINE 1: COPY x from stdin (log_verbosity unsupported);
^
+HINT: Valid values are "default", "silent", and "verbose".
COPY x from stdin with (reject_limit 1);
ERROR: COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE
COPY x from stdin with (on_error ignore, reject_limit 0);
@@ -138,6 +140,31 @@ COPY x from stdin with (header 2.5);
ERROR: header requires a Boolean value, a non-negative integer, or the string "match"
COPY x to stdout with (header 2);
ERROR: cannot use multi-line header in COPY TO
+-- test error hints for invalid COPY options
+COPY x from stdin (foramt CSV); -- error, suggests "format"
+ERROR: option "foramt" not recognized
+LINE 1: COPY x from stdin (foramt CSV);
+ ^
+HINT: Perhaps you meant "format".
+COPY x from stdin (completely_invalid_option 'value'); -- error, no suggestion
+ERROR: option "completely_invalid_option" not recognized
+LINE 1: COPY x from stdin (completely_invalid_option 'value');
+ ^
+COPY x from stdin (format cvs); -- error, lists valid formats
+ERROR: COPY format "cvs" not recognized
+LINE 1: COPY x from stdin (format cvs);
+ ^
+HINT: Valid formats are "binary", "csv", and "text".
+COPY x from stdin (format CSV, on_error ignor); -- error, lists valid on_error values
+ERROR: COPY ON_ERROR "ignor" not recognized
+LINE 1: COPY x from stdin (format CSV, on_error ignor);
+ ^
+HINT: Valid values are "ignore" and "stop".
+COPY x from stdin (format CSV, log_verbosity verbos); -- error, lists valid log_verbosity values
+ERROR: COPY LOG_VERBOSITY "verbos" not recognized
+LINE 1: COPY x from stdin (format CSV, log_verbosity verbos);
+ ^
+HINT: Valid values are "default", "silent", and "verbose".
-- too many columns in column list: should fail
COPY x (a, b, c, d, e, d, c) from stdin;
ERROR: column "d" specified more than once
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index cef45868db5..b8c9d2439ef 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -94,6 +94,13 @@ COPY x from stdin with (header -1);
COPY x from stdin with (header 2.5);
COPY x to stdout with (header 2);
+-- test error hints for invalid COPY options
+COPY x from stdin (foramt CSV); -- error, suggests "format"
+COPY x from stdin (completely_invalid_option 'value'); -- error, no suggestion
+COPY x from stdin (format cvs); -- error, lists valid formats
+COPY x from stdin (format CSV, on_error ignor); -- error, lists valid on_error values
+COPY x from stdin (format CSV, log_verbosity verbos); -- error, lists valid log_verbosity values
+
-- too many columns in column list: should fail
COPY x (a, b, c, d, e, d, c) from stdin;
--
2.50.1 (Apple Git-155)
Hi,
On Sat, Nov 22, 2025 at 8:33 PM Sugamoto Shinya <shinya34892@gmail.com> wrote:
Hi,
This patch improves the user experience when working with COPY commands by
adding helpful error hints for invalid options.Currently, when users make typos in COPY option names or values, they receive
a generic error message without guidance on what went wrong. This patch adds
two types of hints:1. For invalid option names: suggests the closest matching valid option using
the ClosestMatch algorithm (e.g., "foramt" → "Perhaps you meant 'format'")2. For invalid option values: lists all valid values when the set is small
(e.g., for format, on_error, log_verbosity options)This follows the pattern already used elsewhere in PostgreSQL for providing
helpful error hints to users.
Given we have 15 COPY options now, it sounds like a reasonable idea.
One concern about the patch is that when adding a new COPY option, we
could miss updating valid_copy_options list, resulting in providing a
wrong suggestion. I think we can consider refactoring the COPY option
handling so that we check the given option is a valid name or not by
checking valid_copy_options array and then process the option value.
Additionally, this patch corrects a misleading comment for the
convert_selectively option. The comment stated it was "not-accessible-from-SQL",
but actualy it has been accessible from SQL due to PostgreSQL's generic option parser.
The updated comment clarifies that while technically accessible, it's intended for
internal use and not recommended for end-user use due to potential data loss.
Hmm, I'm not sure the proposed comment improves the clarification.
It's essentially non-accessible from SQL since we cannot provide a
valid value for convert_selectively from SQL commands.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Mon, Nov 24, 2025 at 11:56:34AM -0800, Masahiko Sawada wrote:
On Sat, Nov 22, 2025 at 8:33 PM Sugamoto Shinya <shinya34892@gmail.com> wrote:
This follows the pattern already used elsewhere in PostgreSQL for providing
helpful error hints to users.Given we have 15 COPY options now, it sounds like a reasonable idea.
One concern about the patch is that when adding a new COPY option, we
could miss updating valid_copy_options list, resulting in providing a
wrong suggestion. I think we can consider refactoring the COPY option
handling so that we check the given option is a valid name or not by
checking valid_copy_options array and then process the option value.
+1. Ideally, folks wouldn't need to update a separate list when adding new
options.
Additionally, this patch corrects a misleading comment for the
convert_selectively option. The comment stated it was "not-accessible-from-SQL",
but actualy it has been accessible from SQL due to PostgreSQL's generic option parser.
The updated comment clarifies that while technically accessible, it's intended for
internal use and not recommended for end-user use due to potential data loss.Hmm, I'm not sure the proposed comment improves the clarification.
It's essentially non-accessible from SQL since we cannot provide a
valid value for convert_selectively from SQL commands.
Yeah, I'd leave it alone, at least for this patch.
--
nathan
2025年11月25日(火) 6:50 Nathan Bossart <nathandbossart@gmail.com>:
On Mon, Nov 24, 2025 at 11:56:34AM -0800, Masahiko Sawada wrote:
On Sat, Nov 22, 2025 at 8:33 PM Sugamoto Shinya <shinya34892@gmail.com>
wrote:
This follows the pattern already used elsewhere in PostgreSQL for
providing
helpful error hints to users.
Given we have 15 COPY options now, it sounds like a reasonable idea.
One concern about the patch is that when adding a new COPY option, we
could miss updating valid_copy_options list, resulting in providing a
wrong suggestion. I think we can consider refactoring the COPY option
handling so that we check the given option is a valid name or not by
checking valid_copy_options array and then process the option value.+1. Ideally, folks wouldn't need to update a separate list when adding new
options.Additionally, this patch corrects a misleading comment for the
convert_selectively option. The comment stated it was"not-accessible-from-SQL",
but actualy it has been accessible from SQL due to PostgreSQL's generic
option parser.
The updated comment clarifies that while technically accessible, it's
intended for
internal use and not recommended for end-user use due to potential data
loss.
Hmm, I'm not sure the proposed comment improves the clarification.
It's essentially non-accessible from SQL since we cannot provide a
valid value for convert_selectively from SQL commands.Yeah, I'd leave it alone, at least for this patch.
--
nathan
Thanks for checking my proposal.
For the refactoring of the COPY options, it sounds reasonable to me. Let me
take that changes in my patch.
Regarding the option “convert_selectively”, at first i thought it should be
blocked to be used like in the lexer and parser level but later I succeeded
in specifying and using it in the SQL interface, psql , in my local
environment, and also these accept the grammar for the option. You can
verify that it’s actually accessible in any SQL interface using the
following SQLs. Please let me know if there might be something I missed.
Also I would be happy to make another thread about this matter
‘’’sql
CREATE TABLE conv_test (
a int,
b int,
c text
);
COPY conv_test FROM STDIN (
FORMAT csv,
convert_selectively (a, b)
);
— for stdin
1,2,foo
3,4,bar
—
SELECT * FROM conv_test;
— result is
a | b | c
---+---+------
1 | 2 | NULL
3 | 4 | NULL
(2 rows)
—
‘’’
Regards,
Show quoted text
On Wed, Nov 26, 2025 at 5:55 PM Sugamoto Shinya <shinya34892@gmail.com>
wrote:
2025年11月25日(火) 6:50 Nathan Bossart <nathandbossart@gmail.com>:
On Mon, Nov 24, 2025 at 11:56:34AM -0800, Masahiko Sawada wrote:
On Sat, Nov 22, 2025 at 8:33 PM Sugamoto Shinya <shinya34892@gmail.com>
wrote:
This follows the pattern already used elsewhere in PostgreSQL for
providing
helpful error hints to users.
Given we have 15 COPY options now, it sounds like a reasonable idea.
One concern about the patch is that when adding a new COPY option, we
could miss updating valid_copy_options list, resulting in providing a
wrong suggestion. I think we can consider refactoring the COPY option
handling so that we check the given option is a valid name or not by
checking valid_copy_options array and then process the option value.+1. Ideally, folks wouldn't need to update a separate list when adding
new
options.Additionally, this patch corrects a misleading comment for the
convert_selectively option. The comment stated it was"not-accessible-from-SQL",
but actualy it has been accessible from SQL due to PostgreSQL's
generic option parser.
The updated comment clarifies that while technically accessible, it's
intended for
internal use and not recommended for end-user use due to potential
data loss.
Hmm, I'm not sure the proposed comment improves the clarification.
It's essentially non-accessible from SQL since we cannot provide a
valid value for convert_selectively from SQL commands.Yeah, I'd leave it alone, at least for this patch.
--
nathanThanks for checking my proposal.
For the refactoring of the COPY options, it sounds reasonable to me. Let
me take that changes in my patch.Regarding the option “convert_selectively”, at first i thought it should
be blocked to be used like in the lexer and parser level but later I
succeeded in specifying and using it in the SQL interface, psql , in my
local environment, and also these accept the grammar for the option. You
can verify that it’s actually accessible in any SQL interface using the
following SQLs. Please let me know if there might be something I missed.
Also I would be happy to make another thread about this matter‘’’sql
CREATE TABLE conv_test (
a int,
b int,
c text
);COPY conv_test FROM STDIN (
FORMAT csv,
convert_selectively (a, b)
);— for stdin
1,2,foo
3,4,bar
—SELECT * FROM conv_test;
— result is
a | b | c
---+---+------
1 | 2 | NULL
3 | 4 | NULL
(2 rows)
—‘’’
Regards,
One concern about the patch is that when adding a new COPY option, we
could miss updating valid_copy_options list, resulting in providing a
wrong suggestion. I think we can consider refactoring the COPY option
handling so that we check the given option is a valid name or not by
checking valid_copy_options array and then process the option value.+1. Ideally, folks wouldn't need to update a separate list when adding new
options.
I updated my patch similar to src/backend/utils/adt/encode.c so that we can
put both
all the COPY options and its validation functions into one place, because I
believe this
should be the strongest way to prevent anyone from forgetting to add a new
option name to multiple places. However, I was wondering if any other
simple solutions
, such as putting some simple validation right under each iteration loop,
would be sufficient
or not, for example like the following pseudo code. It would be the minimum
and simplest
changes, although it would leave some room for some people to skip adding
proper code.
Please let me know freely if you have any opinions about this.
```
list valid_options = ["format", "force_null", ...]
foreach(opt, options)
{
validate_option(opt, valid_options) <--- THIS
if (opt.name == "format") ...
if (opt.name == "force_null") ...
...
}
```
Regarding the option `convert_selectively`, what do you think about this?
Still it seems to me that this option is accessible from SQL interfaces.
But for now
I just removed the comment changes from my v2 patch and would like to
discuss
this with you here or in another thread.
Regards,
Attachments:
v2-0001-Refactor-COPY-option-handling-with-table-driven-appr.patchapplication/octet-stream; name=v2-0001-Refactor-COPY-option-handling-with-table-driven-appr.patchDownload
From bb3ece5bb89b1aa265c77a17735d4a206184dabf Mon Sep 17 00:00:00 2001
From: Shinya Sugamoto <sugamoto@me.com>
Date: Sat, 22 Nov 2025 22:03:21 +0900
Subject: [PATCH] Refactor COPY option handling with table-driven approach
Refactor ProcessCopyOptions() to use a table-driven approach for option
handling. This provides a single source of truth for valid COPY options
- when adding a new option, simply add an entry to the copy_options[]
table and implement its processor function.
Also add error hints for invalid COPY options using ClosestMatch to
suggest similar valid option names, and list all valid values for
options with few choices (format, on_error, log_verbosity).
---
contrib/file_fdw/expected/file_fdw.out | 3 +
src/backend/commands/copy.c | 448 ++++++++++++++++---------
src/test/regress/expected/copy2.out | 27 ++
src/test/regress/sql/copy2.sql | 7 +
4 files changed, 323 insertions(+), 162 deletions(-)
diff --git a/contrib/file_fdw/expected/file_fdw.out b/contrib/file_fdw/expected/file_fdw.out
index 5121e27dce5..bf953c62a49 100644
--- a/contrib/file_fdw/expected/file_fdw.out
+++ b/contrib/file_fdw/expected/file_fdw.out
@@ -54,6 +54,7 @@ CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS ("a=b" 'true'); -- ERROR
ERROR: invalid option name "a=b": must not contain "="
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'xml'); -- ERROR
ERROR: COPY format "xml" not recognized
+HINT: Valid formats are "binary", "csv", and "text".
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'text', quote ':'); -- ERROR
ERROR: COPY QUOTE requires CSV mode
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'text', escape ':'); -- ERROR
@@ -96,10 +97,12 @@ CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'csv', null '
ERROR: COPY null representation cannot use newline or carriage return
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (on_error 'unsupported'); -- ERROR
ERROR: COPY ON_ERROR "unsupported" not recognized
+HINT: Valid values are "ignore" and "stop".
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'binary', on_error 'ignore'); -- ERROR
ERROR: only ON_ERROR STOP is allowed in BINARY mode
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (log_verbosity 'unsupported'); -- ERROR
ERROR: COPY LOG_VERBOSITY "unsupported" not recognized
+HINT: Valid values are "default", "silent", and "verbose".
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (reject_limit '1'); -- ERROR
ERROR: COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (on_error 'ignore', reject_limit '0'); -- ERROR
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 28e878c3688..b5ffb3c1062 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -38,6 +38,7 @@
#include "utils/lsyscache.h"
#include "utils/rel.h"
#include "utils/rls.h"
+#include "utils/varlena.h"
/*
* DoCopy executes the SQL COPY statement
@@ -467,6 +468,7 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
/*- translator: first %s is the name of a COPY option, e.g. ON_ERROR */
errmsg("COPY %s \"%s\" not recognized", "ON_ERROR", sval),
+ errhint("Valid values are \"%s\" and \"%s\".", "ignore", "stop"),
parser_errposition(pstate, def->location)));
return COPY_ON_ERROR_STOP; /* keep compiler quiet */
}
@@ -525,10 +527,259 @@ defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate)
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
/*- translator: first %s is the name of a COPY option, e.g. ON_ERROR */
errmsg("COPY %s \"%s\" not recognized", "LOG_VERBOSITY", sval),
+ errhint("Valid values are \"%s\", \"%s\", and \"%s\".", "default", "silent", "verbose"),
parser_errposition(pstate, def->location)));
return COPY_LOG_VERBOSITY_DEFAULT; /* keep compiler quiet */
}
+/*
+ * Context structure for COPY option processing.
+ */
+typedef struct CopyOptionContext
+{
+ ParseState *pstate;
+ CopyFormatOptions *opts_out;
+ bool is_from;
+ bool format_specified;
+ bool freeze_specified;
+ bool header_specified;
+ bool on_error_specified;
+ bool log_verbosity_specified;
+ bool reject_limit_specified;
+} CopyOptionContext;
+
+/* Individual option processors */
+
+static void
+copy_opt_format(CopyOptionContext *ctx, DefElem *defel)
+{
+ char *fmt = defGetString(defel);
+
+ if (ctx->format_specified)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->format_specified = true;
+
+ if (strcmp(fmt, "text") == 0)
+ /* default format */ ;
+ else if (strcmp(fmt, "csv") == 0)
+ ctx->opts_out->csv_mode = true;
+ else if (strcmp(fmt, "binary") == 0)
+ ctx->opts_out->binary = true;
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("COPY format \"%s\" not recognized", fmt),
+ errhint("Valid formats are \"%s\", \"%s\", and \"%s\".",
+ "binary", "csv", "text"),
+ parser_errposition(ctx->pstate, defel->location)));
+}
+
+static void
+copy_opt_freeze(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->freeze_specified)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->freeze_specified = true;
+ ctx->opts_out->freeze = defGetBoolean(defel);
+}
+
+static void
+copy_opt_delimiter(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->opts_out->delim)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->opts_out->delim = defGetString(defel);
+}
+
+static void
+copy_opt_null(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->opts_out->null_print)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->opts_out->null_print = defGetString(defel);
+}
+
+static void
+copy_opt_default(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->opts_out->default_print)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->opts_out->default_print = defGetString(defel);
+}
+
+static void
+copy_opt_header(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->header_specified)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->header_specified = true;
+ ctx->opts_out->header_line = defGetCopyHeaderOption(defel, ctx->is_from);
+}
+
+static void
+copy_opt_quote(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->opts_out->quote)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->opts_out->quote = defGetString(defel);
+}
+
+static void
+copy_opt_escape(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->opts_out->escape)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->opts_out->escape = defGetString(defel);
+}
+
+static void
+copy_opt_force_quote(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->opts_out->force_quote || ctx->opts_out->force_quote_all)
+ errorConflictingDefElem(defel, ctx->pstate);
+ if (defel->arg && IsA(defel->arg, A_Star))
+ ctx->opts_out->force_quote_all = true;
+ else if (defel->arg && IsA(defel->arg, List))
+ ctx->opts_out->force_quote = castNode(List, defel->arg);
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("argument to option \"%s\" must be a list of column names",
+ defel->defname),
+ parser_errposition(ctx->pstate, defel->location)));
+}
+
+static void
+copy_opt_force_not_null(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->opts_out->force_notnull || ctx->opts_out->force_notnull_all)
+ errorConflictingDefElem(defel, ctx->pstate);
+ if (defel->arg && IsA(defel->arg, A_Star))
+ ctx->opts_out->force_notnull_all = true;
+ else if (defel->arg && IsA(defel->arg, List))
+ ctx->opts_out->force_notnull = castNode(List, defel->arg);
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("argument to option \"%s\" must be a list of column names",
+ defel->defname),
+ parser_errposition(ctx->pstate, defel->location)));
+}
+
+static void
+copy_opt_force_null(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->opts_out->force_null || ctx->opts_out->force_null_all)
+ errorConflictingDefElem(defel, ctx->pstate);
+ if (defel->arg && IsA(defel->arg, A_Star))
+ ctx->opts_out->force_null_all = true;
+ else if (defel->arg && IsA(defel->arg, List))
+ ctx->opts_out->force_null = castNode(List, defel->arg);
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("argument to option \"%s\" must be a list of column names",
+ defel->defname),
+ parser_errposition(ctx->pstate, defel->location)));
+}
+
+static void
+copy_opt_convert_selectively(CopyOptionContext *ctx, DefElem *defel)
+{
+ /*
+ * Undocumented, not-accessible-from-SQL option: convert only the
+ * named columns to binary form, storing the rest as NULLs. It's
+ * allowed for the column list to be NIL.
+ */
+ if (ctx->opts_out->convert_selectively)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->opts_out->convert_selectively = true;
+ if (defel->arg == NULL || IsA(defel->arg, List))
+ ctx->opts_out->convert_select = castNode(List, defel->arg);
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("argument to option \"%s\" must be a list of column names",
+ defel->defname),
+ parser_errposition(ctx->pstate, defel->location)));
+}
+
+static void
+copy_opt_encoding(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->opts_out->file_encoding >= 0)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->opts_out->file_encoding = pg_char_to_encoding(defGetString(defel));
+ if (ctx->opts_out->file_encoding < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("argument to option \"%s\" must be a valid encoding name",
+ defel->defname),
+ parser_errposition(ctx->pstate, defel->location)));
+}
+
+static void
+copy_opt_on_error(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->on_error_specified)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->on_error_specified = true;
+ ctx->opts_out->on_error = defGetCopyOnErrorChoice(defel, ctx->pstate,
+ ctx->is_from);
+}
+
+static void
+copy_opt_log_verbosity(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->log_verbosity_specified)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->log_verbosity_specified = true;
+ ctx->opts_out->log_verbosity = defGetCopyLogVerbosityChoice(defel,
+ ctx->pstate);
+}
+
+static void
+copy_opt_reject_limit(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->reject_limit_specified)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->reject_limit_specified = true;
+ ctx->opts_out->reject_limit = defGetCopyRejectLimitOption(defel);
+}
+
+/*
+ * Table of all valid COPY options.
+ *
+ * This is the single source of truth for valid COPY options. When adding a
+ * new option, add an entry here and implement its processor function above.
+ */
+static const struct
+{
+ const char *name;
+ void (*processor) (CopyOptionContext *ctx, DefElem *defel);
+ bool suggest_in_hints;
+} copy_options[] =
+
+{
+ {"default", copy_opt_default, true},
+ {"delimiter", copy_opt_delimiter, true},
+ {"encoding", copy_opt_encoding, true},
+ {"escape", copy_opt_escape, true},
+ {"force_not_null", copy_opt_force_not_null, true},
+ {"force_null", copy_opt_force_null, true},
+ {"force_quote", copy_opt_force_quote, true},
+ {"format", copy_opt_format, true},
+ {"freeze", copy_opt_freeze, true},
+ {"header", copy_opt_header, true},
+ {"log_verbosity", copy_opt_log_verbosity, true},
+ {"null", copy_opt_null, true},
+ {"on_error", copy_opt_on_error, true},
+ {"quote", copy_opt_quote, true},
+ {"reject_limit", copy_opt_reject_limit, true},
+ {"convert_selectively", copy_opt_convert_selectively, false},
+ {NULL, NULL, false}
+};
+
/*
* Process the statement option list for COPY.
*
@@ -551,12 +802,7 @@ ProcessCopyOptions(ParseState *pstate,
bool is_from,
List *options)
{
- bool format_specified = false;
- bool freeze_specified = false;
- bool header_specified = false;
- bool on_error_specified = false;
- bool log_verbosity_specified = false;
- bool reject_limit_specified = false;
+ CopyOptionContext ctx;
ListCell *option;
/* Support external use for option sanity checking */
@@ -565,177 +811,55 @@ ProcessCopyOptions(ParseState *pstate,
opts_out->file_encoding = -1;
+ /* Initialize option processing context */
+ memset(&ctx, 0, sizeof(ctx));
+ ctx.pstate = pstate;
+ ctx.opts_out = opts_out;
+ ctx.is_from = is_from;
+
/* Extract options from the statement node tree */
foreach(option, options)
{
DefElem *defel = lfirst_node(DefElem, option);
+ bool found = false;
- if (strcmp(defel->defname, "format") == 0)
- {
- char *fmt = defGetString(defel);
-
- if (format_specified)
- errorConflictingDefElem(defel, pstate);
- format_specified = true;
- if (strcmp(fmt, "text") == 0)
- /* default format */ ;
- else if (strcmp(fmt, "csv") == 0)
- opts_out->csv_mode = true;
- else if (strcmp(fmt, "binary") == 0)
- opts_out->binary = true;
- else
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("COPY format \"%s\" not recognized", fmt),
- parser_errposition(pstate, defel->location)));
- }
- else if (strcmp(defel->defname, "freeze") == 0)
- {
- if (freeze_specified)
- errorConflictingDefElem(defel, pstate);
- freeze_specified = true;
- opts_out->freeze = defGetBoolean(defel);
- }
- else if (strcmp(defel->defname, "delimiter") == 0)
- {
- if (opts_out->delim)
- errorConflictingDefElem(defel, pstate);
- opts_out->delim = defGetString(defel);
- }
- else if (strcmp(defel->defname, "null") == 0)
- {
- if (opts_out->null_print)
- errorConflictingDefElem(defel, pstate);
- opts_out->null_print = defGetString(defel);
- }
- else if (strcmp(defel->defname, "default") == 0)
- {
- if (opts_out->default_print)
- errorConflictingDefElem(defel, pstate);
- opts_out->default_print = defGetString(defel);
- }
- else if (strcmp(defel->defname, "header") == 0)
- {
- if (header_specified)
- errorConflictingDefElem(defel, pstate);
- header_specified = true;
- opts_out->header_line = defGetCopyHeaderOption(defel, is_from);
- }
- else if (strcmp(defel->defname, "quote") == 0)
- {
- if (opts_out->quote)
- errorConflictingDefElem(defel, pstate);
- opts_out->quote = defGetString(defel);
- }
- else if (strcmp(defel->defname, "escape") == 0)
- {
- if (opts_out->escape)
- errorConflictingDefElem(defel, pstate);
- opts_out->escape = defGetString(defel);
- }
- else if (strcmp(defel->defname, "force_quote") == 0)
+ /* Look up the option in copy_options table */
+ for (int i = 0; copy_options[i].name != NULL; i++)
{
- if (opts_out->force_quote || opts_out->force_quote_all)
- errorConflictingDefElem(defel, pstate);
- if (defel->arg && IsA(defel->arg, A_Star))
- opts_out->force_quote_all = true;
- else if (defel->arg && IsA(defel->arg, List))
- opts_out->force_quote = castNode(List, defel->arg);
- else
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("argument to option \"%s\" must be a list of column names",
- defel->defname),
- parser_errposition(pstate, defel->location)));
- }
- else if (strcmp(defel->defname, "force_not_null") == 0)
- {
- if (opts_out->force_notnull || opts_out->force_notnull_all)
- errorConflictingDefElem(defel, pstate);
- if (defel->arg && IsA(defel->arg, A_Star))
- opts_out->force_notnull_all = true;
- else if (defel->arg && IsA(defel->arg, List))
- opts_out->force_notnull = castNode(List, defel->arg);
- else
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("argument to option \"%s\" must be a list of column names",
- defel->defname),
- parser_errposition(pstate, defel->location)));
- }
- else if (strcmp(defel->defname, "force_null") == 0)
- {
- if (opts_out->force_null || opts_out->force_null_all)
- errorConflictingDefElem(defel, pstate);
- if (defel->arg && IsA(defel->arg, A_Star))
- opts_out->force_null_all = true;
- else if (defel->arg && IsA(defel->arg, List))
- opts_out->force_null = castNode(List, defel->arg);
- else
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("argument to option \"%s\" must be a list of column names",
- defel->defname),
- parser_errposition(pstate, defel->location)));
+ if (strcmp(copy_options[i].name, defel->defname) == 0)
+ {
+ copy_options[i].processor(&ctx, defel);
+ found = true;
+ break;
+ }
}
- else if (strcmp(defel->defname, "convert_selectively") == 0)
+
+ if (!found)
{
/*
- * Undocumented, not-accessible-from-SQL option: convert only the
- * named columns to binary form, storing the rest as NULLs. It's
- * allowed for the column list to be NIL.
+ * When an unknown option is specified, complain about it.
+ * Provide a hint with a valid option that looks similar, if there
+ * is one.
*/
- if (opts_out->convert_selectively)
- errorConflictingDefElem(defel, pstate);
- opts_out->convert_selectively = true;
- if (defel->arg == NULL || IsA(defel->arg, List))
- opts_out->convert_select = castNode(List, defel->arg);
- else
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("argument to option \"%s\" must be a list of column names",
- defel->defname),
- parser_errposition(pstate, defel->location)));
- }
- else if (strcmp(defel->defname, "encoding") == 0)
- {
- if (opts_out->file_encoding >= 0)
- errorConflictingDefElem(defel, pstate);
- opts_out->file_encoding = pg_char_to_encoding(defGetString(defel));
- if (opts_out->file_encoding < 0)
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("argument to option \"%s\" must be a valid encoding name",
- defel->defname),
- parser_errposition(pstate, defel->location)));
- }
- else if (strcmp(defel->defname, "on_error") == 0)
- {
- if (on_error_specified)
- errorConflictingDefElem(defel, pstate);
- on_error_specified = true;
- opts_out->on_error = defGetCopyOnErrorChoice(defel, pstate, is_from);
- }
- else if (strcmp(defel->defname, "log_verbosity") == 0)
- {
- if (log_verbosity_specified)
- errorConflictingDefElem(defel, pstate);
- log_verbosity_specified = true;
- opts_out->log_verbosity = defGetCopyLogVerbosityChoice(defel, pstate);
- }
- else if (strcmp(defel->defname, "reject_limit") == 0)
- {
- if (reject_limit_specified)
- errorConflictingDefElem(defel, pstate);
- reject_limit_specified = true;
- opts_out->reject_limit = defGetCopyRejectLimitOption(defel);
- }
- else
+ ClosestMatchState match_state;
+ const char *closest_match;
+
+ initClosestMatch(&match_state, defel->defname, 4);
+ for (int i = 0; copy_options[i].name != NULL; i++)
+ {
+ if (copy_options[i].suggest_in_hints)
+ updateClosestMatch(&match_state, copy_options[i].name);
+ }
+
+ closest_match = getClosestMatch(&match_state);
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("option \"%s\" not recognized",
defel->defname),
+ closest_match ?
+ errhint("Perhaps you meant \"%s\".", closest_match) : 0,
parser_errposition(pstate, defel->location)));
+ }
}
/*
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index f3fdce23459..a4b4e33248f 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -96,6 +96,7 @@ COPY x from stdin (on_error unsupported);
ERROR: COPY ON_ERROR "unsupported" not recognized
LINE 1: COPY x from stdin (on_error unsupported);
^
+HINT: Valid values are "ignore" and "stop".
COPY x from stdin (format TEXT, force_quote(a));
ERROR: COPY FORCE_QUOTE requires CSV mode
COPY x from stdin (format TEXT, force_quote *);
@@ -128,6 +129,7 @@ COPY x from stdin (log_verbosity unsupported);
ERROR: COPY LOG_VERBOSITY "unsupported" not recognized
LINE 1: COPY x from stdin (log_verbosity unsupported);
^
+HINT: Valid values are "default", "silent", and "verbose".
COPY x from stdin with (reject_limit 1);
ERROR: COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE
COPY x from stdin with (on_error ignore, reject_limit 0);
@@ -138,6 +140,31 @@ COPY x from stdin with (header 2.5);
ERROR: header requires a Boolean value, a non-negative integer, or the string "match"
COPY x to stdout with (header 2);
ERROR: cannot use multi-line header in COPY TO
+-- test error hints for invalid COPY options
+COPY x from stdin (foramt CSV); -- error, suggests "format"
+ERROR: option "foramt" not recognized
+LINE 1: COPY x from stdin (foramt CSV);
+ ^
+HINT: Perhaps you meant "format".
+COPY x from stdin (completely_invalid_option 'value'); -- error, no suggestion
+ERROR: option "completely_invalid_option" not recognized
+LINE 1: COPY x from stdin (completely_invalid_option 'value');
+ ^
+COPY x from stdin (format cvs); -- error, lists valid formats
+ERROR: COPY format "cvs" not recognized
+LINE 1: COPY x from stdin (format cvs);
+ ^
+HINT: Valid formats are "binary", "csv", and "text".
+COPY x from stdin (format CSV, on_error ignor); -- error, lists valid on_error values
+ERROR: COPY ON_ERROR "ignor" not recognized
+LINE 1: COPY x from stdin (format CSV, on_error ignor);
+ ^
+HINT: Valid values are "ignore" and "stop".
+COPY x from stdin (format CSV, log_verbosity verbos); -- error, lists valid log_verbosity values
+ERROR: COPY LOG_VERBOSITY "verbos" not recognized
+LINE 1: COPY x from stdin (format CSV, log_verbosity verbos);
+ ^
+HINT: Valid values are "default", "silent", and "verbose".
-- too many columns in column list: should fail
COPY x (a, b, c, d, e, d, c) from stdin;
ERROR: column "d" specified more than once
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index cef45868db5..b8c9d2439ef 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -94,6 +94,13 @@ COPY x from stdin with (header -1);
COPY x from stdin with (header 2.5);
COPY x to stdout with (header 2);
+-- test error hints for invalid COPY options
+COPY x from stdin (foramt CSV); -- error, suggests "format"
+COPY x from stdin (completely_invalid_option 'value'); -- error, no suggestion
+COPY x from stdin (format cvs); -- error, lists valid formats
+COPY x from stdin (format CSV, on_error ignor); -- error, lists valid on_error values
+COPY x from stdin (format CSV, log_verbosity verbos); -- error, lists valid log_verbosity values
+
-- too many columns in column list: should fail
COPY x (a, b, c, d, e, d, c) from stdin;
--
2.50.1 (Apple Git-155)
Hi
On Thu, 27 Nov 2025 at 20:38, Sugamoto Shinya <shinya34892@gmail.com> wrote:
Regarding the option `convert_selectively`, what do you think about this?
Still it seems to me that this option is accessible from SQL interfaces. But for now
I just removed the comment changes from my v2 patch and would like to discuss
this with you here or in another thread.Regards,
It is worse than that - user can lose his data by using
convert_selectively from SQL:
```
db1=# COPY conv_test FROM STDIN (
FORMAT csv,
convert_selectively (a)
);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
2,2,dsd
\.
COPY 1
db1=# table conv_test
db1-# ;
a | b | c
---+---+---
2 | |
(1 row)
```
So, this option should be rejected when manually specified. Looks like
we can do ereport() directly in parser, when reading option name
--
Best regards,
Kirill Reshke
On Wed, 26 Nov 2025 at 11:55, Sugamoto Shinya <shinya34892@gmail.com> wrote:
2025年11月25日(火) 6:50 Nathan Bossart <nathandbossart@gmail.com>:
On Mon, Nov 24, 2025 at 11:56:34AM -0800, Masahiko Sawada wrote:
On Sat, Nov 22, 2025 at 8:33 PM Sugamoto Shinya <shinya34892@gmail.com> wrote:
This follows the pattern already used elsewhere in PostgreSQL for providing
helpful error hints to users.Given we have 15 COPY options now, it sounds like a reasonable idea.
One concern about the patch is that when adding a new COPY option, we
could miss updating valid_copy_options list, resulting in providing a
wrong suggestion. I think we can consider refactoring the COPY option
handling so that we check the given option is a valid name or not by
checking valid_copy_options array and then process the option value.+1. Ideally, folks wouldn't need to update a separate list when adding new
options.Additionally, this patch corrects a misleading comment for the
convert_selectively option. The comment stated it was "not-accessible-from-SQL",
but actualy it has been accessible from SQL due to PostgreSQL's generic option parser.
The updated comment clarifies that while technically accessible, it's intended for
internal use and not recommended for end-user use due to potential data loss.Hmm, I'm not sure the proposed comment improves the clarification.
It's essentially non-accessible from SQL since we cannot provide a
valid value for convert_selectively from SQL commands.Yeah, I'd leave it alone, at least for this patch.
--
nathanThanks for checking my proposal.
For the refactoring of the COPY options, it sounds reasonable to me. Let me take that changes in my patch.
Also one little thing:
+{ + {"default", copy_opt_default, true}, + {"delimiter", copy_opt_delimiter, true}, + {"encoding", copy_opt_encoding, true}, + {"escape", copy_opt_escape, true}, + {"force_not_null", copy_opt_force_not_null, true}, + {"force_null", copy_opt_force_null, true}, + {"force_quote", copy_opt_force_quote, true}, + {"format", copy_opt_format, true}, + {"freeze", copy_opt_freeze, true}, + {"header", copy_opt_header, true}, + {"log_verbosity", copy_opt_log_verbosity, true}, + {"null", copy_opt_null, true}, + {"on_error", copy_opt_on_error, true}, + {"quote", copy_opt_quote, true}, + {"reject_limit", copy_opt_reject_limit, true}, + {"convert_selectively", copy_opt_convert_selectively, false}, + {NULL, NULL, false} +};
Maybe we need one more struct member here, to indicate which options
are valid to be specified by user?
Also, pattern
static const struct {..} array_name[] = ... is not used in PostgreSQL
sources. At least, I do not see any use of such .
--
Best regards,
Kirill Reshke
On Fri, Nov 28, 2025 at 2:59 AM Kirill Reshke <reshkekirill@gmail.com>
wrote:
On Wed, 26 Nov 2025 at 11:55, Sugamoto Shinya <shinya34892@gmail.com>
wrote:2025年11月25日(火) 6:50 Nathan Bossart <nathandbossart@gmail.com>:
On Mon, Nov 24, 2025 at 11:56:34AM -0800, Masahiko Sawada wrote:
On Sat, Nov 22, 2025 at 8:33 PM Sugamoto Shinya <
shinya34892@gmail.com> wrote:
This follows the pattern already used elsewhere in PostgreSQL for
providing
helpful error hints to users.
Given we have 15 COPY options now, it sounds like a reasonable idea.
One concern about the patch is that when adding a new COPY option, we
could miss updating valid_copy_options list, resulting in providing a
wrong suggestion. I think we can consider refactoring the COPY option
handling so that we check the given option is a valid name or not by
checking valid_copy_options array and then process the option value.+1. Ideally, folks wouldn't need to update a separate list when adding
new
options.
Additionally, this patch corrects a misleading comment for the
convert_selectively option. The comment stated it was"not-accessible-from-SQL",
but actualy it has been accessible from SQL due to PostgreSQL's
generic option parser.
The updated comment clarifies that while technically accessible,
it's intended for
internal use and not recommended for end-user use due to potential
data loss.
Hmm, I'm not sure the proposed comment improves the clarification.
It's essentially non-accessible from SQL since we cannot provide a
valid value for convert_selectively from SQL commands.Yeah, I'd leave it alone, at least for this patch.
--
nathanThanks for checking my proposal.
For the refactoring of the COPY options, it sounds reasonable to me. Let
me take that changes in my patch.
Also one little thing:
+{ + {"default", copy_opt_default, true}, + {"delimiter", copy_opt_delimiter, true}, + {"encoding", copy_opt_encoding, true}, + {"escape", copy_opt_escape, true}, + {"force_not_null", copy_opt_force_not_null, true}, + {"force_null", copy_opt_force_null, true}, + {"force_quote", copy_opt_force_quote, true}, + {"format", copy_opt_format, true}, + {"freeze", copy_opt_freeze, true}, + {"header", copy_opt_header, true}, + {"log_verbosity", copy_opt_log_verbosity, true}, + {"null", copy_opt_null, true}, + {"on_error", copy_opt_on_error, true}, + {"quote", copy_opt_quote, true}, + {"reject_limit", copy_opt_reject_limit, true}, + {"convert_selectively", copy_opt_convert_selectively, false}, + {NULL, NULL, false} +};Maybe we need one more struct member here, to indicate which options
are valid to be specified by user?Also, pattern
static const struct {..} array_name[] = ... is not used in PostgreSQL
sources. At least, I do not see any use of such .--
Best regards,
Kirill Reshke
Thanks for checking my proposal.
Maybe we need one more struct member here, to indicate which options
are valid to be specified by user?
If you don't mind, I would like to make a separate patch for fixing the
"convert_selectively"
option and focus on refactoring error handling here because I tend to feel
we should
separate refactoring changes and non-backward compatible changes into
different commits.
After this patch gets merged, I'll make another thread to discuss whether
we should block
unexpected "convert_selectively" use or not.
static const struct {..} array_name[] = ... is not used in PostgreSQL
sources. At least, I do not see any use of such .
I saw several places that use that sort of style, for example
src/backend/utils/adt/encode.c:836
and src/backend/access/heap/heapam.c:122, but you seems to be more or less
correct since
usually we define types explicitly like src/backend/foreign/foreign.c:576
and src/backend/backup/basebackup.c:191.
I updated my patch by defining a new type `CopyCoptionDef`.
Also, I added improvements to helper functions like defGet**. I just
removed and unified those
into corresponding proceeOption** functions.
Regards,
Attachments:
v3-0001-Refactor-COPY-option-handling.patchapplication/octet-stream; name=v3-0001-Refactor-COPY-option-handling.patchDownload
From c923fce80ee0d6cb8bc37f0dd893abc069a01c26 Mon Sep 17 00:00:00 2001
From: Shinya Sugamoto <sugamoto@me.com>
Date: Sat, 22 Nov 2025 22:03:21 +0900
Subject: [PATCH] Refactor COPY option handling and improve error hints
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
- Restructure ProcessCopyOptions() so that each option has its own
processor function. This makes option handling more modular and
easier to extend.
- Improve error messages by adding hints that show valid values
when users specify unrecognized option values.
---
contrib/file_fdw/expected/file_fdw.out | 3 +
src/backend/commands/copy.c | 597 ++++++++++++++-----------
src/test/regress/expected/copy2.out | 27 ++
src/test/regress/sql/copy2.sql | 7 +
4 files changed, 375 insertions(+), 259 deletions(-)
diff --git a/contrib/file_fdw/expected/file_fdw.out b/contrib/file_fdw/expected/file_fdw.out
index 5121e27dce5..bf953c62a49 100644
--- a/contrib/file_fdw/expected/file_fdw.out
+++ b/contrib/file_fdw/expected/file_fdw.out
@@ -54,6 +54,7 @@ CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS ("a=b" 'true'); -- ERROR
ERROR: invalid option name "a=b": must not contain "="
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'xml'); -- ERROR
ERROR: COPY format "xml" not recognized
+HINT: Valid formats are "binary", "csv", and "text".
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'text', quote ':'); -- ERROR
ERROR: COPY QUOTE requires CSV mode
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'text', escape ':'); -- ERROR
@@ -96,10 +97,12 @@ CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'csv', null '
ERROR: COPY null representation cannot use newline or carriage return
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (on_error 'unsupported'); -- ERROR
ERROR: COPY ON_ERROR "unsupported" not recognized
+HINT: Valid values are "ignore" and "stop".
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'binary', on_error 'ignore'); -- ERROR
ERROR: only ON_ERROR STOP is allowed in BINARY mode
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (log_verbosity 'unsupported'); -- ERROR
ERROR: COPY LOG_VERBOSITY "unsupported" not recognized
+HINT: Valid values are "default", "silent", and "verbose".
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (reject_limit '1'); -- ERROR
ERROR: COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (on_error 'ignore', reject_limit '0'); -- ERROR
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 28e878c3688..804810c2272 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -38,6 +38,7 @@
#include "utils/lsyscache.h"
#include "utils/rel.h"
#include "utils/rls.h"
+#include "utils/varlena.h"
/*
* DoCopy executes the SQL COPY statement
@@ -365,133 +366,328 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt,
}
/*
- * Extract the CopyFormatOptions.header_line value from a DefElem.
- *
- * Parses the HEADER option for COPY, which can be a boolean, a non-negative
- * integer (number of lines to skip), or the special value "match".
+ * Context structure for COPY option processing.
*/
-static int
-defGetCopyHeaderOption(DefElem *def, bool is_from)
+typedef struct CopyOptionContext
{
- /*
- * If no parameter value given, assume "true" is meant.
- */
- if (def->arg == NULL)
- return COPY_HEADER_TRUE;
+ ParseState *pstate;
+ CopyFormatOptions *opts_out;
+ bool is_from;
+ bool format_specified;
+ bool freeze_specified;
+ bool header_specified;
+ bool on_error_specified;
+ bool log_verbosity_specified;
+ bool reject_limit_specified;
+} CopyOptionContext;
+
+/* Individual option processors */
+
+static void
+processOptionFormat(CopyOptionContext *ctx, DefElem *defel)
+{
+ char *fmt = defGetString(defel);
+
+ if (ctx->format_specified)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->format_specified = true;
+
+ if (strcmp(fmt, "text") == 0)
+ /* default format */ ;
+ else if (strcmp(fmt, "csv") == 0)
+ ctx->opts_out->csv_mode = true;
+ else if (strcmp(fmt, "binary") == 0)
+ ctx->opts_out->binary = true;
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("COPY format \"%s\" not recognized", fmt),
+ errhint("Valid formats are \"%s\", \"%s\", and \"%s\".",
+ "binary", "csv", "text"),
+ parser_errposition(ctx->pstate, defel->location)));
+}
- /*
- * Allow 0, 1, "true", "false", "on", "off", a non-negative integer, or
- * "match".
- */
- switch (nodeTag(def->arg))
+static void
+processOptionFreeze(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->freeze_specified)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->freeze_specified = true;
+ ctx->opts_out->freeze = defGetBoolean(defel);
+}
+
+static void
+processOptionDelimiter(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->opts_out->delim)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->opts_out->delim = defGetString(defel);
+}
+
+static void
+processOptionNull(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->opts_out->null_print)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->opts_out->null_print = defGetString(defel);
+}
+
+static void
+processOptionDefault(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->opts_out->default_print)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->opts_out->default_print = defGetString(defel);
+}
+
+static void
+processOptionHeader(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->header_specified)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->header_specified = true;
+
+ if (defel->arg == NULL)
+ {
+ ctx->opts_out->header_line = COPY_HEADER_TRUE;
+ return;
+ }
+ switch (nodeTag(defel->arg))
{
case T_Integer:
{
- int ival = intVal(def->arg);
+ int ival = intVal(defel->arg);
if (ival < 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("a negative integer value cannot be "
- "specified for %s", def->defname)));
+ "specified for %s", defel->defname)));
- if (!is_from && ival > 1)
+ if (!ctx->is_from && ival > 1)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot use multi-line header in COPY TO")));
- return ival;
+ ctx->opts_out->header_line = ival;
+ return;
}
- break;
default:
{
- char *sval = defGetString(def);
+ char *sval = defGetString(defel);
- /*
- * The set of strings accepted here should match up with the
- * grammar's opt_boolean_or_string production.
- */
- if (pg_strcasecmp(sval, "true") == 0)
- return COPY_HEADER_TRUE;
- if (pg_strcasecmp(sval, "false") == 0)
- return COPY_HEADER_FALSE;
- if (pg_strcasecmp(sval, "on") == 0)
- return COPY_HEADER_TRUE;
- if (pg_strcasecmp(sval, "off") == 0)
- return COPY_HEADER_FALSE;
+ if (pg_strcasecmp(sval, "true") == 0 ||
+ pg_strcasecmp(sval, "on") == 0)
+ {
+ ctx->opts_out->header_line = COPY_HEADER_TRUE;
+ return;
+ }
+ if (pg_strcasecmp(sval, "false") == 0 ||
+ pg_strcasecmp(sval, "off") == 0)
+ {
+ ctx->opts_out->header_line = COPY_HEADER_FALSE;
+ return;
+ }
if (pg_strcasecmp(sval, "match") == 0)
{
- if (!is_from)
+ if (!ctx->is_from)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot use \"%s\" with HEADER in COPY TO",
sval)));
- return COPY_HEADER_MATCH;
+ ctx->opts_out->header_line = COPY_HEADER_MATCH;
+ return;
}
}
break;
}
+
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("%s requires a Boolean value, a non-negative integer, "
"or the string \"match\"",
- def->defname)));
- return COPY_HEADER_FALSE; /* keep compiler quiet */
+ defel->defname)));
}
-/*
- * Extract a CopyOnErrorChoice value from a DefElem.
- */
-static CopyOnErrorChoice
-defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
+static void
+processOptionQuote(CopyOptionContext *ctx, DefElem *defel)
{
- char *sval = defGetString(def);
+ if (ctx->opts_out->quote)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->opts_out->quote = defGetString(defel);
+}
- if (!is_from)
+static void
+processOptionEscape(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->opts_out->escape)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->opts_out->escape = defGetString(defel);
+}
+
+static void
+processOptionForceQuote(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->opts_out->force_quote || ctx->opts_out->force_quote_all)
+ errorConflictingDefElem(defel, ctx->pstate);
+
+ if (defel->arg && IsA(defel->arg, A_Star))
+ ctx->opts_out->force_quote_all = true;
+ else if (defel->arg && IsA(defel->arg, List))
+ ctx->opts_out->force_quote = castNode(List, defel->arg);
+ else
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- /*- translator: first %s is the name of a COPY option, e.g. ON_ERROR,
- second %s is a COPY with direction, e.g. COPY TO */
- errmsg("COPY %s cannot be used with %s", "ON_ERROR", "COPY TO"),
- parser_errposition(pstate, def->location)));
+ errmsg("argument to option \"%s\" must be a list of column names",
+ defel->defname),
+ parser_errposition(ctx->pstate, defel->location)));
+}
+
+static void
+processOptionForceNotNull(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->opts_out->force_notnull || ctx->opts_out->force_notnull_all)
+ errorConflictingDefElem(defel, ctx->pstate);
+
+ if (defel->arg && IsA(defel->arg, A_Star))
+ ctx->opts_out->force_notnull_all = true;
+ else if (defel->arg && IsA(defel->arg, List))
+ ctx->opts_out->force_notnull = castNode(List, defel->arg);
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("argument to option \"%s\" must be a list of column names",
+ defel->defname),
+ parser_errposition(ctx->pstate, defel->location)));
+}
+
+static void
+processOptionForceNull(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->opts_out->force_null || ctx->opts_out->force_null_all)
+ errorConflictingDefElem(defel, ctx->pstate);
+
+ if (defel->arg && IsA(defel->arg, A_Star))
+ ctx->opts_out->force_null_all = true;
+ else if (defel->arg && IsA(defel->arg, List))
+ ctx->opts_out->force_null = castNode(List, defel->arg);
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("argument to option \"%s\" must be a list of column names",
+ defel->defname),
+ parser_errposition(ctx->pstate, defel->location)));
+}
+static void
+processOptionConvertSelectively(CopyOptionContext *ctx, DefElem *defel)
+{
/*
- * Allow "stop", or "ignore" values.
+ * Undocumented, not-accessible-from-SQL option: convert only the
+ * named columns to binary form, storing the rest as NULLs. It's
+ * allowed for the column list to be NIL.
*/
+ if (ctx->opts_out->convert_selectively)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->opts_out->convert_selectively = true;
+
+ if (defel->arg == NULL || IsA(defel->arg, List))
+ ctx->opts_out->convert_select = castNode(List, defel->arg);
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("argument to option \"%s\" must be a list of column names",
+ defel->defname),
+ parser_errposition(ctx->pstate, defel->location)));
+}
+
+static void
+processOptionEncoding(CopyOptionContext *ctx, DefElem *defel)
+{
+ if (ctx->opts_out->file_encoding >= 0)
+ errorConflictingDefElem(defel, ctx->pstate);
+
+ ctx->opts_out->file_encoding = pg_char_to_encoding(defGetString(defel));
+ if (ctx->opts_out->file_encoding < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("argument to option \"%s\" must be a valid encoding name",
+ defel->defname),
+ parser_errposition(ctx->pstate, defel->location)));
+}
+
+/* Process ON_ERROR option: "stop" or "ignore" */
+static void
+processOptionOnError(CopyOptionContext *ctx, DefElem *defel)
+{
+ char *sval;
+
+ if (ctx->on_error_specified)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->on_error_specified = true;
+
+ if (!ctx->is_from)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("COPY %s cannot be used with %s", "ON_ERROR", "COPY TO"),
+ parser_errposition(ctx->pstate, defel->location)));
+
+ sval = defGetString(defel);
if (pg_strcasecmp(sval, "stop") == 0)
- return COPY_ON_ERROR_STOP;
- if (pg_strcasecmp(sval, "ignore") == 0)
- return COPY_ON_ERROR_IGNORE;
+ ctx->opts_out->on_error = COPY_ON_ERROR_STOP;
+ else if (pg_strcasecmp(sval, "ignore") == 0)
+ ctx->opts_out->on_error = COPY_ON_ERROR_IGNORE;
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("COPY %s \"%s\" not recognized", "ON_ERROR", sval),
+ errhint("Valid values are \"%s\" and \"%s\".", "ignore", "stop"),
+ parser_errposition(ctx->pstate, defel->location)));
+}
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- /*- translator: first %s is the name of a COPY option, e.g. ON_ERROR */
- errmsg("COPY %s \"%s\" not recognized", "ON_ERROR", sval),
- parser_errposition(pstate, def->location)));
- return COPY_ON_ERROR_STOP; /* keep compiler quiet */
+static void
+processOptionLogVerbosity(CopyOptionContext *ctx, DefElem *defel)
+{
+ char *sval;
+
+ if (ctx->log_verbosity_specified)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->log_verbosity_specified = true;
+
+ sval = defGetString(defel);
+ if (pg_strcasecmp(sval, "silent") == 0)
+ ctx->opts_out->log_verbosity = COPY_LOG_VERBOSITY_SILENT;
+ else if (pg_strcasecmp(sval, "default") == 0)
+ ctx->opts_out->log_verbosity = COPY_LOG_VERBOSITY_DEFAULT;
+ else if (pg_strcasecmp(sval, "verbose") == 0)
+ ctx->opts_out->log_verbosity = COPY_LOG_VERBOSITY_VERBOSE;
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("COPY %s \"%s\" not recognized", "LOG_VERBOSITY", sval),
+ errhint("Valid values are \"%s\", \"%s\", and \"%s\".",
+ "default", "silent", "verbose"),
+ parser_errposition(ctx->pstate, defel->location)));
}
-/*
- * Extract REJECT_LIMIT value from a DefElem.
- *
- * REJECT_LIMIT can be specified in two ways: as an int64 for the COPY command
- * option or as a single-quoted string for the foreign table option using
- * file_fdw. Therefore this function needs to handle both formats.
- */
-static int64
-defGetCopyRejectLimitOption(DefElem *def)
+static void
+processOptionRejectLimit(CopyOptionContext *ctx, DefElem *defel)
{
int64 reject_limit;
- if (def->arg == NULL)
+ if (ctx->reject_limit_specified)
+ errorConflictingDefElem(defel, ctx->pstate);
+ ctx->reject_limit_specified = true;
+
+ if (defel->arg == NULL)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("%s requires a numeric value",
- def->defname)));
- else if (nodeTag(def->arg) == T_String)
- reject_limit = pg_strtoint64(strVal(def->arg));
+ errmsg("%s requires a numeric value", defel->defname)));
+
+ if (nodeTag(defel->arg) == T_String)
+ reject_limit = pg_strtoint64(strVal(defel->arg));
else
- reject_limit = defGetInt64(def);
+ reject_limit = defGetInt64(defel);
if (reject_limit <= 0)
ereport(ERROR,
@@ -499,35 +695,45 @@ defGetCopyRejectLimitOption(DefElem *def)
errmsg("REJECT_LIMIT (%" PRId64 ") must be greater than zero",
reject_limit)));
- return reject_limit;
+ ctx->opts_out->reject_limit = reject_limit;
}
/*
- * Extract a CopyLogVerbosityChoice value from a DefElem.
+ * Definition of a COPY option.
*/
-static CopyLogVerbosityChoice
-defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate)
+typedef struct CopyOptionDef
{
- char *sval;
+ const char *name;
+ void (*processor) (CopyOptionContext *ctx, DefElem *defel);
+ bool suggest_in_hints;
+} CopyOptionDef;
- /*
- * Allow "silent", "default", or "verbose" values.
- */
- sval = defGetString(def);
- if (pg_strcasecmp(sval, "silent") == 0)
- return COPY_LOG_VERBOSITY_SILENT;
- if (pg_strcasecmp(sval, "default") == 0)
- return COPY_LOG_VERBOSITY_DEFAULT;
- if (pg_strcasecmp(sval, "verbose") == 0)
- return COPY_LOG_VERBOSITY_VERBOSE;
-
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- /*- translator: first %s is the name of a COPY option, e.g. ON_ERROR */
- errmsg("COPY %s \"%s\" not recognized", "LOG_VERBOSITY", sval),
- parser_errposition(pstate, def->location)));
- return COPY_LOG_VERBOSITY_DEFAULT; /* keep compiler quiet */
-}
+/*
+ * Table of all valid COPY options.
+ *
+ * This is the single source of truth for valid COPY options. When adding a
+ * new option, add an entry here and implement its processor function above.
+ */
+static const CopyOptionDef copy_options[] =
+{
+ {"default", processOptionDefault, true},
+ {"delimiter", processOptionDelimiter, true},
+ {"encoding", processOptionEncoding, true},
+ {"escape", processOptionEscape, true},
+ {"force_not_null", processOptionForceNotNull, true},
+ {"force_null", processOptionForceNull, true},
+ {"force_quote", processOptionForceQuote, true},
+ {"format", processOptionFormat, true},
+ {"freeze", processOptionFreeze, true},
+ {"header", processOptionHeader, true},
+ {"log_verbosity", processOptionLogVerbosity, true},
+ {"null", processOptionNull, true},
+ {"on_error", processOptionOnError, true},
+ {"quote", processOptionQuote, true},
+ {"reject_limit", processOptionRejectLimit, true},
+ {"convert_selectively", processOptionConvertSelectively, false},
+ {NULL, NULL, false}
+};
/*
* Process the statement option list for COPY.
@@ -551,12 +757,7 @@ ProcessCopyOptions(ParseState *pstate,
bool is_from,
List *options)
{
- bool format_specified = false;
- bool freeze_specified = false;
- bool header_specified = false;
- bool on_error_specified = false;
- bool log_verbosity_specified = false;
- bool reject_limit_specified = false;
+ CopyOptionContext ctx;
ListCell *option;
/* Support external use for option sanity checking */
@@ -565,177 +766,55 @@ ProcessCopyOptions(ParseState *pstate,
opts_out->file_encoding = -1;
+ /* Initialize option processing context */
+ memset(&ctx, 0, sizeof(ctx));
+ ctx.pstate = pstate;
+ ctx.opts_out = opts_out;
+ ctx.is_from = is_from;
+
/* Extract options from the statement node tree */
foreach(option, options)
{
DefElem *defel = lfirst_node(DefElem, option);
+ bool found = false;
- if (strcmp(defel->defname, "format") == 0)
- {
- char *fmt = defGetString(defel);
-
- if (format_specified)
- errorConflictingDefElem(defel, pstate);
- format_specified = true;
- if (strcmp(fmt, "text") == 0)
- /* default format */ ;
- else if (strcmp(fmt, "csv") == 0)
- opts_out->csv_mode = true;
- else if (strcmp(fmt, "binary") == 0)
- opts_out->binary = true;
- else
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("COPY format \"%s\" not recognized", fmt),
- parser_errposition(pstate, defel->location)));
- }
- else if (strcmp(defel->defname, "freeze") == 0)
- {
- if (freeze_specified)
- errorConflictingDefElem(defel, pstate);
- freeze_specified = true;
- opts_out->freeze = defGetBoolean(defel);
- }
- else if (strcmp(defel->defname, "delimiter") == 0)
- {
- if (opts_out->delim)
- errorConflictingDefElem(defel, pstate);
- opts_out->delim = defGetString(defel);
- }
- else if (strcmp(defel->defname, "null") == 0)
- {
- if (opts_out->null_print)
- errorConflictingDefElem(defel, pstate);
- opts_out->null_print = defGetString(defel);
- }
- else if (strcmp(defel->defname, "default") == 0)
- {
- if (opts_out->default_print)
- errorConflictingDefElem(defel, pstate);
- opts_out->default_print = defGetString(defel);
- }
- else if (strcmp(defel->defname, "header") == 0)
- {
- if (header_specified)
- errorConflictingDefElem(defel, pstate);
- header_specified = true;
- opts_out->header_line = defGetCopyHeaderOption(defel, is_from);
- }
- else if (strcmp(defel->defname, "quote") == 0)
- {
- if (opts_out->quote)
- errorConflictingDefElem(defel, pstate);
- opts_out->quote = defGetString(defel);
- }
- else if (strcmp(defel->defname, "escape") == 0)
- {
- if (opts_out->escape)
- errorConflictingDefElem(defel, pstate);
- opts_out->escape = defGetString(defel);
- }
- else if (strcmp(defel->defname, "force_quote") == 0)
- {
- if (opts_out->force_quote || opts_out->force_quote_all)
- errorConflictingDefElem(defel, pstate);
- if (defel->arg && IsA(defel->arg, A_Star))
- opts_out->force_quote_all = true;
- else if (defel->arg && IsA(defel->arg, List))
- opts_out->force_quote = castNode(List, defel->arg);
- else
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("argument to option \"%s\" must be a list of column names",
- defel->defname),
- parser_errposition(pstate, defel->location)));
- }
- else if (strcmp(defel->defname, "force_not_null") == 0)
- {
- if (opts_out->force_notnull || opts_out->force_notnull_all)
- errorConflictingDefElem(defel, pstate);
- if (defel->arg && IsA(defel->arg, A_Star))
- opts_out->force_notnull_all = true;
- else if (defel->arg && IsA(defel->arg, List))
- opts_out->force_notnull = castNode(List, defel->arg);
- else
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("argument to option \"%s\" must be a list of column names",
- defel->defname),
- parser_errposition(pstate, defel->location)));
- }
- else if (strcmp(defel->defname, "force_null") == 0)
+ /* Look up the option in copy_options table */
+ for (int i = 0; copy_options[i].name != NULL; i++)
{
- if (opts_out->force_null || opts_out->force_null_all)
- errorConflictingDefElem(defel, pstate);
- if (defel->arg && IsA(defel->arg, A_Star))
- opts_out->force_null_all = true;
- else if (defel->arg && IsA(defel->arg, List))
- opts_out->force_null = castNode(List, defel->arg);
- else
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("argument to option \"%s\" must be a list of column names",
- defel->defname),
- parser_errposition(pstate, defel->location)));
+ if (strcmp(copy_options[i].name, defel->defname) == 0)
+ {
+ copy_options[i].processor(&ctx, defel);
+ found = true;
+ break;
+ }
}
- else if (strcmp(defel->defname, "convert_selectively") == 0)
+
+ if (!found)
{
/*
- * Undocumented, not-accessible-from-SQL option: convert only the
- * named columns to binary form, storing the rest as NULLs. It's
- * allowed for the column list to be NIL.
+ * When an unknown option is specified, complain about it.
+ * Provide a hint with a valid option that looks similar, if there
+ * is one.
*/
- if (opts_out->convert_selectively)
- errorConflictingDefElem(defel, pstate);
- opts_out->convert_selectively = true;
- if (defel->arg == NULL || IsA(defel->arg, List))
- opts_out->convert_select = castNode(List, defel->arg);
- else
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("argument to option \"%s\" must be a list of column names",
- defel->defname),
- parser_errposition(pstate, defel->location)));
- }
- else if (strcmp(defel->defname, "encoding") == 0)
- {
- if (opts_out->file_encoding >= 0)
- errorConflictingDefElem(defel, pstate);
- opts_out->file_encoding = pg_char_to_encoding(defGetString(defel));
- if (opts_out->file_encoding < 0)
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("argument to option \"%s\" must be a valid encoding name",
- defel->defname),
- parser_errposition(pstate, defel->location)));
- }
- else if (strcmp(defel->defname, "on_error") == 0)
- {
- if (on_error_specified)
- errorConflictingDefElem(defel, pstate);
- on_error_specified = true;
- opts_out->on_error = defGetCopyOnErrorChoice(defel, pstate, is_from);
- }
- else if (strcmp(defel->defname, "log_verbosity") == 0)
- {
- if (log_verbosity_specified)
- errorConflictingDefElem(defel, pstate);
- log_verbosity_specified = true;
- opts_out->log_verbosity = defGetCopyLogVerbosityChoice(defel, pstate);
- }
- else if (strcmp(defel->defname, "reject_limit") == 0)
- {
- if (reject_limit_specified)
- errorConflictingDefElem(defel, pstate);
- reject_limit_specified = true;
- opts_out->reject_limit = defGetCopyRejectLimitOption(defel);
- }
- else
+ ClosestMatchState match_state;
+ const char *closest_match;
+
+ initClosestMatch(&match_state, defel->defname, 4);
+ for (int i = 0; copy_options[i].name != NULL; i++)
+ {
+ if (copy_options[i].suggest_in_hints)
+ updateClosestMatch(&match_state, copy_options[i].name);
+ }
+
+ closest_match = getClosestMatch(&match_state);
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("option \"%s\" not recognized",
defel->defname),
+ closest_match ?
+ errhint("Perhaps you meant \"%s\".", closest_match) : 0,
parser_errposition(pstate, defel->location)));
+ }
}
/*
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index f3fdce23459..a4b4e33248f 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -96,6 +96,7 @@ COPY x from stdin (on_error unsupported);
ERROR: COPY ON_ERROR "unsupported" not recognized
LINE 1: COPY x from stdin (on_error unsupported);
^
+HINT: Valid values are "ignore" and "stop".
COPY x from stdin (format TEXT, force_quote(a));
ERROR: COPY FORCE_QUOTE requires CSV mode
COPY x from stdin (format TEXT, force_quote *);
@@ -128,6 +129,7 @@ COPY x from stdin (log_verbosity unsupported);
ERROR: COPY LOG_VERBOSITY "unsupported" not recognized
LINE 1: COPY x from stdin (log_verbosity unsupported);
^
+HINT: Valid values are "default", "silent", and "verbose".
COPY x from stdin with (reject_limit 1);
ERROR: COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE
COPY x from stdin with (on_error ignore, reject_limit 0);
@@ -138,6 +140,31 @@ COPY x from stdin with (header 2.5);
ERROR: header requires a Boolean value, a non-negative integer, or the string "match"
COPY x to stdout with (header 2);
ERROR: cannot use multi-line header in COPY TO
+-- test error hints for invalid COPY options
+COPY x from stdin (foramt CSV); -- error, suggests "format"
+ERROR: option "foramt" not recognized
+LINE 1: COPY x from stdin (foramt CSV);
+ ^
+HINT: Perhaps you meant "format".
+COPY x from stdin (completely_invalid_option 'value'); -- error, no suggestion
+ERROR: option "completely_invalid_option" not recognized
+LINE 1: COPY x from stdin (completely_invalid_option 'value');
+ ^
+COPY x from stdin (format cvs); -- error, lists valid formats
+ERROR: COPY format "cvs" not recognized
+LINE 1: COPY x from stdin (format cvs);
+ ^
+HINT: Valid formats are "binary", "csv", and "text".
+COPY x from stdin (format CSV, on_error ignor); -- error, lists valid on_error values
+ERROR: COPY ON_ERROR "ignor" not recognized
+LINE 1: COPY x from stdin (format CSV, on_error ignor);
+ ^
+HINT: Valid values are "ignore" and "stop".
+COPY x from stdin (format CSV, log_verbosity verbos); -- error, lists valid log_verbosity values
+ERROR: COPY LOG_VERBOSITY "verbos" not recognized
+LINE 1: COPY x from stdin (format CSV, log_verbosity verbos);
+ ^
+HINT: Valid values are "default", "silent", and "verbose".
-- too many columns in column list: should fail
COPY x (a, b, c, d, e, d, c) from stdin;
ERROR: column "d" specified more than once
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index cef45868db5..b8c9d2439ef 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -94,6 +94,13 @@ COPY x from stdin with (header -1);
COPY x from stdin with (header 2.5);
COPY x to stdout with (header 2);
+-- test error hints for invalid COPY options
+COPY x from stdin (foramt CSV); -- error, suggests "format"
+COPY x from stdin (completely_invalid_option 'value'); -- error, no suggestion
+COPY x from stdin (format cvs); -- error, lists valid formats
+COPY x from stdin (format CSV, on_error ignor); -- error, lists valid on_error values
+COPY x from stdin (format CSV, log_verbosity verbos); -- error, lists valid log_verbosity values
+
-- too many columns in column list: should fail
COPY x (a, b, c, d, e, d, c) from stdin;
--
2.50.1 (Apple Git-155)
On Sat, Nov 29, 2025 at 9:36 PM Sugamoto Shinya <shinya34892@gmail.com>
wrote:
On Fri, Nov 28, 2025 at 2:59 AM Kirill Reshke <reshkekirill@gmail.com>
wrote:On Wed, 26 Nov 2025 at 11:55, Sugamoto Shinya <shinya34892@gmail.com>
wrote:2025年11月25日(火) 6:50 Nathan Bossart <nathandbossart@gmail.com>:
On Mon, Nov 24, 2025 at 11:56:34AM -0800, Masahiko Sawada wrote:
On Sat, Nov 22, 2025 at 8:33 PM Sugamoto Shinya <
shinya34892@gmail.com> wrote:
This follows the pattern already used elsewhere in PostgreSQL for
providing
helpful error hints to users.
Given we have 15 COPY options now, it sounds like a reasonable idea.
One concern about the patch is that when adding a new COPY option, we
could miss updating valid_copy_options list, resulting in providing a
wrong suggestion. I think we can consider refactoring the COPY option
handling so that we check the given option is a valid name or not by
checking valid_copy_options array and then process the option value.+1. Ideally, folks wouldn't need to update a separate list when
adding new
options.
Additionally, this patch corrects a misleading comment for the
convert_selectively option. The comment stated it was"not-accessible-from-SQL",
but actualy it has been accessible from SQL due to PostgreSQL's
generic option parser.
The updated comment clarifies that while technically accessible,
it's intended for
internal use and not recommended for end-user use due to potential
data loss.
Hmm, I'm not sure the proposed comment improves the clarification.
It's essentially non-accessible from SQL since we cannot provide a
valid value for convert_selectively from SQL commands.Yeah, I'd leave it alone, at least for this patch.
--
nathanThanks for checking my proposal.
For the refactoring of the COPY options, it sounds reasonable to me.
Let me take that changes in my patch.
Also one little thing:
+{ + {"default", copy_opt_default, true}, + {"delimiter", copy_opt_delimiter, true}, + {"encoding", copy_opt_encoding, true}, + {"escape", copy_opt_escape, true}, + {"force_not_null", copy_opt_force_not_null, true}, + {"force_null", copy_opt_force_null, true}, + {"force_quote", copy_opt_force_quote, true}, + {"format", copy_opt_format, true}, + {"freeze", copy_opt_freeze, true}, + {"header", copy_opt_header, true}, + {"log_verbosity", copy_opt_log_verbosity, true}, + {"null", copy_opt_null, true}, + {"on_error", copy_opt_on_error, true}, + {"quote", copy_opt_quote, true}, + {"reject_limit", copy_opt_reject_limit, true}, + {"convert_selectively", copy_opt_convert_selectively, false}, + {NULL, NULL, false} +};Maybe we need one more struct member here, to indicate which options
are valid to be specified by user?Also, pattern
static const struct {..} array_name[] = ... is not used in PostgreSQL
sources. At least, I do not see any use of such .--
Best regards,
Kirill ReshkeThanks for checking my proposal.
Maybe we need one more struct member here, to indicate which options
are valid to be specified by user?If you don't mind, I would like to make a separate patch for fixing the
"convert_selectively"
option and focus on refactoring error handling here because I tend to feel
we should
separate refactoring changes and non-backward compatible changes into
different commits.
After this patch gets merged, I'll make another thread to discuss whether
we should block
unexpected "convert_selectively" use or not.static const struct {..} array_name[] = ... is not used in PostgreSQL
sources. At least, I do not see any use of such .I saw several places that use that sort of style, for example
src/backend/utils/adt/encode.c:836
and src/backend/access/heap/heapam.c:122, but you seems to be more or less
correct since
usually we define types explicitly like src/backend/foreign/foreign.c:576
and src/backend/backup/basebackup.c:191.
I updated my patch by defining a new type `CopyCoptionDef`.Also, I added improvements to helper functions like defGet**. I just
removed and unified those
into corresponding proceeOption** functions.Regards,
Hi,
Just a friendly ping on this thread.
In the latest version of the patch, I refactored COPY option handling so
that:
-
All the COPY options and their validation functions are defined in a
single table (CopyOptionDef), and
-
Error hints for invalid option names/values are generated based on that
table.
The goal was to make it harder to forget updating the error-hinting logic
when adding new options, and to keep validation logic in one place.
But on the other hand, I can also simplify this if you feel the current
approach is too heavy. For example, one alternative would be to keep the
existing per-option handling and just add a minimal option check like
validate_copy_option() near the top of the main options loop in order to
keep our implementations simple and small, even if that does not completely
eliminate the chance of someone missing an update.
This is an alternative approche what I mentioned here.
```
list valid_options = ["format", "force_null", ...]
foreach(opt, options)
{
validate_copy_option(opt, valid_options) <--- THIS
if (opt.name == "format") ...
if (opt.name == "force_null") ...
...
}
```
Regarding convert_selectively, I have kept behavior and comments unchanged
in this patch. As I said, I plan to propose a separate patch to address the
possibility of users specifying convert_selectively from SQL (e.g., by
rejecting it in the parser), once we agree on the direction for this
refactoring.
I would appreciate any feedback or preferences on the current approach. I
am happy to adjust the design and createe a new version accordingly.
Regards,
On Tue, Dec 2, 2025 at 3:42 AM Sugamoto Shinya <shinya34892@gmail.com> wrote:
On Sat, Nov 29, 2025 at 9:36 PM Sugamoto Shinya <shinya34892@gmail.com> wrote:
On Fri, Nov 28, 2025 at 2:59 AM Kirill Reshke <reshkekirill@gmail.com> wrote:
On Wed, 26 Nov 2025 at 11:55, Sugamoto Shinya <shinya34892@gmail.com> wrote:
2025年11月25日(火) 6:50 Nathan Bossart <nathandbossart@gmail.com>:
On Mon, Nov 24, 2025 at 11:56:34AM -0800, Masahiko Sawada wrote:
On Sat, Nov 22, 2025 at 8:33 PM Sugamoto Shinya <shinya34892@gmail.com> wrote:
This follows the pattern already used elsewhere in PostgreSQL for providing
helpful error hints to users.Given we have 15 COPY options now, it sounds like a reasonable idea.
One concern about the patch is that when adding a new COPY option, we
could miss updating valid_copy_options list, resulting in providing a
wrong suggestion. I think we can consider refactoring the COPY option
handling so that we check the given option is a valid name or not by
checking valid_copy_options array and then process the option value.+1. Ideally, folks wouldn't need to update a separate list when adding new
options.Additionally, this patch corrects a misleading comment for the
convert_selectively option. The comment stated it was "not-accessible-from-SQL",
but actualy it has been accessible from SQL due to PostgreSQL's generic option parser.
The updated comment clarifies that while technically accessible, it's intended for
internal use and not recommended for end-user use due to potential data loss.Hmm, I'm not sure the proposed comment improves the clarification.
It's essentially non-accessible from SQL since we cannot provide a
valid value for convert_selectively from SQL commands.Yeah, I'd leave it alone, at least for this patch.
--
nathanThanks for checking my proposal.
For the refactoring of the COPY options, it sounds reasonable to me. Let me take that changes in my patch.
Also one little thing:
+{ + {"default", copy_opt_default, true}, + {"delimiter", copy_opt_delimiter, true}, + {"encoding", copy_opt_encoding, true}, + {"escape", copy_opt_escape, true}, + {"force_not_null", copy_opt_force_not_null, true}, + {"force_null", copy_opt_force_null, true}, + {"force_quote", copy_opt_force_quote, true}, + {"format", copy_opt_format, true}, + {"freeze", copy_opt_freeze, true}, + {"header", copy_opt_header, true}, + {"log_verbosity", copy_opt_log_verbosity, true}, + {"null", copy_opt_null, true}, + {"on_error", copy_opt_on_error, true}, + {"quote", copy_opt_quote, true}, + {"reject_limit", copy_opt_reject_limit, true}, + {"convert_selectively", copy_opt_convert_selectively, false}, + {NULL, NULL, false} +};Maybe we need one more struct member here, to indicate which options
are valid to be specified by user?Also, pattern
static const struct {..} array_name[] = ... is not used in PostgreSQL
sources. At least, I do not see any use of such .--
Best regards,
Kirill ReshkeThanks for checking my proposal.
Maybe we need one more struct member here, to indicate which options
are valid to be specified by user?If you don't mind, I would like to make a separate patch for fixing the "convert_selectively"
option and focus on refactoring error handling here because I tend to feel we should
separate refactoring changes and non-backward compatible changes into different commits.
After this patch gets merged, I'll make another thread to discuss whether we should block
unexpected "convert_selectively" use or not.static const struct {..} array_name[] = ... is not used in PostgreSQL
sources. At least, I do not see any use of such .I saw several places that use that sort of style, for example src/backend/utils/adt/encode.c:836
and src/backend/access/heap/heapam.c:122, but you seems to be more or less correct since
usually we define types explicitly like src/backend/foreign/foreign.c:576 and src/backend/backup/basebackup.c:191.
I updated my patch by defining a new type `CopyCoptionDef`.Also, I added improvements to helper functions like defGet**. I just removed and unified those
into corresponding proceeOption** functions.Regards,
Hi,
Just a friendly ping on this thread.
In the latest version of the patch, I refactored COPY option handling so that:
All the COPY options and their validation functions are defined in a single table (CopyOptionDef), and
Error hints for invalid option names/values are generated based on that table.
The goal was to make it harder to forget updating the error-hinting logic when adding new options, and to keep validation logic in one place.
But on the other hand, I can also simplify this if you feel the current approach is too heavy. For example, one alternative would be to keep the existing per-option handling and just add a minimal option check like validate_copy_option() near the top of the main options loop in order to keep our implementations simple and small, even if that does not completely eliminate the chance of someone missing an update.
This is an alternative approche what I mentioned here.
```
list valid_options = ["format", "force_null", ...]foreach(opt, options)
{
validate_copy_option(opt, valid_options) <--- THISif (opt.name == "format") ...
if (opt.name == "force_null") ...
...
}
```
The proposed patch requires us to create one function per option. I'm
concerned that it could cause bloating functions and be overkill just
to save changing a separate list. I would suggest starting with
putting a validation check for options at the top of foreach() loop.
When adding a new COPY option in the future, it wouldn't work if we
miss either changing the valid-options list or handling the option,
which seems a good prevention.
Regarding convert_selectively, I have kept behavior and comments unchanged in this patch. As I said, I plan to propose a separate patch to address the possibility of users specifying convert_selectively from SQL (e.g., by rejecting it in the parser), once we agree on the direction for this refactoring.
+1 for a separate discussion and patch. Thank you for pointing out
that it actually can be accessible via SQL command.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Thu, 4 Dec 2025 at 03:56, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
The proposed patch requires us to create one function per option. I'm
concerned that it could cause bloating functions and be overkill just
to save changing a separate list. I would suggest starting with
putting a validation check for options at the top of foreach() loop.
When adding a new COPY option in the future, it wouldn't work if we
miss either changing the valid-options list or handling the option,
which seems a good prevention.
Yep, Im +1 on that. "bloating functions" - that's precise wording, I
did not know how to explain the same concern.
--
Best regards,
Kirill Reshke
On Thu, Dec 4, 2025 at 4:35 PM Kirill Reshke <reshkekirill@gmail.com> wrote:
On Thu, 4 Dec 2025 at 03:56, Masahiko Sawada <sawada.mshk@gmail.com>
wrote:The proposed patch requires us to create one function per option. I'm
concerned that it could cause bloating functions and be overkill just
to save changing a separate list. I would suggest starting with
putting a validation check for options at the top of foreach() loop.
When adding a new COPY option in the future, it wouldn't work if we
miss either changing the valid-options list or handling the option,
which seems a good prevention.Yep, Im +1 on that. "bloating functions" - that's precise wording, I
did not know how to explain the same concern.--
Best regards,
Kirill Reshke
Thanks everyone for reviewing my proposal.
The proposed patch requires us to create one function per option. I'm
concerned that it could cause bloating functions and be overkill just
to save changing a separate list. I would suggest starting with
putting a validation check for options at the top of foreach() loop.
When adding a new COPY option in the future, it wouldn't work if we
miss either changing the valid-options list or handling the option,
which seems a good prevention.
I completely agree with your feedback. I will proceed with a smaller and
simpler revision of the changes. The v3 patch was beneficial in that it
removed duplicated option names for COPY options in the code, but it
introduced too much refactoring for such a small improvement.
Attached is the new patch. One possible discussion point is that I choose
FATAL
over ERROR at src/backend/commands/copy.c#L816, since reaching that point
would
indicate an implementation problem. Please let me know if there is a better
option.
Regards,
Attachments:
v4-0001-Add-option-name-validation-and-hints-for-COPY-comman.patchapplication/octet-stream; name=v4-0001-Add-option-name-validation-and-hints-for-COPY-comman.patchDownload
From b7dd821ed89dcd2846169f46ff983e051b75ef5e Mon Sep 17 00:00:00 2001
From: Shinya Sugamoto <sugamoto@me.com>
Date: Fri, 5 Dec 2025 00:16:55 +0900
Subject: [PATCH] Add option name validation and hints for COPY command
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Add validation for COPY option names with helpful error messages:
- Suggest similar option names for typos (e.g., "foramt" -> "format")
- Show valid values for format, on_error, and log_verbosity options
- Internal options like convert_selectively are excluded from suggestions
---
contrib/file_fdw/expected/file_fdw.out | 3 +
src/backend/commands/copy.c | 91 ++++++++++++++++++++++++--
src/test/regress/expected/copy2.out | 31 +++++++++
src/test/regress/sql/copy2.sql | 8 +++
4 files changed, 128 insertions(+), 5 deletions(-)
diff --git a/contrib/file_fdw/expected/file_fdw.out b/contrib/file_fdw/expected/file_fdw.out
index 5121e27dce5..bf953c62a49 100644
--- a/contrib/file_fdw/expected/file_fdw.out
+++ b/contrib/file_fdw/expected/file_fdw.out
@@ -54,6 +54,7 @@ CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS ("a=b" 'true'); -- ERROR
ERROR: invalid option name "a=b": must not contain "="
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'xml'); -- ERROR
ERROR: COPY format "xml" not recognized
+HINT: Valid formats are "binary", "csv", and "text".
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'text', quote ':'); -- ERROR
ERROR: COPY QUOTE requires CSV mode
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'text', escape ':'); -- ERROR
@@ -96,10 +97,12 @@ CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'csv', null '
ERROR: COPY null representation cannot use newline or carriage return
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (on_error 'unsupported'); -- ERROR
ERROR: COPY ON_ERROR "unsupported" not recognized
+HINT: Valid values are "ignore" and "stop".
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'binary', on_error 'ignore'); -- ERROR
ERROR: only ON_ERROR STOP is allowed in BINARY mode
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (log_verbosity 'unsupported'); -- ERROR
ERROR: COPY LOG_VERBOSITY "unsupported" not recognized
+HINT: Valid values are "default", "silent", and "verbose".
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (reject_limit '1'); -- ERROR
ERROR: COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (on_error 'ignore', reject_limit '0'); -- ERROR
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 28e878c3688..1d48035e3e0 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -35,9 +35,83 @@
#include "parser/parse_relation.h"
#include "utils/acl.h"
#include "utils/builtins.h"
+#include "utils/elog.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
#include "utils/rls.h"
+#include "utils/varlena.h"
+
+/*
+ * List of valid COPY option names, used for validation and error hints.
+ *
+ * Options with suggest_in_hints=true will be suggested in error hints when a
+ * user specifies an unrecognized option name. Internal/undocumented options
+ * should set this to false.
+ */
+typedef struct CopyOptionDef
+{
+ const char *name;
+ bool suggest_in_hints;
+} CopyOptionDef;
+
+static const CopyOptionDef valid_copy_options[] = {
+ {"default", true},
+ {"delimiter", true},
+ {"encoding", true},
+ {"escape", true},
+ {"force_not_null", true},
+ {"force_null", true},
+ {"force_quote", true},
+ {"format", true},
+ {"freeze", true},
+ {"header", true},
+ {"log_verbosity", true},
+ {"null", true},
+ {"on_error", true},
+ {"quote", true},
+ {"reject_limit", true},
+ {"convert_selectively", false}, /* internal, undocumented */
+ {NULL, false}
+};
+
+/*
+ * Validate a COPY option name.
+ *
+ * If the option name is not recognized, report an error with a hint
+ * suggesting a similar valid option name if one exists.
+ */
+static void
+validate_copy_option(const char *option, ParseState *pstate, int location)
+{
+ ClosestMatchState match_state;
+ const char *closest_match;
+
+ /* Check if it's a valid option */
+ for (int i = 0; valid_copy_options[i].name != NULL; i++)
+ {
+ if (strcmp(valid_copy_options[i].name, option) == 0)
+ return; /* valid option */
+ }
+
+ /*
+ * Unknown option - build an error with a hint suggesting a similar option
+ * name if there is one.
+ */
+ initClosestMatch(&match_state, option, 4);
+ for (int i = 0; valid_copy_options[i].name != NULL; i++)
+ {
+ if (valid_copy_options[i].suggest_in_hints)
+ updateClosestMatch(&match_state, valid_copy_options[i].name);
+ }
+ closest_match = getClosestMatch(&match_state);
+
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("option \"%s\" not recognized", option),
+ closest_match ?
+ errhint("Perhaps you meant \"%s\".", closest_match) : 0,
+ parser_errposition(pstate, location)));
+}
/*
* DoCopy executes the SQL COPY statement
@@ -467,6 +541,7 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
/*- translator: first %s is the name of a COPY option, e.g. ON_ERROR */
errmsg("COPY %s \"%s\" not recognized", "ON_ERROR", sval),
+ errhint("Valid values are \"%s\" and \"%s\".", "ignore", "stop"),
parser_errposition(pstate, def->location)));
return COPY_ON_ERROR_STOP; /* keep compiler quiet */
}
@@ -525,6 +600,8 @@ defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate)
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
/*- translator: first %s is the name of a COPY option, e.g. ON_ERROR */
errmsg("COPY %s \"%s\" not recognized", "LOG_VERBOSITY", sval),
+ errhint("Valid values are \"%s\", \"%s\", and \"%s\".",
+ "default", "silent", "verbose"),
parser_errposition(pstate, def->location)));
return COPY_LOG_VERBOSITY_DEFAULT; /* keep compiler quiet */
}
@@ -570,6 +647,9 @@ ProcessCopyOptions(ParseState *pstate,
{
DefElem *defel = lfirst_node(DefElem, option);
+ /* Validate the option name; errors out if unrecognized */
+ validate_copy_option(defel->defname, pstate, defel->location);
+
if (strcmp(defel->defname, "format") == 0)
{
char *fmt = defGetString(defel);
@@ -587,6 +667,8 @@ ProcessCopyOptions(ParseState *pstate,
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("COPY format \"%s\" not recognized", fmt),
+ errhint("Valid formats are \"%s\", \"%s\", and \"%s\".",
+ "binary", "csv", "text"),
parser_errposition(pstate, defel->location)));
}
else if (strcmp(defel->defname, "freeze") == 0)
@@ -731,11 +813,10 @@ ProcessCopyOptions(ParseState *pstate,
opts_out->reject_limit = defGetCopyRejectLimitOption(defel);
}
else
- ereport(ERROR,
- (errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("option \"%s\" not recognized",
- defel->defname),
- parser_errposition(pstate, defel->location)));
+ ereport(FATAL,
+ (errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg("unhandled COPY option \"%s\"", defel->defname)));
+
}
/*
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index f3fdce23459..7480903023f 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -96,6 +96,7 @@ COPY x from stdin (on_error unsupported);
ERROR: COPY ON_ERROR "unsupported" not recognized
LINE 1: COPY x from stdin (on_error unsupported);
^
+HINT: Valid values are "ignore" and "stop".
COPY x from stdin (format TEXT, force_quote(a));
ERROR: COPY FORCE_QUOTE requires CSV mode
COPY x from stdin (format TEXT, force_quote *);
@@ -128,6 +129,7 @@ COPY x from stdin (log_verbosity unsupported);
ERROR: COPY LOG_VERBOSITY "unsupported" not recognized
LINE 1: COPY x from stdin (log_verbosity unsupported);
^
+HINT: Valid values are "default", "silent", and "verbose".
COPY x from stdin with (reject_limit 1);
ERROR: COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE
COPY x from stdin with (on_error ignore, reject_limit 0);
@@ -138,6 +140,35 @@ COPY x from stdin with (header 2.5);
ERROR: header requires a Boolean value, a non-negative integer, or the string "match"
COPY x to stdout with (header 2);
ERROR: cannot use multi-line header in COPY TO
+-- test error hints for invalid COPY options
+COPY x from stdin (foramt CSV); -- error, suggests "format"
+ERROR: option "foramt" not recognized
+LINE 1: COPY x from stdin (foramt CSV);
+ ^
+HINT: Perhaps you meant "format".
+COPY x from stdin (convert_selectivel (a)); -- error, should NOT suggest convert_selectively
+ERROR: option "convert_selectivel" not recognized
+LINE 1: COPY x from stdin (convert_selectivel (a));
+ ^
+COPY x from stdin (completely_invalid_option 'value'); -- error, no suggestion
+ERROR: option "completely_invalid_option" not recognized
+LINE 1: COPY x from stdin (completely_invalid_option 'value');
+ ^
+COPY x from stdin (format cvs); -- error, lists valid formats
+ERROR: COPY format "cvs" not recognized
+LINE 1: COPY x from stdin (format cvs);
+ ^
+HINT: Valid formats are "binary", "csv", and "text".
+COPY x from stdin (format CSV, on_error ignor); -- error, lists valid on_error values
+ERROR: COPY ON_ERROR "ignor" not recognized
+LINE 1: COPY x from stdin (format CSV, on_error ignor);
+ ^
+HINT: Valid values are "ignore" and "stop".
+COPY x from stdin (format CSV, log_verbosity verbos); -- error, lists valid log_verbosity values
+ERROR: COPY LOG_VERBOSITY "verbos" not recognized
+LINE 1: COPY x from stdin (format CSV, log_verbosity verbos);
+ ^
+HINT: Valid values are "default", "silent", and "verbose".
-- too many columns in column list: should fail
COPY x (a, b, c, d, e, d, c) from stdin;
ERROR: column "d" specified more than once
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index cef45868db5..39d8e3fd992 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -94,6 +94,14 @@ COPY x from stdin with (header -1);
COPY x from stdin with (header 2.5);
COPY x to stdout with (header 2);
+-- test error hints for invalid COPY options
+COPY x from stdin (foramt CSV); -- error, suggests "format"
+COPY x from stdin (convert_selectivel (a)); -- error, should NOT suggest convert_selectively
+COPY x from stdin (completely_invalid_option 'value'); -- error, no suggestion
+COPY x from stdin (format cvs); -- error, lists valid formats
+COPY x from stdin (format CSV, on_error ignor); -- error, lists valid on_error values
+COPY x from stdin (format CSV, log_verbosity verbos); -- error, lists valid log_verbosity values
+
-- too many columns in column list: should fail
COPY x (a, b, c, d, e, d, c) from stdin;
--
2.50.1 (Apple Git-155)
On Sun, Dec 7, 2025 at 6:32 AM Sugamoto Shinya <shinya34892@gmail.com> wrote:
On Thu, Dec 4, 2025 at 4:35 PM Kirill Reshke <reshkekirill@gmail.com> wrote:
On Thu, 4 Dec 2025 at 03:56, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
The proposed patch requires us to create one function per option. I'm
concerned that it could cause bloating functions and be overkill just
to save changing a separate list. I would suggest starting with
putting a validation check for options at the top of foreach() loop.
When adding a new COPY option in the future, it wouldn't work if we
miss either changing the valid-options list or handling the option,
which seems a good prevention.Yep, Im +1 on that. "bloating functions" - that's precise wording, I
did not know how to explain the same concern.--
Best regards,
Kirill ReshkeThanks everyone for reviewing my proposal.
The proposed patch requires us to create one function per option. I'm
concerned that it could cause bloating functions and be overkill just
to save changing a separate list. I would suggest starting with
putting a validation check for options at the top of foreach() loop.
When adding a new COPY option in the future, it wouldn't work if we
miss either changing the valid-options list or handling the option,
which seems a good prevention.I completely agree with your feedback. I will proceed with a smaller and
simpler revision of the changes. The v3 patch was beneficial in that it
removed duplicated option names for COPY options in the code, but it
introduced too much refactoring for such a small improvement.Attached is the new patch. One possible discussion point is that I choose FATAL
over ERROR at src/backend/commands/copy.c#L816, since reaching that point would
indicate an implementation problem. Please let me know if there is a better option.
We typically use elog(ERROR) for should-not-happen errors instead of
ereport(ERROR), but I don't think we need the last 'else if' branch
since we have validate_copy_option.
The patch mostly looks good to me. Here are some minor comments:
+#include "utils/elog.h"
I think we don't need to #include elog.h.
---
src/backend/commands/copy.c:819: trailing whitespace.
+
There is unnecessary whitespace.
---
+static const CopyOptionDef valid_copy_options[] = {
+ {"default", true},
+ {"delimiter", true},
+ {"encoding", true},
+ {"escape", true},
+ {"force_not_null", true},
+ {"force_null", true},
+ {"force_quote", true},
+ {"format", true},
+ {"freeze", true},
+ {"header", true},
+ {"log_verbosity", true},
+ {"null", true},
+ {"on_error", true},
+ {"quote", true},
+ {"reject_limit", true},
+ {"convert_selectively", false}, /* internal, undocumented */
+ {NULL, false}
+};
I think we can maintain this list in option name order (not sorted by
suggest_in_hints).
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Sun, Dec 07, 2025 at 11:32:28PM +0900, Sugamoto Shinya wrote:
Attached is the new patch. One possible discussion point is that I choose
FATAL over ERROR at src/backend/commands/copy.c#L816, since reaching that
point would indicate an implementation problem. Please let me know if
there is a better option.
Since it indicates a coding error, I would probably choose something like
Assert(false); /* should never happen */
That seems to be a standard way to handle these situations.
It's a little unfortunate that we are essentially validating the option
twice, but the way this is structured should at least prevent folks from
forgetting to add it in one place or another. One way to make this a
little better could be to add an enum that validate_copy_option() returns
(so that we don't have to repeat the strcmp()s). Or we could replace each
strcmp() in ProcessCopyOptions()'s option extraction block with a function
that does the strcmp() and also calls updateClosestMatch(). Then, by the
time we reach the final "else", we've already done the work for the hint.
--
nathan