Rejecting redundant options in Create Collation

Started by Daniel Veriteover 5 years ago5 messages
#1Daniel Verite
daniel@manitou-mail.org
1 attachment(s)

Hi,

Currently, it's not an error for CREATE COLLATION to be invoked
with options repeated several times. The last (rightmost) value is kept
and the others are lost.
For instance CREATE COLLATION x (lc_ctype='en_US.UTF8',
lc_collate='en_US.UTF8', lc_ctype='C')
silently ignores lc_ctype='en_US.UTF8'. But that kind of invocation
isn't likely to be legit. It's more plausible that it's the result of
some mistake or confusion.
The same goes for the other options:

CREATE COLLATION [ IF NOT EXISTS ] name (
[ LOCALE = locale, ]
[ LC_COLLATE = lc_collate, ]
[ LC_CTYPE = lc_ctype, ]
[ PROVIDER = provider, ]
[ DETERMINISTIC = boolean, ]
[ VERSION = version ]
)

I suggest the attached simple patch to raise an error when any of
these options is specified multiple times.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: https://www.manitou-mail.org
Twitter: @DanielVerite

Attachments:

create-collation-redundant-options.patchtext/plainDownload
diff --git a/src/backend/commands/collationcmds.c b/src/backend/commands/collationcmds.c
index 9f6582c530..7dfd544396 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -108,6 +108,19 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 			break;
 		}
 
+		if (*defelp != NULL)
+		{
+			/*
+			 * If the option was previously set, it means that it occurs
+			 * several times in the list, which is not allowed.
+			 */
+			ereport(ERROR,
+					(errcode(ERRCODE_SYNTAX_ERROR),
+					 errmsg("conflicting or redundant options",
+							defel->defname),
+					 parser_errposition(pstate, defel->location)));
+			break;
+		}
 		*defelp = defel;
 	}
 
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Verite (#1)
Re: Rejecting redundant options in Create Collation

"Daniel Verite" <daniel@manitou-mail.org> writes:

Currently, it's not an error for CREATE COLLATION to be invoked
with options repeated several times. The last (rightmost) value is kept
and the others are lost.
...
I suggest the attached simple patch to raise an error when any of
these options is specified multiple times.

Hmm ... I think that that is pretty standard behavior for a lot of
our utility commands. Trying something at random,

regression=# create operator <<< (leftarg = int, leftarg = int,
regression(# rightarg = int99, procedure = int4pl, rightarg = int);
CREATE OPERATOR

Note the lack of any comment on the flat-out-wrong first value
for rightarg :-(

I can see the argument for complaining about repeated options rather
than letting this pass. But shouldn't we try to make it uniform
instead of letting different commands have different policies?

regards, tom lane

#3Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#2)
Re: Rejecting redundant options in Create Collation

On Thu, Oct 01, 2020 at 02:58:53PM -0400, Tom Lane wrote:

Hmm ... I think that that is pretty standard behavior for a lot of
our utility commands. Trying something at random,

The behavior handling is a bit inconsistent. For example EXPLAIN and
VACUUM don't do that, because their parenthesized grammar got
introduced after the flavor that handles options as separate items in
the query, so redundant options was not something possible with only
the original grammar.

I can see the argument for complaining about repeated options rather
than letting this pass. But shouldn't we try to make it uniform
instead of letting different commands have different policies?

Consistency would be a good thing. There is a very recent discussion
in this area for COPY actually, where the option redundancy is handled
incorrectly:
/messages/by-id/20200929072433.GA15570@paquier.xyz

The argument goes in both ways. If we handle redundant options, I
would also suspect that users complain back about the lack of error
checks with stupid input values, but I'd like to think that it can
also be beneficial in some cases. Now, if we don't handle redundant
options, that's less code, less risk to introduce semi-broken
behaviors like what COPY has done for years and a bit less work for
new translators, but the last point is not really an issue as we have
a generic error message here. From the point of view of a code
maintainer, not handling redundant options at all is more appealing
IMO in the long-term. From the user perspective, there are benefits
and disadvantages with both. So, at the end, it seems to me that not
doing it anymore for any commands would have more benefits in the
long-term than doing it.
--
Michael

#4Daniel Verite
daniel@manitou-mail.org
In reply to: Michael Paquier (#3)
Re: Rejecting redundant options in Create Collation

Michael Paquier wrote:

Hmm ... I think that that is pretty standard behavior for a lot of
our utility commands. Trying something at random,

The behavior handling is a bit inconsistent. For example EXPLAIN and
VACUUM don't do that, because their parenthesized grammar got
introduced after the flavor that handles options as separate items in
the query, so redundant options was not something possible with only
the original grammar.

Assuming we agree that redundant options should consistently
raise an error for a certain class of statements, could it be handled
at the grammar level?
If "list of options enforcing uniqueness" was a grammatical construct,
the redundancy would be caught by the parser and there would be no
need for ad-hoc code in the implementation of utility statements.
I don't know if that makes sense, unfortunately I know next to nothing
about bison.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: https://www.manitou-mail.org
Twitter: @DanielVerite

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Verite (#4)
Re: Rejecting redundant options in Create Collation

"Daniel Verite" <daniel@manitou-mail.org> writes:

Assuming we agree that redundant options should consistently
raise an error for a certain class of statements, could it be handled
at the grammar level?

I don't think this'd be a great idea. The grammar would have to
do something pretty brute-force to check for duplication, whereas
the individual statements typically know already whether they've
seen an instance of a particular option.

It will be kind of annoying to make similar changes in every statement,
agreed. I wonder if there'd be any value in trying to make a subroutine
that deconstructs a DefElem list according to a provided list of option
names, allowing centralized handling of unknown-option and
duplicate-option cases, and maybe sharing handling of the simpler cases
such as boolean and integer options. I'm not quite sure what the output
ought to look like though. Some sort of statement-specific struct would
be really ideal, but C doesn't make that easy.

regards, tom lane