[PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

Started by Tommy Pavlicekover 2 years ago52 messages
#1Tommy Pavlicek
tommypav122@gmail.com
2 attachment(s)

Hi All,

I've attached a couple of patches to allow ALTER OPERATOR to add
commutators, negators, hashes and merges to operators that lack them.

The need for this arose adding hash functions to the ltree type after the
operator had been created without hash support[1]/messages/by-id/CAEhP-W9ZEoHeaP_nKnPCVd_o1c3BAUvq1gWHrq8EbkNRiS9CvQ@mail.gmail.com. There are potential
issues with modifying these attributes that have been discussed
previously[2]/messages/by-id/3348985.V7xMLFDaJO@dinodell, but I understand that setting them, if they have not been
set before, is ok.

I belatedly realised that it may not be desirable or necessary to allow
adding commutators and negators in ALTER OPERATOR because the linkage can
already be added when creating a new operator. I don't know what's best, so
I thought I'd post this here and get feedback before removing anything.

The first patch is create_op_fixes_v1.patch and it includes some
refactoring in preparation for the ALTER OPERATOR changes and fixes a
couple of minor bugs in CREATE OPERATOR:
- prevents self negation when filling in/creating an existing shell operator
- remove reference to sort operator in the self negation error message as
the sort attribute appears to be deprecated in Postgres 8.3

The second patch is alter_op_v1.patch which contains the changes to ALTER
OPERATOR and depends on create_op_fixes_v1.patch.

Additionally, I wasn't sure whether it was preferred to fail or succeed on
ALTERs that have no effect, such as adding hashes on an operator that
already allows them or disabling hashes on one that does not. I chose to
raise an error when this happens, on the thinking it was more explicit and
made the code simpler, even though the end result would be what the user
wanted.

Comments appreciated.

Thanks,
Tommy

[1]: /messages/by-id/CAEhP-W9ZEoHeaP_nKnPCVd_o1c3BAUvq1gWHrq8EbkNRiS9CvQ@mail.gmail.com
/messages/by-id/CAEhP-W9ZEoHeaP_nKnPCVd_o1c3BAUvq1gWHrq8EbkNRiS9CvQ@mail.gmail.com
[2]: /messages/by-id/3348985.V7xMLFDaJO@dinodell

Attachments:

alter_op_v1.patchapplication/octet-stream; name=alter_op_v1.patchDownload
diff --git a/doc/src/sgml/ref/alter_operator.sgml b/doc/src/sgml/ref/alter_operator.sgml
index a4a1af564f..04af611cca 100644
--- a/doc/src/sgml/ref/alter_operator.sgml
+++ b/doc/src/sgml/ref/alter_operator.sgml
@@ -29,8 +29,12 @@ ALTER OPERATOR <replaceable>name</replaceable> ( { <replaceable>left_type</repla
 
 ALTER OPERATOR <replaceable>name</replaceable> ( { <replaceable>left_type</replaceable> | NONE } , <replaceable>right_type</replaceable> )
     SET ( {  RESTRICT = { <replaceable class="parameter">res_proc</replaceable> | NONE }
-           | JOIN = { <replaceable class="parameter">join_proc</replaceable> | NONE }
-         } [, ... ] )
+            | JOIN = { <replaceable class="parameter">join_proc</replaceable> | NONE }
+            | COMMUTATOR = <replaceable class="parameter">com_op</replaceable> 
+            | NEGATOR = <replaceable class="parameter">neg_op</replaceable>
+            | HASHES
+            | MERGES
+          } [, ... ] )
 </synopsis>
  </refsynopsisdiv>
 
@@ -121,6 +125,42 @@ ALTER OPERATOR <replaceable>name</replaceable> ( { <replaceable>left_type</repla
      </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">com_op</replaceable></term>
+    <listitem>
+     <para>
+      The commutator of this operator. Can only be set if the operator does not have an existing commutator.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><replaceable class="parameter">neg_op</replaceable></term>
+    <listitem>
+     <para>
+      The negator of this operator. Can only be set if the operator does not have an existing negator.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><literal>HASHES</literal></term>
+    <listitem>
+     <para>
+     Indicates this operator can support a hash join. Can only be set when the operator does not support a hash join.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><literal>MERGES</literal></term>
+    <listitem>
+     <para>
+     Indicates this operator can support a merge join. Can only be set when the operator does not support a merge join.
+     </para>
+    </listitem>
+   </varlistentry>
+
   </variablelist>
  </refsect1>
 
diff --git a/src/backend/commands/operatorcmds.c b/src/backend/commands/operatorcmds.c
index cd7f83136f..3787c234d4 100644
--- a/src/backend/commands/operatorcmds.c
+++ b/src/backend/commands/operatorcmds.c
@@ -54,6 +54,7 @@
 
 static Oid	ValidateRestrictionEstimator(List *restrictionName);
 static Oid	ValidateJoinEstimator(List *joinName);
+static void ValidateBooleanIsTrue(DefElem *def, const char *paramName);
 
 /*
  * DefineOperator
@@ -359,6 +360,18 @@ ValidateJoinEstimator(List *joinName)
 	return joinOid;
 }
 
+static void
+ValidateBooleanIsTrue(DefElem *def, const char *paramName)
+{
+	if (!defGetBoolean(def))
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+				 errmsg("alter operator cannot disable \"%s\"",
+						paramName)));
+	}
+}
+
 /*
  * Guts of operator deletion.
  */
@@ -426,6 +439,12 @@ AlterOperator(AlterOperatorStmt *stmt)
 	List	   *joinName = NIL; /* optional join sel. function */
 	bool		updateJoin = false;
 	Oid			joinOid;
+	List	   *commutatorName = NIL;	/* optional commutator operator name */
+	Oid			commutatorOid;
+	List	   *negatorName = NIL;	/* optional negator operator name */
+	Oid			negatorOid;
+	bool		updateHashes = false;
+	bool		updateMerges = false;
 
 	/* Look up the operator */
 	oprId = LookupOperWithArgs(stmt->opername, false);
@@ -456,6 +475,24 @@ AlterOperator(AlterOperatorStmt *stmt)
 			joinName = param;
 			updateJoin = true;
 		}
+		else if (strcmp(defel->defname, "commutator") == 0)
+		{
+			commutatorName = defGetQualifiedName(defel);
+		}
+		else if (strcmp(defel->defname, "negator") == 0)
+		{
+			negatorName = defGetQualifiedName(defel);
+		}
+		else if (strcmp(defel->defname, "hashes") == 0)
+		{
+			ValidateBooleanIsTrue(defel, "hashes");
+			updateHashes = true;
+		}
+		else if (strcmp(defel->defname, "merges") == 0)
+		{
+			ValidateBooleanIsTrue(defel, "merges");
+			updateMerges = true;
+		}
 
 		/*
 		 * The rest of the options that CREATE accepts cannot be changed.
@@ -464,11 +501,7 @@ AlterOperator(AlterOperatorStmt *stmt)
 		else if (strcmp(defel->defname, "leftarg") == 0 ||
 				 strcmp(defel->defname, "rightarg") == 0 ||
 				 strcmp(defel->defname, "function") == 0 ||
-				 strcmp(defel->defname, "procedure") == 0 ||
-				 strcmp(defel->defname, "commutator") == 0 ||
-				 strcmp(defel->defname, "negator") == 0 ||
-				 strcmp(defel->defname, "hashes") == 0 ||
-				 strcmp(defel->defname, "merges") == 0)
+				 strcmp(defel->defname, "procedure") == 0)
 		{
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -488,7 +521,7 @@ AlterOperator(AlterOperatorStmt *stmt)
 					   NameStr(oprForm->oprname));
 
 	/*
-	 * Look up restriction and join estimators if specified
+	 * Look up Oid for any parameters specified
 	 */
 	if (restrictionName)
 		restrictionOid = ValidateRestrictionEstimator(restrictionName);
@@ -499,28 +532,77 @@ AlterOperator(AlterOperatorStmt *stmt)
 	else
 		joinOid = InvalidOid;
 
-	/* Perform additional checks, like OperatorCreate does */
-	if (!(OidIsValid(oprForm->oprleft) && OidIsValid(oprForm->oprright)))
+	if (commutatorName)
 	{
-		/* If it's not a binary op, these things mustn't be set: */
-		if (OidIsValid(joinOid))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only binary operators can have join selectivity")));
+		commutatorOid = OperatorGetOrCreateValidCommutator(commutatorName,
+														   oprForm->oid,
+														   oprForm->oprname.data,
+														   oprForm->oprnamespace,
+														   oprForm->oprleft,
+														   oprForm->oprright);
+
+		/*
+		 * we don't need to do anything extra for a self commutator as in
+		 * OperatorCreate as there we have to create the operator before
+		 * setting the commutator, but here the operator already exists and
+		 * the commutatorOid above is valid (and is the operator oid).
+		 */
 	}
+	else
+		commutatorOid = InvalidOid;
+
+	if (negatorName)
+		negatorOid = OperatorGetOrCreateValidNegator(negatorName,
+													 oprForm->oid,
+													 oprForm->oprname.data,
+													 oprForm->oprnamespace,
+													 oprForm->oprleft,
+													 oprForm->oprright);
+	else
+		negatorOid = InvalidOid;
 
-	if (oprForm->oprresult != BOOLOID)
+	/*
+	 * check that we're not changing any existing values that can't be changed
+	 */
+	if (OidIsValid(commutatorOid) && OidIsValid(oprForm->oprcom))
 	{
-		if (OidIsValid(restrictionOid))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can have restriction selectivity")));
-		if (OidIsValid(joinOid))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can have join selectivity")));
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+				 errmsg("operator attribute \"commutator\" cannot be changed if it has already been set")));
 	}
 
+	if (OidIsValid(negatorOid) && OidIsValid(oprForm->oprnegate))
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+				 errmsg("operator attribute \"negator\" cannot be changed if it has already been set")));
+	}
+
+	if (updateHashes && oprForm->oprcanhash)
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+				 errmsg("operator attribute \"hashes\" cannot be changed if it has already been set")));
+	}
+
+	if (updateMerges && oprForm->oprcanmerge)
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+				 errmsg("operator attribute \"merges\" cannot be changed if it has already been set")));
+	}
+
+	/* Perform additional checks, like OperatorCreate does */
+	OperatorValidateParams(oprForm->oprleft,
+						   oprForm->oprright,
+						   oprForm->oprresult,
+						   OidIsValid(commutatorOid),
+						   OidIsValid(negatorOid),
+						   OidIsValid(joinOid),
+						   OidIsValid(restrictionOid),
+						   updateMerges,
+						   updateHashes);
+
 	/* Update the tuple */
 	for (i = 0; i < Natts_pg_operator; ++i)
 	{
@@ -539,6 +621,30 @@ AlterOperator(AlterOperatorStmt *stmt)
 		values[Anum_pg_operator_oprjoin - 1] = joinOid;
 	}
 
+	if (OidIsValid(commutatorOid))
+	{
+		replaces[Anum_pg_operator_oprcom - 1] = true;
+		values[Anum_pg_operator_oprcom - 1] = ObjectIdGetDatum(commutatorOid);
+	}
+
+	if (OidIsValid(negatorOid))
+	{
+		replaces[Anum_pg_operator_oprnegate - 1] = true;
+		values[Anum_pg_operator_oprnegate - 1] = ObjectIdGetDatum(negatorOid);
+	}
+
+	if (updateMerges)
+	{
+		replaces[Anum_pg_operator_oprcanmerge - 1] = true;
+		values[Anum_pg_operator_oprcanmerge - 1] = true;
+	}
+
+	if (updateHashes)
+	{
+		replaces[Anum_pg_operator_oprcanhash - 1] = true;
+		values[Anum_pg_operator_oprcanhash - 1] = true;
+	}
+
 	tup = heap_modify_tuple(tup, RelationGetDescr(catalog),
 							values, nulls, replaces);
 
@@ -550,5 +656,8 @@ AlterOperator(AlterOperatorStmt *stmt)
 
 	table_close(catalog, NoLock);
 
+	if (OidIsValid(commutatorOid) || OidIsValid(negatorOid))
+		OperatorUpd(oprId, commutatorOid, negatorOid, false);
+
 	return address;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 39ab7eac0d..84017da3ee 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10101,6 +10101,8 @@ operator_def_elem: ColLabel '=' NONE
 						{ $$ = makeDefElem($1, NULL, @1); }
 				   | ColLabel '=' operator_def_arg
 						{ $$ = makeDefElem($1, (Node *) $3, @1); }
+				   | ColLabel
+						{ $$ = makeDefElem($1, NULL, @1); }
 		;
 
 /* must be similar enough to def_arg to avoid reduce/reduce conflicts */
diff --git a/src/test/regress/expected/alter_operator.out b/src/test/regress/expected/alter_operator.out
index 71bd484282..57cf4df155 100644
--- a/src/test/regress/expected/alter_operator.out
+++ b/src/test/regress/expected/alter_operator.out
@@ -25,7 +25,7 @@ ORDER BY 1;
 (3 rows)
 
 --
--- Reset and set params
+-- test reset and set restrict and join
 --
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = NONE);
 ALTER OPERATOR === (boolean, boolean) SET (JOIN = NONE);
@@ -106,34 +106,240 @@ ORDER BY 1;
  schema public                                         | n
 (3 rows)
 
---
--- Test invalid options.
---
-ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = ====);
-ERROR:  operator attribute "commutator" cannot be changed
-ALTER OPERATOR === (boolean, boolean) SET (NEGATOR = ====);
-ERROR:  operator attribute "negator" cannot be changed
+-- test cannot set non existant function
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = non_existent_func);
 ERROR:  function non_existent_func(internal, oid, internal, integer) does not exist
 ALTER OPERATOR === (boolean, boolean) SET (JOIN = non_existent_func);
 ERROR:  function non_existent_func(internal, oid, internal, smallint, internal) does not exist
-ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = !==);
-ERROR:  operator attribute "commutator" cannot be changed
-ALTER OPERATOR === (boolean, boolean) SET (NEGATOR = !==);
-ERROR:  operator attribute "negator" cannot be changed
--- invalid: non-lowercase quoted identifiers
+-- test non-lowercase quoted identifiers invalid
 ALTER OPERATOR & (bit, bit) SET ("Restrict" = _int_contsel, "Join" = _int_contjoinsel);
 ERROR:  operator attribute "Restrict" not recognized
 --
--- Test permission check. Must be owner to ALTER OPERATOR.
+-- test must be owner of operator to ALTER OPERATOR.
 --
 CREATE USER regress_alter_op_user;
 SET SESSION AUTHORIZATION regress_alter_op_user;
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = NONE);
 ERROR:  must be owner of operator ===
--- Clean up
 RESET SESSION AUTHORIZATION;
+--
+-- test set commutator, negator, hashes, and merges which can only be set if not
+-- already set
+--
+-- for these tests create operators with different left and right types so that
+-- we can validate that function signatures are handled correctly
+CREATE FUNCTION alter_op_test_fn_bool_real(boolean, real)
+RETURNS boolean AS $$ SELECT NULL::BOOLEAN; $$ LANGUAGE sql IMMUTABLE;
+CREATE FUNCTION alter_op_test_fn_real_bool(real, boolean)
+RETURNS boolean AS $$ SELECT NULL::BOOLEAN; $$ LANGUAGE sql IMMUTABLE;
+-- operator
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+-- commutator
+CREATE OPERATOR ==== (
+    LEFTARG = real,
+    RIGHTARG = boolean,
+    PROCEDURE = alter_op_test_fn_real_bool
+);
+-- negator
+CREATE OPERATOR !==== (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+-- test cannot set hashes or merges to false
+ALTER OPERATOR === (boolean, real) SET (HASHES = false);
+ERROR:  alter operator cannot disable "hashes"
+ALTER OPERATOR === (boolean, real) SET (MERGES = false);
+ERROR:  alter operator cannot disable "merges"
+-- test cannot set commutator or negator without owning them
+SET SESSION AUTHORIZATION regress_alter_op_user;
+-- we need need a new operator owned by regress_alter_op_user so that we are
+-- allowed to alter it. ==== and !==== are owned by the test user so we expect
+-- the alters below to fail.
+CREATE OPERATOR ===@@@ (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+ALTER OPERATOR ===@@@ (boolean, real) SET (COMMUTATOR= ====);
+ERROR:  must be owner of operator ====
+ALTER OPERATOR ===@@@ (boolean, real) SET (NEGATOR = !====);
+ERROR:  must be owner of operator !====
+-- validate operator is unchanged and commutator and negator are unset
+SELECT oprcom, oprnegate FROM pg_operator WHERE oprname = '===@@@'
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcom | oprnegate 
+--------+-----------
+      0 |         0
+(1 row)
+
+DROP OPERATOR ===@@@ (boolean, real);
+RESET SESSION AUTHORIZATION;
+-- test cannot set self negator
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = ===);
+ERROR:  operator cannot be its own negator
+-- validate no changes made
+SELECT oprcanmerge, oprcanhash, oprcom, oprnegate
+FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanmerge | oprcanhash | oprcom | oprnegate 
+-------------+------------+--------+-----------
+ f           | f          |      0 |         0
+(1 row)
+
+-- test set hashes
+ALTER OPERATOR === (boolean, real) SET (HASHES);
+SELECT oprcanhash FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanhash 
+------------
+ t
+(1 row)
+
+-- test set merges
+ALTER OPERATOR === (boolean, real) SET (MERGES);
+SELECT oprcanmerge FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanmerge 
+-------------
+ t
+(1 row)
+
+-- test set commutator
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ====);
+-- validate that the commutator has been set on both the operator and commutator,
+-- that they reference each other, and that the operator used is the existing
+-- one we created and not a new shell operator
+SELECT op.oprname AS operator_name, com.oprname AS commutator_name,
+  com.oprcode AS commutator_func
+  FROM pg_operator op
+  INNER JOIN pg_operator com ON (op.oid = com.oprcom AND op.oprcom = com.oid)
+  WHERE op.oprname = '==='
+  AND op.oprleft = 'boolean'::regtype AND op.oprright = 'real'::regtype;
+ operator_name | commutator_name |      commutator_func       
+---------------+-----------------+----------------------------
+ ===           | ====            | alter_op_test_fn_real_bool
+(1 row)
+
+-- test set negator
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !====);
+-- validate that the negator has been set on both the operator and negator, that
+-- they reference each other, and that the operator used is the existing one we
+-- created and not a new shell operator
+SELECT op.oprname AS operator_name, neg.oprname AS negator_name,
+  neg.oprcode AS negator_func
+  FROM pg_operator op
+  INNER JOIN pg_operator neg ON (op.oid = neg.oprnegate AND op.oprnegate = neg.oid)
+  WHERE op.oprname = '==='
+  AND op.oprleft = 'boolean'::regtype AND op.oprright = 'real'::regtype;
+ operator_name | negator_name |        negator_func        
+---------------+--------------+----------------------------
+ ===           | !====        | alter_op_test_fn_bool_real
+(1 row)
+
+-- validate that the final state of the operator is as we expect
+SELECT oprcanmerge, oprcanhash,
+       pg_describe_object('pg_operator'::regclass, oprcom, 0) AS commutator,
+       pg_describe_object('pg_operator'::regclass, oprnegate, 0) AS negator
+  FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanmerge | oprcanhash |         commutator          |           negator            
+-------------+------------+-----------------------------+------------------------------
+ t           | t          | operator ====(real,boolean) | operator !====(boolean,real)
+(1 row)
+
+-- test cannot set commutator, negator, hashes, and merges when already set
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = =);
+ERROR:  operator attribute "commutator" cannot be changed if it has already been set
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = =);
+ERROR:  operator attribute "negator" cannot be changed if it has already been set
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = !=);
+ERROR:  operator attribute "commutator" cannot be changed if it has already been set
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !=);
+ERROR:  operator attribute "negator" cannot be changed if it has already been set
+ALTER OPERATOR === (boolean, real) SET (HASHES);
+ERROR:  operator attribute "hashes" cannot be changed if it has already been set
+ALTER OPERATOR === (boolean, real) SET (MERGES);
+ERROR:  operator attribute "merges" cannot be changed if it has already been set
+ALTER OPERATOR === (boolean, real) SET (HASHES = false);
+ERROR:  alter operator cannot disable "hashes"
+ALTER OPERATOR === (boolean, real) SET (MERGES = false);
+ERROR:  alter operator cannot disable "merges"
+-- validate no changes made
+SELECT oprcanmerge, oprcanhash,
+       pg_describe_object('pg_operator'::regclass, oprcom, 0) AS commutator,
+       pg_describe_object('pg_operator'::regclass, oprnegate, 0) AS negator
+  FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanmerge | oprcanhash |         commutator          |           negator            
+-------------+------------+-----------------------------+------------------------------
+ t           | t          | operator ====(real,boolean) | operator !====(boolean,real)
+(1 row)
+
+--
+-- test setting undefined operator creates shell operator (matches the
+-- behaviour of CREATE OPERATOR)
+--
+DROP OPERATOR === (boolean, real);
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ===@@@);
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !===@@@);
+-- validate that shell operators are created for the commutator and negator with
+-- the commutator having reversed args and the negator matching the operator.
+-- The shell operators should have an empty function below.
+SELECT pg_describe_object('pg_operator'::regclass, op.oid, 0) AS operator,
+       pg_describe_object('pg_operator'::regclass, com.oid, 0) AS commutator,
+       com.oprcode AS commutator_func,
+       pg_describe_object('pg_operator'::regclass, neg.oid, 0) AS negator,
+       neg.oprcode AS negator_func
+  FROM pg_operator op
+  INNER JOIN pg_operator com ON (op.oid = com.oprcom AND op.oprcom = com.oid)
+  INNER JOIN pg_operator neg ON (op.oid = neg.oprnegate AND op.oprnegate = neg.oid)
+  WHERE op.oprname = '===' AND com.oprname = '===@@@' AND neg.oprname = '!===@@@'
+  AND op.oprleft = 'boolean'::regtype AND op.oprright = 'real'::regtype;
+          operator          |          commutator           | commutator_func |            negator             | negator_func 
+----------------------------+-------------------------------+-----------------+--------------------------------+--------------
+ operator ===(boolean,real) | operator ===@@@(real,boolean) | -               | operator !===@@@(boolean,real) | -
+(1 row)
+
+--
+-- test setting self commutator
+--
+DROP OPERATOR === (boolean, boolean);
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = boolean,
+    PROCEDURE = alter_op_test_fn
+);
+ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = ===);
+-- validate that the oprcom is the operator oid
+SELECT oprname FROM pg_operator 
+  WHERE oprname = '===' AND oid = oprcom
+  AND oprleft = 'boolean'::regtype AND oprright = 'boolean'::regtype;
+ oprname 
+---------
+ ===
+(1 row)
+
+--
+-- Clean up
+--
 DROP USER regress_alter_op_user;
 DROP OPERATOR === (boolean, boolean);
+DROP OPERATOR === (boolean, real);
+DROP OPERATOR ==== (real, boolean);
+DROP OPERATOR !==== (boolean, real);
+DROP OPERATOR ===@@@ (real, boolean);
+DROP OPERATOR !===@@@(boolean, real);
 DROP FUNCTION customcontsel(internal, oid, internal, integer);
 DROP FUNCTION alter_op_test_fn(boolean, boolean);
+DROP FUNCTION alter_op_test_fn_bool_real(boolean, real);
+DROP FUNCTION alter_op_test_fn_real_bool(real, boolean);
diff --git a/src/test/regress/sql/alter_operator.sql b/src/test/regress/sql/alter_operator.sql
index fd40370165..a7e0d4331f 100644
--- a/src/test/regress/sql/alter_operator.sql
+++ b/src/test/regress/sql/alter_operator.sql
@@ -22,7 +22,7 @@ WHERE classid = 'pg_operator'::regclass AND
 ORDER BY 1;
 
 --
--- Reset and set params
+-- test reset and set restrict and join
 --
 
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = NONE);
@@ -71,30 +71,215 @@ WHERE classid = 'pg_operator'::regclass AND
       objid = '===(bool,bool)'::regoperator
 ORDER BY 1;
 
---
--- Test invalid options.
---
-ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = ====);
-ALTER OPERATOR === (boolean, boolean) SET (NEGATOR = ====);
+-- test cannot set non existant function
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = non_existent_func);
 ALTER OPERATOR === (boolean, boolean) SET (JOIN = non_existent_func);
-ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = !==);
-ALTER OPERATOR === (boolean, boolean) SET (NEGATOR = !==);
 
--- invalid: non-lowercase quoted identifiers
+-- test non-lowercase quoted identifiers invalid
 ALTER OPERATOR & (bit, bit) SET ("Restrict" = _int_contsel, "Join" = _int_contjoinsel);
 
 --
--- Test permission check. Must be owner to ALTER OPERATOR.
+-- test must be owner of operator to ALTER OPERATOR.
 --
 CREATE USER regress_alter_op_user;
 SET SESSION AUTHORIZATION regress_alter_op_user;
 
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = NONE);
 
--- Clean up
 RESET SESSION AUTHORIZATION;
+
+--
+-- test set commutator, negator, hashes, and merges which can only be set if not
+-- already set
+--
+
+-- for these tests create operators with different left and right types so that
+-- we can validate that function signatures are handled correctly
+
+CREATE FUNCTION alter_op_test_fn_bool_real(boolean, real)
+RETURNS boolean AS $$ SELECT NULL::BOOLEAN; $$ LANGUAGE sql IMMUTABLE;
+
+CREATE FUNCTION alter_op_test_fn_real_bool(real, boolean)
+RETURNS boolean AS $$ SELECT NULL::BOOLEAN; $$ LANGUAGE sql IMMUTABLE;
+
+-- operator
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+
+-- commutator
+CREATE OPERATOR ==== (
+    LEFTARG = real,
+    RIGHTARG = boolean,
+    PROCEDURE = alter_op_test_fn_real_bool
+);
+
+-- negator
+CREATE OPERATOR !==== (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+
+-- test cannot set hashes or merges to false
+ALTER OPERATOR === (boolean, real) SET (HASHES = false);
+ALTER OPERATOR === (boolean, real) SET (MERGES = false);
+
+-- test cannot set commutator or negator without owning them
+SET SESSION AUTHORIZATION regress_alter_op_user;
+
+-- we need need a new operator owned by regress_alter_op_user so that we are
+-- allowed to alter it. ==== and !==== are owned by the test user so we expect
+-- the alters below to fail.
+CREATE OPERATOR ===@@@ (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+
+ALTER OPERATOR ===@@@ (boolean, real) SET (COMMUTATOR= ====);
+ALTER OPERATOR ===@@@ (boolean, real) SET (NEGATOR = !====);
+
+-- validate operator is unchanged and commutator and negator are unset
+SELECT oprcom, oprnegate FROM pg_operator WHERE oprname = '===@@@'
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+DROP OPERATOR ===@@@ (boolean, real);
+
+RESET SESSION AUTHORIZATION;
+
+-- test cannot set self negator
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = ===);
+
+-- validate no changes made
+SELECT oprcanmerge, oprcanhash, oprcom, oprnegate
+FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+-- test set hashes
+ALTER OPERATOR === (boolean, real) SET (HASHES);
+SELECT oprcanhash FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+-- test set merges
+ALTER OPERATOR === (boolean, real) SET (MERGES);
+SELECT oprcanmerge FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+-- test set commutator
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ====);
+
+-- validate that the commutator has been set on both the operator and commutator,
+-- that they reference each other, and that the operator used is the existing
+-- one we created and not a new shell operator
+SELECT op.oprname AS operator_name, com.oprname AS commutator_name,
+  com.oprcode AS commutator_func
+  FROM pg_operator op
+  INNER JOIN pg_operator com ON (op.oid = com.oprcom AND op.oprcom = com.oid)
+  WHERE op.oprname = '==='
+  AND op.oprleft = 'boolean'::regtype AND op.oprright = 'real'::regtype;
+
+-- test set negator
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !====);
+
+-- validate that the negator has been set on both the operator and negator, that
+-- they reference each other, and that the operator used is the existing one we
+-- created and not a new shell operator
+SELECT op.oprname AS operator_name, neg.oprname AS negator_name,
+  neg.oprcode AS negator_func
+  FROM pg_operator op
+  INNER JOIN pg_operator neg ON (op.oid = neg.oprnegate AND op.oprnegate = neg.oid)
+  WHERE op.oprname = '==='
+  AND op.oprleft = 'boolean'::regtype AND op.oprright = 'real'::regtype;
+
+-- validate that the final state of the operator is as we expect
+SELECT oprcanmerge, oprcanhash,
+       pg_describe_object('pg_operator'::regclass, oprcom, 0) AS commutator,
+       pg_describe_object('pg_operator'::regclass, oprnegate, 0) AS negator
+  FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+-- test cannot set commutator, negator, hashes, and merges when already set
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = =);
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = =);
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = !=);
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !=);
+ALTER OPERATOR === (boolean, real) SET (HASHES);
+ALTER OPERATOR === (boolean, real) SET (MERGES);
+ALTER OPERATOR === (boolean, real) SET (HASHES = false);
+ALTER OPERATOR === (boolean, real) SET (MERGES = false);
+
+-- validate no changes made
+SELECT oprcanmerge, oprcanhash,
+       pg_describe_object('pg_operator'::regclass, oprcom, 0) AS commutator,
+       pg_describe_object('pg_operator'::regclass, oprnegate, 0) AS negator
+  FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+--
+-- test setting undefined operator creates shell operator (matches the
+-- behaviour of CREATE OPERATOR)
+--
+
+DROP OPERATOR === (boolean, real);
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ===@@@);
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !===@@@);
+
+-- validate that shell operators are created for the commutator and negator with
+-- the commutator having reversed args and the negator matching the operator.
+-- The shell operators should have an empty function below.
+SELECT pg_describe_object('pg_operator'::regclass, op.oid, 0) AS operator,
+       pg_describe_object('pg_operator'::regclass, com.oid, 0) AS commutator,
+       com.oprcode AS commutator_func,
+       pg_describe_object('pg_operator'::regclass, neg.oid, 0) AS negator,
+       neg.oprcode AS negator_func
+  FROM pg_operator op
+  INNER JOIN pg_operator com ON (op.oid = com.oprcom AND op.oprcom = com.oid)
+  INNER JOIN pg_operator neg ON (op.oid = neg.oprnegate AND op.oprnegate = neg.oid)
+  WHERE op.oprname = '===' AND com.oprname = '===@@@' AND neg.oprname = '!===@@@'
+  AND op.oprleft = 'boolean'::regtype AND op.oprright = 'real'::regtype;
+
+
+--
+-- test setting self commutator
+--
+
+DROP OPERATOR === (boolean, boolean);
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = boolean,
+    PROCEDURE = alter_op_test_fn
+);
+
+ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = ===);
+
+-- validate that the oprcom is the operator oid
+SELECT oprname FROM pg_operator 
+  WHERE oprname = '===' AND oid = oprcom
+  AND oprleft = 'boolean'::regtype AND oprright = 'boolean'::regtype;
+
+--
+-- Clean up
+--
+
 DROP USER regress_alter_op_user;
+
 DROP OPERATOR === (boolean, boolean);
+DROP OPERATOR === (boolean, real);
+DROP OPERATOR ==== (real, boolean);
+DROP OPERATOR !==== (boolean, real);
+DROP OPERATOR ===@@@ (real, boolean);
+DROP OPERATOR !===@@@(boolean, real);
+
 DROP FUNCTION customcontsel(internal, oid, internal, integer);
 DROP FUNCTION alter_op_test_fn(boolean, boolean);
+DROP FUNCTION alter_op_test_fn_bool_real(boolean, real);
+DROP FUNCTION alter_op_test_fn_real_bool(real, boolean);
create_op_fixes_v1.patchapplication/octet-stream; name=create_op_fixes_v1.patchDownload
diff --git a/src/backend/catalog/pg_operator.c b/src/backend/catalog/pg_operator.c
index 95918a77a1..e2590cfce8 100644
--- a/src/backend/catalog/pg_operator.c
+++ b/src/backend/catalog/pg_operator.c
@@ -54,11 +54,10 @@ static Oid	OperatorShellMake(const char *operatorName,
 							  Oid leftTypeId,
 							  Oid rightTypeId);
 
-static Oid	get_other_operator(List *otherOp,
-							   Oid otherLeftTypeId, Oid otherRightTypeId,
-							   const char *operatorName, Oid operatorNamespace,
-							   Oid leftTypeId, Oid rightTypeId,
-							   bool isCommutator);
+static Oid	get_or_create_other_operator(List *otherOp,
+										 Oid otherLeftTypeId, Oid otherRightTypeId,
+										 const char *operatorName, Oid operatorNamespace,
+										 Oid leftTypeId, Oid rightTypeId);
 
 
 /*
@@ -361,53 +360,17 @@ OperatorCreate(const char *operatorName,
 				 errmsg("\"%s\" is not a valid operator name",
 						operatorName)));
 
-	if (!(OidIsValid(leftTypeId) && OidIsValid(rightTypeId)))
-	{
-		/* If it's not a binary op, these things mustn't be set: */
-		if (commutatorName)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only binary operators can have commutators")));
-		if (OidIsValid(joinId))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only binary operators can have join selectivity")));
-		if (canMerge)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only binary operators can merge join")));
-		if (canHash)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only binary operators can hash")));
-	}
-
 	operResultType = get_func_rettype(procedureId);
 
-	if (operResultType != BOOLOID)
-	{
-		/* If it's not a boolean op, these things mustn't be set: */
-		if (negatorName)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can have negators")));
-		if (OidIsValid(restrictionId))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can have restriction selectivity")));
-		if (OidIsValid(joinId))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can have join selectivity")));
-		if (canMerge)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can merge join")));
-		if (canHash)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can hash")));
-	}
+	OperatorValidateParams(leftTypeId,
+						   rightTypeId,
+						   operResultType,
+						   commutatorName,
+						   negatorName,
+						   OidIsValid(joinId),
+						   OidIsValid(restrictionId),
+						   canMerge,
+						   canHash);
 
 	operatorObjectId = OperatorGet(operatorName,
 								   operatorNamespace,
@@ -438,18 +401,12 @@ OperatorCreate(const char *operatorName,
 
 	if (commutatorName)
 	{
-		/* commutator has reversed arg types */
-		commutatorId = get_other_operator(commutatorName,
-										  rightTypeId, leftTypeId,
-										  operatorName, operatorNamespace,
-										  leftTypeId, rightTypeId,
-										  true);
-
-		/* Permission check: must own other operator */
-		if (OidIsValid(commutatorId) &&
-			!object_ownercheck(OperatorRelationId, commutatorId, GetUserId()))
-			aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_OPERATOR,
-						   NameListToString(commutatorName));
+		commutatorId = OperatorGetOrCreateValidCommutator(commutatorName,
+														  operatorObjectId,
+														  operatorName,
+														  operatorNamespace,
+														  leftTypeId,
+														  rightTypeId);
 
 		/*
 		 * self-linkage to this operator; will fix below. Note that only
@@ -462,20 +419,12 @@ OperatorCreate(const char *operatorName,
 		commutatorId = InvalidOid;
 
 	if (negatorName)
-	{
-		/* negator has same arg types */
-		negatorId = get_other_operator(negatorName,
-									   leftTypeId, rightTypeId,
-									   operatorName, operatorNamespace,
-									   leftTypeId, rightTypeId,
-									   false);
-
-		/* Permission check: must own other operator */
-		if (OidIsValid(negatorId) &&
-			!object_ownercheck(OperatorRelationId, negatorId, GetUserId()))
-			aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_OPERATOR,
-						   NameListToString(negatorName));
-	}
+		negatorId = OperatorGetOrCreateValidNegator(negatorName,
+													operatorObjectId,
+													operatorName,
+													operatorNamespace,
+													leftTypeId,
+													rightTypeId);
 	else
 		negatorId = InvalidOid;
 
@@ -574,17 +523,16 @@ OperatorCreate(const char *operatorName,
 }
 
 /*
- * Try to lookup another operator (commutator, etc)
+ * Try to lookup another operator (commutator, negator)
  *
  * If not found, check to see if it is exactly the operator we are trying
- * to define; if so, return InvalidOid.  (Note that this case is only
- * sensible for a commutator, so we error out otherwise.)  If it is not
- * the same operator, create a shell operator.
+ * to define; if so, return InvalidOid. If it is not the same operator, create
+ * a shell operator.
  */
 static Oid
-get_other_operator(List *otherOp, Oid otherLeftTypeId, Oid otherRightTypeId,
-				   const char *operatorName, Oid operatorNamespace,
-				   Oid leftTypeId, Oid rightTypeId, bool isCommutator)
+get_or_create_other_operator(List *otherOp, Oid otherLeftTypeId, Oid otherRightTypeId,
+							 const char *operatorName, Oid operatorNamespace,
+							 Oid leftTypeId, Oid rightTypeId)
 {
 	Oid			other_oid;
 	bool		otherDefined;
@@ -612,13 +560,12 @@ get_other_operator(List *otherOp, Oid otherLeftTypeId, Oid otherRightTypeId,
 		otherRightTypeId == rightTypeId)
 	{
 		/*
-		 * self-linkage to this operator; caller will fix later. Note that
-		 * only self-linkage for commutation makes sense.
+		 * Do not create as the other operator is the same as the operator:
+		 * self-linkage. This only occurs when the operator has not already
+		 * been created and we expect that the caller is going to create it.
+		 * We can't create it in this case so we return and let the caller
+		 * handle this scenario.
 		 */
-		if (!isCommutator)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("operator cannot be its own negator or sort operator")));
 		return InvalidOid;
 	}
 
@@ -867,3 +814,166 @@ makeOperatorDependencies(HeapTuple tuple,
 
 	return myself;
 }
+
+/*
+ * OperatorGetOrCreateValidCommutator
+ *
+ * This function gets or creates a commutator for an existing or yet to be
+ * created operator.
+ *
+ * For an operator that does not exist, pass in InvalidOid for the operatorOid.
+ *
+ * It will return:
+ * - InvalidOid if the commutator is the same as the operator and the operator
+ *   does not exist.
+ * - The Oid of the commutator (including the operator itself) if it exists
+ * - The Oid of a newly created commutator shell operator if the commutator does
+ *   not exist and is not the same as the operator.
+ *
+ * The function will error if the commutator is not owned by the user.
+ */
+
+Oid
+OperatorGetOrCreateValidCommutator(List *commutatorName,
+								   Oid operatorOid,
+								   const char *operatorName,
+								   Oid operatorNamespace,
+								   Oid operatorLeftTypeId,
+								   Oid operatorRightTypeId)
+{
+	Oid			commutatorId;
+
+	/* commutator has reversed arg types */
+	commutatorId = get_or_create_other_operator(commutatorName,
+												operatorRightTypeId, operatorLeftTypeId,
+												operatorName, operatorNamespace,
+												operatorLeftTypeId, operatorRightTypeId);
+
+	/* Permission check: must own other operator */
+	if (OidIsValid(commutatorId) &&
+		!object_ownercheck(OperatorRelationId, commutatorId, GetUserId()))
+		aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_OPERATOR,
+					   NameListToString(commutatorName));
+
+	return commutatorId;
+}
+
+/*
+ * OperatorGetOrCreateValidNegator
+ *
+ * This function gets or creates a negator for an existing or yet to be created
+ * operator. It will either return an existing operator or a newly created shell
+ * operator.
+ *
+ * For an operator that does not exist, pass in InvalidOid for the operatorOid.
+ *
+ * This function will either return a valid negator oid or error:
+ * - if the operator is the same as the negator as a negator cannot self link
+ * - if the negator is not owned by the user
+ */
+
+Oid
+OperatorGetOrCreateValidNegator(List *negatorName,
+								Oid operatorOid,
+								const char *operatorName,
+								Oid operatorNamespace,
+								Oid operatorLeftTypeId,
+								Oid operatorRightTypeId)
+{
+	Oid			negatorId;
+
+	/* negator has same arg types */
+	negatorId = get_or_create_other_operator(negatorName,
+											 operatorLeftTypeId, operatorRightTypeId,
+											 operatorName, operatorNamespace,
+											 operatorLeftTypeId, operatorRightTypeId);
+
+	/*
+	 * Check we're not self negating. An invalid negatorID means that the
+	 * neither the operator or negator exist, but that their signatures are
+	 * the same and therefore it's a self negator.
+	 */
+	if (!OidIsValid(negatorId) || negatorId == operatorOid)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+				 errmsg("operator cannot be its own negator")));
+
+	/* Permission check: must own other operator */
+	if (!object_ownercheck(OperatorRelationId, negatorId, GetUserId()))
+		aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_OPERATOR,
+					   NameListToString(negatorName));
+
+	return negatorId;
+}
+
+/*
+ * OperatorValidateParams
+ *
+ * This function validates that an operator with arguments of leftTypeId and
+ * rightTypeId returning operResultType can have the attributes that are set
+ * to true. If any attributes set are not compatible with the operator then this
+ * function will raise an error, otherwise it simply returns.
+ *
+ * This function doesn't check for missing requirements so any attribute that's
+ * false is considered valid by this function. As such, if you are modifying an
+ * operator you can set any attributes you are not modifying to false to validate
+ * the changes you are making.
+ */
+
+void
+OperatorValidateParams(Oid leftTypeId,
+					   Oid rightTypeId,
+					   Oid operResultType,
+					   bool hasCommutator,
+					   bool hasNegator,
+					   bool hasJoinSelectivity,
+					   bool hasRestrictionSelectivity,
+					   bool canMerge,
+					   bool canHash)
+{
+	if (!(OidIsValid(leftTypeId) && OidIsValid(rightTypeId)))
+	{
+		/* If it's not a binary op, these things mustn't be set: */
+		if (hasCommutator)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only binary operators can have commutators")));
+		if (hasJoinSelectivity)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only binary operators can have join selectivity")));
+		if (canMerge)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only binary operators can merge join")));
+		if (canHash)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only binary operators can hash")));
+	}
+
+	if (operResultType != BOOLOID)
+	{
+		/* If it's not a boolean op, these things mustn't be set: */
+		if (hasNegator)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only boolean operators can have negators")));
+		if (hasRestrictionSelectivity)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only boolean operators can have restriction selectivity")));
+		if (hasJoinSelectivity)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only boolean operators can have join selectivity")));
+		if (canMerge)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only boolean operators can merge join")));
+		if (canHash)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only boolean operators can hash")));
+	}
+}
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index e60cf782a6..7e1f8680a2 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -104,4 +104,28 @@ extern ObjectAddress makeOperatorDependencies(HeapTuple tuple,
 
 extern void OperatorUpd(Oid baseId, Oid commId, Oid negId, bool isDelete);
 
+extern Oid	OperatorGetOrCreateValidCommutator(List *commutatorName,
+											   Oid operatorOid,
+											   const char *operatorName,
+											   Oid operatorNamespace,
+											   Oid operatorLeftTypeId,
+											   Oid operatorRightTypeId);
+
+extern Oid	OperatorGetOrCreateValidNegator(List *negatorName,
+											Oid operatorOid,
+											const char *operatorName,
+											Oid operatorNamespace,
+											Oid operatorLeftTypeId,
+											Oid operatorRightTypeId);
+
+extern void OperatorValidateParams(Oid leftTypeId,
+								   Oid rightTypeId,
+								   Oid operResultType,
+								   bool hasCommutator,
+								   bool hasNegator,
+								   bool hasJoinSelectivity,
+								   bool hasRestrictionSelectivity,
+								   bool canMerge,
+								   bool canHash);
+
 #endif							/* PG_OPERATOR_H */
diff --git a/src/test/regress/expected/create_operator.out b/src/test/regress/expected/create_operator.out
index f71b601f2d..49070ce90f 100644
--- a/src/test/regress/expected/create_operator.out
+++ b/src/test/regress/expected/create_operator.out
@@ -260,6 +260,42 @@ CREATE OPERATOR #*# (
 );
 ERROR:  permission denied for type type_op6
 ROLLBACK;
+-- Should fail. An operator cannot have a self negator.
+BEGIN TRANSACTION;
+CREATE FUNCTION create_op_test_fn(boolean, boolean)
+RETURNS boolean AS $$
+    SELECT NULL::BOOLEAN;
+$$ LANGUAGE sql IMMUTABLE;
+CREATE OPERATOR === (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    negator = ===
+);
+ERROR:  operator cannot be its own negator
+ROLLBACK;
+-- Should fail. An operator cannot have a self negator. Here we test that when
+-- 'creating' an existing shell operator, it checks the negator is not self.
+BEGIN TRANSACTION;
+CREATE FUNCTION create_op_test_fn(boolean, boolean)
+RETURNS boolean AS $$
+    SELECT NULL::BOOLEAN;
+$$ LANGUAGE sql IMMUTABLE;
+-- create a shell operator for ===!!! by referencing it as a commutator
+CREATE OPERATOR === (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    commutator = ===!!!
+);
+CREATE OPERATOR ===!!! (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    negator = ===!!!
+);
+ERROR:  operator cannot be its own negator
+ROLLBACK;
 -- invalid: non-lowercase quoted identifiers
 CREATE OPERATOR ===
 (
diff --git a/src/test/regress/sql/create_operator.sql b/src/test/regress/sql/create_operator.sql
index f53e24db3c..168cad3814 100644
--- a/src/test/regress/sql/create_operator.sql
+++ b/src/test/regress/sql/create_operator.sql
@@ -210,6 +210,42 @@ CREATE OPERATOR #*# (
 );
 ROLLBACK;
 
+-- Should fail. An operator cannot have a self negator.
+BEGIN TRANSACTION;
+CREATE FUNCTION create_op_test_fn(boolean, boolean)
+RETURNS boolean AS $$
+    SELECT NULL::BOOLEAN;
+$$ LANGUAGE sql IMMUTABLE;
+CREATE OPERATOR === (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    negator = ===
+);
+ROLLBACK;
+
+-- Should fail. An operator cannot have a self negator. Here we test that when
+-- 'creating' an existing shell operator, it checks the negator is not self.
+BEGIN TRANSACTION;
+CREATE FUNCTION create_op_test_fn(boolean, boolean)
+RETURNS boolean AS $$
+    SELECT NULL::BOOLEAN;
+$$ LANGUAGE sql IMMUTABLE;
+-- create a shell operator for ===!!! by referencing it as a commutator
+CREATE OPERATOR === (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    commutator = ===!!!
+);
+CREATE OPERATOR ===!!! (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    negator = ===!!!
+);
+ROLLBACK;
+
 -- invalid: non-lowercase quoted identifiers
 CREATE OPERATOR ===
 (
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tommy Pavlicek (#1)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

Tommy Pavlicek <tommypav122@gmail.com> writes:

I've attached a couple of patches to allow ALTER OPERATOR to add
commutators, negators, hashes and merges to operators that lack them.

Please add this to the upcoming commitfest [1]https://commitfest.postgresql.org/43/, to ensure we don't
lose track of it.

The first patch is create_op_fixes_v1.patch and it includes some
refactoring in preparation for the ALTER OPERATOR changes and fixes a
couple of minor bugs in CREATE OPERATOR:
- prevents self negation when filling in/creating an existing shell operator
- remove reference to sort operator in the self negation error message as
the sort attribute appears to be deprecated in Postgres 8.3

Hmm, yeah, I bet nobody has looked at those edge cases in awhile.

Additionally, I wasn't sure whether it was preferred to fail or succeed on
ALTERs that have no effect, such as adding hashes on an operator that
already allows them or disabling hashes on one that does not. I chose to
raise an error when this happens, on the thinking it was more explicit and
made the code simpler, even though the end result would be what the user
wanted.

You could argue that both ways I guess. We definitely need to raise error
if the command tries to change an existing nondefault setting, since that
might break things as per previous discussion. But perhaps rejecting
an attempt to set the existing setting is overly nanny-ish. Personally
I think I'd lean to "don't throw an error if we don't have to", but I'm
not strongly set on that position.

(Don't we have existing precedents that apply here? I can't offhand
think of any existing ALTER commands that would reject no-op requests,
but maybe that's not a direct precedent.)

regards, tom lane

[1]: https://commitfest.postgresql.org/43/

In reply to: Tom Lane (#2)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

Tom Lane <tgl@sss.pgh.pa.us> writes:

Tommy Pavlicek <tommypav122@gmail.com> writes:

Additionally, I wasn't sure whether it was preferred to fail or succeed on
ALTERs that have no effect, such as adding hashes on an operator that
already allows them or disabling hashes on one that does not. I chose to
raise an error when this happens, on the thinking it was more explicit and
made the code simpler, even though the end result would be what the user
wanted.

You could argue that both ways I guess. We definitely need to raise error
if the command tries to change an existing nondefault setting, since that
might break things as per previous discussion. But perhaps rejecting
an attempt to set the existing setting is overly nanny-ish. Personally
I think I'd lean to "don't throw an error if we don't have to", but I'm
not strongly set on that position.

(Don't we have existing precedents that apply here? I can't offhand
think of any existing ALTER commands that would reject no-op requests,
but maybe that's not a direct precedent.)

Since it only supports adding these operations if they don't already
exist, should it not be ALTER OPERATOR ADD <thing>, not SET <thing>?

That makes it natural to add an IF NOT EXISTS clause, like ALTER TABLE
ADD COLUMN has, to make it a no-op instead of an error.

regards, tom lane

- ilmari

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dagfinn Ilmari Mannsåker (#3)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

=?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?= <ilmari@ilmari.org> writes:

Tom Lane <tgl@sss.pgh.pa.us> writes:

(Don't we have existing precedents that apply here? I can't offhand
think of any existing ALTER commands that would reject no-op requests,
but maybe that's not a direct precedent.)

Since it only supports adding these operations if they don't already
exist, should it not be ALTER OPERATOR ADD <thing>, not SET <thing>?
That makes it natural to add an IF NOT EXISTS clause, like ALTER TABLE
ADD COLUMN has, to make it a no-op instead of an error.

Hmm, maybe. But it feels like choosing syntax and semantics based
on what might be only a temporary implementation restriction. We
certainly don't handle any other property-setting commands that way.

Admittedly, "can't change an existing setting" is likely to be pretty
permanent in this case, just because I don't see a use-case for it
that'd justify the work involved. (My wife recently gave me a coffee
cup that says "Nothing is as permanent as a temporary fix.") But
still, if someone did show up and do that work, we'd regret this
choice of syntax because it'd then be uselessly unlike every other
ALTER command.

regards, tom lane

#5Tommy Pavlicek
tommypav122@gmail.com
In reply to: Tom Lane (#4)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

Tom Lane <tgl@sss.pgh.pa.us> writes:

Please add this to the upcoming commitfest [1], to ensure we don't
lose track of it.

I've added a single patch here: https://commitfest.postgresql.org/43/4389/

It wasn't obvious whether I should create a second commitfest entry
because I've included 2 patches so I've just done 1 to begin with. On
that note, is it preferred here to split patches of this size into
separate patches, and if so, additionally, separate threads?

Tom Lane <tgl@sss.pgh.pa.us> writes:

Additionally, I wasn't sure whether it was preferred to fail or succeed on
ALTERs that have no effect, such as adding hashes on an operator that
already allows them or disabling hashes on one that does not. I chose to
raise an error when this happens, on the thinking it was more explicit and
made the code simpler, even though the end result would be what the user
wanted.

You could argue that both ways I guess. We definitely need to raise error
if the command tries to change an existing nondefault setting, since that
might break things as per previous discussion. But perhaps rejecting
an attempt to set the existing setting is overly nanny-ish. Personally
I think I'd lean to "don't throw an error if we don't have to", but I'm
not strongly set on that position.

(Don't we have existing precedents that apply here? I can't offhand
think of any existing ALTER commands that would reject no-op requests,
but maybe that's not a direct precedent.)

My initial thinking behind the error for a no-op was largely driven by
the existence of 'DROP.. IF EXISTS'. However, I did some ad hoc
testing on ALTER commands and it does seem that they mostly allow
no-ops. I did find that renaming an object to the same name will fail
due to the object already existing, but that seems to be more of a
coincidence than a design decision to me. Given this, I also lean
towards allowing the no-ops and will change it unless there are
objections.

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tommy Pavlicek (#5)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

Tommy Pavlicek <tommypav122@gmail.com> writes:

I've added a single patch here: https://commitfest.postgresql.org/43/4389/

It wasn't obvious whether I should create a second commitfest entry
because I've included 2 patches so I've just done 1 to begin with. On
that note, is it preferred here to split patches of this size into
separate patches, and if so, additionally, separate threads?

No, our commitfest infrastructure is unable to deal with patches that have
interdependencies unless they're presented in a single email. So just use
one thread, and be sure to attach all the patches each time.

(BTW, while you seem to have gotten away with it so far, it's usually
advisable to name the patch files like 0001-foo.patch, 0002-bar.patch,
etc, to make sure the cfbot understands what order to apply them in.)

regards, tom lane

#7Tommy Pavlicek
tommypav122@gmail.com
In reply to: Tom Lane (#6)
2 attachment(s)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

On Fri, Jun 23, 2023 at 12:21 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Tommy Pavlicek <tommypav122@gmail.com> writes:

I've added a single patch here: https://commitfest.postgresql.org/43/4389/

It wasn't obvious whether I should create a second commitfest entry
because I've included 2 patches so I've just done 1 to begin with. On
that note, is it preferred here to split patches of this size into
separate patches, and if so, additionally, separate threads?

No, our commitfest infrastructure is unable to deal with patches that have
interdependencies unless they're presented in a single email. So just use
one thread, and be sure to attach all the patches each time.

(BTW, while you seem to have gotten away with it so far, it's usually
advisable to name the patch files like 0001-foo.patch, 0002-bar.patch,
etc, to make sure the cfbot understands what order to apply them in.)

regards, tom lane

Thanks.

I've attached a new version of the ALTER OPERATOR patch that allows
no-ops. It should be ready to review now.

Attachments:

0001-create_op_fixes_v1.patchapplication/octet-stream; name=0001-create_op_fixes_v1.patchDownload
diff --git a/src/backend/catalog/pg_operator.c b/src/backend/catalog/pg_operator.c
index 95918a77a1..e2590cfce8 100644
--- a/src/backend/catalog/pg_operator.c
+++ b/src/backend/catalog/pg_operator.c
@@ -54,11 +54,10 @@ static Oid	OperatorShellMake(const char *operatorName,
 							  Oid leftTypeId,
 							  Oid rightTypeId);
 
-static Oid	get_other_operator(List *otherOp,
-							   Oid otherLeftTypeId, Oid otherRightTypeId,
-							   const char *operatorName, Oid operatorNamespace,
-							   Oid leftTypeId, Oid rightTypeId,
-							   bool isCommutator);
+static Oid	get_or_create_other_operator(List *otherOp,
+										 Oid otherLeftTypeId, Oid otherRightTypeId,
+										 const char *operatorName, Oid operatorNamespace,
+										 Oid leftTypeId, Oid rightTypeId);
 
 
 /*
@@ -361,53 +360,17 @@ OperatorCreate(const char *operatorName,
 				 errmsg("\"%s\" is not a valid operator name",
 						operatorName)));
 
-	if (!(OidIsValid(leftTypeId) && OidIsValid(rightTypeId)))
-	{
-		/* If it's not a binary op, these things mustn't be set: */
-		if (commutatorName)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only binary operators can have commutators")));
-		if (OidIsValid(joinId))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only binary operators can have join selectivity")));
-		if (canMerge)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only binary operators can merge join")));
-		if (canHash)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only binary operators can hash")));
-	}
-
 	operResultType = get_func_rettype(procedureId);
 
-	if (operResultType != BOOLOID)
-	{
-		/* If it's not a boolean op, these things mustn't be set: */
-		if (negatorName)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can have negators")));
-		if (OidIsValid(restrictionId))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can have restriction selectivity")));
-		if (OidIsValid(joinId))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can have join selectivity")));
-		if (canMerge)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can merge join")));
-		if (canHash)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can hash")));
-	}
+	OperatorValidateParams(leftTypeId,
+						   rightTypeId,
+						   operResultType,
+						   commutatorName,
+						   negatorName,
+						   OidIsValid(joinId),
+						   OidIsValid(restrictionId),
+						   canMerge,
+						   canHash);
 
 	operatorObjectId = OperatorGet(operatorName,
 								   operatorNamespace,
@@ -438,18 +401,12 @@ OperatorCreate(const char *operatorName,
 
 	if (commutatorName)
 	{
-		/* commutator has reversed arg types */
-		commutatorId = get_other_operator(commutatorName,
-										  rightTypeId, leftTypeId,
-										  operatorName, operatorNamespace,
-										  leftTypeId, rightTypeId,
-										  true);
-
-		/* Permission check: must own other operator */
-		if (OidIsValid(commutatorId) &&
-			!object_ownercheck(OperatorRelationId, commutatorId, GetUserId()))
-			aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_OPERATOR,
-						   NameListToString(commutatorName));
+		commutatorId = OperatorGetOrCreateValidCommutator(commutatorName,
+														  operatorObjectId,
+														  operatorName,
+														  operatorNamespace,
+														  leftTypeId,
+														  rightTypeId);
 
 		/*
 		 * self-linkage to this operator; will fix below. Note that only
@@ -462,20 +419,12 @@ OperatorCreate(const char *operatorName,
 		commutatorId = InvalidOid;
 
 	if (negatorName)
-	{
-		/* negator has same arg types */
-		negatorId = get_other_operator(negatorName,
-									   leftTypeId, rightTypeId,
-									   operatorName, operatorNamespace,
-									   leftTypeId, rightTypeId,
-									   false);
-
-		/* Permission check: must own other operator */
-		if (OidIsValid(negatorId) &&
-			!object_ownercheck(OperatorRelationId, negatorId, GetUserId()))
-			aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_OPERATOR,
-						   NameListToString(negatorName));
-	}
+		negatorId = OperatorGetOrCreateValidNegator(negatorName,
+													operatorObjectId,
+													operatorName,
+													operatorNamespace,
+													leftTypeId,
+													rightTypeId);
 	else
 		negatorId = InvalidOid;
 
@@ -574,17 +523,16 @@ OperatorCreate(const char *operatorName,
 }
 
 /*
- * Try to lookup another operator (commutator, etc)
+ * Try to lookup another operator (commutator, negator)
  *
  * If not found, check to see if it is exactly the operator we are trying
- * to define; if so, return InvalidOid.  (Note that this case is only
- * sensible for a commutator, so we error out otherwise.)  If it is not
- * the same operator, create a shell operator.
+ * to define; if so, return InvalidOid. If it is not the same operator, create
+ * a shell operator.
  */
 static Oid
-get_other_operator(List *otherOp, Oid otherLeftTypeId, Oid otherRightTypeId,
-				   const char *operatorName, Oid operatorNamespace,
-				   Oid leftTypeId, Oid rightTypeId, bool isCommutator)
+get_or_create_other_operator(List *otherOp, Oid otherLeftTypeId, Oid otherRightTypeId,
+							 const char *operatorName, Oid operatorNamespace,
+							 Oid leftTypeId, Oid rightTypeId)
 {
 	Oid			other_oid;
 	bool		otherDefined;
@@ -612,13 +560,12 @@ get_other_operator(List *otherOp, Oid otherLeftTypeId, Oid otherRightTypeId,
 		otherRightTypeId == rightTypeId)
 	{
 		/*
-		 * self-linkage to this operator; caller will fix later. Note that
-		 * only self-linkage for commutation makes sense.
+		 * Do not create as the other operator is the same as the operator:
+		 * self-linkage. This only occurs when the operator has not already
+		 * been created and we expect that the caller is going to create it.
+		 * We can't create it in this case so we return and let the caller
+		 * handle this scenario.
 		 */
-		if (!isCommutator)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("operator cannot be its own negator or sort operator")));
 		return InvalidOid;
 	}
 
@@ -867,3 +814,166 @@ makeOperatorDependencies(HeapTuple tuple,
 
 	return myself;
 }
+
+/*
+ * OperatorGetOrCreateValidCommutator
+ *
+ * This function gets or creates a commutator for an existing or yet to be
+ * created operator.
+ *
+ * For an operator that does not exist, pass in InvalidOid for the operatorOid.
+ *
+ * It will return:
+ * - InvalidOid if the commutator is the same as the operator and the operator
+ *   does not exist.
+ * - The Oid of the commutator (including the operator itself) if it exists
+ * - The Oid of a newly created commutator shell operator if the commutator does
+ *   not exist and is not the same as the operator.
+ *
+ * The function will error if the commutator is not owned by the user.
+ */
+
+Oid
+OperatorGetOrCreateValidCommutator(List *commutatorName,
+								   Oid operatorOid,
+								   const char *operatorName,
+								   Oid operatorNamespace,
+								   Oid operatorLeftTypeId,
+								   Oid operatorRightTypeId)
+{
+	Oid			commutatorId;
+
+	/* commutator has reversed arg types */
+	commutatorId = get_or_create_other_operator(commutatorName,
+												operatorRightTypeId, operatorLeftTypeId,
+												operatorName, operatorNamespace,
+												operatorLeftTypeId, operatorRightTypeId);
+
+	/* Permission check: must own other operator */
+	if (OidIsValid(commutatorId) &&
+		!object_ownercheck(OperatorRelationId, commutatorId, GetUserId()))
+		aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_OPERATOR,
+					   NameListToString(commutatorName));
+
+	return commutatorId;
+}
+
+/*
+ * OperatorGetOrCreateValidNegator
+ *
+ * This function gets or creates a negator for an existing or yet to be created
+ * operator. It will either return an existing operator or a newly created shell
+ * operator.
+ *
+ * For an operator that does not exist, pass in InvalidOid for the operatorOid.
+ *
+ * This function will either return a valid negator oid or error:
+ * - if the operator is the same as the negator as a negator cannot self link
+ * - if the negator is not owned by the user
+ */
+
+Oid
+OperatorGetOrCreateValidNegator(List *negatorName,
+								Oid operatorOid,
+								const char *operatorName,
+								Oid operatorNamespace,
+								Oid operatorLeftTypeId,
+								Oid operatorRightTypeId)
+{
+	Oid			negatorId;
+
+	/* negator has same arg types */
+	negatorId = get_or_create_other_operator(negatorName,
+											 operatorLeftTypeId, operatorRightTypeId,
+											 operatorName, operatorNamespace,
+											 operatorLeftTypeId, operatorRightTypeId);
+
+	/*
+	 * Check we're not self negating. An invalid negatorID means that the
+	 * neither the operator or negator exist, but that their signatures are
+	 * the same and therefore it's a self negator.
+	 */
+	if (!OidIsValid(negatorId) || negatorId == operatorOid)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+				 errmsg("operator cannot be its own negator")));
+
+	/* Permission check: must own other operator */
+	if (!object_ownercheck(OperatorRelationId, negatorId, GetUserId()))
+		aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_OPERATOR,
+					   NameListToString(negatorName));
+
+	return negatorId;
+}
+
+/*
+ * OperatorValidateParams
+ *
+ * This function validates that an operator with arguments of leftTypeId and
+ * rightTypeId returning operResultType can have the attributes that are set
+ * to true. If any attributes set are not compatible with the operator then this
+ * function will raise an error, otherwise it simply returns.
+ *
+ * This function doesn't check for missing requirements so any attribute that's
+ * false is considered valid by this function. As such, if you are modifying an
+ * operator you can set any attributes you are not modifying to false to validate
+ * the changes you are making.
+ */
+
+void
+OperatorValidateParams(Oid leftTypeId,
+					   Oid rightTypeId,
+					   Oid operResultType,
+					   bool hasCommutator,
+					   bool hasNegator,
+					   bool hasJoinSelectivity,
+					   bool hasRestrictionSelectivity,
+					   bool canMerge,
+					   bool canHash)
+{
+	if (!(OidIsValid(leftTypeId) && OidIsValid(rightTypeId)))
+	{
+		/* If it's not a binary op, these things mustn't be set: */
+		if (hasCommutator)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only binary operators can have commutators")));
+		if (hasJoinSelectivity)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only binary operators can have join selectivity")));
+		if (canMerge)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only binary operators can merge join")));
+		if (canHash)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only binary operators can hash")));
+	}
+
+	if (operResultType != BOOLOID)
+	{
+		/* If it's not a boolean op, these things mustn't be set: */
+		if (hasNegator)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only boolean operators can have negators")));
+		if (hasRestrictionSelectivity)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only boolean operators can have restriction selectivity")));
+		if (hasJoinSelectivity)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only boolean operators can have join selectivity")));
+		if (canMerge)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only boolean operators can merge join")));
+		if (canHash)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only boolean operators can hash")));
+	}
+}
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index e60cf782a6..7e1f8680a2 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -104,4 +104,28 @@ extern ObjectAddress makeOperatorDependencies(HeapTuple tuple,
 
 extern void OperatorUpd(Oid baseId, Oid commId, Oid negId, bool isDelete);
 
+extern Oid	OperatorGetOrCreateValidCommutator(List *commutatorName,
+											   Oid operatorOid,
+											   const char *operatorName,
+											   Oid operatorNamespace,
+											   Oid operatorLeftTypeId,
+											   Oid operatorRightTypeId);
+
+extern Oid	OperatorGetOrCreateValidNegator(List *negatorName,
+											Oid operatorOid,
+											const char *operatorName,
+											Oid operatorNamespace,
+											Oid operatorLeftTypeId,
+											Oid operatorRightTypeId);
+
+extern void OperatorValidateParams(Oid leftTypeId,
+								   Oid rightTypeId,
+								   Oid operResultType,
+								   bool hasCommutator,
+								   bool hasNegator,
+								   bool hasJoinSelectivity,
+								   bool hasRestrictionSelectivity,
+								   bool canMerge,
+								   bool canHash);
+
 #endif							/* PG_OPERATOR_H */
diff --git a/src/test/regress/expected/create_operator.out b/src/test/regress/expected/create_operator.out
index f71b601f2d..49070ce90f 100644
--- a/src/test/regress/expected/create_operator.out
+++ b/src/test/regress/expected/create_operator.out
@@ -260,6 +260,42 @@ CREATE OPERATOR #*# (
 );
 ERROR:  permission denied for type type_op6
 ROLLBACK;
+-- Should fail. An operator cannot have a self negator.
+BEGIN TRANSACTION;
+CREATE FUNCTION create_op_test_fn(boolean, boolean)
+RETURNS boolean AS $$
+    SELECT NULL::BOOLEAN;
+$$ LANGUAGE sql IMMUTABLE;
+CREATE OPERATOR === (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    negator = ===
+);
+ERROR:  operator cannot be its own negator
+ROLLBACK;
+-- Should fail. An operator cannot have a self negator. Here we test that when
+-- 'creating' an existing shell operator, it checks the negator is not self.
+BEGIN TRANSACTION;
+CREATE FUNCTION create_op_test_fn(boolean, boolean)
+RETURNS boolean AS $$
+    SELECT NULL::BOOLEAN;
+$$ LANGUAGE sql IMMUTABLE;
+-- create a shell operator for ===!!! by referencing it as a commutator
+CREATE OPERATOR === (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    commutator = ===!!!
+);
+CREATE OPERATOR ===!!! (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    negator = ===!!!
+);
+ERROR:  operator cannot be its own negator
+ROLLBACK;
 -- invalid: non-lowercase quoted identifiers
 CREATE OPERATOR ===
 (
diff --git a/src/test/regress/sql/create_operator.sql b/src/test/regress/sql/create_operator.sql
index f53e24db3c..168cad3814 100644
--- a/src/test/regress/sql/create_operator.sql
+++ b/src/test/regress/sql/create_operator.sql
@@ -210,6 +210,42 @@ CREATE OPERATOR #*# (
 );
 ROLLBACK;
 
+-- Should fail. An operator cannot have a self negator.
+BEGIN TRANSACTION;
+CREATE FUNCTION create_op_test_fn(boolean, boolean)
+RETURNS boolean AS $$
+    SELECT NULL::BOOLEAN;
+$$ LANGUAGE sql IMMUTABLE;
+CREATE OPERATOR === (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    negator = ===
+);
+ROLLBACK;
+
+-- Should fail. An operator cannot have a self negator. Here we test that when
+-- 'creating' an existing shell operator, it checks the negator is not self.
+BEGIN TRANSACTION;
+CREATE FUNCTION create_op_test_fn(boolean, boolean)
+RETURNS boolean AS $$
+    SELECT NULL::BOOLEAN;
+$$ LANGUAGE sql IMMUTABLE;
+-- create a shell operator for ===!!! by referencing it as a commutator
+CREATE OPERATOR === (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    commutator = ===!!!
+);
+CREATE OPERATOR ===!!! (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    negator = ===!!!
+);
+ROLLBACK;
+
 -- invalid: non-lowercase quoted identifiers
 CREATE OPERATOR ===
 (
0002-alter_op_v2.patchapplication/octet-stream; name=0002-alter_op_v2.patchDownload
diff --git a/doc/src/sgml/ref/alter_operator.sgml b/doc/src/sgml/ref/alter_operator.sgml
index a4a1af564f..0bee668f33 100644
--- a/doc/src/sgml/ref/alter_operator.sgml
+++ b/doc/src/sgml/ref/alter_operator.sgml
@@ -29,8 +29,12 @@ ALTER OPERATOR <replaceable>name</replaceable> ( { <replaceable>left_type</repla
 
 ALTER OPERATOR <replaceable>name</replaceable> ( { <replaceable>left_type</replaceable> | NONE } , <replaceable>right_type</replaceable> )
     SET ( {  RESTRICT = { <replaceable class="parameter">res_proc</replaceable> | NONE }
-           | JOIN = { <replaceable class="parameter">join_proc</replaceable> | NONE }
-         } [, ... ] )
+            | JOIN = { <replaceable class="parameter">join_proc</replaceable> | NONE }
+            | COMMUTATOR = <replaceable class="parameter">com_op</replaceable> 
+            | NEGATOR = <replaceable class="parameter">neg_op</replaceable>
+            | HASHES
+            | MERGES
+          } [, ... ] )
 </synopsis>
  </refsynopsisdiv>
 
@@ -121,6 +125,42 @@ ALTER OPERATOR <replaceable>name</replaceable> ( { <replaceable>left_type</repla
      </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">com_op</replaceable></term>
+    <listitem>
+     <para>
+      The commutator of this operator. Can only be changed if the operator does not have an existing commutator.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><replaceable class="parameter">neg_op</replaceable></term>
+    <listitem>
+     <para>
+      The negator of this operator. Can only be changed if the operator does not have an existing negator.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><literal>HASHES</literal></term>
+    <listitem>
+     <para>
+     Indicates this operator can support a hash join. Can only be set when the operator does not support a hash join.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><literal>MERGES</literal></term>
+    <listitem>
+     <para>
+     Indicates this operator can support a merge join. Can only be set when the operator does not support a merge join.
+     </para>
+    </listitem>
+   </varlistentry>
+
   </variablelist>
  </refsect1>
 
diff --git a/src/backend/commands/operatorcmds.c b/src/backend/commands/operatorcmds.c
index cd7f83136f..bb976b14a5 100644
--- a/src/backend/commands/operatorcmds.c
+++ b/src/backend/commands/operatorcmds.c
@@ -426,6 +426,14 @@ AlterOperator(AlterOperatorStmt *stmt)
 	List	   *joinName = NIL; /* optional join sel. function */
 	bool		updateJoin = false;
 	Oid			joinOid;
+	List	   *commutatorName = NIL;	/* optional commutator operator name */
+	Oid			commutatorOid;
+	List	   *negatorName = NIL;	/* optional negator operator name */
+	Oid			negatorOid;
+	bool		updateHashes = false;
+	bool		canHash = false;
+	bool		updateMerges = false;
+	bool		canMerge = false;
 
 	/* Look up the operator */
 	oprId = LookupOperWithArgs(stmt->opername, false);
@@ -456,6 +464,24 @@ AlterOperator(AlterOperatorStmt *stmt)
 			joinName = param;
 			updateJoin = true;
 		}
+		else if (strcmp(defel->defname, "commutator") == 0)
+		{
+			commutatorName = defGetQualifiedName(defel);
+		}
+		else if (strcmp(defel->defname, "negator") == 0)
+		{
+			negatorName = defGetQualifiedName(defel);
+		}
+		else if (strcmp(defel->defname, "hashes") == 0)
+		{
+			canHash = defGetBoolean(defel);
+			updateHashes = true;
+		}
+		else if (strcmp(defel->defname, "merges") == 0)
+		{
+			canMerge = defGetBoolean(defel);
+			updateMerges = true;
+		}
 
 		/*
 		 * The rest of the options that CREATE accepts cannot be changed.
@@ -464,11 +490,7 @@ AlterOperator(AlterOperatorStmt *stmt)
 		else if (strcmp(defel->defname, "leftarg") == 0 ||
 				 strcmp(defel->defname, "rightarg") == 0 ||
 				 strcmp(defel->defname, "function") == 0 ||
-				 strcmp(defel->defname, "procedure") == 0 ||
-				 strcmp(defel->defname, "commutator") == 0 ||
-				 strcmp(defel->defname, "negator") == 0 ||
-				 strcmp(defel->defname, "hashes") == 0 ||
-				 strcmp(defel->defname, "merges") == 0)
+				 strcmp(defel->defname, "procedure") == 0)
 		{
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -488,7 +510,7 @@ AlterOperator(AlterOperatorStmt *stmt)
 					   NameStr(oprForm->oprname));
 
 	/*
-	 * Look up restriction and join estimators if specified
+	 * Look up Oid for any parameters specified
 	 */
 	if (restrictionName)
 		restrictionOid = ValidateRestrictionEstimator(restrictionName);
@@ -499,28 +521,80 @@ AlterOperator(AlterOperatorStmt *stmt)
 	else
 		joinOid = InvalidOid;
 
-	/* Perform additional checks, like OperatorCreate does */
-	if (!(OidIsValid(oprForm->oprleft) && OidIsValid(oprForm->oprright)))
+	if (commutatorName)
 	{
-		/* If it's not a binary op, these things mustn't be set: */
-		if (OidIsValid(joinOid))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only binary operators can have join selectivity")));
+		commutatorOid = OperatorGetOrCreateValidCommutator(commutatorName,
+														   oprForm->oid,
+														   oprForm->oprname.data,
+														   oprForm->oprnamespace,
+														   oprForm->oprleft,
+														   oprForm->oprright);
+
+		/*
+		 * we don't need to do anything extra for a self commutator as in
+		 * OperatorCreate as there we have to create the operator before
+		 * setting the commutator, but here the operator already exists and
+		 * the commutatorOid above is valid (and is the operator oid).
+		 */
 	}
+	else
+		commutatorOid = InvalidOid;
+
+	if (negatorName)
+		negatorOid = OperatorGetOrCreateValidNegator(negatorName,
+													 oprForm->oid,
+													 oprForm->oprname.data,
+													 oprForm->oprnamespace,
+													 oprForm->oprleft,
+													 oprForm->oprright);
+	else
+		negatorOid = InvalidOid;
 
-	if (oprForm->oprresult != BOOLOID)
+	/*
+	 * check that we're not changing any existing values that might be depended
+	 * on elsewhere and may be expected to never change, while allowing no-ops.
+	 */
+	if (OidIsValid(commutatorOid) && OidIsValid(oprForm->oprcom)
+		&& commutatorOid != oprForm->oprcom)
 	{
-		if (OidIsValid(restrictionOid))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can have restriction selectivity")));
-		if (OidIsValid(joinOid))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can have join selectivity")));
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+				 errmsg("operator attribute \"commutator\" cannot be changed if it has already been set")));
+	}
+
+	if (OidIsValid(negatorOid) && OidIsValid(oprForm->oprnegate)
+		&& negatorOid != oprForm->oprnegate)
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+				 errmsg("operator attribute \"negator\" cannot be changed if it has already been set")));
+	}
+
+	if (updateHashes && oprForm->oprcanhash && !canHash)
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+				 errmsg("operator attribute \"hashes\" cannot be changed if it has already been set")));
+	}
+
+	if (updateMerges && oprForm->oprcanmerge && !canMerge)
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+				 errmsg("operator attribute \"merges\" cannot be changed if it has already been set")));
 	}
 
+	/* Perform additional checks, like OperatorCreate does */
+	OperatorValidateParams(oprForm->oprleft,
+						   oprForm->oprright,
+						   oprForm->oprresult,
+						   OidIsValid(commutatorOid),
+						   OidIsValid(negatorOid),
+						   OidIsValid(joinOid),
+						   OidIsValid(restrictionOid),
+						   updateMerges,
+						   updateHashes);
+
 	/* Update the tuple */
 	for (i = 0; i < Natts_pg_operator; ++i)
 	{
@@ -539,6 +613,30 @@ AlterOperator(AlterOperatorStmt *stmt)
 		values[Anum_pg_operator_oprjoin - 1] = joinOid;
 	}
 
+	if (OidIsValid(commutatorOid))
+	{
+		replaces[Anum_pg_operator_oprcom - 1] = true;
+		values[Anum_pg_operator_oprcom - 1] = ObjectIdGetDatum(commutatorOid);
+	}
+
+	if (OidIsValid(negatorOid))
+	{
+		replaces[Anum_pg_operator_oprnegate - 1] = true;
+		values[Anum_pg_operator_oprnegate - 1] = ObjectIdGetDatum(negatorOid);
+	}
+
+	if (updateMerges)
+	{
+		replaces[Anum_pg_operator_oprcanmerge - 1] = true;
+		values[Anum_pg_operator_oprcanmerge - 1] = canMerge;
+	}
+
+	if (updateHashes)
+	{
+		replaces[Anum_pg_operator_oprcanhash - 1] = true;
+		values[Anum_pg_operator_oprcanhash - 1] = canHash;
+	}
+
 	tup = heap_modify_tuple(tup, RelationGetDescr(catalog),
 							values, nulls, replaces);
 
@@ -550,5 +648,8 @@ AlterOperator(AlterOperatorStmt *stmt)
 
 	table_close(catalog, NoLock);
 
+	if (OidIsValid(commutatorOid) || OidIsValid(negatorOid))
+		OperatorUpd(oprId, commutatorOid, negatorOid, false);
+
 	return address;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 39ab7eac0d..84017da3ee 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10101,6 +10101,8 @@ operator_def_elem: ColLabel '=' NONE
 						{ $$ = makeDefElem($1, NULL, @1); }
 				   | ColLabel '=' operator_def_arg
 						{ $$ = makeDefElem($1, (Node *) $3, @1); }
+				   | ColLabel
+						{ $$ = makeDefElem($1, NULL, @1); }
 		;
 
 /* must be similar enough to def_arg to avoid reduce/reduce conflicts */
diff --git a/src/test/regress/expected/alter_operator.out b/src/test/regress/expected/alter_operator.out
index 71bd484282..50a0c61a39 100644
--- a/src/test/regress/expected/alter_operator.out
+++ b/src/test/regress/expected/alter_operator.out
@@ -25,7 +25,7 @@ ORDER BY 1;
 (3 rows)
 
 --
--- Reset and set params
+-- test reset and set restrict and join
 --
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = NONE);
 ALTER OPERATOR === (boolean, boolean) SET (JOIN = NONE);
@@ -106,34 +106,243 @@ ORDER BY 1;
  schema public                                         | n
 (3 rows)
 
---
--- Test invalid options.
---
-ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = ====);
-ERROR:  operator attribute "commutator" cannot be changed
-ALTER OPERATOR === (boolean, boolean) SET (NEGATOR = ====);
-ERROR:  operator attribute "negator" cannot be changed
+-- test cannot set non existant function
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = non_existent_func);
 ERROR:  function non_existent_func(internal, oid, internal, integer) does not exist
 ALTER OPERATOR === (boolean, boolean) SET (JOIN = non_existent_func);
 ERROR:  function non_existent_func(internal, oid, internal, smallint, internal) does not exist
-ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = !==);
-ERROR:  operator attribute "commutator" cannot be changed
-ALTER OPERATOR === (boolean, boolean) SET (NEGATOR = !==);
-ERROR:  operator attribute "negator" cannot be changed
--- invalid: non-lowercase quoted identifiers
+-- test non-lowercase quoted identifiers invalid
 ALTER OPERATOR & (bit, bit) SET ("Restrict" = _int_contsel, "Join" = _int_contjoinsel);
 ERROR:  operator attribute "Restrict" not recognized
 --
--- Test permission check. Must be owner to ALTER OPERATOR.
+-- test must be owner of operator to ALTER OPERATOR.
 --
 CREATE USER regress_alter_op_user;
 SET SESSION AUTHORIZATION regress_alter_op_user;
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = NONE);
 ERROR:  must be owner of operator ===
--- Clean up
 RESET SESSION AUTHORIZATION;
+--
+-- test set commutator, negator, hashes, and merges which can only be set if not
+-- already set
+--
+-- for these tests create operators with different left and right types so that
+-- we can validate that function signatures are handled correctly
+CREATE FUNCTION alter_op_test_fn_bool_real(boolean, real)
+RETURNS boolean AS $$ SELECT NULL::BOOLEAN; $$ LANGUAGE sql IMMUTABLE;
+CREATE FUNCTION alter_op_test_fn_real_bool(real, boolean)
+RETURNS boolean AS $$ SELECT NULL::BOOLEAN; $$ LANGUAGE sql IMMUTABLE;
+-- operator
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+-- commutator
+CREATE OPERATOR ==== (
+    LEFTARG = real,
+    RIGHTARG = boolean,
+    PROCEDURE = alter_op_test_fn_real_bool
+);
+-- negator
+CREATE OPERATOR !==== (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+-- test no-op setting already false hashes and merges to false works
+ALTER OPERATOR === (boolean, real) SET (HASHES = false);
+ALTER OPERATOR === (boolean, real) SET (MERGES = false);
+-- validate still false after no-op
+SELECT oprcanhash, oprcanmerge FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanhash | oprcanmerge 
+------------+-------------
+ f          | f
+(1 row)
+
+-- test cannot set commutator or negator without owning them
+SET SESSION AUTHORIZATION regress_alter_op_user;
+-- we need need a new operator owned by regress_alter_op_user so that we are
+-- allowed to alter it. ==== and !==== are owned by the test user so we expect
+-- the alters below to fail.
+CREATE OPERATOR ===@@@ (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+ALTER OPERATOR ===@@@ (boolean, real) SET (COMMUTATOR= ====);
+ERROR:  must be owner of operator ====
+ALTER OPERATOR ===@@@ (boolean, real) SET (NEGATOR = !====);
+ERROR:  must be owner of operator !====
+-- validate operator is unchanged and commutator and negator are unset
+SELECT oprcom, oprnegate FROM pg_operator WHERE oprname = '===@@@'
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcom | oprnegate 
+--------+-----------
+      0 |         0
+(1 row)
+
+DROP OPERATOR ===@@@ (boolean, real);
+RESET SESSION AUTHORIZATION;
+-- test cannot set self negator
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = ===);
+ERROR:  operator cannot be its own negator
+-- validate no changes made
+SELECT oprcanmerge, oprcanhash, oprcom, oprnegate
+FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanmerge | oprcanhash | oprcom | oprnegate 
+-------------+------------+--------+-----------
+ f           | f          |      0 |         0
+(1 row)
+
+-- test set hashes
+ALTER OPERATOR === (boolean, real) SET (HASHES);
+SELECT oprcanhash FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanhash 
+------------
+ t
+(1 row)
+
+-- test set merges
+ALTER OPERATOR === (boolean, real) SET (MERGES);
+SELECT oprcanmerge FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanmerge 
+-------------
+ t
+(1 row)
+
+-- test set commutator
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ====);
+-- validate that the commutator has been set on both the operator and commutator,
+-- that they reference each other, and that the operator used is the existing
+-- one we created and not a new shell operator
+SELECT op.oprname AS operator_name, com.oprname AS commutator_name,
+  com.oprcode AS commutator_func
+  FROM pg_operator op
+  INNER JOIN pg_operator com ON (op.oid = com.oprcom AND op.oprcom = com.oid)
+  WHERE op.oprname = '==='
+  AND op.oprleft = 'boolean'::regtype AND op.oprright = 'real'::regtype;
+ operator_name | commutator_name |      commutator_func       
+---------------+-----------------+----------------------------
+ ===           | ====            | alter_op_test_fn_real_bool
+(1 row)
+
+-- test set negator
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !====);
+-- validate that the negator has been set on both the operator and negator, that
+-- they reference each other, and that the operator used is the existing one we
+-- created and not a new shell operator
+SELECT op.oprname AS operator_name, neg.oprname AS negator_name,
+  neg.oprcode AS negator_func
+  FROM pg_operator op
+  INNER JOIN pg_operator neg ON (op.oid = neg.oprnegate AND op.oprnegate = neg.oid)
+  WHERE op.oprname = '==='
+  AND op.oprleft = 'boolean'::regtype AND op.oprright = 'real'::regtype;
+ operator_name | negator_name |        negator_func        
+---------------+--------------+----------------------------
+ ===           | !====        | alter_op_test_fn_bool_real
+(1 row)
+
+-- validate that the final state of the operator is as we expect
+SELECT oprcanmerge, oprcanhash,
+       pg_describe_object('pg_operator'::regclass, oprcom, 0) AS commutator,
+       pg_describe_object('pg_operator'::regclass, oprnegate, 0) AS negator
+  FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanmerge | oprcanhash |         commutator          |           negator            
+-------------+------------+-----------------------------+------------------------------
+ t           | t          | operator ====(real,boolean) | operator !====(boolean,real)
+(1 row)
+
+-- test no-op set 'succeeds'
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !====);
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ====);
+ALTER OPERATOR === (boolean, real) SET (HASHES);
+ALTER OPERATOR === (boolean, real) SET (MERGES);
+-- test cannot change commutator, negator, hashes, and merges when already set
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = =);
+ERROR:  operator attribute "commutator" cannot be changed if it has already been set
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = =);
+ERROR:  operator attribute "negator" cannot be changed if it has already been set
+ALTER OPERATOR === (boolean, real) SET (HASHES = false);
+ERROR:  operator attribute "hashes" cannot be changed if it has already been set
+ALTER OPERATOR === (boolean, real) SET (MERGES = false);
+ERROR:  operator attribute "merges" cannot be changed if it has already been set
+-- validate no changes made
+SELECT oprcanmerge, oprcanhash,
+       pg_describe_object('pg_operator'::regclass, oprcom, 0) AS commutator,
+       pg_describe_object('pg_operator'::regclass, oprnegate, 0) AS negator
+  FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanmerge | oprcanhash |         commutator          |           negator            
+-------------+------------+-----------------------------+------------------------------
+ t           | t          | operator ====(real,boolean) | operator !====(boolean,real)
+(1 row)
+
+--
+-- test setting undefined operator creates shell operator (matches the
+-- behaviour of CREATE OPERATOR)
+--
+DROP OPERATOR === (boolean, real);
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ===@@@);
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !===@@@);
+-- validate that shell operators are created for the commutator and negator with
+-- the commutator having reversed args and the negator matching the operator.
+-- The shell operators should have an empty function below.
+SELECT pg_describe_object('pg_operator'::regclass, op.oid, 0) AS operator,
+       pg_describe_object('pg_operator'::regclass, com.oid, 0) AS commutator,
+       com.oprcode AS commutator_func,
+       pg_describe_object('pg_operator'::regclass, neg.oid, 0) AS negator,
+       neg.oprcode AS negator_func
+  FROM pg_operator op
+  INNER JOIN pg_operator com ON (op.oid = com.oprcom AND op.oprcom = com.oid)
+  INNER JOIN pg_operator neg ON (op.oid = neg.oprnegate AND op.oprnegate = neg.oid)
+  WHERE op.oprname = '===' AND com.oprname = '===@@@' AND neg.oprname = '!===@@@'
+  AND op.oprleft = 'boolean'::regtype AND op.oprright = 'real'::regtype;
+          operator          |          commutator           | commutator_func |            negator             | negator_func 
+----------------------------+-------------------------------+-----------------+--------------------------------+--------------
+ operator ===(boolean,real) | operator ===@@@(real,boolean) | -               | operator !===@@@(boolean,real) | -
+(1 row)
+
+--
+-- test setting self commutator
+--
+DROP OPERATOR === (boolean, boolean);
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = boolean,
+    PROCEDURE = alter_op_test_fn
+);
+ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = ===);
+-- validate that the oprcom is the operator oid
+SELECT oprname FROM pg_operator 
+  WHERE oprname = '===' AND oid = oprcom
+  AND oprleft = 'boolean'::regtype AND oprright = 'boolean'::regtype;
+ oprname 
+---------
+ ===
+(1 row)
+
+--
+-- Clean up
+--
 DROP USER regress_alter_op_user;
 DROP OPERATOR === (boolean, boolean);
+DROP OPERATOR === (boolean, real);
+DROP OPERATOR ==== (real, boolean);
+DROP OPERATOR !==== (boolean, real);
+DROP OPERATOR ===@@@ (real, boolean);
+DROP OPERATOR !===@@@(boolean, real);
 DROP FUNCTION customcontsel(internal, oid, internal, integer);
 DROP FUNCTION alter_op_test_fn(boolean, boolean);
+DROP FUNCTION alter_op_test_fn_bool_real(boolean, real);
+DROP FUNCTION alter_op_test_fn_real_bool(real, boolean);
diff --git a/src/test/regress/sql/alter_operator.sql b/src/test/regress/sql/alter_operator.sql
index fd40370165..978290be0c 100644
--- a/src/test/regress/sql/alter_operator.sql
+++ b/src/test/regress/sql/alter_operator.sql
@@ -22,7 +22,7 @@ WHERE classid = 'pg_operator'::regclass AND
 ORDER BY 1;
 
 --
--- Reset and set params
+-- test reset and set restrict and join
 --
 
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = NONE);
@@ -71,30 +71,221 @@ WHERE classid = 'pg_operator'::regclass AND
       objid = '===(bool,bool)'::regoperator
 ORDER BY 1;
 
---
--- Test invalid options.
---
-ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = ====);
-ALTER OPERATOR === (boolean, boolean) SET (NEGATOR = ====);
+-- test cannot set non existant function
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = non_existent_func);
 ALTER OPERATOR === (boolean, boolean) SET (JOIN = non_existent_func);
-ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = !==);
-ALTER OPERATOR === (boolean, boolean) SET (NEGATOR = !==);
 
--- invalid: non-lowercase quoted identifiers
+-- test non-lowercase quoted identifiers invalid
 ALTER OPERATOR & (bit, bit) SET ("Restrict" = _int_contsel, "Join" = _int_contjoinsel);
 
 --
--- Test permission check. Must be owner to ALTER OPERATOR.
+-- test must be owner of operator to ALTER OPERATOR.
 --
 CREATE USER regress_alter_op_user;
 SET SESSION AUTHORIZATION regress_alter_op_user;
 
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = NONE);
 
--- Clean up
 RESET SESSION AUTHORIZATION;
+
+--
+-- test set commutator, negator, hashes, and merges which can only be set if not
+-- already set
+--
+
+-- for these tests create operators with different left and right types so that
+-- we can validate that function signatures are handled correctly
+
+CREATE FUNCTION alter_op_test_fn_bool_real(boolean, real)
+RETURNS boolean AS $$ SELECT NULL::BOOLEAN; $$ LANGUAGE sql IMMUTABLE;
+
+CREATE FUNCTION alter_op_test_fn_real_bool(real, boolean)
+RETURNS boolean AS $$ SELECT NULL::BOOLEAN; $$ LANGUAGE sql IMMUTABLE;
+
+-- operator
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+
+-- commutator
+CREATE OPERATOR ==== (
+    LEFTARG = real,
+    RIGHTARG = boolean,
+    PROCEDURE = alter_op_test_fn_real_bool
+);
+
+-- negator
+CREATE OPERATOR !==== (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+
+-- test no-op setting already false hashes and merges to false works
+ALTER OPERATOR === (boolean, real) SET (HASHES = false);
+ALTER OPERATOR === (boolean, real) SET (MERGES = false);
+
+-- validate still false after no-op
+SELECT oprcanhash, oprcanmerge FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+-- test cannot set commutator or negator without owning them
+SET SESSION AUTHORIZATION regress_alter_op_user;
+
+-- we need need a new operator owned by regress_alter_op_user so that we are
+-- allowed to alter it. ==== and !==== are owned by the test user so we expect
+-- the alters below to fail.
+CREATE OPERATOR ===@@@ (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+
+ALTER OPERATOR ===@@@ (boolean, real) SET (COMMUTATOR= ====);
+ALTER OPERATOR ===@@@ (boolean, real) SET (NEGATOR = !====);
+
+-- validate operator is unchanged and commutator and negator are unset
+SELECT oprcom, oprnegate FROM pg_operator WHERE oprname = '===@@@'
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+DROP OPERATOR ===@@@ (boolean, real);
+
+RESET SESSION AUTHORIZATION;
+
+-- test cannot set self negator
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = ===);
+
+-- validate no changes made
+SELECT oprcanmerge, oprcanhash, oprcom, oprnegate
+FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+-- test set hashes
+ALTER OPERATOR === (boolean, real) SET (HASHES);
+SELECT oprcanhash FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+-- test set merges
+ALTER OPERATOR === (boolean, real) SET (MERGES);
+SELECT oprcanmerge FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+-- test set commutator
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ====);
+
+-- validate that the commutator has been set on both the operator and commutator,
+-- that they reference each other, and that the operator used is the existing
+-- one we created and not a new shell operator
+SELECT op.oprname AS operator_name, com.oprname AS commutator_name,
+  com.oprcode AS commutator_func
+  FROM pg_operator op
+  INNER JOIN pg_operator com ON (op.oid = com.oprcom AND op.oprcom = com.oid)
+  WHERE op.oprname = '==='
+  AND op.oprleft = 'boolean'::regtype AND op.oprright = 'real'::regtype;
+
+-- test set negator
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !====);
+
+-- validate that the negator has been set on both the operator and negator, that
+-- they reference each other, and that the operator used is the existing one we
+-- created and not a new shell operator
+SELECT op.oprname AS operator_name, neg.oprname AS negator_name,
+  neg.oprcode AS negator_func
+  FROM pg_operator op
+  INNER JOIN pg_operator neg ON (op.oid = neg.oprnegate AND op.oprnegate = neg.oid)
+  WHERE op.oprname = '==='
+  AND op.oprleft = 'boolean'::regtype AND op.oprright = 'real'::regtype;
+
+-- validate that the final state of the operator is as we expect
+SELECT oprcanmerge, oprcanhash,
+       pg_describe_object('pg_operator'::regclass, oprcom, 0) AS commutator,
+       pg_describe_object('pg_operator'::regclass, oprnegate, 0) AS negator
+  FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+-- test no-op set 'succeeds'
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !====);
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ====);
+ALTER OPERATOR === (boolean, real) SET (HASHES);
+ALTER OPERATOR === (boolean, real) SET (MERGES);
+
+-- test cannot change commutator, negator, hashes, and merges when already set
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = =);
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = =);
+ALTER OPERATOR === (boolean, real) SET (HASHES = false);
+ALTER OPERATOR === (boolean, real) SET (MERGES = false);
+
+-- validate no changes made
+SELECT oprcanmerge, oprcanhash,
+       pg_describe_object('pg_operator'::regclass, oprcom, 0) AS commutator,
+       pg_describe_object('pg_operator'::regclass, oprnegate, 0) AS negator
+  FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+--
+-- test setting undefined operator creates shell operator (matches the
+-- behaviour of CREATE OPERATOR)
+--
+
+DROP OPERATOR === (boolean, real);
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ===@@@);
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !===@@@);
+
+-- validate that shell operators are created for the commutator and negator with
+-- the commutator having reversed args and the negator matching the operator.
+-- The shell operators should have an empty function below.
+SELECT pg_describe_object('pg_operator'::regclass, op.oid, 0) AS operator,
+       pg_describe_object('pg_operator'::regclass, com.oid, 0) AS commutator,
+       com.oprcode AS commutator_func,
+       pg_describe_object('pg_operator'::regclass, neg.oid, 0) AS negator,
+       neg.oprcode AS negator_func
+  FROM pg_operator op
+  INNER JOIN pg_operator com ON (op.oid = com.oprcom AND op.oprcom = com.oid)
+  INNER JOIN pg_operator neg ON (op.oid = neg.oprnegate AND op.oprnegate = neg.oid)
+  WHERE op.oprname = '===' AND com.oprname = '===@@@' AND neg.oprname = '!===@@@'
+  AND op.oprleft = 'boolean'::regtype AND op.oprright = 'real'::regtype;
+
+
+--
+-- test setting self commutator
+--
+
+DROP OPERATOR === (boolean, boolean);
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = boolean,
+    PROCEDURE = alter_op_test_fn
+);
+
+ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = ===);
+
+-- validate that the oprcom is the operator oid
+SELECT oprname FROM pg_operator 
+  WHERE oprname = '===' AND oid = oprcom
+  AND oprleft = 'boolean'::regtype AND oprright = 'boolean'::regtype;
+
+--
+-- Clean up
+--
+
 DROP USER regress_alter_op_user;
+
 DROP OPERATOR === (boolean, boolean);
+DROP OPERATOR === (boolean, real);
+DROP OPERATOR ==== (real, boolean);
+DROP OPERATOR !==== (boolean, real);
+DROP OPERATOR ===@@@ (real, boolean);
+DROP OPERATOR !===@@@(boolean, real);
+
 DROP FUNCTION customcontsel(internal, oid, internal, integer);
 DROP FUNCTION alter_op_test_fn(boolean, boolean);
+DROP FUNCTION alter_op_test_fn_bool_real(boolean, real);
+DROP FUNCTION alter_op_test_fn_real_bool(real, boolean);
#8jian he
jian.universality@gmail.com
In reply to: Tommy Pavlicek (#7)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

/*
* AlterOperator
* routine implementing ALTER OPERATOR <operator> SET (option = ...).
*
* Currently, only RESTRICT and JOIN estimator functions can be changed.
*/
ObjectAddress
AlterOperator(AlterOperatorStmt *stmt)

The above comment needs to change, other than that, it passed the
test, works as expected.

#9jian he
jian.universality@gmail.com
In reply to: jian he (#8)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

in doc/src/sgml/ref/alter_operator.sgml

<varlistentry>
<term><literal>HASHES</literal></term>
<listitem>
<para>
Indicates this operator can support a hash join. Can only be set
when the operator does not support a hash join.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term><literal>MERGES</literal></term>
<listitem>
<para>
Indicates this operator can support a merge join. Can only be set
when the operator does not support a merge join.
</para>
</listitem>
</varlistentry>
------------------------
if the operator cannot support hash/merge join, it can't do ALTER
OPERATOR oprname(left_arg, right_arg) SET (HASHES/MERGES = false);

I think it means:
Can only be set when the operator does support a hash/merge join. Once
set to true, it cannot be reset to false.

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tommy Pavlicek (#7)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

Tommy Pavlicek <tommypav122@gmail.com> writes:

I've attached a new version of the ALTER OPERATOR patch that allows
no-ops. It should be ready to review now.

I got around to looking through this finally (sorry about the delay).
I'm mostly on board with the functionality, with the exception that
I don't see why we should allow ALTER OPERATOR to cause new shell
operators to be created. The argument for allowing that in CREATE
OPERATOR was mainly to allow a linked pair of operators to be created
without a lot of complexity (specifically, being careful to specify
the commutator or negator linkage only in the second CREATE, which
is a rule that requires another exception for a self-commutator).
However, if you're using ALTER OPERATOR then you might as well create
both operators first and then link them with an ALTER command.
In fact, I don't really see a use-case where the operators wouldn't
both exist; isn't this feature mainly to allow retrospective
correction of omitted linkages? So I think allowing ALTER to create a
second operator is more likely to allow mistakes to sneak by than to
do anything useful --- and they will be mistakes you can't correct
except by starting over. I'd even question whether we want to let
ALTER establish a linkage to an existing shell operator, rather than
insisting you turn it into a valid operator via CREATE first.

If we implement it with that restriction then I don't think the
refactorization done in 0001 is correct, or at least not ideal.

(In any case, it seems like a bad idea that the command reference
pages make no mention of this stuff about shell operators. It's
explained in 38.15. Operator Optimization Information, but it'd
be worth at least alluding to that section here. Or maybe we
should move that info to CREATE OPERATOR?)

More generally, you muttered something about 0001 fixing some
existing bugs, but if so I can't see those trees for the forest of
refactorization. I'd suggest splitting any bug fixes apart from
the pure-refactorization step.

regards, tom lane

#11Tommy Pavlicek
tommypav122@gmail.com
In reply to: Tom Lane (#10)
3 attachment(s)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

On Thu, Sep 28, 2023 at 9:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Tommy Pavlicek <tommypav122@gmail.com> writes:

I've attached a new version of the ALTER OPERATOR patch that allows
no-ops. It should be ready to review now.

I got around to looking through this finally (sorry about the delay).
I'm mostly on board with the functionality, with the exception that
I don't see why we should allow ALTER OPERATOR to cause new shell
operators to be created. The argument for allowing that in CREATE
OPERATOR was mainly to allow a linked pair of operators to be created
without a lot of complexity (specifically, being careful to specify
the commutator or negator linkage only in the second CREATE, which
is a rule that requires another exception for a self-commutator).
However, if you're using ALTER OPERATOR then you might as well create
both operators first and then link them with an ALTER command.
In fact, I don't really see a use-case where the operators wouldn't
both exist; isn't this feature mainly to allow retrospective
correction of omitted linkages? So I think allowing ALTER to create a
second operator is more likely to allow mistakes to sneak by than to
do anything useful --- and they will be mistakes you can't correct
except by starting over. I'd even question whether we want to let
ALTER establish a linkage to an existing shell operator, rather than
insisting you turn it into a valid operator via CREATE first.

If we implement it with that restriction then I don't think the
refactorization done in 0001 is correct, or at least not ideal.

(In any case, it seems like a bad idea that the command reference
pages make no mention of this stuff about shell operators. It's
explained in 38.15. Operator Optimization Information, but it'd
be worth at least alluding to that section here. Or maybe we
should move that info to CREATE OPERATOR?)

More generally, you muttered something about 0001 fixing some
existing bugs, but if so I can't see those trees for the forest of
refactorization. I'd suggest splitting any bug fixes apart from
the pure-refactorization step.

regards, tom lane

Thanks Tom.

The rationale behind the shell operator and that part of section 38.15
of the documentation had escaped me, but what you're saying makes
complete sense. Based on your comments, I've made some changes:

1. I've isolated the bug fixes (fixing the error message and
disallowing self negation when filling in a shell operator) into
0001-bug-fixes-v3.patch.
2. I've scaled back most of the refactoring as I agree it no longer makes sense.
3. I updated the logic to prevent the creation of or linking to shell operators.
4. I made further updates to the documentation including referencing
38.15 directly in the CREATE and ALTER pages (It's easy to miss if
only 38.14 is referenced) and moved the commentary about creating
commutators and negators into the CREATE section as with the the ALTER
changes it now seems specific to CREATE. I didn't move the rest of
38.15 as I think this applies to both CREATE and ALTER.

I did notice one further potential bug. When creating an operator and
adding a commutator, PostgreSQL only links the commutator back to the
operator if the commutator has no commutator of its own, but the
create operation succeeds regardless of whether this linkage happens.

In other words, if A and B are a pair of commutators and one creates
another operator, C, with A as its commutator, then C will link to A,
but A will still link to B (and B to A). It's not clear to me if this
in itself is a problem, but unless I've misunderstood something
operator C must be the same as B so it implies an error by the user
and there could also be issues if A was deleted since C would continue
to refer to the deleted A.

The same applies for negators and alter operations.

Do you know if this behaviour is intentional or if I've missed
something because it seems undesirable to me. If it is a bug, then I
think I can see how to fix it, but wanted to ask before making any
changes.

On Mon, Sep 25, 2023 at 11:52 AM jian he <jian.universality@gmail.com> wrote:

/*
* AlterOperator
* routine implementing ALTER OPERATOR <operator> SET (option = ...).
*
* Currently, only RESTRICT and JOIN estimator functions can be changed.
*/
ObjectAddress
AlterOperator(AlterOperatorStmt *stmt)

The above comment needs to change, other than that, it passed the
test, works as expected.

Thanks, added a comment.

Can only be set when the operator does support a hash/merge join. Once
set to true, it cannot be reset to false.

Yes, I updated the wording. Is it clearer now?

Attachments:

0001-bug-fixes-v3.patchapplication/octet-stream; name=0001-bug-fixes-v3.patchDownload
diff --git a/src/backend/catalog/pg_operator.c b/src/backend/catalog/pg_operator.c
index 95918a77a1..b0570c33b6 100644
--- a/src/backend/catalog/pg_operator.c
+++ b/src/backend/catalog/pg_operator.c
@@ -57,8 +57,7 @@ static Oid	OperatorShellMake(const char *operatorName,
 static Oid	get_other_operator(List *otherOp,
 							   Oid otherLeftTypeId, Oid otherRightTypeId,
 							   const char *operatorName, Oid operatorNamespace,
-							   Oid leftTypeId, Oid rightTypeId,
-							   bool isCommutator);
+							   Oid leftTypeId, Oid rightTypeId);
 
 
 /*
@@ -442,8 +441,7 @@ OperatorCreate(const char *operatorName,
 		commutatorId = get_other_operator(commutatorName,
 										  rightTypeId, leftTypeId,
 										  operatorName, operatorNamespace,
-										  leftTypeId, rightTypeId,
-										  true);
+										  leftTypeId, rightTypeId);
 
 		/* Permission check: must own other operator */
 		if (OidIsValid(commutatorId) &&
@@ -467,8 +465,19 @@ OperatorCreate(const char *operatorName,
 		negatorId = get_other_operator(negatorName,
 									   leftTypeId, rightTypeId,
 									   operatorName, operatorNamespace,
-									   leftTypeId, rightTypeId,
-									   false);
+									   leftTypeId, rightTypeId);
+
+		/*
+		 * Prevent self negation as it doesn't make sense. When self negating,
+		 * get_other_operator returns InvalidOid when creating a new operator
+		 * and the OID of the operator we are 'creating' when upgrading a
+		 * shell operator.
+		 */
+
+		if (!OidIsValid(negatorId) || operatorObjectId == negatorId)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("operator cannot be its own negator")));
 
 		/* Permission check: must own other operator */
 		if (OidIsValid(negatorId) &&
@@ -584,7 +593,7 @@ OperatorCreate(const char *operatorName,
 static Oid
 get_other_operator(List *otherOp, Oid otherLeftTypeId, Oid otherRightTypeId,
 				   const char *operatorName, Oid operatorNamespace,
-				   Oid leftTypeId, Oid rightTypeId, bool isCommutator)
+				   Oid leftTypeId, Oid rightTypeId)
 {
 	Oid			other_oid;
 	bool		otherDefined;
@@ -611,14 +620,7 @@ get_other_operator(List *otherOp, Oid otherLeftTypeId, Oid otherRightTypeId,
 		otherLeftTypeId == leftTypeId &&
 		otherRightTypeId == rightTypeId)
 	{
-		/*
-		 * self-linkage to this operator; caller will fix later. Note that
-		 * only self-linkage for commutation makes sense.
-		 */
-		if (!isCommutator)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("operator cannot be its own negator or sort operator")));
+		/* self-linkage to this operator; caller will fix later. */
 		return InvalidOid;
 	}
 
diff --git a/src/test/regress/expected/create_operator.out b/src/test/regress/expected/create_operator.out
index f71b601f2d..49070ce90f 100644
--- a/src/test/regress/expected/create_operator.out
+++ b/src/test/regress/expected/create_operator.out
@@ -260,6 +260,42 @@ CREATE OPERATOR #*# (
 );
 ERROR:  permission denied for type type_op6
 ROLLBACK;
+-- Should fail. An operator cannot have a self negator.
+BEGIN TRANSACTION;
+CREATE FUNCTION create_op_test_fn(boolean, boolean)
+RETURNS boolean AS $$
+    SELECT NULL::BOOLEAN;
+$$ LANGUAGE sql IMMUTABLE;
+CREATE OPERATOR === (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    negator = ===
+);
+ERROR:  operator cannot be its own negator
+ROLLBACK;
+-- Should fail. An operator cannot have a self negator. Here we test that when
+-- 'creating' an existing shell operator, it checks the negator is not self.
+BEGIN TRANSACTION;
+CREATE FUNCTION create_op_test_fn(boolean, boolean)
+RETURNS boolean AS $$
+    SELECT NULL::BOOLEAN;
+$$ LANGUAGE sql IMMUTABLE;
+-- create a shell operator for ===!!! by referencing it as a commutator
+CREATE OPERATOR === (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    commutator = ===!!!
+);
+CREATE OPERATOR ===!!! (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    negator = ===!!!
+);
+ERROR:  operator cannot be its own negator
+ROLLBACK;
 -- invalid: non-lowercase quoted identifiers
 CREATE OPERATOR ===
 (
diff --git a/src/test/regress/sql/create_operator.sql b/src/test/regress/sql/create_operator.sql
index f53e24db3c..168cad3814 100644
--- a/src/test/regress/sql/create_operator.sql
+++ b/src/test/regress/sql/create_operator.sql
@@ -210,6 +210,42 @@ CREATE OPERATOR #*# (
 );
 ROLLBACK;
 
+-- Should fail. An operator cannot have a self negator.
+BEGIN TRANSACTION;
+CREATE FUNCTION create_op_test_fn(boolean, boolean)
+RETURNS boolean AS $$
+    SELECT NULL::BOOLEAN;
+$$ LANGUAGE sql IMMUTABLE;
+CREATE OPERATOR === (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    negator = ===
+);
+ROLLBACK;
+
+-- Should fail. An operator cannot have a self negator. Here we test that when
+-- 'creating' an existing shell operator, it checks the negator is not self.
+BEGIN TRANSACTION;
+CREATE FUNCTION create_op_test_fn(boolean, boolean)
+RETURNS boolean AS $$
+    SELECT NULL::BOOLEAN;
+$$ LANGUAGE sql IMMUTABLE;
+-- create a shell operator for ===!!! by referencing it as a commutator
+CREATE OPERATOR === (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    commutator = ===!!!
+);
+CREATE OPERATOR ===!!! (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    negator = ===!!!
+);
+ROLLBACK;
+
 -- invalid: non-lowercase quoted identifiers
 CREATE OPERATOR ===
 (
0003-alter_op-v3.patchapplication/octet-stream; name=0003-alter_op-v3.patchDownload
diff --git a/doc/src/sgml/ref/alter_operator.sgml b/doc/src/sgml/ref/alter_operator.sgml
index a4a1af564f..ce447e22b0 100644
--- a/doc/src/sgml/ref/alter_operator.sgml
+++ b/doc/src/sgml/ref/alter_operator.sgml
@@ -29,8 +29,12 @@ ALTER OPERATOR <replaceable>name</replaceable> ( { <replaceable>left_type</repla
 
 ALTER OPERATOR <replaceable>name</replaceable> ( { <replaceable>left_type</replaceable> | NONE } , <replaceable>right_type</replaceable> )
     SET ( {  RESTRICT = { <replaceable class="parameter">res_proc</replaceable> | NONE }
-           | JOIN = { <replaceable class="parameter">join_proc</replaceable> | NONE }
-         } [, ... ] )
+            | JOIN = { <replaceable class="parameter">join_proc</replaceable> | NONE }
+            | COMMUTATOR = <replaceable class="parameter">com_op</replaceable> 
+            | NEGATOR = <replaceable class="parameter">neg_op</replaceable>
+            | HASHES
+            | MERGES
+          } [, ... ] )
 </synopsis>
  </refsynopsisdiv>
 
@@ -121,9 +125,64 @@ ALTER OPERATOR <replaceable>name</replaceable> ( { <replaceable>left_type</repla
      </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">com_op</replaceable></term>
+    <listitem>
+     <para>
+      The commutator of this operator. Can only be set if the operator does not have an existing commutator.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><replaceable class="parameter">neg_op</replaceable></term>
+    <listitem>
+     <para>
+      The negator of this operator. Can only be set if the operator does not have an existing negator.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><literal>HASHES</literal></term>
+    <listitem>
+     <para>
+     Indicates this operator can support a hash join. Can only be enabled and not disabled.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><literal>MERGES</literal></term>
+    <listitem>
+     <para>
+     Indicates this operator can support a merge join. Can only be enabled and not disabled.
+     </para>
+    </listitem>
+   </varlistentry>
+
   </variablelist>
  </refsect1>
 
+  <refsect1>
+  <title>Notes</title>
+
+  <para>
+   Refer to <xref linkend="xoper"/> and <xref linkend="xoper-optimization"/> for further information.
+  </para>
+
+  <para>
+  Since commutators come in pairs and are commutators of each other,
+  <productname>PostgreSQL</productname> will set the commutator of the
+  <replaceable class="parameter">com_op</replaceable> to the operator.
+  </para>
+
+  <para>
+  Like with commutators, <productname>PostgreSQL</productname> will set the
+  negator of the <replaceable class="parameter">neg_op</replaceable> to the operator.
+  </para>
+  </refsect1>
+
  <refsect1>
   <title>Examples</title>
 
diff --git a/doc/src/sgml/ref/create_operator.sgml b/doc/src/sgml/ref/create_operator.sgml
index e27512ff39..9801eb3ed0 100644
--- a/doc/src/sgml/ref/create_operator.sgml
+++ b/doc/src/sgml/ref/create_operator.sgml
@@ -231,9 +231,60 @@ COMMUTATOR = OPERATOR(myschema.===) ,
   <title>Notes</title>
 
   <para>
-   Refer to <xref linkend="xoper"/> for further information.
+   Refer to <xref linkend="xoper"/> and <xref linkend="xoper-optimization"/> for further information.
   </para>
 
+  <para>
+    When you are defining a self-commutative operator, you just do it.
+    When you are defining a pair of commutative operators, things are
+    a little trickier: how can the first one to be defined refer to the
+    other one, which you haven't defined yet?  There are three solutions
+    to this problem:
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       One way is to omit the <literal>COMMUTATOR</literal> clause in the first operator that
+       you define, and then provide one in the second operator's definition.
+       Since <productname>PostgreSQL</productname> knows that commutative
+       operators come in pairs, when it sees the second definition it will
+       automatically go back and fill in the missing <literal>COMMUTATOR</literal> clause in
+       the first definition.
+      </para>
+     </listitem>
+
+     <listitem>
+      <para>
+       Another, more straightforward way is just to include <literal>COMMUTATOR</literal> clauses
+       in both definitions.  When <productname>PostgreSQL</productname> processes
+       the first definition and realizes that <literal>COMMUTATOR</literal> refers to a nonexistent
+       operator, the system will make a dummy entry for that operator in the
+       system catalog.  This dummy entry will have valid data only
+       for the operator name, left and right operand types, and result type,
+       since that's all that <productname>PostgreSQL</productname> can deduce
+       at this point.  The first operator's catalog entry will link to this
+       dummy entry.  Later, when you define the second operator, the system
+       updates the dummy entry with the additional information from the second
+       definition.  If you try to use the dummy operator before it's been filled
+       in, you'll just get an error message.
+      </para>
+     </listitem>
+
+     <listitem>
+      <para>
+      Alternatively, both operators can be defined without a <literal>COMMUTATOR</literal>
+      and then <command>ALTER OPERATOR</command> can be used to set the <literal>COMMUTATOR</literal>
+      for the operator.
+      </para>
+     </listitem>
+    </itemizedlist>
+   </para>
+
+   <para>
+    Pairs of negator operators can be defined using the same methods
+    explained above for commutator pairs.
+   </para>
+
   <para>
    It is not possible to specify an operator's lexical precedence in
    <command>CREATE OPERATOR</command>, because the parser's precedence behavior
diff --git a/doc/src/sgml/xoper.sgml b/doc/src/sgml/xoper.sgml
index a929ced07d..31bc8df3de 100644
--- a/doc/src/sgml/xoper.sgml
+++ b/doc/src/sgml/xoper.sgml
@@ -146,44 +146,6 @@ SELECT (a + b) AS c FROM test_complex;
      <literal>=</literal> operator must specify that it is valid, by marking the
      operator with commutator information.
     </para>
-
-    <para>
-     When you are defining a self-commutative operator, you just do it.
-     When you are defining a pair of commutative operators, things are
-     a little trickier: how can the first one to be defined refer to the
-     other one, which you haven't defined yet?  There are two solutions
-     to this problem:
-
-     <itemizedlist>
-      <listitem>
-       <para>
-        One way is to omit the <literal>COMMUTATOR</literal> clause in the first operator that
-        you define, and then provide one in the second operator's definition.
-        Since <productname>PostgreSQL</productname> knows that commutative
-        operators come in pairs, when it sees the second definition it will
-        automatically go back and fill in the missing <literal>COMMUTATOR</literal> clause in
-        the first definition.
-       </para>
-      </listitem>
-
-      <listitem>
-       <para>
-        The other, more straightforward way is just to include <literal>COMMUTATOR</literal> clauses
-        in both definitions.  When <productname>PostgreSQL</productname> processes
-        the first definition and realizes that <literal>COMMUTATOR</literal> refers to a nonexistent
-        operator, the system will make a dummy entry for that operator in the
-        system catalog.  This dummy entry will have valid data only
-        for the operator name, left and right operand types, and result type,
-        since that's all that <productname>PostgreSQL</productname> can deduce
-        at this point.  The first operator's catalog entry will link to this
-        dummy entry.  Later, when you define the second operator, the system
-        updates the dummy entry with the additional information from the second
-        definition.  If you try to use the dummy operator before it's been filled
-        in, you'll just get an error message.
-       </para>
-      </listitem>
-     </itemizedlist>
-    </para>
    </sect2>
 
    <sect2 id="xoper-negator">
@@ -218,12 +180,7 @@ SELECT (a + b) AS c FROM test_complex;
     <literal>NOT</literal> operations can be inserted as a consequence of other rearrangements.
    </para>
 
-   <para>
-    Pairs of negator operators can be defined using the same methods
-    explained above for commutator pairs.
-   </para>
-
-  </sect2>
+   </sect2>
 
   <sect2 id="xoper-restrict">
    <title><literal>RESTRICT</literal></title>
diff --git a/src/backend/catalog/pg_operator.c b/src/backend/catalog/pg_operator.c
index cda543a03b..35f6e04b47 100644
--- a/src/backend/catalog/pg_operator.c
+++ b/src/backend/catalog/pg_operator.c
@@ -44,11 +44,6 @@ static Oid	OperatorGet(const char *operatorName,
 						Oid rightObjectId,
 						bool *defined);
 
-static Oid	OperatorLookup(List *operatorName,
-						   Oid leftObjectId,
-						   Oid rightObjectId,
-						   bool *defined);
-
 static Oid	OperatorShellMake(const char *operatorName,
 							  Oid operatorNamespace,
 							  Oid leftTypeId,
@@ -165,7 +160,7 @@ OperatorGet(const char *operatorName,
  *
  *		*defined is set true if defined (not a shell)
  */
-static Oid
+Oid
 OperatorLookup(List *operatorName,
 			   Oid leftObjectId,
 			   Oid rightObjectId,
diff --git a/src/backend/commands/operatorcmds.c b/src/backend/commands/operatorcmds.c
index cd7f83136f..e3d2619081 100644
--- a/src/backend/commands/operatorcmds.c
+++ b/src/backend/commands/operatorcmds.c
@@ -54,6 +54,9 @@
 
 static Oid	ValidateRestrictionEstimator(List *restrictionName);
 static Oid	ValidateJoinEstimator(List *joinName);
+static Oid	ValidateOperatorDefined(List *name,
+									Oid leftTypeId,
+									Oid rightTypeId);
 
 /*
  * DefineOperator
@@ -359,6 +362,44 @@ ValidateJoinEstimator(List *joinName)
 	return joinOid;
 }
 
+/*
+ * Looks up and returns an oid for an operator given a possibly-qualified name
+ * and left and right type IDs, verifies the operator is defined (non-shell),
+ * and owned by the current user. Raises an error otherwise.
+ */
+static Oid
+ValidateOperatorDefined(List *name,
+						Oid leftTypeId,
+						Oid rightTypeId)
+{
+	Oid			oid;
+	bool		defined;
+
+	oid = OperatorLookup(name,
+						 leftTypeId,
+						 rightTypeId,
+						 &defined);
+
+	if (!OidIsValid(oid))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_FUNCTION),
+				 errmsg("operator %s does not exist",
+						NameListToString(name))));
+
+	if (!defined)
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_FUNCTION),
+				 errmsg("operator %s is undefined shell operator",
+						NameListToString(name))));
+
+	if (!object_ownercheck(OperatorRelationId, oid, GetUserId()))
+		aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_OPERATOR,
+					   NameListToString(name));
+
+	return oid;
+}
+
+
 /*
  * Guts of operator deletion.
  */
@@ -406,6 +447,8 @@ RemoveOperatorById(Oid operOid)
  *		routine implementing ALTER OPERATOR <operator> SET (option = ...).
  *
  * Currently, only RESTRICT and JOIN estimator functions can be changed.
+ * COMMUTATOR and NEGATOR, HASHES, and MERGES can be set if they have not been set
+ * previously.
  */
 ObjectAddress
 AlterOperator(AlterOperatorStmt *stmt)
@@ -426,6 +469,14 @@ AlterOperator(AlterOperatorStmt *stmt)
 	List	   *joinName = NIL; /* optional join sel. function */
 	bool		updateJoin = false;
 	Oid			joinOid;
+	List	   *commutatorName = NIL;	/* optional commutator operator name */
+	Oid			commutatorOid;
+	List	   *negatorName = NIL;	/* optional negator operator name */
+	Oid			negatorOid;
+	bool		updateHashes = false;
+	bool		canHash = false;
+	bool		updateMerges = false;
+	bool		canMerge = false;
 
 	/* Look up the operator */
 	oprId = LookupOperWithArgs(stmt->opername, false);
@@ -456,6 +507,24 @@ AlterOperator(AlterOperatorStmt *stmt)
 			joinName = param;
 			updateJoin = true;
 		}
+		else if (strcmp(defel->defname, "commutator") == 0)
+		{
+			commutatorName = defGetQualifiedName(defel);
+		}
+		else if (strcmp(defel->defname, "negator") == 0)
+		{
+			negatorName = defGetQualifiedName(defel);
+		}
+		else if (strcmp(defel->defname, "hashes") == 0)
+		{
+			canHash = defGetBoolean(defel);
+			updateHashes = true;
+		}
+		else if (strcmp(defel->defname, "merges") == 0)
+		{
+			canMerge = defGetBoolean(defel);
+			updateMerges = true;
+		}
 
 		/*
 		 * The rest of the options that CREATE accepts cannot be changed.
@@ -464,11 +533,7 @@ AlterOperator(AlterOperatorStmt *stmt)
 		else if (strcmp(defel->defname, "leftarg") == 0 ||
 				 strcmp(defel->defname, "rightarg") == 0 ||
 				 strcmp(defel->defname, "function") == 0 ||
-				 strcmp(defel->defname, "procedure") == 0 ||
-				 strcmp(defel->defname, "commutator") == 0 ||
-				 strcmp(defel->defname, "negator") == 0 ||
-				 strcmp(defel->defname, "hashes") == 0 ||
-				 strcmp(defel->defname, "merges") == 0)
+				 strcmp(defel->defname, "procedure") == 0)
 		{
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -488,7 +553,7 @@ AlterOperator(AlterOperatorStmt *stmt)
 					   NameStr(oprForm->oprname));
 
 	/*
-	 * Look up restriction and join estimators if specified
+	 * Look up Oid for any parameters specified
 	 */
 	if (restrictionName)
 		restrictionOid = ValidateRestrictionEstimator(restrictionName);
@@ -499,28 +564,85 @@ AlterOperator(AlterOperatorStmt *stmt)
 	else
 		joinOid = InvalidOid;
 
-	/* Perform additional checks, like OperatorCreate does */
-	if (!(OidIsValid(oprForm->oprleft) && OidIsValid(oprForm->oprright)))
+	if (commutatorName)
 	{
-		/* If it's not a binary op, these things mustn't be set: */
-		if (OidIsValid(joinOid))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only binary operators can have join selectivity")));
+		/* commutator has reversed arg types */
+		commutatorOid = ValidateOperatorDefined(commutatorName,
+												oprForm->oprright,
+												oprForm->oprleft);
+
+		/*
+		 * we don't need to do anything extra for a self commutator as in
+		 * OperatorCreate as there we have to create the operator before
+		 * setting the commutator, but here the operator already exists and
+		 * the commutatorOid above is valid (and is the operator oid).
+		 */
 	}
+	else
+		commutatorOid = InvalidOid;
 
-	if (oprForm->oprresult != BOOLOID)
+	if (negatorName)
 	{
-		if (OidIsValid(restrictionOid))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can have restriction selectivity")));
-		if (OidIsValid(joinOid))
+		negatorOid = ValidateOperatorDefined(negatorName,
+											 oprForm->oprleft,
+											 oprForm->oprright);
+
+		if (negatorOid == oprForm->oid)
 			ereport(ERROR,
 					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can have join selectivity")));
+					 errmsg("operator cannot be its own negator")));
+	}
+	else
+	{
+		negatorOid = InvalidOid;
+	}
+
+	/*
+	 * check that we're not changing any existing values that might be
+	 * depended on elsewhere and may be expected to never change, while
+	 * allowing no-ops.
+	 */
+	if (OidIsValid(commutatorOid) && OidIsValid(oprForm->oprcom)
+		&& commutatorOid != oprForm->oprcom)
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+				 errmsg("operator attribute \"commutator\" cannot be changed if it has already been set")));
+	}
+
+	if (OidIsValid(negatorOid) && OidIsValid(oprForm->oprnegate)
+		&& negatorOid != oprForm->oprnegate)
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+				 errmsg("operator attribute \"negator\" cannot be changed if it has already been set")));
+	}
+
+	if (updateHashes && oprForm->oprcanhash && !canHash)
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+				 errmsg("operator attribute \"hashes\" cannot be disabled after it has been enabled")));
+	}
+
+	if (updateMerges && oprForm->oprcanmerge && !canMerge)
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+				 errmsg("operator attribute \"merges\" cannot be disabled after it has been enabled")));
 	}
 
+	/* Perform additional checks, like OperatorCreate does */
+	OperatorValidateParams(oprForm->oprleft,
+						   oprForm->oprright,
+						   oprForm->oprresult,
+						   OidIsValid(commutatorOid),
+						   OidIsValid(negatorOid),
+						   OidIsValid(joinOid),
+						   OidIsValid(restrictionOid),
+						   updateMerges,
+						   updateHashes);
+
 	/* Update the tuple */
 	for (i = 0; i < Natts_pg_operator; ++i)
 	{
@@ -539,6 +661,30 @@ AlterOperator(AlterOperatorStmt *stmt)
 		values[Anum_pg_operator_oprjoin - 1] = joinOid;
 	}
 
+	if (OidIsValid(commutatorOid))
+	{
+		replaces[Anum_pg_operator_oprcom - 1] = true;
+		values[Anum_pg_operator_oprcom - 1] = ObjectIdGetDatum(commutatorOid);
+	}
+
+	if (OidIsValid(negatorOid))
+	{
+		replaces[Anum_pg_operator_oprnegate - 1] = true;
+		values[Anum_pg_operator_oprnegate - 1] = ObjectIdGetDatum(negatorOid);
+	}
+
+	if (updateMerges)
+	{
+		replaces[Anum_pg_operator_oprcanmerge - 1] = true;
+		values[Anum_pg_operator_oprcanmerge - 1] = canMerge;
+	}
+
+	if (updateHashes)
+	{
+		replaces[Anum_pg_operator_oprcanhash - 1] = true;
+		values[Anum_pg_operator_oprcanhash - 1] = canHash;
+	}
+
 	tup = heap_modify_tuple(tup, RelationGetDescr(catalog),
 							values, nulls, replaces);
 
@@ -550,5 +696,8 @@ AlterOperator(AlterOperatorStmt *stmt)
 
 	table_close(catalog, NoLock);
 
+	if (OidIsValid(commutatorOid) || OidIsValid(negatorOid))
+		OperatorUpd(oprId, commutatorOid, negatorOid, false);
+
 	return address;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index fec9b2604f..8c9a35b5f3 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10124,6 +10124,8 @@ operator_def_elem: ColLabel '=' NONE
 						{ $$ = makeDefElem($1, NULL, @1); }
 				   | ColLabel '=' operator_def_arg
 						{ $$ = makeDefElem($1, (Node *) $3, @1); }
+				   | ColLabel
+						{ $$ = makeDefElem($1, NULL, @1); }
 		;
 
 /* must be similar enough to def_arg to avoid reduce/reduce conflicts */
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index a670973ab2..13ebc592f2 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -85,6 +85,11 @@ typedef FormData_pg_operator *Form_pg_operator;
 DECLARE_UNIQUE_INDEX_PKEY(pg_operator_oid_index, 2688, OperatorOidIndexId, pg_operator, btree(oid oid_ops));
 DECLARE_UNIQUE_INDEX(pg_operator_oprname_l_r_n_index, 2689, OperatorNameNspIndexId, pg_operator, btree(oprname name_ops, oprleft oid_ops, oprright oid_ops, oprnamespace oid_ops));
 
+extern Oid
+			OperatorLookup(List *operatorName,
+						   Oid leftObjectId,
+						   Oid rightObjectId,
+						   bool *defined);
 
 extern ObjectAddress OperatorCreate(const char *operatorName,
 									Oid operatorNamespace,
diff --git a/src/test/regress/expected/alter_operator.out b/src/test/regress/expected/alter_operator.out
index 71bd484282..87bf0618fe 100644
--- a/src/test/regress/expected/alter_operator.out
+++ b/src/test/regress/expected/alter_operator.out
@@ -25,7 +25,7 @@ ORDER BY 1;
 (3 rows)
 
 --
--- Reset and set params
+-- test reset and set restrict and join
 --
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = NONE);
 ALTER OPERATOR === (boolean, boolean) SET (JOIN = NONE);
@@ -106,34 +106,264 @@ ORDER BY 1;
  schema public                                         | n
 (3 rows)
 
---
--- Test invalid options.
---
-ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = ====);
-ERROR:  operator attribute "commutator" cannot be changed
-ALTER OPERATOR === (boolean, boolean) SET (NEGATOR = ====);
-ERROR:  operator attribute "negator" cannot be changed
+-- test cannot set non existant function
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = non_existent_func);
 ERROR:  function non_existent_func(internal, oid, internal, integer) does not exist
 ALTER OPERATOR === (boolean, boolean) SET (JOIN = non_existent_func);
 ERROR:  function non_existent_func(internal, oid, internal, smallint, internal) does not exist
-ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = !==);
-ERROR:  operator attribute "commutator" cannot be changed
-ALTER OPERATOR === (boolean, boolean) SET (NEGATOR = !==);
-ERROR:  operator attribute "negator" cannot be changed
--- invalid: non-lowercase quoted identifiers
+-- test non-lowercase quoted identifiers invalid
 ALTER OPERATOR & (bit, bit) SET ("Restrict" = _int_contsel, "Join" = _int_contjoinsel);
 ERROR:  operator attribute "Restrict" not recognized
 --
--- Test permission check. Must be owner to ALTER OPERATOR.
+-- test must be owner of operator to ALTER OPERATOR.
 --
 CREATE USER regress_alter_op_user;
 SET SESSION AUTHORIZATION regress_alter_op_user;
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = NONE);
 ERROR:  must be owner of operator ===
--- Clean up
 RESET SESSION AUTHORIZATION;
+--
+-- test set commutator, negator, hashes, and merges which can only be set if not
+-- already set
+--
+-- for these tests create operators with different left and right types so that
+-- we can validate that function signatures are handled correctly
+CREATE FUNCTION alter_op_test_fn_bool_real(boolean, real)
+RETURNS boolean AS $$ SELECT NULL::BOOLEAN; $$ LANGUAGE sql IMMUTABLE;
+CREATE FUNCTION alter_op_test_fn_real_bool(real, boolean)
+RETURNS boolean AS $$ SELECT NULL::BOOLEAN; $$ LANGUAGE sql IMMUTABLE;
+-- operator
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+-- commutator
+CREATE OPERATOR ==== (
+    LEFTARG = real,
+    RIGHTARG = boolean,
+    PROCEDURE = alter_op_test_fn_real_bool
+);
+-- negator
+CREATE OPERATOR !==== (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+-- test no-op setting already false hashes and merges to false works
+ALTER OPERATOR === (boolean, real) SET (HASHES = false);
+ALTER OPERATOR === (boolean, real) SET (MERGES = false);
+-- validate still false after no-op
+SELECT oprcanhash, oprcanmerge FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanhash | oprcanmerge 
+------------+-------------
+ f          | f
+(1 row)
+
+-- test cannot set commutator or negator without owning them
+SET SESSION AUTHORIZATION regress_alter_op_user;
+-- we need need a new operator owned by regress_alter_op_user so that we are
+-- allowed to alter it. ==== and !==== are owned by the test user so we expect
+-- the alters below to fail.
+CREATE OPERATOR ===@@@ (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+ALTER OPERATOR ===@@@ (boolean, real) SET (COMMUTATOR= ====);
+ERROR:  must be owner of operator ====
+ALTER OPERATOR ===@@@ (boolean, real) SET (NEGATOR = !====);
+ERROR:  must be owner of operator !====
+-- validate operator is unchanged and commutator and negator are unset
+SELECT oprcom, oprnegate FROM pg_operator WHERE oprname = '===@@@'
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcom | oprnegate 
+--------+-----------
+      0 |         0
+(1 row)
+
+DROP OPERATOR ===@@@ (boolean, real);
+RESET SESSION AUTHORIZATION;
+-- test cannot set self negator
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = ===);
+ERROR:  operator cannot be its own negator
+-- validate no changes made
+SELECT oprcanmerge, oprcanhash, oprcom, oprnegate
+FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanmerge | oprcanhash | oprcom | oprnegate 
+-------------+------------+--------+-----------
+ f           | f          |      0 |         0
+(1 row)
+
+-- test set hashes
+ALTER OPERATOR === (boolean, real) SET (HASHES);
+SELECT oprcanhash FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanhash 
+------------
+ t
+(1 row)
+
+-- test set merges
+ALTER OPERATOR === (boolean, real) SET (MERGES);
+SELECT oprcanmerge FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanmerge 
+-------------
+ t
+(1 row)
+
+-- test set commutator
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ====);
+-- validate that the commutator has been set on both the operator and commutator,
+-- that they reference each other, and that the operator used is the existing
+-- one we created and not a new shell operator
+SELECT op.oprname AS operator_name, com.oprname AS commutator_name,
+  com.oprcode AS commutator_func
+  FROM pg_operator op
+  INNER JOIN pg_operator com ON (op.oid = com.oprcom AND op.oprcom = com.oid)
+  WHERE op.oprname = '==='
+  AND op.oprleft = 'boolean'::regtype AND op.oprright = 'real'::regtype;
+ operator_name | commutator_name |      commutator_func       
+---------------+-----------------+----------------------------
+ ===           | ====            | alter_op_test_fn_real_bool
+(1 row)
+
+-- test set negator
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !====);
+-- validate that the negator has been set on both the operator and negator, that
+-- they reference each other, and that the operator used is the existing one we
+-- created and not a new shell operator
+SELECT op.oprname AS operator_name, neg.oprname AS negator_name,
+  neg.oprcode AS negator_func
+  FROM pg_operator op
+  INNER JOIN pg_operator neg ON (op.oid = neg.oprnegate AND op.oprnegate = neg.oid)
+  WHERE op.oprname = '==='
+  AND op.oprleft = 'boolean'::regtype AND op.oprright = 'real'::regtype;
+ operator_name | negator_name |        negator_func        
+---------------+--------------+----------------------------
+ ===           | !====        | alter_op_test_fn_bool_real
+(1 row)
+
+-- validate that the final state of the operator is as we expect
+SELECT oprcanmerge, oprcanhash,
+       pg_describe_object('pg_operator'::regclass, oprcom, 0) AS commutator,
+       pg_describe_object('pg_operator'::regclass, oprnegate, 0) AS negator
+  FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanmerge | oprcanhash |         commutator          |           negator            
+-------------+------------+-----------------------------+------------------------------
+ t           | t          | operator ====(real,boolean) | operator !====(boolean,real)
+(1 row)
+
+-- test no-op set 'succeeds'
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !====);
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ====);
+ALTER OPERATOR === (boolean, real) SET (HASHES);
+ALTER OPERATOR === (boolean, real) SET (MERGES);
+-- test cannot change commutator, negator, hashes, and merges when already set
+CREATE OPERATOR @= (
+    LEFTARG = real,
+    RIGHTARG = boolean,
+    PROCEDURE = alter_op_test_fn_real_bool
+);
+CREATE OPERATOR @!= (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = @=);
+ERROR:  operator attribute "commutator" cannot be changed if it has already been set
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = @!=);
+ERROR:  operator attribute "negator" cannot be changed if it has already been set
+ALTER OPERATOR === (boolean, real) SET (HASHES = false);
+ERROR:  operator attribute "hashes" cannot be disabled after it has been enabled
+ALTER OPERATOR === (boolean, real) SET (MERGES = false);
+ERROR:  operator attribute "merges" cannot be disabled after it has been enabled
+-- validate no changes made
+SELECT oprcanmerge, oprcanhash,
+       pg_describe_object('pg_operator'::regclass, oprcom, 0) AS commutator,
+       pg_describe_object('pg_operator'::regclass, oprnegate, 0) AS negator
+  FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanmerge | oprcanhash |         commutator          |           negator            
+-------------+------------+-----------------------------+------------------------------
+ t           | t          | operator ====(real,boolean) | operator !====(boolean,real)
+(1 row)
+
+DROP OPERATOR @=(real, boolean);
+DROP OPERATOR @!=(boolean, real);
+--
+-- test setting commutator and negator to undefined or shell operator fails
+--
+DROP OPERATOR === (boolean, real);
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+-- test cannot set undefined
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ===@@@);
+ERROR:  operator ===@@@ does not exist
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !===@@@);
+ERROR:  operator !===@@@ does not exist
+-- create shell operators for ===@@@ and !===@@@ 
+CREATE OPERATOR ===@@ (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real,
+    COMMUTATOR = ===@@@,
+    NEGATOR = !===@@@
+);
+-- test cannot set shell operators
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ===@@@);
+ERROR:  operator ===@@@ is undefined shell operator
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !===@@@);
+ERROR:  operator !===@@@ is undefined shell operator
+-- validate no changes made
+SELECT oprcanmerge, oprcanhash,
+       pg_describe_object('pg_operator'::regclass, oprcom, 0) AS commutator,
+       pg_describe_object('pg_operator'::regclass, oprnegate, 0) AS negator
+  FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanmerge | oprcanhash | commutator | negator 
+-------------+------------+------------+---------
+ f           | f          |            | 
+(1 row)
+
+DROP OPERATOR ===@@(boolean, real);
+DROP OPERATOR ===@@@(real, boolean);
+DROP OPERATOR !===@@@(boolean, real);
+--
+-- test setting self commutator
+--
+DROP OPERATOR === (boolean, boolean);
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = boolean,
+    PROCEDURE = alter_op_test_fn
+);
+ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = ===);
+-- validate that the oprcom is the operator oid
+SELECT oprname FROM pg_operator 
+  WHERE oprname = '===' AND oid = oprcom
+  AND oprleft = 'boolean'::regtype AND oprright = 'boolean'::regtype;
+ oprname 
+---------
+ ===
+(1 row)
+
+--
+-- Clean up
+--
 DROP USER regress_alter_op_user;
 DROP OPERATOR === (boolean, boolean);
+DROP OPERATOR === (boolean, real);
+DROP OPERATOR ==== (real, boolean);
+DROP OPERATOR !==== (boolean, real);
 DROP FUNCTION customcontsel(internal, oid, internal, integer);
 DROP FUNCTION alter_op_test_fn(boolean, boolean);
+DROP FUNCTION alter_op_test_fn_bool_real(boolean, real);
+DROP FUNCTION alter_op_test_fn_real_bool(real, boolean);
diff --git a/src/test/regress/sql/alter_operator.sql b/src/test/regress/sql/alter_operator.sql
index fd40370165..9a3b3f0cde 100644
--- a/src/test/regress/sql/alter_operator.sql
+++ b/src/test/regress/sql/alter_operator.sql
@@ -22,7 +22,7 @@ WHERE classid = 'pg_operator'::regclass AND
 ORDER BY 1;
 
 --
--- Reset and set params
+-- test reset and set restrict and join
 --
 
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = NONE);
@@ -71,30 +71,243 @@ WHERE classid = 'pg_operator'::regclass AND
       objid = '===(bool,bool)'::regoperator
 ORDER BY 1;
 
---
--- Test invalid options.
---
-ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = ====);
-ALTER OPERATOR === (boolean, boolean) SET (NEGATOR = ====);
+-- test cannot set non existant function
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = non_existent_func);
 ALTER OPERATOR === (boolean, boolean) SET (JOIN = non_existent_func);
-ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = !==);
-ALTER OPERATOR === (boolean, boolean) SET (NEGATOR = !==);
 
--- invalid: non-lowercase quoted identifiers
+-- test non-lowercase quoted identifiers invalid
 ALTER OPERATOR & (bit, bit) SET ("Restrict" = _int_contsel, "Join" = _int_contjoinsel);
 
 --
--- Test permission check. Must be owner to ALTER OPERATOR.
+-- test must be owner of operator to ALTER OPERATOR.
 --
 CREATE USER regress_alter_op_user;
 SET SESSION AUTHORIZATION regress_alter_op_user;
 
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = NONE);
 
--- Clean up
 RESET SESSION AUTHORIZATION;
+
+--
+-- test set commutator, negator, hashes, and merges which can only be set if not
+-- already set
+--
+
+-- for these tests create operators with different left and right types so that
+-- we can validate that function signatures are handled correctly
+
+CREATE FUNCTION alter_op_test_fn_bool_real(boolean, real)
+RETURNS boolean AS $$ SELECT NULL::BOOLEAN; $$ LANGUAGE sql IMMUTABLE;
+
+CREATE FUNCTION alter_op_test_fn_real_bool(real, boolean)
+RETURNS boolean AS $$ SELECT NULL::BOOLEAN; $$ LANGUAGE sql IMMUTABLE;
+
+-- operator
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+
+-- commutator
+CREATE OPERATOR ==== (
+    LEFTARG = real,
+    RIGHTARG = boolean,
+    PROCEDURE = alter_op_test_fn_real_bool
+);
+
+-- negator
+CREATE OPERATOR !==== (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+
+-- test no-op setting already false hashes and merges to false works
+ALTER OPERATOR === (boolean, real) SET (HASHES = false);
+ALTER OPERATOR === (boolean, real) SET (MERGES = false);
+
+-- validate still false after no-op
+SELECT oprcanhash, oprcanmerge FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+-- test cannot set commutator or negator without owning them
+SET SESSION AUTHORIZATION regress_alter_op_user;
+
+-- we need need a new operator owned by regress_alter_op_user so that we are
+-- allowed to alter it. ==== and !==== are owned by the test user so we expect
+-- the alters below to fail.
+CREATE OPERATOR ===@@@ (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+
+ALTER OPERATOR ===@@@ (boolean, real) SET (COMMUTATOR= ====);
+ALTER OPERATOR ===@@@ (boolean, real) SET (NEGATOR = !====);
+
+-- validate operator is unchanged and commutator and negator are unset
+SELECT oprcom, oprnegate FROM pg_operator WHERE oprname = '===@@@'
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+DROP OPERATOR ===@@@ (boolean, real);
+
+RESET SESSION AUTHORIZATION;
+
+-- test cannot set self negator
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = ===);
+
+-- validate no changes made
+SELECT oprcanmerge, oprcanhash, oprcom, oprnegate
+FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+-- test set hashes
+ALTER OPERATOR === (boolean, real) SET (HASHES);
+SELECT oprcanhash FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+-- test set merges
+ALTER OPERATOR === (boolean, real) SET (MERGES);
+SELECT oprcanmerge FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+-- test set commutator
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ====);
+
+-- validate that the commutator has been set on both the operator and commutator,
+-- that they reference each other, and that the operator used is the existing
+-- one we created and not a new shell operator
+SELECT op.oprname AS operator_name, com.oprname AS commutator_name,
+  com.oprcode AS commutator_func
+  FROM pg_operator op
+  INNER JOIN pg_operator com ON (op.oid = com.oprcom AND op.oprcom = com.oid)
+  WHERE op.oprname = '==='
+  AND op.oprleft = 'boolean'::regtype AND op.oprright = 'real'::regtype;
+
+-- test set negator
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !====);
+
+-- validate that the negator has been set on both the operator and negator, that
+-- they reference each other, and that the operator used is the existing one we
+-- created and not a new shell operator
+SELECT op.oprname AS operator_name, neg.oprname AS negator_name,
+  neg.oprcode AS negator_func
+  FROM pg_operator op
+  INNER JOIN pg_operator neg ON (op.oid = neg.oprnegate AND op.oprnegate = neg.oid)
+  WHERE op.oprname = '==='
+  AND op.oprleft = 'boolean'::regtype AND op.oprright = 'real'::regtype;
+
+-- validate that the final state of the operator is as we expect
+SELECT oprcanmerge, oprcanhash,
+       pg_describe_object('pg_operator'::regclass, oprcom, 0) AS commutator,
+       pg_describe_object('pg_operator'::regclass, oprnegate, 0) AS negator
+  FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+-- test no-op set 'succeeds'
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !====);
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ====);
+ALTER OPERATOR === (boolean, real) SET (HASHES);
+ALTER OPERATOR === (boolean, real) SET (MERGES);
+
+-- test cannot change commutator, negator, hashes, and merges when already set
+
+CREATE OPERATOR @= (
+    LEFTARG = real,
+    RIGHTARG = boolean,
+    PROCEDURE = alter_op_test_fn_real_bool
+);
+CREATE OPERATOR @!= (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = @=);
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = @!=);
+ALTER OPERATOR === (boolean, real) SET (HASHES = false);
+ALTER OPERATOR === (boolean, real) SET (MERGES = false);
+
+-- validate no changes made
+SELECT oprcanmerge, oprcanhash,
+       pg_describe_object('pg_operator'::regclass, oprcom, 0) AS commutator,
+       pg_describe_object('pg_operator'::regclass, oprnegate, 0) AS negator
+  FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+DROP OPERATOR @=(real, boolean);
+DROP OPERATOR @!=(boolean, real);
+
+--
+-- test setting commutator and negator to undefined or shell operator fails
+--
+
+DROP OPERATOR === (boolean, real);
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+
+-- test cannot set undefined
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ===@@@);
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !===@@@);
+
+-- create shell operators for ===@@@ and !===@@@ 
+CREATE OPERATOR ===@@ (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real,
+    COMMUTATOR = ===@@@,
+    NEGATOR = !===@@@
+);
+
+-- test cannot set shell operators
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ===@@@);
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !===@@@);
+
+-- validate no changes made
+SELECT oprcanmerge, oprcanhash,
+       pg_describe_object('pg_operator'::regclass, oprcom, 0) AS commutator,
+       pg_describe_object('pg_operator'::regclass, oprnegate, 0) AS negator
+  FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+DROP OPERATOR ===@@(boolean, real);
+DROP OPERATOR ===@@@(real, boolean);
+DROP OPERATOR !===@@@(boolean, real);
+
+--
+-- test setting self commutator
+--
+
+DROP OPERATOR === (boolean, boolean);
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = boolean,
+    PROCEDURE = alter_op_test_fn
+);
+
+ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = ===);
+
+-- validate that the oprcom is the operator oid
+SELECT oprname FROM pg_operator 
+  WHERE oprname = '===' AND oid = oprcom
+  AND oprleft = 'boolean'::regtype AND oprright = 'boolean'::regtype;
+
+--
+-- Clean up
+--
+
 DROP USER regress_alter_op_user;
+
 DROP OPERATOR === (boolean, boolean);
+DROP OPERATOR === (boolean, real);
+DROP OPERATOR ==== (real, boolean);
+DROP OPERATOR !==== (boolean, real);
+
 DROP FUNCTION customcontsel(internal, oid, internal, integer);
 DROP FUNCTION alter_op_test_fn(boolean, boolean);
+DROP FUNCTION alter_op_test_fn_bool_real(boolean, real);
+DROP FUNCTION alter_op_test_fn_real_bool(real, boolean);
0002-refactor-v3.patchapplication/octet-stream; name=0002-refactor-v3.patchDownload
diff --git a/src/backend/catalog/pg_operator.c b/src/backend/catalog/pg_operator.c
index b0570c33b6..cda543a03b 100644
--- a/src/backend/catalog/pg_operator.c
+++ b/src/backend/catalog/pg_operator.c
@@ -360,53 +360,17 @@ OperatorCreate(const char *operatorName,
 				 errmsg("\"%s\" is not a valid operator name",
 						operatorName)));
 
-	if (!(OidIsValid(leftTypeId) && OidIsValid(rightTypeId)))
-	{
-		/* If it's not a binary op, these things mustn't be set: */
-		if (commutatorName)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only binary operators can have commutators")));
-		if (OidIsValid(joinId))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only binary operators can have join selectivity")));
-		if (canMerge)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only binary operators can merge join")));
-		if (canHash)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only binary operators can hash")));
-	}
-
 	operResultType = get_func_rettype(procedureId);
 
-	if (operResultType != BOOLOID)
-	{
-		/* If it's not a boolean op, these things mustn't be set: */
-		if (negatorName)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can have negators")));
-		if (OidIsValid(restrictionId))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can have restriction selectivity")));
-		if (OidIsValid(joinId))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can have join selectivity")));
-		if (canMerge)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can merge join")));
-		if (canHash)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can hash")));
-	}
+	OperatorValidateParams(leftTypeId,
+						   rightTypeId,
+						   operResultType,
+						   commutatorName,
+						   negatorName,
+						   OidIsValid(joinId),
+						   OidIsValid(restrictionId),
+						   canMerge,
+						   canHash);
 
 	operatorObjectId = OperatorGet(operatorName,
 								   operatorNamespace,
@@ -763,6 +727,78 @@ OperatorUpd(Oid baseId, Oid commId, Oid negId, bool isDelete)
 	table_close(pg_operator_desc, RowExclusiveLock);
 }
 
+/*
+ * OperatorValidateParams
+ *
+ * This function validates that an operator with arguments of leftTypeId and
+ * rightTypeId returning operResultType can have the attributes that are set
+ * to true. If any attributes set are not compatible with the operator then this
+ * function will raise an error, otherwise it simply returns.
+ *
+ * This function doesn't check for missing requirements so any attribute that's
+ * false is considered valid by this function. As such, if you are modifying an
+ * operator you can set any attributes you are not modifying to false to validate
+ * the changes you are making.
+ */
+
+void
+OperatorValidateParams(Oid leftTypeId,
+					   Oid rightTypeId,
+					   Oid operResultType,
+					   bool hasCommutator,
+					   bool hasNegator,
+					   bool hasJoinSelectivity,
+					   bool hasRestrictionSelectivity,
+					   bool canMerge,
+					   bool canHash)
+{
+	if (!(OidIsValid(leftTypeId) && OidIsValid(rightTypeId)))
+	{
+		/* If it's not a binary op, these things mustn't be set: */
+		if (hasCommutator)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only binary operators can have commutators")));
+		if (hasJoinSelectivity)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only binary operators can have join selectivity")));
+		if (canMerge)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only binary operators can merge join")));
+		if (canHash)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only binary operators can hash")));
+	}
+
+	if (operResultType != BOOLOID)
+	{
+		/* If it's not a boolean op, these things mustn't be set: */
+		if (hasNegator)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only boolean operators can have negators")));
+		if (hasRestrictionSelectivity)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only boolean operators can have restriction selectivity")));
+		if (hasJoinSelectivity)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only boolean operators can have join selectivity")));
+		if (canMerge)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only boolean operators can merge join")));
+		if (canHash)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only boolean operators can hash")));
+	}
+}
+
 /*
  * Create dependencies for an operator (either a freshly inserted
  * complete operator, a new shell operator, a just-updated shell,
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index aff372b4bb..a670973ab2 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -104,4 +104,14 @@ extern ObjectAddress makeOperatorDependencies(HeapTuple tuple,
 
 extern void OperatorUpd(Oid baseId, Oid commId, Oid negId, bool isDelete);
 
+extern void OperatorValidateParams(Oid leftTypeId,
+								   Oid rightTypeId,
+								   Oid operResultType,
+								   bool hasCommutator,
+								   bool hasNegator,
+								   bool hasJoinSelectivity,
+								   bool hasRestrictionSelectivity,
+								   bool canMerge,
+								   bool canHash);
+
 #endif							/* PG_OPERATOR_H */
#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tommy Pavlicek (#11)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

Tommy Pavlicek <tommypav122@gmail.com> writes:

I did notice one further potential bug. When creating an operator and
adding a commutator, PostgreSQL only links the commutator back to the
operator if the commutator has no commutator of its own, but the
create operation succeeds regardless of whether this linkage happens.

In other words, if A and B are a pair of commutators and one creates
another operator, C, with A as its commutator, then C will link to A,
but A will still link to B (and B to A). It's not clear to me if this
in itself is a problem, but unless I've misunderstood something
operator C must be the same as B so it implies an error by the user
and there could also be issues if A was deleted since C would continue
to refer to the deleted A.

Yeah, it'd make sense to tighten that up. Per the discussion so far,
we should not allow an operator's commutator/negator links to change
once set, so overwriting the existing link with a different value
would be wrong. But allowing creation of the new operator to proceed
with a different outcome than expected isn't good either. I think
we should start throwing an error for that.

regards, tom lane

#13Tommy Pavlicek
tommypav122@gmail.com
In reply to: Tom Lane (#12)
4 attachment(s)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

On Tue, Oct 10, 2023 at 9:32 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Tommy Pavlicek <tommypav122@gmail.com> writes:

I did notice one further potential bug. When creating an operator and
adding a commutator, PostgreSQL only links the commutator back to the
operator if the commutator has no commutator of its own, but the
create operation succeeds regardless of whether this linkage happens.

In other words, if A and B are a pair of commutators and one creates
another operator, C, with A as its commutator, then C will link to A,
but A will still link to B (and B to A). It's not clear to me if this
in itself is a problem, but unless I've misunderstood something
operator C must be the same as B so it implies an error by the user
and there could also be issues if A was deleted since C would continue
to refer to the deleted A.

Yeah, it'd make sense to tighten that up. Per the discussion so far,
we should not allow an operator's commutator/negator links to change
once set, so overwriting the existing link with a different value
would be wrong. But allowing creation of the new operator to proceed
with a different outcome than expected isn't good either. I think
we should start throwing an error for that.

regards, tom lane

Thanks.

I've added another patch (0002-require_unused_neg_com-v1.patch) that
prevents using a commutator or negator that's already part of a pair.
The only other changes from my email yesterday are that in the ALTER
command I moved the post alter hook to after OperatorUpd and the
addition of tests to verify that we can't use an existing commutator
or negator with the ALTER command.

I believe this can all be looked at again.

Cheers,
Tommy

Attachments:

0002-require_unused_neg_com-v1.patchapplication/octet-stream; name=0002-require_unused_neg_com-v1.patchDownload
diff --git a/src/backend/catalog/pg_operator.c b/src/backend/catalog/pg_operator.c
index b0570c33b6..15128235d4 100644
--- a/src/backend/catalog/pg_operator.c
+++ b/src/backend/catalog/pg_operator.c
@@ -557,11 +557,6 @@ OperatorCreate(const char *operatorName,
 	/* Add dependencies for the entry */
 	address = makeOperatorDependencies(tup, true, isUpdate);
 
-	/* Post creation hook for new operator */
-	InvokeObjectPostCreateHook(OperatorRelationId, operatorObjectId, 0);
-
-	table_close(pg_operator_desc, RowExclusiveLock);
-
 	/*
 	 * If a commutator and/or negator link is provided, update the other
 	 * operator(s) to point at this one, if they don't already have a link.
@@ -579,6 +574,11 @@ OperatorCreate(const char *operatorName,
 	if (OidIsValid(commutatorId) || OidIsValid(negatorId))
 		OperatorUpd(operatorObjectId, commutatorId, negatorId, false);
 
+	/* Post creation hook for new operator */
+	InvokeObjectPostCreateHook(OperatorRelationId, operatorObjectId, 0);
+
+	table_close(pg_operator_desc, RowExclusiveLock);
+
 	return address;
 }
 
@@ -692,8 +692,28 @@ OperatorUpd(Oid baseId, Oid commId, Oid negId, bool isDelete)
 			t->oprcom = InvalidOid;
 			update_commutator = true;
 		}
-		else if (!isDelete && !OidIsValid(t->oprcom))
+		else if (!isDelete && baseId != t->oprcom)
 		{
+			/*
+			 * When filling in a shell operator the commutator's oprcom field
+			 * will already be set to the baseId so we don't need to make an
+			 * update in that case.
+			 */
+
+			if (OidIsValid(t->oprcom))
+			{
+				/*
+				 * When commutator's oprcom is already filled in that means
+				 * that it's part of a pair of commutators and by definition
+				 * the existing oprcom must be equivalent to the baseId
+				 * operator. It doesn't make sense to allow the baseId alter
+				 * or create to succeed in this case so raise an error.
+				 */
+				ereport(ERROR,
+						(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+						 errmsg("commutator commutates another operator")));
+			}
+
 			t->oprcom = baseId;
 			update_commutator = true;
 		}
@@ -737,8 +757,28 @@ OperatorUpd(Oid baseId, Oid commId, Oid negId, bool isDelete)
 			t->oprnegate = InvalidOid;
 			update_negator = true;
 		}
-		else if (!isDelete && !OidIsValid(t->oprnegate))
+		else if (!isDelete && baseId != t->oprnegate)
 		{
+			/*
+			 * When filling in a shell operator the negator's oprnegate field
+			 * will already be set to the baseId so we don't need to make an
+			 * update in that case.
+			 */
+
+			if (OidIsValid(t->oprnegate))
+			{
+				/*
+				 * When negator's oprnegate is already filled in that means
+				 * that it's part of a pair of negators and by definition the
+				 * existing oprnegate must be equivalent to the baseId
+				 * operator. It doesn't make sense to allow the baseId alter
+				 * or create to succeed in this case so raise an error.
+				 */
+				ereport(ERROR,
+						(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+						 errmsg("negator negates another operator")));
+			}
+
 			t->oprnegate = baseId;
 			update_negator = true;
 		}
diff --git a/src/test/regress/expected/create_operator.out b/src/test/regress/expected/create_operator.out
index 49070ce90f..600699f407 100644
--- a/src/test/regress/expected/create_operator.out
+++ b/src/test/regress/expected/create_operator.out
@@ -296,6 +296,52 @@ CREATE OPERATOR ===!!! (
 );
 ERROR:  operator cannot be its own negator
 ROLLBACK;
+-- test that we can't use part of an existing commutator or negator pair as a
+-- commutator or negator
+CREATE FUNCTION create_op_test_fn(boolean, boolean)
+RETURNS boolean AS $$
+    SELECT NULL::BOOLEAN;
+$$ LANGUAGE sql IMMUTABLE;
+CREATE OPERATOR === (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    commutator = ===@@@,
+    negator = ===!!!
+);
+CREATE OPERATOR ===!!! (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    negator = ===
+);
+CREATE OPERATOR ===@@@ (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    commutator = ===
+);
+-- test trying to use === as a commutator fails
+CREATE OPERATOR ===@ (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    commutator = ===
+);
+ERROR:  commutator commutates another operator
+-- test trying to use === as a negator fails
+CREATE OPERATOR ===@ (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    negator = ===
+);
+ERROR:  negator negates another operator
+-- cleanup
+DROP OPERATOR ===@@@(bool, bool);
+DROP OPERATOR ===!!!(bool, bool);
+DROP OPERATOR ===(bool, bool);
+DROP FUNCTION create_op_test_fn;
 -- invalid: non-lowercase quoted identifiers
 CREATE OPERATOR ===
 (
diff --git a/src/test/regress/sql/create_operator.sql b/src/test/regress/sql/create_operator.sql
index 168cad3814..03e2157a57 100644
--- a/src/test/regress/sql/create_operator.sql
+++ b/src/test/regress/sql/create_operator.sql
@@ -246,6 +246,51 @@ CREATE OPERATOR ===!!! (
 );
 ROLLBACK;
 
+-- test that we can't use part of an existing commutator or negator pair as a
+-- commutator or negator
+CREATE FUNCTION create_op_test_fn(boolean, boolean)
+RETURNS boolean AS $$
+    SELECT NULL::BOOLEAN;
+$$ LANGUAGE sql IMMUTABLE;
+CREATE OPERATOR === (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    commutator = ===@@@,
+    negator = ===!!!
+);
+CREATE OPERATOR ===!!! (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    negator = ===
+);
+CREATE OPERATOR ===@@@ (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    commutator = ===
+);
+-- test trying to use === as a commutator fails
+CREATE OPERATOR ===@ (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    commutator = ===
+);
+-- test trying to use === as a negator fails
+CREATE OPERATOR ===@ (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    negator = ===
+);
+-- cleanup
+DROP OPERATOR ===@@@(bool, bool);
+DROP OPERATOR ===!!!(bool, bool);
+DROP OPERATOR ===(bool, bool);
+DROP FUNCTION create_op_test_fn;
+
 -- invalid: non-lowercase quoted identifiers
 CREATE OPERATOR ===
 (
0003-refactor-v3.patchapplication/octet-stream; name=0003-refactor-v3.patchDownload
diff --git a/src/backend/catalog/pg_operator.c b/src/backend/catalog/pg_operator.c
index b0570c33b6..cda543a03b 100644
--- a/src/backend/catalog/pg_operator.c
+++ b/src/backend/catalog/pg_operator.c
@@ -360,53 +360,17 @@ OperatorCreate(const char *operatorName,
 				 errmsg("\"%s\" is not a valid operator name",
 						operatorName)));
 
-	if (!(OidIsValid(leftTypeId) && OidIsValid(rightTypeId)))
-	{
-		/* If it's not a binary op, these things mustn't be set: */
-		if (commutatorName)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only binary operators can have commutators")));
-		if (OidIsValid(joinId))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only binary operators can have join selectivity")));
-		if (canMerge)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only binary operators can merge join")));
-		if (canHash)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only binary operators can hash")));
-	}
-
 	operResultType = get_func_rettype(procedureId);
 
-	if (operResultType != BOOLOID)
-	{
-		/* If it's not a boolean op, these things mustn't be set: */
-		if (negatorName)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can have negators")));
-		if (OidIsValid(restrictionId))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can have restriction selectivity")));
-		if (OidIsValid(joinId))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can have join selectivity")));
-		if (canMerge)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can merge join")));
-		if (canHash)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can hash")));
-	}
+	OperatorValidateParams(leftTypeId,
+						   rightTypeId,
+						   operResultType,
+						   commutatorName,
+						   negatorName,
+						   OidIsValid(joinId),
+						   OidIsValid(restrictionId),
+						   canMerge,
+						   canHash);
 
 	operatorObjectId = OperatorGet(operatorName,
 								   operatorNamespace,
@@ -763,6 +727,78 @@ OperatorUpd(Oid baseId, Oid commId, Oid negId, bool isDelete)
 	table_close(pg_operator_desc, RowExclusiveLock);
 }
 
+/*
+ * OperatorValidateParams
+ *
+ * This function validates that an operator with arguments of leftTypeId and
+ * rightTypeId returning operResultType can have the attributes that are set
+ * to true. If any attributes set are not compatible with the operator then this
+ * function will raise an error, otherwise it simply returns.
+ *
+ * This function doesn't check for missing requirements so any attribute that's
+ * false is considered valid by this function. As such, if you are modifying an
+ * operator you can set any attributes you are not modifying to false to validate
+ * the changes you are making.
+ */
+
+void
+OperatorValidateParams(Oid leftTypeId,
+					   Oid rightTypeId,
+					   Oid operResultType,
+					   bool hasCommutator,
+					   bool hasNegator,
+					   bool hasJoinSelectivity,
+					   bool hasRestrictionSelectivity,
+					   bool canMerge,
+					   bool canHash)
+{
+	if (!(OidIsValid(leftTypeId) && OidIsValid(rightTypeId)))
+	{
+		/* If it's not a binary op, these things mustn't be set: */
+		if (hasCommutator)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only binary operators can have commutators")));
+		if (hasJoinSelectivity)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only binary operators can have join selectivity")));
+		if (canMerge)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only binary operators can merge join")));
+		if (canHash)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only binary operators can hash")));
+	}
+
+	if (operResultType != BOOLOID)
+	{
+		/* If it's not a boolean op, these things mustn't be set: */
+		if (hasNegator)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only boolean operators can have negators")));
+		if (hasRestrictionSelectivity)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only boolean operators can have restriction selectivity")));
+		if (hasJoinSelectivity)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only boolean operators can have join selectivity")));
+		if (canMerge)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only boolean operators can merge join")));
+		if (canHash)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("only boolean operators can hash")));
+	}
+}
+
 /*
  * Create dependencies for an operator (either a freshly inserted
  * complete operator, a new shell operator, a just-updated shell,
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index aff372b4bb..a670973ab2 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -104,4 +104,14 @@ extern ObjectAddress makeOperatorDependencies(HeapTuple tuple,
 
 extern void OperatorUpd(Oid baseId, Oid commId, Oid negId, bool isDelete);
 
+extern void OperatorValidateParams(Oid leftTypeId,
+								   Oid rightTypeId,
+								   Oid operResultType,
+								   bool hasCommutator,
+								   bool hasNegator,
+								   bool hasJoinSelectivity,
+								   bool hasRestrictionSelectivity,
+								   bool canMerge,
+								   bool canHash);
+
 #endif							/* PG_OPERATOR_H */
0001-bug-fixes-v3.patchapplication/octet-stream; name=0001-bug-fixes-v3.patchDownload
diff --git a/src/backend/catalog/pg_operator.c b/src/backend/catalog/pg_operator.c
index 95918a77a1..b0570c33b6 100644
--- a/src/backend/catalog/pg_operator.c
+++ b/src/backend/catalog/pg_operator.c
@@ -57,8 +57,7 @@ static Oid	OperatorShellMake(const char *operatorName,
 static Oid	get_other_operator(List *otherOp,
 							   Oid otherLeftTypeId, Oid otherRightTypeId,
 							   const char *operatorName, Oid operatorNamespace,
-							   Oid leftTypeId, Oid rightTypeId,
-							   bool isCommutator);
+							   Oid leftTypeId, Oid rightTypeId);
 
 
 /*
@@ -442,8 +441,7 @@ OperatorCreate(const char *operatorName,
 		commutatorId = get_other_operator(commutatorName,
 										  rightTypeId, leftTypeId,
 										  operatorName, operatorNamespace,
-										  leftTypeId, rightTypeId,
-										  true);
+										  leftTypeId, rightTypeId);
 
 		/* Permission check: must own other operator */
 		if (OidIsValid(commutatorId) &&
@@ -467,8 +465,19 @@ OperatorCreate(const char *operatorName,
 		negatorId = get_other_operator(negatorName,
 									   leftTypeId, rightTypeId,
 									   operatorName, operatorNamespace,
-									   leftTypeId, rightTypeId,
-									   false);
+									   leftTypeId, rightTypeId);
+
+		/*
+		 * Prevent self negation as it doesn't make sense. When self negating,
+		 * get_other_operator returns InvalidOid when creating a new operator
+		 * and the OID of the operator we are 'creating' when upgrading a
+		 * shell operator.
+		 */
+
+		if (!OidIsValid(negatorId) || operatorObjectId == negatorId)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+					 errmsg("operator cannot be its own negator")));
 
 		/* Permission check: must own other operator */
 		if (OidIsValid(negatorId) &&
@@ -584,7 +593,7 @@ OperatorCreate(const char *operatorName,
 static Oid
 get_other_operator(List *otherOp, Oid otherLeftTypeId, Oid otherRightTypeId,
 				   const char *operatorName, Oid operatorNamespace,
-				   Oid leftTypeId, Oid rightTypeId, bool isCommutator)
+				   Oid leftTypeId, Oid rightTypeId)
 {
 	Oid			other_oid;
 	bool		otherDefined;
@@ -611,14 +620,7 @@ get_other_operator(List *otherOp, Oid otherLeftTypeId, Oid otherRightTypeId,
 		otherLeftTypeId == leftTypeId &&
 		otherRightTypeId == rightTypeId)
 	{
-		/*
-		 * self-linkage to this operator; caller will fix later. Note that
-		 * only self-linkage for commutation makes sense.
-		 */
-		if (!isCommutator)
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("operator cannot be its own negator or sort operator")));
+		/* self-linkage to this operator; caller will fix later. */
 		return InvalidOid;
 	}
 
diff --git a/src/test/regress/expected/create_operator.out b/src/test/regress/expected/create_operator.out
index f71b601f2d..49070ce90f 100644
--- a/src/test/regress/expected/create_operator.out
+++ b/src/test/regress/expected/create_operator.out
@@ -260,6 +260,42 @@ CREATE OPERATOR #*# (
 );
 ERROR:  permission denied for type type_op6
 ROLLBACK;
+-- Should fail. An operator cannot have a self negator.
+BEGIN TRANSACTION;
+CREATE FUNCTION create_op_test_fn(boolean, boolean)
+RETURNS boolean AS $$
+    SELECT NULL::BOOLEAN;
+$$ LANGUAGE sql IMMUTABLE;
+CREATE OPERATOR === (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    negator = ===
+);
+ERROR:  operator cannot be its own negator
+ROLLBACK;
+-- Should fail. An operator cannot have a self negator. Here we test that when
+-- 'creating' an existing shell operator, it checks the negator is not self.
+BEGIN TRANSACTION;
+CREATE FUNCTION create_op_test_fn(boolean, boolean)
+RETURNS boolean AS $$
+    SELECT NULL::BOOLEAN;
+$$ LANGUAGE sql IMMUTABLE;
+-- create a shell operator for ===!!! by referencing it as a commutator
+CREATE OPERATOR === (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    commutator = ===!!!
+);
+CREATE OPERATOR ===!!! (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    negator = ===!!!
+);
+ERROR:  operator cannot be its own negator
+ROLLBACK;
 -- invalid: non-lowercase quoted identifiers
 CREATE OPERATOR ===
 (
diff --git a/src/test/regress/sql/create_operator.sql b/src/test/regress/sql/create_operator.sql
index f53e24db3c..168cad3814 100644
--- a/src/test/regress/sql/create_operator.sql
+++ b/src/test/regress/sql/create_operator.sql
@@ -210,6 +210,42 @@ CREATE OPERATOR #*# (
 );
 ROLLBACK;
 
+-- Should fail. An operator cannot have a self negator.
+BEGIN TRANSACTION;
+CREATE FUNCTION create_op_test_fn(boolean, boolean)
+RETURNS boolean AS $$
+    SELECT NULL::BOOLEAN;
+$$ LANGUAGE sql IMMUTABLE;
+CREATE OPERATOR === (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    negator = ===
+);
+ROLLBACK;
+
+-- Should fail. An operator cannot have a self negator. Here we test that when
+-- 'creating' an existing shell operator, it checks the negator is not self.
+BEGIN TRANSACTION;
+CREATE FUNCTION create_op_test_fn(boolean, boolean)
+RETURNS boolean AS $$
+    SELECT NULL::BOOLEAN;
+$$ LANGUAGE sql IMMUTABLE;
+-- create a shell operator for ===!!! by referencing it as a commutator
+CREATE OPERATOR === (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    commutator = ===!!!
+);
+CREATE OPERATOR ===!!! (
+    leftarg = boolean,
+    rightarg = boolean,
+    procedure = create_op_test_fn,
+    negator = ===!!!
+);
+ROLLBACK;
+
 -- invalid: non-lowercase quoted identifiers
 CREATE OPERATOR ===
 (
0004-alter_op-v4.patchapplication/octet-stream; name=0004-alter_op-v4.patchDownload
diff --git a/doc/src/sgml/ref/alter_operator.sgml b/doc/src/sgml/ref/alter_operator.sgml
index a4a1af564f..ce447e22b0 100644
--- a/doc/src/sgml/ref/alter_operator.sgml
+++ b/doc/src/sgml/ref/alter_operator.sgml
@@ -29,8 +29,12 @@ ALTER OPERATOR <replaceable>name</replaceable> ( { <replaceable>left_type</repla
 
 ALTER OPERATOR <replaceable>name</replaceable> ( { <replaceable>left_type</replaceable> | NONE } , <replaceable>right_type</replaceable> )
     SET ( {  RESTRICT = { <replaceable class="parameter">res_proc</replaceable> | NONE }
-           | JOIN = { <replaceable class="parameter">join_proc</replaceable> | NONE }
-         } [, ... ] )
+            | JOIN = { <replaceable class="parameter">join_proc</replaceable> | NONE }
+            | COMMUTATOR = <replaceable class="parameter">com_op</replaceable> 
+            | NEGATOR = <replaceable class="parameter">neg_op</replaceable>
+            | HASHES
+            | MERGES
+          } [, ... ] )
 </synopsis>
  </refsynopsisdiv>
 
@@ -121,9 +125,64 @@ ALTER OPERATOR <replaceable>name</replaceable> ( { <replaceable>left_type</repla
      </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><replaceable class="parameter">com_op</replaceable></term>
+    <listitem>
+     <para>
+      The commutator of this operator. Can only be set if the operator does not have an existing commutator.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><replaceable class="parameter">neg_op</replaceable></term>
+    <listitem>
+     <para>
+      The negator of this operator. Can only be set if the operator does not have an existing negator.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><literal>HASHES</literal></term>
+    <listitem>
+     <para>
+     Indicates this operator can support a hash join. Can only be enabled and not disabled.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><literal>MERGES</literal></term>
+    <listitem>
+     <para>
+     Indicates this operator can support a merge join. Can only be enabled and not disabled.
+     </para>
+    </listitem>
+   </varlistentry>
+
   </variablelist>
  </refsect1>
 
+  <refsect1>
+  <title>Notes</title>
+
+  <para>
+   Refer to <xref linkend="xoper"/> and <xref linkend="xoper-optimization"/> for further information.
+  </para>
+
+  <para>
+  Since commutators come in pairs and are commutators of each other,
+  <productname>PostgreSQL</productname> will set the commutator of the
+  <replaceable class="parameter">com_op</replaceable> to the operator.
+  </para>
+
+  <para>
+  Like with commutators, <productname>PostgreSQL</productname> will set the
+  negator of the <replaceable class="parameter">neg_op</replaceable> to the operator.
+  </para>
+  </refsect1>
+
  <refsect1>
   <title>Examples</title>
 
diff --git a/doc/src/sgml/ref/create_operator.sgml b/doc/src/sgml/ref/create_operator.sgml
index e27512ff39..9801eb3ed0 100644
--- a/doc/src/sgml/ref/create_operator.sgml
+++ b/doc/src/sgml/ref/create_operator.sgml
@@ -231,9 +231,60 @@ COMMUTATOR = OPERATOR(myschema.===) ,
   <title>Notes</title>
 
   <para>
-   Refer to <xref linkend="xoper"/> for further information.
+   Refer to <xref linkend="xoper"/> and <xref linkend="xoper-optimization"/> for further information.
   </para>
 
+  <para>
+    When you are defining a self-commutative operator, you just do it.
+    When you are defining a pair of commutative operators, things are
+    a little trickier: how can the first one to be defined refer to the
+    other one, which you haven't defined yet?  There are three solutions
+    to this problem:
+
+    <itemizedlist>
+     <listitem>
+      <para>
+       One way is to omit the <literal>COMMUTATOR</literal> clause in the first operator that
+       you define, and then provide one in the second operator's definition.
+       Since <productname>PostgreSQL</productname> knows that commutative
+       operators come in pairs, when it sees the second definition it will
+       automatically go back and fill in the missing <literal>COMMUTATOR</literal> clause in
+       the first definition.
+      </para>
+     </listitem>
+
+     <listitem>
+      <para>
+       Another, more straightforward way is just to include <literal>COMMUTATOR</literal> clauses
+       in both definitions.  When <productname>PostgreSQL</productname> processes
+       the first definition and realizes that <literal>COMMUTATOR</literal> refers to a nonexistent
+       operator, the system will make a dummy entry for that operator in the
+       system catalog.  This dummy entry will have valid data only
+       for the operator name, left and right operand types, and result type,
+       since that's all that <productname>PostgreSQL</productname> can deduce
+       at this point.  The first operator's catalog entry will link to this
+       dummy entry.  Later, when you define the second operator, the system
+       updates the dummy entry with the additional information from the second
+       definition.  If you try to use the dummy operator before it's been filled
+       in, you'll just get an error message.
+      </para>
+     </listitem>
+
+     <listitem>
+      <para>
+      Alternatively, both operators can be defined without a <literal>COMMUTATOR</literal>
+      and then <command>ALTER OPERATOR</command> can be used to set the <literal>COMMUTATOR</literal>
+      for the operator.
+      </para>
+     </listitem>
+    </itemizedlist>
+   </para>
+
+   <para>
+    Pairs of negator operators can be defined using the same methods
+    explained above for commutator pairs.
+   </para>
+
   <para>
    It is not possible to specify an operator's lexical precedence in
    <command>CREATE OPERATOR</command>, because the parser's precedence behavior
diff --git a/doc/src/sgml/xoper.sgml b/doc/src/sgml/xoper.sgml
index a929ced07d..31bc8df3de 100644
--- a/doc/src/sgml/xoper.sgml
+++ b/doc/src/sgml/xoper.sgml
@@ -146,44 +146,6 @@ SELECT (a + b) AS c FROM test_complex;
      <literal>=</literal> operator must specify that it is valid, by marking the
      operator with commutator information.
     </para>
-
-    <para>
-     When you are defining a self-commutative operator, you just do it.
-     When you are defining a pair of commutative operators, things are
-     a little trickier: how can the first one to be defined refer to the
-     other one, which you haven't defined yet?  There are two solutions
-     to this problem:
-
-     <itemizedlist>
-      <listitem>
-       <para>
-        One way is to omit the <literal>COMMUTATOR</literal> clause in the first operator that
-        you define, and then provide one in the second operator's definition.
-        Since <productname>PostgreSQL</productname> knows that commutative
-        operators come in pairs, when it sees the second definition it will
-        automatically go back and fill in the missing <literal>COMMUTATOR</literal> clause in
-        the first definition.
-       </para>
-      </listitem>
-
-      <listitem>
-       <para>
-        The other, more straightforward way is just to include <literal>COMMUTATOR</literal> clauses
-        in both definitions.  When <productname>PostgreSQL</productname> processes
-        the first definition and realizes that <literal>COMMUTATOR</literal> refers to a nonexistent
-        operator, the system will make a dummy entry for that operator in the
-        system catalog.  This dummy entry will have valid data only
-        for the operator name, left and right operand types, and result type,
-        since that's all that <productname>PostgreSQL</productname> can deduce
-        at this point.  The first operator's catalog entry will link to this
-        dummy entry.  Later, when you define the second operator, the system
-        updates the dummy entry with the additional information from the second
-        definition.  If you try to use the dummy operator before it's been filled
-        in, you'll just get an error message.
-       </para>
-      </listitem>
-     </itemizedlist>
-    </para>
    </sect2>
 
    <sect2 id="xoper-negator">
@@ -218,12 +180,7 @@ SELECT (a + b) AS c FROM test_complex;
     <literal>NOT</literal> operations can be inserted as a consequence of other rearrangements.
    </para>
 
-   <para>
-    Pairs of negator operators can be defined using the same methods
-    explained above for commutator pairs.
-   </para>
-
-  </sect2>
+   </sect2>
 
   <sect2 id="xoper-restrict">
    <title><literal>RESTRICT</literal></title>
diff --git a/src/backend/catalog/pg_operator.c b/src/backend/catalog/pg_operator.c
index b75f7a9bc1..578c89b6f9 100644
--- a/src/backend/catalog/pg_operator.c
+++ b/src/backend/catalog/pg_operator.c
@@ -44,11 +44,6 @@ static Oid	OperatorGet(const char *operatorName,
 						Oid rightObjectId,
 						bool *defined);
 
-static Oid	OperatorLookup(List *operatorName,
-						   Oid leftObjectId,
-						   Oid rightObjectId,
-						   bool *defined);
-
 static Oid	OperatorShellMake(const char *operatorName,
 							  Oid operatorNamespace,
 							  Oid leftTypeId,
@@ -165,7 +160,7 @@ OperatorGet(const char *operatorName,
  *
  *		*defined is set true if defined (not a shell)
  */
-static Oid
+Oid
 OperatorLookup(List *operatorName,
 			   Oid leftObjectId,
 			   Oid rightObjectId,
diff --git a/src/backend/commands/operatorcmds.c b/src/backend/commands/operatorcmds.c
index cd7f83136f..6666bf476f 100644
--- a/src/backend/commands/operatorcmds.c
+++ b/src/backend/commands/operatorcmds.c
@@ -54,6 +54,9 @@
 
 static Oid	ValidateRestrictionEstimator(List *restrictionName);
 static Oid	ValidateJoinEstimator(List *joinName);
+static Oid	ValidateOperatorDefined(List *name,
+									Oid leftTypeId,
+									Oid rightTypeId);
 
 /*
  * DefineOperator
@@ -359,6 +362,44 @@ ValidateJoinEstimator(List *joinName)
 	return joinOid;
 }
 
+/*
+ * Looks up and returns an oid for an operator given a possibly-qualified name
+ * and left and right type IDs, verifies the operator is defined (non-shell),
+ * and owned by the current user. Raises an error otherwise.
+ */
+static Oid
+ValidateOperatorDefined(List *name,
+						Oid leftTypeId,
+						Oid rightTypeId)
+{
+	Oid			oid;
+	bool		defined;
+
+	oid = OperatorLookup(name,
+						 leftTypeId,
+						 rightTypeId,
+						 &defined);
+
+	if (!OidIsValid(oid))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_FUNCTION),
+				 errmsg("operator %s does not exist",
+						NameListToString(name))));
+
+	if (!defined)
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_FUNCTION),
+				 errmsg("operator %s is undefined shell operator",
+						NameListToString(name))));
+
+	if (!object_ownercheck(OperatorRelationId, oid, GetUserId()))
+		aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_OPERATOR,
+					   NameListToString(name));
+
+	return oid;
+}
+
+
 /*
  * Guts of operator deletion.
  */
@@ -406,6 +447,8 @@ RemoveOperatorById(Oid operOid)
  *		routine implementing ALTER OPERATOR <operator> SET (option = ...).
  *
  * Currently, only RESTRICT and JOIN estimator functions can be changed.
+ * COMMUTATOR and NEGATOR, HASHES, and MERGES can be set if they have not been set
+ * previously.
  */
 ObjectAddress
 AlterOperator(AlterOperatorStmt *stmt)
@@ -426,6 +469,14 @@ AlterOperator(AlterOperatorStmt *stmt)
 	List	   *joinName = NIL; /* optional join sel. function */
 	bool		updateJoin = false;
 	Oid			joinOid;
+	List	   *commutatorName = NIL;	/* optional commutator operator name */
+	Oid			commutatorOid;
+	List	   *negatorName = NIL;	/* optional negator operator name */
+	Oid			negatorOid;
+	bool		updateHashes = false;
+	bool		canHash = false;
+	bool		updateMerges = false;
+	bool		canMerge = false;
 
 	/* Look up the operator */
 	oprId = LookupOperWithArgs(stmt->opername, false);
@@ -456,6 +507,24 @@ AlterOperator(AlterOperatorStmt *stmt)
 			joinName = param;
 			updateJoin = true;
 		}
+		else if (strcmp(defel->defname, "commutator") == 0)
+		{
+			commutatorName = defGetQualifiedName(defel);
+		}
+		else if (strcmp(defel->defname, "negator") == 0)
+		{
+			negatorName = defGetQualifiedName(defel);
+		}
+		else if (strcmp(defel->defname, "hashes") == 0)
+		{
+			canHash = defGetBoolean(defel);
+			updateHashes = true;
+		}
+		else if (strcmp(defel->defname, "merges") == 0)
+		{
+			canMerge = defGetBoolean(defel);
+			updateMerges = true;
+		}
 
 		/*
 		 * The rest of the options that CREATE accepts cannot be changed.
@@ -464,11 +533,7 @@ AlterOperator(AlterOperatorStmt *stmt)
 		else if (strcmp(defel->defname, "leftarg") == 0 ||
 				 strcmp(defel->defname, "rightarg") == 0 ||
 				 strcmp(defel->defname, "function") == 0 ||
-				 strcmp(defel->defname, "procedure") == 0 ||
-				 strcmp(defel->defname, "commutator") == 0 ||
-				 strcmp(defel->defname, "negator") == 0 ||
-				 strcmp(defel->defname, "hashes") == 0 ||
-				 strcmp(defel->defname, "merges") == 0)
+				 strcmp(defel->defname, "procedure") == 0)
 		{
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -488,7 +553,7 @@ AlterOperator(AlterOperatorStmt *stmt)
 					   NameStr(oprForm->oprname));
 
 	/*
-	 * Look up restriction and join estimators if specified
+	 * Look up Oid for any parameters specified
 	 */
 	if (restrictionName)
 		restrictionOid = ValidateRestrictionEstimator(restrictionName);
@@ -499,28 +564,85 @@ AlterOperator(AlterOperatorStmt *stmt)
 	else
 		joinOid = InvalidOid;
 
-	/* Perform additional checks, like OperatorCreate does */
-	if (!(OidIsValid(oprForm->oprleft) && OidIsValid(oprForm->oprright)))
+	if (commutatorName)
 	{
-		/* If it's not a binary op, these things mustn't be set: */
-		if (OidIsValid(joinOid))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only binary operators can have join selectivity")));
+		/* commutator has reversed arg types */
+		commutatorOid = ValidateOperatorDefined(commutatorName,
+												oprForm->oprright,
+												oprForm->oprleft);
+
+		/*
+		 * we don't need to do anything extra for a self commutator as in
+		 * OperatorCreate as there we have to create the operator before
+		 * setting the commutator, but here the operator already exists and
+		 * the commutatorOid above is valid (and is the operator oid).
+		 */
 	}
+	else
+		commutatorOid = InvalidOid;
 
-	if (oprForm->oprresult != BOOLOID)
+	if (negatorName)
 	{
-		if (OidIsValid(restrictionOid))
-			ereport(ERROR,
-					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can have restriction selectivity")));
-		if (OidIsValid(joinOid))
+		negatorOid = ValidateOperatorDefined(negatorName,
+											 oprForm->oprleft,
+											 oprForm->oprright);
+
+		if (negatorOid == oprForm->oid)
 			ereport(ERROR,
 					(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
-					 errmsg("only boolean operators can have join selectivity")));
+					 errmsg("operator cannot be its own negator")));
+	}
+	else
+	{
+		negatorOid = InvalidOid;
+	}
+
+	/*
+	 * check that we're not changing any existing values that might be
+	 * depended on elsewhere and may be expected to never change, while
+	 * allowing no-ops.
+	 */
+	if (OidIsValid(commutatorOid) && OidIsValid(oprForm->oprcom)
+		&& commutatorOid != oprForm->oprcom)
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+				 errmsg("operator attribute \"commutator\" cannot be changed if it has already been set")));
+	}
+
+	if (OidIsValid(negatorOid) && OidIsValid(oprForm->oprnegate)
+		&& negatorOid != oprForm->oprnegate)
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+				 errmsg("operator attribute \"negator\" cannot be changed if it has already been set")));
+	}
+
+	if (updateHashes && oprForm->oprcanhash && !canHash)
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+				 errmsg("operator attribute \"hashes\" cannot be disabled after it has been enabled")));
+	}
+
+	if (updateMerges && oprForm->oprcanmerge && !canMerge)
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
+				 errmsg("operator attribute \"merges\" cannot be disabled after it has been enabled")));
 	}
 
+	/* Perform additional checks, like OperatorCreate does */
+	OperatorValidateParams(oprForm->oprleft,
+						   oprForm->oprright,
+						   oprForm->oprresult,
+						   OidIsValid(commutatorOid),
+						   OidIsValid(negatorOid),
+						   OidIsValid(joinOid),
+						   OidIsValid(restrictionOid),
+						   updateMerges,
+						   updateHashes);
+
 	/* Update the tuple */
 	for (i = 0; i < Natts_pg_operator; ++i)
 	{
@@ -539,6 +661,30 @@ AlterOperator(AlterOperatorStmt *stmt)
 		values[Anum_pg_operator_oprjoin - 1] = joinOid;
 	}
 
+	if (OidIsValid(commutatorOid))
+	{
+		replaces[Anum_pg_operator_oprcom - 1] = true;
+		values[Anum_pg_operator_oprcom - 1] = ObjectIdGetDatum(commutatorOid);
+	}
+
+	if (OidIsValid(negatorOid))
+	{
+		replaces[Anum_pg_operator_oprnegate - 1] = true;
+		values[Anum_pg_operator_oprnegate - 1] = ObjectIdGetDatum(negatorOid);
+	}
+
+	if (updateMerges)
+	{
+		replaces[Anum_pg_operator_oprcanmerge - 1] = true;
+		values[Anum_pg_operator_oprcanmerge - 1] = canMerge;
+	}
+
+	if (updateHashes)
+	{
+		replaces[Anum_pg_operator_oprcanhash - 1] = true;
+		values[Anum_pg_operator_oprcanhash - 1] = canHash;
+	}
+
 	tup = heap_modify_tuple(tup, RelationGetDescr(catalog),
 							values, nulls, replaces);
 
@@ -546,6 +692,9 @@ AlterOperator(AlterOperatorStmt *stmt)
 
 	address = makeOperatorDependencies(tup, false, true);
 
+	if (OidIsValid(commutatorOid) || OidIsValid(negatorOid))
+		OperatorUpd(oprId, commutatorOid, negatorOid, false);
+
 	InvokeObjectPostAlterHook(OperatorRelationId, oprId, 0);
 
 	table_close(catalog, NoLock);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e56cbe77cb..0c7a1f77e0 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10151,6 +10151,8 @@ operator_def_elem: ColLabel '=' NONE
 						{ $$ = makeDefElem($1, NULL, @1); }
 				   | ColLabel '=' operator_def_arg
 						{ $$ = makeDefElem($1, (Node *) $3, @1); }
+				   | ColLabel
+						{ $$ = makeDefElem($1, NULL, @1); }
 		;
 
 /* must be similar enough to def_arg to avoid reduce/reduce conflicts */
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index a670973ab2..13ebc592f2 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -85,6 +85,11 @@ typedef FormData_pg_operator *Form_pg_operator;
 DECLARE_UNIQUE_INDEX_PKEY(pg_operator_oid_index, 2688, OperatorOidIndexId, pg_operator, btree(oid oid_ops));
 DECLARE_UNIQUE_INDEX(pg_operator_oprname_l_r_n_index, 2689, OperatorNameNspIndexId, pg_operator, btree(oprname name_ops, oprleft oid_ops, oprright oid_ops, oprnamespace oid_ops));
 
+extern Oid
+			OperatorLookup(List *operatorName,
+						   Oid leftObjectId,
+						   Oid rightObjectId,
+						   bool *defined);
 
 extern ObjectAddress OperatorCreate(const char *operatorName,
 									Oid operatorNamespace,
diff --git a/src/test/regress/expected/alter_operator.out b/src/test/regress/expected/alter_operator.out
index 71bd484282..ea5d79b054 100644
--- a/src/test/regress/expected/alter_operator.out
+++ b/src/test/regress/expected/alter_operator.out
@@ -25,7 +25,7 @@ ORDER BY 1;
 (3 rows)
 
 --
--- Reset and set params
+-- test reset and set restrict and join
 --
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = NONE);
 ALTER OPERATOR === (boolean, boolean) SET (JOIN = NONE);
@@ -106,34 +106,282 @@ ORDER BY 1;
  schema public                                         | n
 (3 rows)
 
---
--- Test invalid options.
---
-ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = ====);
-ERROR:  operator attribute "commutator" cannot be changed
-ALTER OPERATOR === (boolean, boolean) SET (NEGATOR = ====);
-ERROR:  operator attribute "negator" cannot be changed
+-- test cannot set non existant function
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = non_existent_func);
 ERROR:  function non_existent_func(internal, oid, internal, integer) does not exist
 ALTER OPERATOR === (boolean, boolean) SET (JOIN = non_existent_func);
 ERROR:  function non_existent_func(internal, oid, internal, smallint, internal) does not exist
-ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = !==);
-ERROR:  operator attribute "commutator" cannot be changed
-ALTER OPERATOR === (boolean, boolean) SET (NEGATOR = !==);
-ERROR:  operator attribute "negator" cannot be changed
--- invalid: non-lowercase quoted identifiers
+-- test non-lowercase quoted identifiers invalid
 ALTER OPERATOR & (bit, bit) SET ("Restrict" = _int_contsel, "Join" = _int_contjoinsel);
 ERROR:  operator attribute "Restrict" not recognized
 --
--- Test permission check. Must be owner to ALTER OPERATOR.
+-- test must be owner of operator to ALTER OPERATOR.
 --
 CREATE USER regress_alter_op_user;
 SET SESSION AUTHORIZATION regress_alter_op_user;
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = NONE);
 ERROR:  must be owner of operator ===
--- Clean up
 RESET SESSION AUTHORIZATION;
+--
+-- test set commutator, negator, hashes, and merges which can only be set if not
+-- already set
+--
+-- for these tests create operators with different left and right types so that
+-- we can validate that function signatures are handled correctly
+CREATE FUNCTION alter_op_test_fn_bool_real(boolean, real)
+RETURNS boolean AS $$ SELECT NULL::BOOLEAN; $$ LANGUAGE sql IMMUTABLE;
+CREATE FUNCTION alter_op_test_fn_real_bool(real, boolean)
+RETURNS boolean AS $$ SELECT NULL::BOOLEAN; $$ LANGUAGE sql IMMUTABLE;
+-- operator
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+-- commutator
+CREATE OPERATOR ==== (
+    LEFTARG = real,
+    RIGHTARG = boolean,
+    PROCEDURE = alter_op_test_fn_real_bool
+);
+-- negator
+CREATE OPERATOR !==== (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+-- test no-op setting already false hashes and merges to false works
+ALTER OPERATOR === (boolean, real) SET (HASHES = false);
+ALTER OPERATOR === (boolean, real) SET (MERGES = false);
+-- validate still false after no-op
+SELECT oprcanhash, oprcanmerge FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanhash | oprcanmerge 
+------------+-------------
+ f          | f
+(1 row)
+
+-- test cannot set commutator or negator without owning them
+SET SESSION AUTHORIZATION regress_alter_op_user;
+-- we need need a new operator owned by regress_alter_op_user so that we are
+-- allowed to alter it. ==== and !==== are owned by the test user so we expect
+-- the alters below to fail.
+CREATE OPERATOR ===@@@ (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+ALTER OPERATOR ===@@@ (boolean, real) SET (COMMUTATOR= ====);
+ERROR:  must be owner of operator ====
+ALTER OPERATOR ===@@@ (boolean, real) SET (NEGATOR = !====);
+ERROR:  must be owner of operator !====
+-- validate operator is unchanged and commutator and negator are unset
+SELECT oprcom, oprnegate FROM pg_operator WHERE oprname = '===@@@'
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcom | oprnegate 
+--------+-----------
+      0 |         0
+(1 row)
+
+DROP OPERATOR ===@@@ (boolean, real);
+RESET SESSION AUTHORIZATION;
+-- test cannot set self negator
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = ===);
+ERROR:  operator cannot be its own negator
+-- validate no changes made
+SELECT oprcanmerge, oprcanhash, oprcom, oprnegate
+FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanmerge | oprcanhash | oprcom | oprnegate 
+-------------+------------+--------+-----------
+ f           | f          |      0 |         0
+(1 row)
+
+-- test set hashes
+ALTER OPERATOR === (boolean, real) SET (HASHES);
+SELECT oprcanhash FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanhash 
+------------
+ t
+(1 row)
+
+-- test set merges
+ALTER OPERATOR === (boolean, real) SET (MERGES);
+SELECT oprcanmerge FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanmerge 
+-------------
+ t
+(1 row)
+
+-- test set commutator
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ====);
+-- validate that the commutator has been set on both the operator and commutator,
+-- that they reference each other, and that the operator used is the existing
+-- one we created and not a new shell operator
+SELECT op.oprname AS operator_name, com.oprname AS commutator_name,
+  com.oprcode AS commutator_func
+  FROM pg_operator op
+  INNER JOIN pg_operator com ON (op.oid = com.oprcom AND op.oprcom = com.oid)
+  WHERE op.oprname = '==='
+  AND op.oprleft = 'boolean'::regtype AND op.oprright = 'real'::regtype;
+ operator_name | commutator_name |      commutator_func       
+---------------+-----------------+----------------------------
+ ===           | ====            | alter_op_test_fn_real_bool
+(1 row)
+
+-- test set negator
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !====);
+-- validate that the negator has been set on both the operator and negator, that
+-- they reference each other, and that the operator used is the existing one we
+-- created and not a new shell operator
+SELECT op.oprname AS operator_name, neg.oprname AS negator_name,
+  neg.oprcode AS negator_func
+  FROM pg_operator op
+  INNER JOIN pg_operator neg ON (op.oid = neg.oprnegate AND op.oprnegate = neg.oid)
+  WHERE op.oprname = '==='
+  AND op.oprleft = 'boolean'::regtype AND op.oprright = 'real'::regtype;
+ operator_name | negator_name |        negator_func        
+---------------+--------------+----------------------------
+ ===           | !====        | alter_op_test_fn_bool_real
+(1 row)
+
+-- validate that the final state of the operator is as we expect
+SELECT oprcanmerge, oprcanhash,
+       pg_describe_object('pg_operator'::regclass, oprcom, 0) AS commutator,
+       pg_describe_object('pg_operator'::regclass, oprnegate, 0) AS negator
+  FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanmerge | oprcanhash |         commutator          |           negator            
+-------------+------------+-----------------------------+------------------------------
+ t           | t          | operator ====(real,boolean) | operator !====(boolean,real)
+(1 row)
+
+-- test no-op set 'succeeds'
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !====);
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ====);
+ALTER OPERATOR === (boolean, real) SET (HASHES);
+ALTER OPERATOR === (boolean, real) SET (MERGES);
+-- test cannot change commutator, negator, hashes, and merges when already set
+CREATE OPERATOR @= (
+    LEFTARG = real,
+    RIGHTARG = boolean,
+    PROCEDURE = alter_op_test_fn_real_bool
+);
+CREATE OPERATOR @!= (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = @=);
+ERROR:  operator attribute "commutator" cannot be changed if it has already been set
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = @!=);
+ERROR:  operator attribute "negator" cannot be changed if it has already been set
+ALTER OPERATOR === (boolean, real) SET (HASHES = false);
+ERROR:  operator attribute "hashes" cannot be disabled after it has been enabled
+ALTER OPERATOR === (boolean, real) SET (MERGES = false);
+ERROR:  operator attribute "merges" cannot be disabled after it has been enabled
+-- test cannot set an operator that aleady has a commutator as the commutator
+ALTER OPERATOR @=(real, boolean) SET (COMMUTATOR = ===);
+ERROR:  commutator commutates another operator
+SELECT oprcom FROM pg_operator WHERE oprname = '@=';
+ oprcom 
+--------
+      0
+(1 row)
+
+-- test cannot set an operator that already has a negator as the negator
+ALTER OPERATOR @!=(boolean, real) SET (NEGATOR = ===);
+ERROR:  negator negates another operator
+SELECT oprnegate FROM pg_operator WHERE oprname = '@!=';
+ oprnegate 
+-----------
+         0
+(1 row)
+
+-- validate no changes made
+SELECT oprcanmerge, oprcanhash,
+       pg_describe_object('pg_operator'::regclass, oprcom, 0) AS commutator,
+       pg_describe_object('pg_operator'::regclass, oprnegate, 0) AS negator
+  FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanmerge | oprcanhash |         commutator          |           negator            
+-------------+------------+-----------------------------+------------------------------
+ t           | t          | operator ====(real,boolean) | operator !====(boolean,real)
+(1 row)
+
+DROP OPERATOR @=(real, boolean);
+DROP OPERATOR @!=(boolean, real);
+--
+-- test setting commutator and negator to undefined or shell operator fails
+--
+DROP OPERATOR === (boolean, real);
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+-- test cannot set undefined
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ===@@@);
+ERROR:  operator ===@@@ does not exist
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !===@@@);
+ERROR:  operator !===@@@ does not exist
+-- create shell operators for ===@@@ and !===@@@ 
+CREATE OPERATOR ===@@ (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real,
+    COMMUTATOR = ===@@@,
+    NEGATOR = !===@@@
+);
+-- test cannot set shell operators
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ===@@@);
+ERROR:  operator ===@@@ is undefined shell operator
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !===@@@);
+ERROR:  operator !===@@@ is undefined shell operator
+-- validate no changes made
+SELECT oprcanmerge, oprcanhash,
+       pg_describe_object('pg_operator'::regclass, oprcom, 0) AS commutator,
+       pg_describe_object('pg_operator'::regclass, oprnegate, 0) AS negator
+  FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+ oprcanmerge | oprcanhash | commutator | negator 
+-------------+------------+------------+---------
+ f           | f          |            | 
+(1 row)
+
+DROP OPERATOR ===@@(boolean, real);
+DROP OPERATOR ===@@@(real, boolean);
+DROP OPERATOR !===@@@(boolean, real);
+--
+-- test setting self commutator
+--
+DROP OPERATOR === (boolean, boolean);
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = boolean,
+    PROCEDURE = alter_op_test_fn
+);
+ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = ===);
+-- validate that the oprcom is the operator oid
+SELECT oprname FROM pg_operator 
+  WHERE oprname = '===' AND oid = oprcom
+  AND oprleft = 'boolean'::regtype AND oprright = 'boolean'::regtype;
+ oprname 
+---------
+ ===
+(1 row)
+
+--
+-- Clean up
+--
 DROP USER regress_alter_op_user;
 DROP OPERATOR === (boolean, boolean);
+DROP OPERATOR === (boolean, real);
+DROP OPERATOR ==== (real, boolean);
+DROP OPERATOR !==== (boolean, real);
 DROP FUNCTION customcontsel(internal, oid, internal, integer);
 DROP FUNCTION alter_op_test_fn(boolean, boolean);
+DROP FUNCTION alter_op_test_fn_bool_real(boolean, real);
+DROP FUNCTION alter_op_test_fn_real_bool(real, boolean);
diff --git a/src/test/regress/sql/alter_operator.sql b/src/test/regress/sql/alter_operator.sql
index fd40370165..85004d4191 100644
--- a/src/test/regress/sql/alter_operator.sql
+++ b/src/test/regress/sql/alter_operator.sql
@@ -22,7 +22,7 @@ WHERE classid = 'pg_operator'::regclass AND
 ORDER BY 1;
 
 --
--- Reset and set params
+-- test reset and set restrict and join
 --
 
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = NONE);
@@ -71,30 +71,251 @@ WHERE classid = 'pg_operator'::regclass AND
       objid = '===(bool,bool)'::regoperator
 ORDER BY 1;
 
---
--- Test invalid options.
---
-ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = ====);
-ALTER OPERATOR === (boolean, boolean) SET (NEGATOR = ====);
+-- test cannot set non existant function
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = non_existent_func);
 ALTER OPERATOR === (boolean, boolean) SET (JOIN = non_existent_func);
-ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = !==);
-ALTER OPERATOR === (boolean, boolean) SET (NEGATOR = !==);
 
--- invalid: non-lowercase quoted identifiers
+-- test non-lowercase quoted identifiers invalid
 ALTER OPERATOR & (bit, bit) SET ("Restrict" = _int_contsel, "Join" = _int_contjoinsel);
 
 --
--- Test permission check. Must be owner to ALTER OPERATOR.
+-- test must be owner of operator to ALTER OPERATOR.
 --
 CREATE USER regress_alter_op_user;
 SET SESSION AUTHORIZATION regress_alter_op_user;
 
 ALTER OPERATOR === (boolean, boolean) SET (RESTRICT = NONE);
 
--- Clean up
 RESET SESSION AUTHORIZATION;
+
+--
+-- test set commutator, negator, hashes, and merges which can only be set if not
+-- already set
+--
+
+-- for these tests create operators with different left and right types so that
+-- we can validate that function signatures are handled correctly
+
+CREATE FUNCTION alter_op_test_fn_bool_real(boolean, real)
+RETURNS boolean AS $$ SELECT NULL::BOOLEAN; $$ LANGUAGE sql IMMUTABLE;
+
+CREATE FUNCTION alter_op_test_fn_real_bool(real, boolean)
+RETURNS boolean AS $$ SELECT NULL::BOOLEAN; $$ LANGUAGE sql IMMUTABLE;
+
+-- operator
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+
+-- commutator
+CREATE OPERATOR ==== (
+    LEFTARG = real,
+    RIGHTARG = boolean,
+    PROCEDURE = alter_op_test_fn_real_bool
+);
+
+-- negator
+CREATE OPERATOR !==== (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+
+-- test no-op setting already false hashes and merges to false works
+ALTER OPERATOR === (boolean, real) SET (HASHES = false);
+ALTER OPERATOR === (boolean, real) SET (MERGES = false);
+
+-- validate still false after no-op
+SELECT oprcanhash, oprcanmerge FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+-- test cannot set commutator or negator without owning them
+SET SESSION AUTHORIZATION regress_alter_op_user;
+
+-- we need need a new operator owned by regress_alter_op_user so that we are
+-- allowed to alter it. ==== and !==== are owned by the test user so we expect
+-- the alters below to fail.
+CREATE OPERATOR ===@@@ (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+
+ALTER OPERATOR ===@@@ (boolean, real) SET (COMMUTATOR= ====);
+ALTER OPERATOR ===@@@ (boolean, real) SET (NEGATOR = !====);
+
+-- validate operator is unchanged and commutator and negator are unset
+SELECT oprcom, oprnegate FROM pg_operator WHERE oprname = '===@@@'
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+DROP OPERATOR ===@@@ (boolean, real);
+
+RESET SESSION AUTHORIZATION;
+
+-- test cannot set self negator
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = ===);
+
+-- validate no changes made
+SELECT oprcanmerge, oprcanhash, oprcom, oprnegate
+FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+-- test set hashes
+ALTER OPERATOR === (boolean, real) SET (HASHES);
+SELECT oprcanhash FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+-- test set merges
+ALTER OPERATOR === (boolean, real) SET (MERGES);
+SELECT oprcanmerge FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+-- test set commutator
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ====);
+
+-- validate that the commutator has been set on both the operator and commutator,
+-- that they reference each other, and that the operator used is the existing
+-- one we created and not a new shell operator
+SELECT op.oprname AS operator_name, com.oprname AS commutator_name,
+  com.oprcode AS commutator_func
+  FROM pg_operator op
+  INNER JOIN pg_operator com ON (op.oid = com.oprcom AND op.oprcom = com.oid)
+  WHERE op.oprname = '==='
+  AND op.oprleft = 'boolean'::regtype AND op.oprright = 'real'::regtype;
+
+-- test set negator
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !====);
+
+-- validate that the negator has been set on both the operator and negator, that
+-- they reference each other, and that the operator used is the existing one we
+-- created and not a new shell operator
+SELECT op.oprname AS operator_name, neg.oprname AS negator_name,
+  neg.oprcode AS negator_func
+  FROM pg_operator op
+  INNER JOIN pg_operator neg ON (op.oid = neg.oprnegate AND op.oprnegate = neg.oid)
+  WHERE op.oprname = '==='
+  AND op.oprleft = 'boolean'::regtype AND op.oprright = 'real'::regtype;
+
+-- validate that the final state of the operator is as we expect
+SELECT oprcanmerge, oprcanhash,
+       pg_describe_object('pg_operator'::regclass, oprcom, 0) AS commutator,
+       pg_describe_object('pg_operator'::regclass, oprnegate, 0) AS negator
+  FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+-- test no-op set 'succeeds'
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !====);
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ====);
+ALTER OPERATOR === (boolean, real) SET (HASHES);
+ALTER OPERATOR === (boolean, real) SET (MERGES);
+
+-- test cannot change commutator, negator, hashes, and merges when already set
+
+CREATE OPERATOR @= (
+    LEFTARG = real,
+    RIGHTARG = boolean,
+    PROCEDURE = alter_op_test_fn_real_bool
+);
+CREATE OPERATOR @!= (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = @=);
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = @!=);
+ALTER OPERATOR === (boolean, real) SET (HASHES = false);
+ALTER OPERATOR === (boolean, real) SET (MERGES = false);
+
+-- test cannot set an operator that aleady has a commutator as the commutator
+ALTER OPERATOR @=(real, boolean) SET (COMMUTATOR = ===);
+SELECT oprcom FROM pg_operator WHERE oprname = '@=';
+
+-- test cannot set an operator that already has a negator as the negator
+ALTER OPERATOR @!=(boolean, real) SET (NEGATOR = ===);
+SELECT oprnegate FROM pg_operator WHERE oprname = '@!=';
+
+-- validate no changes made
+SELECT oprcanmerge, oprcanhash,
+       pg_describe_object('pg_operator'::regclass, oprcom, 0) AS commutator,
+       pg_describe_object('pg_operator'::regclass, oprnegate, 0) AS negator
+  FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+DROP OPERATOR @=(real, boolean);
+DROP OPERATOR @!=(boolean, real);
+
+--
+-- test setting commutator and negator to undefined or shell operator fails
+--
+
+DROP OPERATOR === (boolean, real);
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real
+);
+
+-- test cannot set undefined
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ===@@@);
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !===@@@);
+
+-- create shell operators for ===@@@ and !===@@@ 
+CREATE OPERATOR ===@@ (
+    LEFTARG = boolean,
+    RIGHTARG = real,
+    PROCEDURE = alter_op_test_fn_bool_real,
+    COMMUTATOR = ===@@@,
+    NEGATOR = !===@@@
+);
+
+-- test cannot set shell operators
+ALTER OPERATOR === (boolean, real) SET (COMMUTATOR = ===@@@);
+ALTER OPERATOR === (boolean, real) SET (NEGATOR = !===@@@);
+
+-- validate no changes made
+SELECT oprcanmerge, oprcanhash,
+       pg_describe_object('pg_operator'::regclass, oprcom, 0) AS commutator,
+       pg_describe_object('pg_operator'::regclass, oprnegate, 0) AS negator
+  FROM pg_operator WHERE oprname = '==='
+  AND oprleft = 'boolean'::regtype AND oprright = 'real'::regtype;
+
+DROP OPERATOR ===@@(boolean, real);
+DROP OPERATOR ===@@@(real, boolean);
+DROP OPERATOR !===@@@(boolean, real);
+
+--
+-- test setting self commutator
+--
+
+DROP OPERATOR === (boolean, boolean);
+CREATE OPERATOR === (
+    LEFTARG = boolean,
+    RIGHTARG = boolean,
+    PROCEDURE = alter_op_test_fn
+);
+
+ALTER OPERATOR === (boolean, boolean) SET (COMMUTATOR = ===);
+
+-- validate that the oprcom is the operator oid
+SELECT oprname FROM pg_operator 
+  WHERE oprname = '===' AND oid = oprcom
+  AND oprleft = 'boolean'::regtype AND oprright = 'boolean'::regtype;
+
+--
+-- Clean up
+--
+
 DROP USER regress_alter_op_user;
+
 DROP OPERATOR === (boolean, boolean);
+DROP OPERATOR === (boolean, real);
+DROP OPERATOR ==== (real, boolean);
+DROP OPERATOR !==== (boolean, real);
+
 DROP FUNCTION customcontsel(internal, oid, internal, integer);
 DROP FUNCTION alter_op_test_fn(boolean, boolean);
+DROP FUNCTION alter_op_test_fn_bool_real(boolean, real);
+DROP FUNCTION alter_op_test_fn_real_bool(real, boolean);
#14jian he
jian.universality@gmail.com
In reply to: Tommy Pavlicek (#13)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

errmsg("operator attribute \"negator\" cannot be changed if it has
already been set")));
I feel like the above message is not very helpful.

Something like the following may be more helpful for diagnosis.
errmsg("operator %s's attribute \"negator\" cannot be changed if it
has already been set", operatorname)));

when I "git apply", I've noticed some minor whitespace warning.

Other than that, it looks fine.

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: jian he (#14)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

jian he <jian.universality@gmail.com> writes:

errmsg("operator attribute \"negator\" cannot be changed if it has
already been set")));
I feel like the above message is not very helpful.

I think it's okay to be concise about this as long as the operator
we're referring to is the target of the ALTER. I agree that when
we're complaining about some *other* operator, we'd better spell
out which one we mean, and I made some changes to the patch to
improve that.

Pushed after a round of editorialization -- mostly cosmetic
stuff, except for tweaking some error messages. I shortened the
test cases a bit too, as I thought they were somewhat excessive
to have as a permanent thing.

regards, tom lane

#16Christoph Berg
myon@debian.org
In reply to: Tommy Pavlicek (#13)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

Re: Tommy Pavlicek

I've added another patch (0002-require_unused_neg_com-v1.patch) that
prevents using a commutator or negator that's already part of a pair.

Hmm. I agree with the general idea of adding sanity checks, but this
might be overzealous:

This change is breaking pgsphere which has <@ @> operator pairs, but
for historical reasons also includes alternative spellings of these
operators (both called @ with swapped operand types) which now
explodes because we can't add them with the "proper" commutator and
negators declared (which point to the canonical <@ @> !<@ !@>
operators).

https://github.com/postgrespro/pgsphere/blob/master/pgs_moc_compat.sql.in

We might be able to simply delete the @ operators, but doesn't this
new check break the general possibility to have more than one spelling
for the same operator?

Christoph

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christoph Berg (#16)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

Christoph Berg <myon@debian.org> writes:

This change is breaking pgsphere which has <@ @> operator pairs, but
for historical reasons also includes alternative spellings of these
operators (both called @ with swapped operand types) which now
explodes because we can't add them with the "proper" commutator and
negators declared (which point to the canonical <@ @> !<@ !@>
operators).

Should have guessed that somebody might be depending on the previous
squishy behavior. Still, I can't see how the above situation is a
good idea. Commutators/negators should come in pairs, not have
completely random links. I think it's only accidental that this
setup isn't triggering other strange behavior.

We might be able to simply delete the @ operators, but doesn't this
new check break the general possibility to have more than one spelling
for the same operator?

You can have more than one operator atop the same function.
But why didn't you make the @ operators commutators of each other,
rather than this mess?

regards, tom lane

#18Christoph Berg
myon@debian.org
In reply to: Tom Lane (#17)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

Re: Tom Lane

We might be able to simply delete the @ operators, but doesn't this
new check break the general possibility to have more than one spelling
for the same operator?

You can have more than one operator atop the same function.
But why didn't you make the @ operators commutators of each other,
rather than this mess?

Historical something.

You are right that the commutators could be fixed that way, but the
negators are a different question. There is no legacy spelling for
these.

Anyway, if this doesn't raise any "oh we didn't think of this"
concerns, we'll just remove the old operators in pgsphere.

Christoph

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christoph Berg (#18)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

Christoph Berg <myon@debian.org> writes:

Anyway, if this doesn't raise any "oh we didn't think of this"
concerns, we'll just remove the old operators in pgsphere.

Well, the idea was exactly to forbid that sort of setup.
However, if we get sufficient pushback maybe we should
reconsider --- for example, maybe it'd be sane to enforce
the restriction in ALTER but not CREATE?

I'm inclined to wait and see if there are more complaints.

regards, tom lane

#20Tommy Pavlicek
tommypav122@gmail.com
In reply to: Christoph Berg (#16)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

On Fri, Oct 20, 2023 at 5:33 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Pushed after a round of editorialization -- mostly cosmetic
stuff, except for tweaking some error messages. I shortened the
test cases a bit too, as I thought they were somewhat excessive
to have as a permanent thing.

Thanks Tom.

On Tue, Oct 24, 2023 at 2:51 PM Christoph Berg <myon@debian.org> wrote:

Re: Tommy Pavlicek

I've added another patch (0002-require_unused_neg_com-v1.patch) that
prevents using a commutator or negator that's already part of a pair.

Hmm. I agree with the general idea of adding sanity checks, but this
might be overzealous:

I can't add much beyond what Tom said, but I think this does go a bit
beyond a sanity check. I forgot to mention it in my previous message,
but the main reason I noticed this was because the DELETE operator
code cleans up commutator and negator links to the operator being
deleted and that code expects each to be part of exactly a pair.

#21Christoph Berg
myon@debian.org
In reply to: Tom Lane (#19)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

Re: Tom Lane

Well, the idea was exactly to forbid that sort of setup.

Fwiw, pgsphere has remove the problematic operators now:

https://github.com/postgrespro/pgsphere/commit/e810f5ddd827881b06a92a303c5c9fbf997b892e

However, if we get sufficient pushback maybe we should
reconsider --- for example, maybe it'd be sane to enforce
the restriction in ALTER but not CREATE?

Hmm, that seems backwards, I'd expect that CREATE might have some
checks that could circumvent using ALTER if I really insisted. If
CREATE can create things that I can't reach by ALTERing existing other
things, that's weird.

Let's keep it like it is now in PG17.

Christoph

#22Christoph Berg
myon@debian.org
In reply to: Christoph Berg (#21)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

Re: To Tom Lane

Let's keep it like it is now in PG17.

Late followup news: This feature has actually found a bug in
postgresql-debversion:

CREATE OPERATOR > (
LEFTARG = debversion,
RIGHTARG = debversion,
COMMUTATOR = <,
- NEGATOR = >=,
+ NEGATOR = <=,
RESTRICT = scalargtsel,
JOIN = scalargtjoinsel
);

https://salsa.debian.org/postgresql/postgresql-debversion/-/commit/8ef08ccbea1438468249b0e94048b1a8a25fc625#000e84a71f8a28b762658375c194b25d529336f3

So, thanks!

Christoph

#23Michael Banck
mbanck@gmx.net
In reply to: Tom Lane (#19)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

Hi,

On Tue, Oct 24, 2023 at 11:42:15AM -0400, Tom Lane wrote:

Christoph Berg <myon@debian.org> writes:

Anyway, if this doesn't raise any "oh we didn't think of this"
concerns, we'll just remove the old operators in pgsphere.

Well, the idea was exactly to forbid that sort of setup.
However, if we get sufficient pushback maybe we should
reconsider --- for example, maybe it'd be sane to enforce
the restriction in ALTER but not CREATE?

I'm inclined to wait and see if there are more complaints.

FWIW, rdkit also fails, but that seems to be an ancient thing as well:

https://github.com/rdkit/rdkit/issues/7843

I guess there's no way to make that error a bit more helpful, like
printing out the offenbding SQL command, presumably because we are
loding an extension?

Michael

#24Christoph Berg
myon@debian.org
In reply to: Michael Banck (#23)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

Re: Michael Banck

I guess there's no way to make that error a bit more helpful, like
printing out the offenbding SQL command, presumably because we are
loding an extension?

I wish there was. The error reporting from failing extension scripts
is really bad with no context at all, it has hit me a few times in the
past already.

Christoph

#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christoph Berg (#24)
1 attachment(s)
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

Christoph Berg <myon@debian.org> writes:

I wish there was. The error reporting from failing extension scripts
is really bad with no context at all, it has hit me a few times in the
past already.

Nobody's spent any work on that :-(. A really basic reporting
facility is not hard to add, as in the attached finger exercise.
The trouble with it can be explained by showing what I get after
intentionally breaking a script file command:

regression=# create extension cube;
ERROR: syntax error at or near "CREAT"
LINE 16: CREAT FUNCTION cube_send(cube)
^
QUERY: /* contrib/cube/cube--1.4--1.5.sql */

-- complain if script is sourced in psql, rather than via ALTER EXTENSION

-- Remove @ and ~
DROP OPERATOR @ (cube, cube);
DROP OPERATOR ~ (cube, cube);

-- Add binary input/output handlers
CREATE FUNCTION cube_recv(internal)
RETURNS cube
AS '$libdir/cube'
LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;

CREAT FUNCTION cube_send(cube)
RETURNS bytea
AS '$libdir/cube'
LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;

ALTER TYPE cube SET ( RECEIVE = cube_recv, SEND = cube_send );

CONTEXT: extension script file "/home/postgres/install/share/extension/cube--1.4--1.5.sql"

So the first part of that is great, but if your script file is
large you probably won't be happy about having the whole thing
repeated in the "QUERY" field. So this needs some work on
user-friendliness.

I'm inclined to think that maybe we'd be best off keeping the server
end of it straightforward, and trying to teach psql to abbreviate the
QUERY field in a useful way. IIRC you get this same type of problem
with very large SQL-language functions and suchlike.

Also, I believe this doesn't help much for non-syntax errors
(those that aren't reported with an error location). It might
be interesting to use the RawStmt.stmt_location/stmt_len fields
for the current parsetree to identify what to report, but
I've not dug further than this.

regards, tom lane

Attachments:

extension-error-reporting-poc.patchtext/x-diff; charset=us-ascii; name=extension-error-reporting-poc.patchDownload
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index fab59ad5f6..6af72ff422 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -107,6 +107,13 @@ typedef struct ExtensionVersionInfo
 	struct ExtensionVersionInfo *previous;	/* current best predecessor */
 } ExtensionVersionInfo;
 
+/* Info structure for script_error_callback() */
+typedef struct
+{
+	const char *sql;
+	const char *filename;
+} script_error_callback_arg;
+
 /* Local functions */
 static List *find_update_path(List *evi_list,
 							  ExtensionVersionInfo *evi_start,
@@ -670,9 +677,32 @@ read_extension_script_file(const ExtensionControlFile *control,
 	return dest_str;
 }
 
+/*
+ * error context callback for failures in script-file execution
+ */
+static void
+script_error_callback(void *arg)
+{
+	script_error_callback_arg *callback_arg = (script_error_callback_arg *) arg;
+	int			syntaxerrposition;
+
+	/* If it's a syntax error, convert to internal syntax error report */
+	syntaxerrposition = geterrposition();
+	if (syntaxerrposition > 0)
+	{
+		errposition(0);
+		internalerrposition(syntaxerrposition);
+		internalerrquery(callback_arg->sql);
+	}
+
+	errcontext("extension script file \"%s\"", callback_arg->filename);
+}
+
 /*
  * Execute given SQL string.
  *
+ * The filename the string came from is also provided, for error reporting.
+ *
  * Note: it's tempting to just use SPI to execute the string, but that does
  * not work very well.  The really serious problem is that SPI will parse,
  * analyze, and plan the whole string before executing any of it; of course
@@ -682,12 +712,25 @@ read_extension_script_file(const ExtensionControlFile *control,
  * could be very long.
  */
 static void
-execute_sql_string(const char *sql)
+execute_sql_string(const char *sql, const char *filename)
 {
+	script_error_callback_arg callback_arg;
+	ErrorContextCallback scripterrcontext;
 	List	   *raw_parsetree_list;
 	DestReceiver *dest;
 	ListCell   *lc1;
 
+	/*
+	 * Setup error traceback support for ereport().
+	 */
+	callback_arg.sql = sql;
+	callback_arg.filename = filename;
+
+	scripterrcontext.callback = script_error_callback;
+	scripterrcontext.arg = (void *) &callback_arg;
+	scripterrcontext.previous = error_context_stack;
+	error_context_stack = &scripterrcontext;
+
 	/*
 	 * Parse the SQL string into a list of raw parse trees.
 	 */
@@ -780,6 +823,8 @@ execute_sql_string(const char *sql)
 
 	/* Be sure to advance the command counter after the last script command */
 	CommandCounterIncrement();
+
+	error_context_stack = scripterrcontext.previous;
 }
 
 /*
@@ -1054,7 +1099,7 @@ execute_extension_script(Oid extensionOid, ExtensionControlFile *control,
 		/* And now back to C string */
 		c_sql = text_to_cstring(DatumGetTextPP(t_sql));
 
-		execute_sql_string(c_sql);
+		execute_sql_string(c_sql, filename);
 	}
 	PG_FINALLY();
 	{
#26Christoph Berg
myon@debian.org
In reply to: Tom Lane (#25)
Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

Re: Tom Lane

So the first part of that is great, but if your script file is
large you probably won't be happy about having the whole thing
repeated in the "QUERY" field. So this needs some work on
user-friendliness.

Does this really have to be addressed? It would be way better than it
is now, and errors during extension creation are rare and mostly for
developers only, so it doesn't have to be pretty.

I'm inclined to think that maybe we'd be best off keeping the server
end of it straightforward, and trying to teach psql to abbreviate the
QUERY field in a useful way. IIRC you get this same type of problem
with very large SQL-language functions and suchlike.

I'd treat this as a separate patch, if it's considered to be a good
idea.

Christoph

#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christoph Berg (#26)
1 attachment(s)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

Christoph Berg <myon@debian.org> writes:

Re: Tom Lane

So the first part of that is great, but if your script file is
large you probably won't be happy about having the whole thing
repeated in the "QUERY" field. So this needs some work on
user-friendliness.

Does this really have to be addressed? It would be way better than it
is now, and errors during extension creation are rare and mostly for
developers only, so it doesn't have to be pretty.

Perhaps. I spent a little more effort on this and added code to
report errors that don't come with an error location. On those,
we don't have any constraints about what to report in the QUERY
field, so I made it trim the string to just the current query
within the script, which makes things quite a bit better. You
can see the results in the test_extensions regression test changes.

(It might be worth some effort to trim away comments appearing
just before a command, but I didn't tackle that here.)

regards, tom lane

Attachments:

v1-better-extension-error-reporting.patchtext/x-diff; charset=us-ascii; name=v1-better-extension-error-reporting.patchDownload
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index fab59ad5f6..0fae1332d2 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -54,6 +54,7 @@
 #include "funcapi.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "nodes/queryjumble.h"
 #include "storage/fd.h"
 #include "tcop/utility.h"
 #include "utils/acl.h"
@@ -107,6 +108,17 @@ typedef struct ExtensionVersionInfo
 	struct ExtensionVersionInfo *previous;	/* current best predecessor */
 } ExtensionVersionInfo;
 
+/*
+ * Information for script_error_callback()
+ */
+typedef struct
+{
+	const char *sql;			/* entire script file contents */
+	const char *filename;		/* script file pathname */
+	ParseLoc	stmt_location;	/* current stmt start loc, or -1 if unknown */
+	ParseLoc	stmt_len;		/* length in bytes; 0 means "rest of string" */
+} script_error_callback_arg;
+
 /* Local functions */
 static List *find_update_path(List *evi_list,
 							  ExtensionVersionInfo *evi_start,
@@ -670,9 +682,60 @@ read_extension_script_file(const ExtensionControlFile *control,
 	return dest_str;
 }
 
+/*
+ * error context callback for failures in script-file execution
+ */
+static void
+script_error_callback(void *arg)
+{
+	script_error_callback_arg *callback_arg = (script_error_callback_arg *) arg;
+	int			syntaxerrposition;
+	const char *lastslash;
+
+	/* If it's a syntax error, convert to internal syntax error report */
+	syntaxerrposition = geterrposition();
+	if (syntaxerrposition > 0)
+	{
+		/*
+		 * We must report the whole string because otherwise details such as
+		 * psql's line number report would be wrong.
+		 */
+		errposition(0);
+		internalerrposition(syntaxerrposition);
+		internalerrquery(callback_arg->sql);
+	}
+	else if (callback_arg->stmt_location >= 0)
+	{
+		/*
+		 * Since no syntax cursor will be shown, it's okay and helpful to trim
+		 * the reported query string to just the current statement.
+		 */
+		const char *query = callback_arg->sql;
+		int			location = callback_arg->stmt_location;
+		int			len = callback_arg->stmt_len;
+
+		query = CleanQuerytext(query, &location, &len);
+		internalerrquery(pnstrdup(query, len));
+	}
+
+	/*
+	 * Trim the reported file name to remove the path.  We know that
+	 * get_extension_script_filename() inserted a '/', regardless of whether
+	 * we're on Windows.
+	 */
+	lastslash = strrchr(callback_arg->filename, '/');
+	if (lastslash)
+		lastslash++;
+	else
+		lastslash = callback_arg->filename; /* shouldn't happen, but cope */
+	errcontext("extension script file \"%s\"", lastslash);
+}
+
 /*
  * Execute given SQL string.
  *
+ * The filename the string came from is also provided, for error reporting.
+ *
  * Note: it's tempting to just use SPI to execute the string, but that does
  * not work very well.  The really serious problem is that SPI will parse,
  * analyze, and plan the whole string before executing any of it; of course
@@ -682,12 +745,27 @@ read_extension_script_file(const ExtensionControlFile *control,
  * could be very long.
  */
 static void
-execute_sql_string(const char *sql)
+execute_sql_string(const char *sql, const char *filename)
 {
+	script_error_callback_arg callback_arg;
+	ErrorContextCallback scripterrcontext;
 	List	   *raw_parsetree_list;
 	DestReceiver *dest;
 	ListCell   *lc1;
 
+	/*
+	 * Setup error traceback support for ereport().
+	 */
+	callback_arg.sql = sql;
+	callback_arg.filename = filename;
+	callback_arg.stmt_location = -1;
+	callback_arg.stmt_len = -1;
+
+	scripterrcontext.callback = script_error_callback;
+	scripterrcontext.arg = (void *) &callback_arg;
+	scripterrcontext.previous = error_context_stack;
+	error_context_stack = &scripterrcontext;
+
 	/*
 	 * Parse the SQL string into a list of raw parse trees.
 	 */
@@ -709,6 +787,10 @@ execute_sql_string(const char *sql)
 		List	   *stmt_list;
 		ListCell   *lc2;
 
+		/* Report location of this query for error context callback */
+		callback_arg.stmt_location = parsetree->stmt_location;
+		callback_arg.stmt_len = parsetree->stmt_len;
+
 		/*
 		 * We do the work for each parsetree in a short-lived context, to
 		 * limit the memory used when there are many commands in the string.
@@ -778,6 +860,8 @@ execute_sql_string(const char *sql)
 		MemoryContextDelete(per_parsetree_context);
 	}
 
+	error_context_stack = scripterrcontext.previous;
+
 	/* Be sure to advance the command counter after the last script command */
 	CommandCounterIncrement();
 }
@@ -1054,7 +1138,7 @@ execute_extension_script(Oid extensionOid, ExtensionControlFile *control,
 		/* And now back to C string */
 		c_sql = text_to_cstring(DatumGetTextPP(t_sql));
 
-		execute_sql_string(c_sql);
+		execute_sql_string(c_sql, filename);
 	}
 	PG_FINALLY();
 	{
diff --git a/src/test/modules/test_extensions/expected/test_extensions.out b/src/test/modules/test_extensions/expected/test_extensions.out
index f357cc21aa..b6370b3b4c 100644
--- a/src/test/modules/test_extensions/expected/test_extensions.out
+++ b/src/test/modules/test_extensions/expected/test_extensions.out
@@ -295,6 +295,16 @@ CREATE FUNCTION ext_cor_func() RETURNS text
 CREATE EXTENSION test_ext_cor;  -- fail
 ERROR:  function ext_cor_func() is not a member of extension "test_ext_cor"
 DETAIL:  An extension is not allowed to replace an object that it does not own.
+QUERY:  /* src/test/modules/test_extensions/test_ext_cor--1.0.sql */
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+
+
+-- It's generally bad style to use CREATE OR REPLACE unnecessarily.
+-- Test what happens if an extension does it anyway.
+
+CREATE OR REPLACE FUNCTION ext_cor_func() RETURNS text
+  AS $$ SELECT 'ext_cor_func: from extension'::text $$ LANGUAGE sql
+CONTEXT:  extension script file "test_ext_cor--1.0.sql"
 SELECT ext_cor_func();
       ext_cor_func      
 ------------------------
@@ -307,6 +317,9 @@ CREATE VIEW ext_cor_view AS
 CREATE EXTENSION test_ext_cor;  -- fail
 ERROR:  view ext_cor_view is not a member of extension "test_ext_cor"
 DETAIL:  An extension is not allowed to replace an object that it does not own.
+QUERY:  CREATE OR REPLACE VIEW ext_cor_view AS
+  SELECT 'ext_cor_view: from extension'::text AS col
+CONTEXT:  extension script file "test_ext_cor--1.0.sql"
 SELECT ext_cor_func();
 ERROR:  function ext_cor_func() does not exist
 LINE 1: SELECT ext_cor_func();
@@ -323,6 +336,11 @@ CREATE TYPE test_ext_type;
 CREATE EXTENSION test_ext_cor;  -- fail
 ERROR:  type test_ext_type is not a member of extension "test_ext_cor"
 DETAIL:  An extension is not allowed to replace an object that it does not own.
+QUERY:  -- These are for testing replacement of a shell type/operator, which works
+-- enough like an implicit OR REPLACE to be important to check.
+
+CREATE TYPE test_ext_type AS ENUM('x', 'y')
+CONTEXT:  extension script file "test_ext_cor--1.0.sql"
 DROP TYPE test_ext_type;
 -- this makes a shell "point <<@@ polygon" operator too
 CREATE OPERATOR @@>> ( PROCEDURE = poly_contain_pt,
@@ -331,6 +349,9 @@ CREATE OPERATOR @@>> ( PROCEDURE = poly_contain_pt,
 CREATE EXTENSION test_ext_cor;  -- fail
 ERROR:  operator <<@@(point,polygon) is not a member of extension "test_ext_cor"
 DETAIL:  An extension is not allowed to replace an object that it does not own.
+QUERY:  CREATE OPERATOR <<@@ ( PROCEDURE = pt_contained_poly,
+  LEFTARG = point, RIGHTARG = polygon )
+CONTEXT:  extension script file "test_ext_cor--1.0.sql"
 DROP OPERATOR <<@@ (point, polygon);
 CREATE EXTENSION test_ext_cor;  -- now it should work
 SELECT ext_cor_func();
@@ -379,37 +400,61 @@ CREATE COLLATION ext_cine_coll
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  collation ext_cine_coll is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+QUERY:  /* src/test/modules/test_extensions/test_ext_cine--1.0.sql */
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+
+
+--
+-- CREATE IF NOT EXISTS is an entirely unsound thing for an extension
+-- to be doing, but let's at least plug the major security hole in it.
+--
+
+CREATE COLLATION IF NOT EXISTS ext_cine_coll
+  ( LC_COLLATE = "POSIX", LC_CTYPE = "POSIX" )
+CONTEXT:  extension script file "test_ext_cine--1.0.sql"
 DROP COLLATION ext_cine_coll;
 CREATE MATERIALIZED VIEW ext_cine_mv AS SELECT 11 AS f1;
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  materialized view ext_cine_mv is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+QUERY:  CREATE MATERIALIZED VIEW IF NOT EXISTS ext_cine_mv AS SELECT 42 AS f1
+CONTEXT:  extension script file "test_ext_cine--1.0.sql"
 DROP MATERIALIZED VIEW ext_cine_mv;
 CREATE FOREIGN DATA WRAPPER dummy;
 CREATE SERVER ext_cine_srv FOREIGN DATA WRAPPER dummy;
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  server ext_cine_srv is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+QUERY:  CREATE SERVER IF NOT EXISTS ext_cine_srv FOREIGN DATA WRAPPER ext_cine_fdw
+CONTEXT:  extension script file "test_ext_cine--1.0.sql"
 DROP SERVER ext_cine_srv;
 CREATE SCHEMA ext_cine_schema;
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  schema ext_cine_schema is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+QUERY:  CREATE SCHEMA IF NOT EXISTS ext_cine_schema
+CONTEXT:  extension script file "test_ext_cine--1.0.sql"
 DROP SCHEMA ext_cine_schema;
 CREATE SEQUENCE ext_cine_seq;
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  sequence ext_cine_seq is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+QUERY:  CREATE SEQUENCE IF NOT EXISTS ext_cine_seq
+CONTEXT:  extension script file "test_ext_cine--1.0.sql"
 DROP SEQUENCE ext_cine_seq;
 CREATE TABLE ext_cine_tab1 (x int);
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  table ext_cine_tab1 is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+QUERY:  CREATE TABLE IF NOT EXISTS ext_cine_tab1 (x int)
+CONTEXT:  extension script file "test_ext_cine--1.0.sql"
 DROP TABLE ext_cine_tab1;
 CREATE TABLE ext_cine_tab2 AS SELECT 42 AS y;
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  table ext_cine_tab2 is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+QUERY:  CREATE TABLE IF NOT EXISTS ext_cine_tab2 AS SELECT 42 AS y
+CONTEXT:  extension script file "test_ext_cine--1.0.sql"
 DROP TABLE ext_cine_tab2;
 CREATE EXTENSION test_ext_cine;
 \dx+ test_ext_cine
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index b6135f0347..9620d3e9d2 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3882,6 +3882,7 @@ saophash_hash
 save_buffer
 scram_state
 scram_state_enum
+script_error_callback_arg
 security_class_t
 sem_t
 sepgsql_context_info_t
#28Christoph Berg
myon@debian.org
In reply to: Tom Lane (#27)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

Re: Tom Lane

Perhaps. I spent a little more effort on this and added code to
report errors that don't come with an error location. On those,
we don't have any constraints about what to report in the QUERY
field, so I made it trim the string to just the current query
within the script, which makes things quite a bit better. You
can see the results in the test_extensions regression test changes.

That looks very good me, thanks!

(It might be worth some effort to trim away comments appearing
just before a command, but I didn't tackle that here.)

The "error when psql" comments do look confusing, but I guess in other
places the comment just before the query adds valuable context, so I'd
say leaving the comments in is ok.

Christoph

#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christoph Berg (#28)
2 attachment(s)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

Christoph Berg <myon@debian.org> writes:

Re: Tom Lane

(It might be worth some effort to trim away comments appearing
just before a command, but I didn't tackle that here.)

The "error when psql" comments do look confusing, but I guess in other
places the comment just before the query adds valuable context, so I'd
say leaving the comments in is ok.

It looks like if we did want to suppress that, the right fix is to
make gram.y track statement start locations more honestly, as in
0002 attached (0001 is the same as before). This'd add a few cycles
per grammar nonterminal reduction, which is kind of annoying but
probably is negligible in the grand scheme of things. Still, I'd
not propose it just for this. But if memory serves, we've had
previous complaints about pg_stat_statements failing to strip
leading comments from queries, and this'd fix that. I think it
likely also improves error cursor positioning for cases involving
empty productions --- I'm a tad surprised that no other regression
cases changed.

regards, tom lane

Attachments:

v2-0001-better-extension-error-reporting.patchtext/x-diff; charset=us-ascii; name=v2-0001-better-extension-error-reporting.patchDownload
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index fab59ad5f6..0fae1332d2 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -54,6 +54,7 @@
 #include "funcapi.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "nodes/queryjumble.h"
 #include "storage/fd.h"
 #include "tcop/utility.h"
 #include "utils/acl.h"
@@ -107,6 +108,17 @@ typedef struct ExtensionVersionInfo
 	struct ExtensionVersionInfo *previous;	/* current best predecessor */
 } ExtensionVersionInfo;
 
+/*
+ * Information for script_error_callback()
+ */
+typedef struct
+{
+	const char *sql;			/* entire script file contents */
+	const char *filename;		/* script file pathname */
+	ParseLoc	stmt_location;	/* current stmt start loc, or -1 if unknown */
+	ParseLoc	stmt_len;		/* length in bytes; 0 means "rest of string" */
+} script_error_callback_arg;
+
 /* Local functions */
 static List *find_update_path(List *evi_list,
 							  ExtensionVersionInfo *evi_start,
@@ -670,9 +682,60 @@ read_extension_script_file(const ExtensionControlFile *control,
 	return dest_str;
 }
 
+/*
+ * error context callback for failures in script-file execution
+ */
+static void
+script_error_callback(void *arg)
+{
+	script_error_callback_arg *callback_arg = (script_error_callback_arg *) arg;
+	int			syntaxerrposition;
+	const char *lastslash;
+
+	/* If it's a syntax error, convert to internal syntax error report */
+	syntaxerrposition = geterrposition();
+	if (syntaxerrposition > 0)
+	{
+		/*
+		 * We must report the whole string because otherwise details such as
+		 * psql's line number report would be wrong.
+		 */
+		errposition(0);
+		internalerrposition(syntaxerrposition);
+		internalerrquery(callback_arg->sql);
+	}
+	else if (callback_arg->stmt_location >= 0)
+	{
+		/*
+		 * Since no syntax cursor will be shown, it's okay and helpful to trim
+		 * the reported query string to just the current statement.
+		 */
+		const char *query = callback_arg->sql;
+		int			location = callback_arg->stmt_location;
+		int			len = callback_arg->stmt_len;
+
+		query = CleanQuerytext(query, &location, &len);
+		internalerrquery(pnstrdup(query, len));
+	}
+
+	/*
+	 * Trim the reported file name to remove the path.  We know that
+	 * get_extension_script_filename() inserted a '/', regardless of whether
+	 * we're on Windows.
+	 */
+	lastslash = strrchr(callback_arg->filename, '/');
+	if (lastslash)
+		lastslash++;
+	else
+		lastslash = callback_arg->filename; /* shouldn't happen, but cope */
+	errcontext("extension script file \"%s\"", lastslash);
+}
+
 /*
  * Execute given SQL string.
  *
+ * The filename the string came from is also provided, for error reporting.
+ *
  * Note: it's tempting to just use SPI to execute the string, but that does
  * not work very well.  The really serious problem is that SPI will parse,
  * analyze, and plan the whole string before executing any of it; of course
@@ -682,12 +745,27 @@ read_extension_script_file(const ExtensionControlFile *control,
  * could be very long.
  */
 static void
-execute_sql_string(const char *sql)
+execute_sql_string(const char *sql, const char *filename)
 {
+	script_error_callback_arg callback_arg;
+	ErrorContextCallback scripterrcontext;
 	List	   *raw_parsetree_list;
 	DestReceiver *dest;
 	ListCell   *lc1;
 
+	/*
+	 * Setup error traceback support for ereport().
+	 */
+	callback_arg.sql = sql;
+	callback_arg.filename = filename;
+	callback_arg.stmt_location = -1;
+	callback_arg.stmt_len = -1;
+
+	scripterrcontext.callback = script_error_callback;
+	scripterrcontext.arg = (void *) &callback_arg;
+	scripterrcontext.previous = error_context_stack;
+	error_context_stack = &scripterrcontext;
+
 	/*
 	 * Parse the SQL string into a list of raw parse trees.
 	 */
@@ -709,6 +787,10 @@ execute_sql_string(const char *sql)
 		List	   *stmt_list;
 		ListCell   *lc2;
 
+		/* Report location of this query for error context callback */
+		callback_arg.stmt_location = parsetree->stmt_location;
+		callback_arg.stmt_len = parsetree->stmt_len;
+
 		/*
 		 * We do the work for each parsetree in a short-lived context, to
 		 * limit the memory used when there are many commands in the string.
@@ -778,6 +860,8 @@ execute_sql_string(const char *sql)
 		MemoryContextDelete(per_parsetree_context);
 	}
 
+	error_context_stack = scripterrcontext.previous;
+
 	/* Be sure to advance the command counter after the last script command */
 	CommandCounterIncrement();
 }
@@ -1054,7 +1138,7 @@ execute_extension_script(Oid extensionOid, ExtensionControlFile *control,
 		/* And now back to C string */
 		c_sql = text_to_cstring(DatumGetTextPP(t_sql));
 
-		execute_sql_string(c_sql);
+		execute_sql_string(c_sql, filename);
 	}
 	PG_FINALLY();
 	{
diff --git a/src/test/modules/test_extensions/expected/test_extensions.out b/src/test/modules/test_extensions/expected/test_extensions.out
index f357cc21aa..b6370b3b4c 100644
--- a/src/test/modules/test_extensions/expected/test_extensions.out
+++ b/src/test/modules/test_extensions/expected/test_extensions.out
@@ -295,6 +295,16 @@ CREATE FUNCTION ext_cor_func() RETURNS text
 CREATE EXTENSION test_ext_cor;  -- fail
 ERROR:  function ext_cor_func() is not a member of extension "test_ext_cor"
 DETAIL:  An extension is not allowed to replace an object that it does not own.
+QUERY:  /* src/test/modules/test_extensions/test_ext_cor--1.0.sql */
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+
+
+-- It's generally bad style to use CREATE OR REPLACE unnecessarily.
+-- Test what happens if an extension does it anyway.
+
+CREATE OR REPLACE FUNCTION ext_cor_func() RETURNS text
+  AS $$ SELECT 'ext_cor_func: from extension'::text $$ LANGUAGE sql
+CONTEXT:  extension script file "test_ext_cor--1.0.sql"
 SELECT ext_cor_func();
       ext_cor_func      
 ------------------------
@@ -307,6 +317,9 @@ CREATE VIEW ext_cor_view AS
 CREATE EXTENSION test_ext_cor;  -- fail
 ERROR:  view ext_cor_view is not a member of extension "test_ext_cor"
 DETAIL:  An extension is not allowed to replace an object that it does not own.
+QUERY:  CREATE OR REPLACE VIEW ext_cor_view AS
+  SELECT 'ext_cor_view: from extension'::text AS col
+CONTEXT:  extension script file "test_ext_cor--1.0.sql"
 SELECT ext_cor_func();
 ERROR:  function ext_cor_func() does not exist
 LINE 1: SELECT ext_cor_func();
@@ -323,6 +336,11 @@ CREATE TYPE test_ext_type;
 CREATE EXTENSION test_ext_cor;  -- fail
 ERROR:  type test_ext_type is not a member of extension "test_ext_cor"
 DETAIL:  An extension is not allowed to replace an object that it does not own.
+QUERY:  -- These are for testing replacement of a shell type/operator, which works
+-- enough like an implicit OR REPLACE to be important to check.
+
+CREATE TYPE test_ext_type AS ENUM('x', 'y')
+CONTEXT:  extension script file "test_ext_cor--1.0.sql"
 DROP TYPE test_ext_type;
 -- this makes a shell "point <<@@ polygon" operator too
 CREATE OPERATOR @@>> ( PROCEDURE = poly_contain_pt,
@@ -331,6 +349,9 @@ CREATE OPERATOR @@>> ( PROCEDURE = poly_contain_pt,
 CREATE EXTENSION test_ext_cor;  -- fail
 ERROR:  operator <<@@(point,polygon) is not a member of extension "test_ext_cor"
 DETAIL:  An extension is not allowed to replace an object that it does not own.
+QUERY:  CREATE OPERATOR <<@@ ( PROCEDURE = pt_contained_poly,
+  LEFTARG = point, RIGHTARG = polygon )
+CONTEXT:  extension script file "test_ext_cor--1.0.sql"
 DROP OPERATOR <<@@ (point, polygon);
 CREATE EXTENSION test_ext_cor;  -- now it should work
 SELECT ext_cor_func();
@@ -379,37 +400,61 @@ CREATE COLLATION ext_cine_coll
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  collation ext_cine_coll is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+QUERY:  /* src/test/modules/test_extensions/test_ext_cine--1.0.sql */
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+
+
+--
+-- CREATE IF NOT EXISTS is an entirely unsound thing for an extension
+-- to be doing, but let's at least plug the major security hole in it.
+--
+
+CREATE COLLATION IF NOT EXISTS ext_cine_coll
+  ( LC_COLLATE = "POSIX", LC_CTYPE = "POSIX" )
+CONTEXT:  extension script file "test_ext_cine--1.0.sql"
 DROP COLLATION ext_cine_coll;
 CREATE MATERIALIZED VIEW ext_cine_mv AS SELECT 11 AS f1;
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  materialized view ext_cine_mv is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+QUERY:  CREATE MATERIALIZED VIEW IF NOT EXISTS ext_cine_mv AS SELECT 42 AS f1
+CONTEXT:  extension script file "test_ext_cine--1.0.sql"
 DROP MATERIALIZED VIEW ext_cine_mv;
 CREATE FOREIGN DATA WRAPPER dummy;
 CREATE SERVER ext_cine_srv FOREIGN DATA WRAPPER dummy;
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  server ext_cine_srv is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+QUERY:  CREATE SERVER IF NOT EXISTS ext_cine_srv FOREIGN DATA WRAPPER ext_cine_fdw
+CONTEXT:  extension script file "test_ext_cine--1.0.sql"
 DROP SERVER ext_cine_srv;
 CREATE SCHEMA ext_cine_schema;
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  schema ext_cine_schema is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+QUERY:  CREATE SCHEMA IF NOT EXISTS ext_cine_schema
+CONTEXT:  extension script file "test_ext_cine--1.0.sql"
 DROP SCHEMA ext_cine_schema;
 CREATE SEQUENCE ext_cine_seq;
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  sequence ext_cine_seq is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+QUERY:  CREATE SEQUENCE IF NOT EXISTS ext_cine_seq
+CONTEXT:  extension script file "test_ext_cine--1.0.sql"
 DROP SEQUENCE ext_cine_seq;
 CREATE TABLE ext_cine_tab1 (x int);
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  table ext_cine_tab1 is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+QUERY:  CREATE TABLE IF NOT EXISTS ext_cine_tab1 (x int)
+CONTEXT:  extension script file "test_ext_cine--1.0.sql"
 DROP TABLE ext_cine_tab1;
 CREATE TABLE ext_cine_tab2 AS SELECT 42 AS y;
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  table ext_cine_tab2 is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+QUERY:  CREATE TABLE IF NOT EXISTS ext_cine_tab2 AS SELECT 42 AS y
+CONTEXT:  extension script file "test_ext_cine--1.0.sql"
 DROP TABLE ext_cine_tab2;
 CREATE EXTENSION test_ext_cine;
 \dx+ test_ext_cine
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index b6135f0347..9620d3e9d2 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3882,6 +3882,7 @@ saophash_hash
 save_buffer
 scram_state
 scram_state_enum
+script_error_callback_arg
 security_class_t
 sem_t
 sepgsql_context_info_t
v2-0002-handle-statement-locations-better.patchtext/x-diff; charset=us-ascii; name=v2-0002-handle-statement-locations-better.patchDownload
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b1d4642c59..4d5f197a1c 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -67,39 +67,23 @@
 
 
 /*
- * Location tracking support --- simpler than bison's default, since we only
- * want to track the start position not the end position of each nonterminal.
+ * Location tracking support.  Unlike bison's default, we only want
+ * to track the start position not the end position of each nonterminal.
+ * Nonterminals that reduce to empty receive position "-1".
  */
 #define YYLLOC_DEFAULT(Current, Rhs, N) \
 	do { \
-		if ((N) > 0) \
-			(Current) = (Rhs)[1]; \
-		else \
-			(Current) = (-1); \
+		(Current) = (-1); \
+		for (int _i_ = 1; _i_ <= (N); _i_++) \
+		{ \
+			if ((Rhs)[_i_] >= 0) \
+			{ \
+				(Current) = (Rhs)[_i_]; \
+				break; \
+			} \
+		} \
 	} while (0)
 
-/*
- * The above macro assigns -1 (unknown) as the parse location of any
- * nonterminal that was reduced from an empty rule, or whose leftmost
- * component was reduced from an empty rule.  This is problematic
- * for nonterminals defined like
- *		OptFooList: / * EMPTY * / { ... } | OptFooList Foo { ... } ;
- * because we'll set -1 as the location during the first reduction and then
- * copy it during each subsequent reduction, leaving us with -1 for the
- * location even when the list is not empty.  To fix that, do this in the
- * action for the nonempty rule(s):
- *		if (@$ < 0) @$ = @2;
- * (Although we have many nonterminals that follow this pattern, we only
- * bother with fixing @$ like this when the nonterminal's parse location
- * is actually referenced in some rule.)
- *
- * A cleaner answer would be to make YYLLOC_DEFAULT scan all the Rhs
- * locations until it's found one that's not -1.  Then we'd get a correct
- * location for any nonterminal that isn't entirely empty.  But this way
- * would add overhead to every rule reduction, and so far there's not been
- * a compelling reason to pay that overhead.
- */
-
 /*
  * Bison doesn't allocate anything that needs to live across parser calls,
  * so we can easily have it use palloc instead of malloc.  This prevents
@@ -930,7 +914,7 @@ parse_toplevel:
 			| MODE_PLPGSQL_EXPR PLpgSQL_Expr
 			{
 				pg_yyget_extra(yyscanner)->parsetree =
-					list_make1(makeRawStmt($2, 0));
+					list_make1(makeRawStmt($2, @2));
 			}
 			| MODE_PLPGSQL_ASSIGN1 PLAssignStmt
 			{
@@ -938,7 +922,7 @@ parse_toplevel:
 
 				n->nnames = 1;
 				pg_yyget_extra(yyscanner)->parsetree =
-					list_make1(makeRawStmt((Node *) n, 0));
+					list_make1(makeRawStmt((Node *) n, @2));
 			}
 			| MODE_PLPGSQL_ASSIGN2 PLAssignStmt
 			{
@@ -946,7 +930,7 @@ parse_toplevel:
 
 				n->nnames = 2;
 				pg_yyget_extra(yyscanner)->parsetree =
-					list_make1(makeRawStmt((Node *) n, 0));
+					list_make1(makeRawStmt((Node *) n, @2));
 			}
 			| MODE_PLPGSQL_ASSIGN3 PLAssignStmt
 			{
@@ -954,18 +938,13 @@ parse_toplevel:
 
 				n->nnames = 3;
 				pg_yyget_extra(yyscanner)->parsetree =
-					list_make1(makeRawStmt((Node *) n, 0));
+					list_make1(makeRawStmt((Node *) n, @2));
 			}
 		;
 
 /*
  * At top level, we wrap each stmt with a RawStmt node carrying start location
- * and length of the stmt's text.  Notice that the start loc/len are driven
- * entirely from semicolon locations (@2).  It would seem natural to use
- * @1 or @3 to get the true start location of a stmt, but that doesn't work
- * for statements that can start with empty nonterminals (opt_with_clause is
- * the main offender here); as noted in the comments for YYLLOC_DEFAULT,
- * we'd get -1 for the location in such cases.
+ * and length of the stmt's text.
  * We also take care to discard empty statements entirely.
  */
 stmtmulti:	stmtmulti ';' toplevel_stmt
@@ -976,14 +955,14 @@ stmtmulti:	stmtmulti ';' toplevel_stmt
 						updateRawStmtEnd(llast_node(RawStmt, $1), @2);
 					}
 					if ($3 != NULL)
-						$$ = lappend($1, makeRawStmt($3, @2 + 1));
+						$$ = lappend($1, makeRawStmt($3, @3));
 					else
 						$$ = $1;
 				}
 			| toplevel_stmt
 				{
 					if ($1 != NULL)
-						$$ = list_make1(makeRawStmt($1, 0));
+						$$ = list_make1(makeRawStmt($1, @1));
 					else
 						$$ = NIL;
 				}
@@ -1584,8 +1563,6 @@ CreateSchemaStmt:
 OptSchemaEltList:
 			OptSchemaEltList schema_stmt
 				{
-					if (@$ < 0)			/* see comments for YYLLOC_DEFAULT */
-						@$ = @2;
 					$$ = lappend($1, $2);
 				}
 			| /* EMPTY */
diff --git a/src/test/modules/test_extensions/expected/test_extensions.out b/src/test/modules/test_extensions/expected/test_extensions.out
index b6370b3b4c..f77760d58e 100644
--- a/src/test/modules/test_extensions/expected/test_extensions.out
+++ b/src/test/modules/test_extensions/expected/test_extensions.out
@@ -295,14 +295,7 @@ CREATE FUNCTION ext_cor_func() RETURNS text
 CREATE EXTENSION test_ext_cor;  -- fail
 ERROR:  function ext_cor_func() is not a member of extension "test_ext_cor"
 DETAIL:  An extension is not allowed to replace an object that it does not own.
-QUERY:  /* src/test/modules/test_extensions/test_ext_cor--1.0.sql */
--- complain if script is sourced in psql, rather than via CREATE EXTENSION
-
-
--- It's generally bad style to use CREATE OR REPLACE unnecessarily.
--- Test what happens if an extension does it anyway.
-
-CREATE OR REPLACE FUNCTION ext_cor_func() RETURNS text
+QUERY:  CREATE OR REPLACE FUNCTION ext_cor_func() RETURNS text
   AS $$ SELECT 'ext_cor_func: from extension'::text $$ LANGUAGE sql
 CONTEXT:  extension script file "test_ext_cor--1.0.sql"
 SELECT ext_cor_func();
@@ -336,10 +329,7 @@ CREATE TYPE test_ext_type;
 CREATE EXTENSION test_ext_cor;  -- fail
 ERROR:  type test_ext_type is not a member of extension "test_ext_cor"
 DETAIL:  An extension is not allowed to replace an object that it does not own.
-QUERY:  -- These are for testing replacement of a shell type/operator, which works
--- enough like an implicit OR REPLACE to be important to check.
-
-CREATE TYPE test_ext_type AS ENUM('x', 'y')
+QUERY:  CREATE TYPE test_ext_type AS ENUM('x', 'y')
 CONTEXT:  extension script file "test_ext_cor--1.0.sql"
 DROP TYPE test_ext_type;
 -- this makes a shell "point <<@@ polygon" operator too
@@ -400,16 +390,7 @@ CREATE COLLATION ext_cine_coll
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  collation ext_cine_coll is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
-QUERY:  /* src/test/modules/test_extensions/test_ext_cine--1.0.sql */
--- complain if script is sourced in psql, rather than via CREATE EXTENSION
-
-
---
--- CREATE IF NOT EXISTS is an entirely unsound thing for an extension
--- to be doing, but let's at least plug the major security hole in it.
---
-
-CREATE COLLATION IF NOT EXISTS ext_cine_coll
+QUERY:  CREATE COLLATION IF NOT EXISTS ext_cine_coll
   ( LC_COLLATE = "POSIX", LC_CTYPE = "POSIX" )
 CONTEXT:  extension script file "test_ext_cine--1.0.sql"
 DROP COLLATION ext_cine_coll;
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#29)
2 attachment(s)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

I wrote:

It looks like if we did want to suppress that, the right fix is to
make gram.y track statement start locations more honestly, as in
0002 attached (0001 is the same as before).

I did a little bit of further work on this:

* I ran some performance checks and convinced myself that indeed
the more complex definition of YYLLOC_DEFAULT has no measurable
cost compared to the overall runtime of raw_parser(), never mind
later processing. So I see no reason not to go ahead with that
change. I swapped the two patches to make that 0001, and added
a regression test illustrating its effect on pg_stat_statements.
(Without the gram.y change, the added slash-star comment shows
up in the pg_stat_statements output, which is surely odd.)

* I concluded that the best way to report the individual statement
when we're able to do that is to include it in an errcontext()
message, similar to what spi.c's _SPI_error_callback does.
Otherwise it interacts badly if some more-tightly-nested error
context function has already set up an "internal error query",
as for example SQL function errors will do if you enable
check_function_bodies = on.

So here's v3, now with commit messages and regression tests.
I feel like this is out of the proof-of-concept stage and
might now actually be committable. There's still a question
of whether reporting the whole script as the query is OK when
we have a syntax error, but I have no good ideas as to how to
make that terser. And I think you're right that we shouldn't let
perfection stand in the way of making things noticeably better.

regards, tom lane

Attachments:

v3-0001-Improve-parser-s-reporting-of-statement-start-loc.patchtext/x-diff; charset=us-ascii; name*0=v3-0001-Improve-parser-s-reporting-of-statement-start-loc.p; name*1=atchDownload
From f6022f120c6c8e16f33a7b7d220d0f3a8b7ebf4e Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sat, 28 Sep 2024 13:48:39 -0400
Subject: [PATCH v3 1/2] Improve parser's reporting of statement start
 locations.

Up to now, the parser's reporting of a statement's stmt_location
included any preceding whitespace or comments.  This isn't really
desirable but was done to avoid accounting honestly for nonterminals
that reduce to empty.  It causes problems for pg_stat_statements,
which partially compensates by manually stripping whitespace, but
is not bright enough to strip /*-style comments.  There will be
more problems with an upcoming patch to improve reporting of errors
in extension scripts, so it's time to do something about this.

The thing we have to do to make it work right is to adjust
YYLLOC_DEFAULT to scan the inputs of each production to find the
first one that has a valid location (i.e., did not reduce to
empty).  In theory this adds a little bit of per-reduction overhead,
but in practice it's negligible.  I checked by measuring the time
to run raw_parser() on the contents of information_schema.sql, and
there was basically no change.

Having done that, we can rely on any nonterminal that didn't reduce
to completely empty to have a correct starting location, and we don't
need the kluges the stmtmulti production formerly used.

This should have a side benefit of allowing parse error reports to
include an error position in some cases where they formerly failed to
do so, due to trying to report the position of an empty nonterminal.
I did not go looking for an example though.  The one previously known
case where that could happen (OptSchemaEltList) no longer needs the
kluge it had; but I rather doubt that that was the only case.

Discussion: https://postgr.es/m/ZvV1ClhnbJLCz7Sm@msg.df7cb.de
---
 .../pg_stat_statements/expected/select.out    |  5 +-
 contrib/pg_stat_statements/sql/select.sql     |  3 +-
 src/backend/nodes/queryjumblefuncs.c          |  6 ++
 src/backend/parser/gram.y                     | 66 +++++++------------
 4 files changed, 34 insertions(+), 46 deletions(-)

diff --git a/contrib/pg_stat_statements/expected/select.out b/contrib/pg_stat_statements/expected/select.out
index dd6c756f67..e0e2fa265c 100644
--- a/contrib/pg_stat_statements/expected/select.out
+++ b/contrib/pg_stat_statements/expected/select.out
@@ -19,8 +19,9 @@ SELECT 1 AS "int";
    1
 (1 row)
 
+/* this comment should not appear in the output */
 SELECT 'hello'
-  -- multiline
+  -- but this one will appear
   AS "text";
  text  
 -------
@@ -129,7 +130,7 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
 -------+------+------------------------------------------------------------------------------
      1 |    1 | PREPARE pgss_test (int) AS SELECT $1, $2 LIMIT $3
      4 |    4 | SELECT $1                                                                   +
-       |      |   -- multiline                                                              +
+       |      |   -- but this one will appear                                               +
        |      |   AS "text"
      2 |    2 | SELECT $1 + $2
      3 |    3 | SELECT $1 + $2 + $3 AS "add"
diff --git a/contrib/pg_stat_statements/sql/select.sql b/contrib/pg_stat_statements/sql/select.sql
index eb45cb81ad..e0be58d5e2 100644
--- a/contrib/pg_stat_statements/sql/select.sql
+++ b/contrib/pg_stat_statements/sql/select.sql
@@ -12,8 +12,9 @@ SELECT pg_stat_statements_reset() IS NOT NULL AS t;
 --
 SELECT 1 AS "int";
 
+/* this comment should not appear in the output */
 SELECT 'hello'
-  -- multiline
+  -- but this one will appear
   AS "text";
 
 SELECT 'world' AS "text";
diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
index 129fb44709..f019f6388d 100644
--- a/src/backend/nodes/queryjumblefuncs.c
+++ b/src/backend/nodes/queryjumblefuncs.c
@@ -89,6 +89,12 @@ CleanQuerytext(const char *query, int *location, int *len)
 	/*
 	 * Discard leading and trailing whitespace, too.  Use scanner_isspace()
 	 * not libc's isspace(), because we want to match the lexer's behavior.
+	 *
+	 * Note: the parser now strips leading comments and whitespace from the
+	 * reported stmt_location, so this first loop will only iterate in the
+	 * unusual case that the location didn't propagate to here.  But the
+	 * statement length will extend to the end-of-string or terminating
+	 * semicolon, so the second loop often does something useful.
 	 */
 	while (query_len > 0 && scanner_isspace(query[0]))
 		query++, query_location++, query_len--;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b1d4642c59..c0faacfac0 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -67,39 +67,25 @@
 
 
 /*
- * Location tracking support --- simpler than bison's default, since we only
- * want to track the start position not the end position of each nonterminal.
+ * Location tracking support.  Unlike bison's default, we only want
+ * to track the start position not the end position of each nonterminal.
+ * Nonterminals that reduce to empty receive position "-1".  Since a
+ * production's leading RHS nonterminal(s) may have reduced to empty,
+ * we have to scan to find the first one that's not -1.
  */
 #define YYLLOC_DEFAULT(Current, Rhs, N) \
 	do { \
-		if ((N) > 0) \
-			(Current) = (Rhs)[1]; \
-		else \
-			(Current) = (-1); \
+		(Current) = (-1); \
+		for (int _i = 1; _i <= (N); _i++) \
+		{ \
+			if ((Rhs)[_i] >= 0) \
+			{ \
+				(Current) = (Rhs)[_i]; \
+				break; \
+			} \
+		} \
 	} while (0)
 
-/*
- * The above macro assigns -1 (unknown) as the parse location of any
- * nonterminal that was reduced from an empty rule, or whose leftmost
- * component was reduced from an empty rule.  This is problematic
- * for nonterminals defined like
- *		OptFooList: / * EMPTY * / { ... } | OptFooList Foo { ... } ;
- * because we'll set -1 as the location during the first reduction and then
- * copy it during each subsequent reduction, leaving us with -1 for the
- * location even when the list is not empty.  To fix that, do this in the
- * action for the nonempty rule(s):
- *		if (@$ < 0) @$ = @2;
- * (Although we have many nonterminals that follow this pattern, we only
- * bother with fixing @$ like this when the nonterminal's parse location
- * is actually referenced in some rule.)
- *
- * A cleaner answer would be to make YYLLOC_DEFAULT scan all the Rhs
- * locations until it's found one that's not -1.  Then we'd get a correct
- * location for any nonterminal that isn't entirely empty.  But this way
- * would add overhead to every rule reduction, and so far there's not been
- * a compelling reason to pay that overhead.
- */
-
 /*
  * Bison doesn't allocate anything that needs to live across parser calls,
  * so we can easily have it use palloc instead of malloc.  This prevents
@@ -930,7 +916,7 @@ parse_toplevel:
 			| MODE_PLPGSQL_EXPR PLpgSQL_Expr
 			{
 				pg_yyget_extra(yyscanner)->parsetree =
-					list_make1(makeRawStmt($2, 0));
+					list_make1(makeRawStmt($2, @2));
 			}
 			| MODE_PLPGSQL_ASSIGN1 PLAssignStmt
 			{
@@ -938,7 +924,7 @@ parse_toplevel:
 
 				n->nnames = 1;
 				pg_yyget_extra(yyscanner)->parsetree =
-					list_make1(makeRawStmt((Node *) n, 0));
+					list_make1(makeRawStmt((Node *) n, @2));
 			}
 			| MODE_PLPGSQL_ASSIGN2 PLAssignStmt
 			{
@@ -946,7 +932,7 @@ parse_toplevel:
 
 				n->nnames = 2;
 				pg_yyget_extra(yyscanner)->parsetree =
-					list_make1(makeRawStmt((Node *) n, 0));
+					list_make1(makeRawStmt((Node *) n, @2));
 			}
 			| MODE_PLPGSQL_ASSIGN3 PLAssignStmt
 			{
@@ -954,19 +940,15 @@ parse_toplevel:
 
 				n->nnames = 3;
 				pg_yyget_extra(yyscanner)->parsetree =
-					list_make1(makeRawStmt((Node *) n, 0));
+					list_make1(makeRawStmt((Node *) n, @2));
 			}
 		;
 
 /*
  * At top level, we wrap each stmt with a RawStmt node carrying start location
- * and length of the stmt's text.  Notice that the start loc/len are driven
- * entirely from semicolon locations (@2).  It would seem natural to use
- * @1 or @3 to get the true start location of a stmt, but that doesn't work
- * for statements that can start with empty nonterminals (opt_with_clause is
- * the main offender here); as noted in the comments for YYLLOC_DEFAULT,
- * we'd get -1 for the location in such cases.
- * We also take care to discard empty statements entirely.
+ * and length of the stmt's text.
+ * We also take care to discard empty statements entirely (which among other
+ * things dodges the problem of assigning them a location).
  */
 stmtmulti:	stmtmulti ';' toplevel_stmt
 				{
@@ -976,14 +958,14 @@ stmtmulti:	stmtmulti ';' toplevel_stmt
 						updateRawStmtEnd(llast_node(RawStmt, $1), @2);
 					}
 					if ($3 != NULL)
-						$$ = lappend($1, makeRawStmt($3, @2 + 1));
+						$$ = lappend($1, makeRawStmt($3, @3));
 					else
 						$$ = $1;
 				}
 			| toplevel_stmt
 				{
 					if ($1 != NULL)
-						$$ = list_make1(makeRawStmt($1, 0));
+						$$ = list_make1(makeRawStmt($1, @1));
 					else
 						$$ = NIL;
 				}
@@ -1584,8 +1566,6 @@ CreateSchemaStmt:
 OptSchemaEltList:
 			OptSchemaEltList schema_stmt
 				{
-					if (@$ < 0)			/* see comments for YYLLOC_DEFAULT */
-						@$ = @2;
 					$$ = lappend($1, $2);
 				}
 			| /* EMPTY */
-- 
2.43.5

v3-0002-Improve-reporting-of-errors-in-extension-script-f.patchtext/x-diff; charset=us-ascii; name*0=v3-0002-Improve-reporting-of-errors-in-extension-script-f.p; name*1=atchDownload
From eb0f55f02fc54e4777cf84b92838c4cb9c749855 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sat, 28 Sep 2024 14:59:26 -0400
Subject: [PATCH v3 2/2] Improve reporting of errors in extension script files.

Previously, CREATE/ALTER EXTENSION gave basically no useful
context about errors reported while executing script files.
I think the idea was that you could run the same commands
manually to see the error, but that's often quite inconvenient.
Let's improve that.

In the case of a syntax error (or anything reported with an error
cursor), just report the whole script file as the "internal query".
This might be annoyingly long, but it seems more difficult than
it's worth to try to shorten it to something relevant.  There are
interactions with libpq's reportErrorPosition logic to worry about,
for example: it seems like a good idea that the line number that that
code reports match up with the true line number within the script.
Possibly someone will come back to this area later.

Without an error cursor, if we have gotten past raw parsing (which
we probably have), we can report just the current SQL statement as
an item of error context.

In any case also report the script file name as error context,
since it might not be entirely obvious which of a series of
update scripts failed.

The error-context code path is already exercised by some
test_extensions test cases, but add tests for the syntax-error
path.

Discussion: https://postgr.es/m/ZvV1ClhnbJLCz7Sm@msg.df7cb.de
---
 src/backend/commands/extension.c              | 91 ++++++++++++++++++-
 src/test/modules/test_extensions/Makefile     |  4 +-
 .../expected/test_extensions.out              | 52 +++++++++++
 src/test/modules/test_extensions/meson.build  |  2 +
 .../test_extensions/sql/test_extensions.sql   |  4 +
 .../test_ext7--2.0--2.1bad.sql                | 11 +++
 .../test_ext7--2.0--2.2bad.sql                | 13 +++
 src/tools/pgindent/typedefs.list              |  1 +
 8 files changed, 175 insertions(+), 3 deletions(-)
 create mode 100644 src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql
 create mode 100644 src/test/modules/test_extensions/test_ext7--2.0--2.2bad.sql

diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index fab59ad5f6..ed5ab2c556 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -54,6 +54,7 @@
 #include "funcapi.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "nodes/queryjumble.h"
 #include "storage/fd.h"
 #include "tcop/utility.h"
 #include "utils/acl.h"
@@ -107,6 +108,17 @@ typedef struct ExtensionVersionInfo
 	struct ExtensionVersionInfo *previous;	/* current best predecessor */
 } ExtensionVersionInfo;
 
+/*
+ * Information for script_error_callback()
+ */
+typedef struct
+{
+	const char *sql;			/* entire script file contents */
+	const char *filename;		/* script file pathname */
+	ParseLoc	stmt_location;	/* current stmt start loc, or -1 if unknown */
+	ParseLoc	stmt_len;		/* length in bytes; 0 means "rest of string" */
+} script_error_callback_arg;
+
 /* Local functions */
 static List *find_update_path(List *evi_list,
 							  ExtensionVersionInfo *evi_start,
@@ -670,9 +682,63 @@ read_extension_script_file(const ExtensionControlFile *control,
 	return dest_str;
 }
 
+/*
+ * error context callback for failures in script-file execution
+ */
+static void
+script_error_callback(void *arg)
+{
+	script_error_callback_arg *callback_arg = (script_error_callback_arg *) arg;
+	int			syntaxerrposition;
+	const char *lastslash;
+
+	/*
+	 * If there is a syntax error position, convert to internal syntax error;
+	 * otherwise report the current query as an item of context stack
+	 */
+	syntaxerrposition = geterrposition();
+	if (syntaxerrposition > 0)
+	{
+		/*
+		 * We must report the whole string because otherwise details such as
+		 * psql's line number report would be wrong.
+		 */
+		errposition(0);
+		internalerrposition(syntaxerrposition);
+		internalerrquery(callback_arg->sql);
+	}
+	else if (callback_arg->stmt_location >= 0)
+	{
+		/*
+		 * Since no syntax cursor will be shown, it's okay and helpful to trim
+		 * the reported query string to just the current statement.
+		 */
+		const char *query = callback_arg->sql;
+		int			location = callback_arg->stmt_location;
+		int			len = callback_arg->stmt_len;
+
+		query = CleanQuerytext(query, &location, &len);
+		errcontext("SQL statement \"%.*s\"", len, query);
+	}
+
+	/*
+	 * Trim the reported file name to remove the path.  We know that
+	 * get_extension_script_filename() inserted a '/', regardless of whether
+	 * we're on Windows.
+	 */
+	lastslash = strrchr(callback_arg->filename, '/');
+	if (lastslash)
+		lastslash++;
+	else
+		lastslash = callback_arg->filename; /* shouldn't happen, but cope */
+	errcontext("extension script file \"%s\"", lastslash);
+}
+
 /*
  * Execute given SQL string.
  *
+ * The filename the string came from is also provided, for error reporting.
+ *
  * Note: it's tempting to just use SPI to execute the string, but that does
  * not work very well.  The really serious problem is that SPI will parse,
  * analyze, and plan the whole string before executing any of it; of course
@@ -682,12 +748,27 @@ read_extension_script_file(const ExtensionControlFile *control,
  * could be very long.
  */
 static void
-execute_sql_string(const char *sql)
+execute_sql_string(const char *sql, const char *filename)
 {
+	script_error_callback_arg callback_arg;
+	ErrorContextCallback scripterrcontext;
 	List	   *raw_parsetree_list;
 	DestReceiver *dest;
 	ListCell   *lc1;
 
+	/*
+	 * Setup error traceback support for ereport().
+	 */
+	callback_arg.sql = sql;
+	callback_arg.filename = filename;
+	callback_arg.stmt_location = -1;
+	callback_arg.stmt_len = -1;
+
+	scripterrcontext.callback = script_error_callback;
+	scripterrcontext.arg = (void *) &callback_arg;
+	scripterrcontext.previous = error_context_stack;
+	error_context_stack = &scripterrcontext;
+
 	/*
 	 * Parse the SQL string into a list of raw parse trees.
 	 */
@@ -709,6 +790,10 @@ execute_sql_string(const char *sql)
 		List	   *stmt_list;
 		ListCell   *lc2;
 
+		/* Report location of this query for error context callback */
+		callback_arg.stmt_location = parsetree->stmt_location;
+		callback_arg.stmt_len = parsetree->stmt_len;
+
 		/*
 		 * We do the work for each parsetree in a short-lived context, to
 		 * limit the memory used when there are many commands in the string.
@@ -778,6 +863,8 @@ execute_sql_string(const char *sql)
 		MemoryContextDelete(per_parsetree_context);
 	}
 
+	error_context_stack = scripterrcontext.previous;
+
 	/* Be sure to advance the command counter after the last script command */
 	CommandCounterIncrement();
 }
@@ -1054,7 +1141,7 @@ execute_extension_script(Oid extensionOid, ExtensionControlFile *control,
 		/* And now back to C string */
 		c_sql = text_to_cstring(DatumGetTextPP(t_sql));
 
-		execute_sql_string(c_sql);
+		execute_sql_string(c_sql, filename);
 	}
 	PG_FINALLY();
 	{
diff --git a/src/test/modules/test_extensions/Makefile b/src/test/modules/test_extensions/Makefile
index 05272e6a40..1dbec14cba 100644
--- a/src/test/modules/test_extensions/Makefile
+++ b/src/test/modules/test_extensions/Makefile
@@ -13,7 +13,9 @@ EXTENSION = test_ext1 test_ext2 test_ext3 test_ext4 test_ext5 test_ext6 \
 
 DATA = test_ext1--1.0.sql test_ext2--1.0.sql test_ext3--1.0.sql \
        test_ext4--1.0.sql test_ext5--1.0.sql test_ext6--1.0.sql \
-       test_ext7--1.0.sql test_ext7--1.0--2.0.sql test_ext8--1.0.sql \
+       test_ext7--1.0.sql test_ext7--1.0--2.0.sql \
+       test_ext7--2.0--2.1bad.sql test_ext7--2.0--2.2bad.sql \
+       test_ext8--1.0.sql \
        test_ext9--1.0.sql \
        test_ext_cine--1.0.sql test_ext_cine--1.0--1.1.sql \
        test_ext_cor--1.0.sql \
diff --git a/src/test/modules/test_extensions/expected/test_extensions.out b/src/test/modules/test_extensions/expected/test_extensions.out
index f357cc21aa..dfe0596888 100644
--- a/src/test/modules/test_extensions/expected/test_extensions.out
+++ b/src/test/modules/test_extensions/expected/test_extensions.out
@@ -72,6 +72,32 @@ Objects in extension "test_ext7"
  type ext7_table2[]
 (4 rows)
 
+-- test reporting of errors in extension scripts
+alter extension test_ext7 update to '2.1bad';
+ERROR:  syntax error at or near "FUNCTIN"
+LINE 7: CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE S...
+               ^
+QUERY:  /* src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+
+
+-- test reporting of a simple syntax error in an extension script
+CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL
+AS $$ SELECT $1 + 1 $$;
+
+CREATE FUNCTION my_other_func(int) RETURNS int LANGUAGE SQL
+AS $$ SELECT $1 + 1 $$;
+
+CONTEXT:  extension script file "test_ext7--2.0--2.1bad.sql"
+alter extension test_ext7 update to '2.2bad';
+ERROR:  syntax error at or near ","
+LINE 1:  SELECT $1 + , 1 
+                     ^
+QUERY:   SELECT $1 + , 1 
+CONTEXT:  SQL statement "CREATE FUNCTION my_erroneous_func(int) RETURNS int LANGUAGE SQL
+AS $$ SELECT $1 + , 1 $$"
+extension script file "test_ext7--2.0--2.2bad.sql"
 -- test handling of temp objects created by extensions
 create extension test_ext8;
 -- \dx+ would expose a variable pg_temp_nn schema name, so we can't use it here
@@ -295,6 +321,9 @@ CREATE FUNCTION ext_cor_func() RETURNS text
 CREATE EXTENSION test_ext_cor;  -- fail
 ERROR:  function ext_cor_func() is not a member of extension "test_ext_cor"
 DETAIL:  An extension is not allowed to replace an object that it does not own.
+CONTEXT:  SQL statement "CREATE OR REPLACE FUNCTION ext_cor_func() RETURNS text
+  AS $$ SELECT 'ext_cor_func: from extension'::text $$ LANGUAGE sql"
+extension script file "test_ext_cor--1.0.sql"
 SELECT ext_cor_func();
       ext_cor_func      
 ------------------------
@@ -307,6 +336,9 @@ CREATE VIEW ext_cor_view AS
 CREATE EXTENSION test_ext_cor;  -- fail
 ERROR:  view ext_cor_view is not a member of extension "test_ext_cor"
 DETAIL:  An extension is not allowed to replace an object that it does not own.
+CONTEXT:  SQL statement "CREATE OR REPLACE VIEW ext_cor_view AS
+  SELECT 'ext_cor_view: from extension'::text AS col"
+extension script file "test_ext_cor--1.0.sql"
 SELECT ext_cor_func();
 ERROR:  function ext_cor_func() does not exist
 LINE 1: SELECT ext_cor_func();
@@ -323,6 +355,8 @@ CREATE TYPE test_ext_type;
 CREATE EXTENSION test_ext_cor;  -- fail
 ERROR:  type test_ext_type is not a member of extension "test_ext_cor"
 DETAIL:  An extension is not allowed to replace an object that it does not own.
+CONTEXT:  SQL statement "CREATE TYPE test_ext_type AS ENUM('x', 'y')"
+extension script file "test_ext_cor--1.0.sql"
 DROP TYPE test_ext_type;
 -- this makes a shell "point <<@@ polygon" operator too
 CREATE OPERATOR @@>> ( PROCEDURE = poly_contain_pt,
@@ -331,6 +365,9 @@ CREATE OPERATOR @@>> ( PROCEDURE = poly_contain_pt,
 CREATE EXTENSION test_ext_cor;  -- fail
 ERROR:  operator <<@@(point,polygon) is not a member of extension "test_ext_cor"
 DETAIL:  An extension is not allowed to replace an object that it does not own.
+CONTEXT:  SQL statement "CREATE OPERATOR <<@@ ( PROCEDURE = pt_contained_poly,
+  LEFTARG = point, RIGHTARG = polygon )"
+extension script file "test_ext_cor--1.0.sql"
 DROP OPERATOR <<@@ (point, polygon);
 CREATE EXTENSION test_ext_cor;  -- now it should work
 SELECT ext_cor_func();
@@ -379,37 +416,52 @@ CREATE COLLATION ext_cine_coll
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  collation ext_cine_coll is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+CONTEXT:  SQL statement "CREATE COLLATION IF NOT EXISTS ext_cine_coll
+  ( LC_COLLATE = "POSIX", LC_CTYPE = "POSIX" )"
+extension script file "test_ext_cine--1.0.sql"
 DROP COLLATION ext_cine_coll;
 CREATE MATERIALIZED VIEW ext_cine_mv AS SELECT 11 AS f1;
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  materialized view ext_cine_mv is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+CONTEXT:  SQL statement "CREATE MATERIALIZED VIEW IF NOT EXISTS ext_cine_mv AS SELECT 42 AS f1"
+extension script file "test_ext_cine--1.0.sql"
 DROP MATERIALIZED VIEW ext_cine_mv;
 CREATE FOREIGN DATA WRAPPER dummy;
 CREATE SERVER ext_cine_srv FOREIGN DATA WRAPPER dummy;
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  server ext_cine_srv is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+CONTEXT:  SQL statement "CREATE SERVER IF NOT EXISTS ext_cine_srv FOREIGN DATA WRAPPER ext_cine_fdw"
+extension script file "test_ext_cine--1.0.sql"
 DROP SERVER ext_cine_srv;
 CREATE SCHEMA ext_cine_schema;
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  schema ext_cine_schema is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+CONTEXT:  SQL statement "CREATE SCHEMA IF NOT EXISTS ext_cine_schema"
+extension script file "test_ext_cine--1.0.sql"
 DROP SCHEMA ext_cine_schema;
 CREATE SEQUENCE ext_cine_seq;
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  sequence ext_cine_seq is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+CONTEXT:  SQL statement "CREATE SEQUENCE IF NOT EXISTS ext_cine_seq"
+extension script file "test_ext_cine--1.0.sql"
 DROP SEQUENCE ext_cine_seq;
 CREATE TABLE ext_cine_tab1 (x int);
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  table ext_cine_tab1 is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+CONTEXT:  SQL statement "CREATE TABLE IF NOT EXISTS ext_cine_tab1 (x int)"
+extension script file "test_ext_cine--1.0.sql"
 DROP TABLE ext_cine_tab1;
 CREATE TABLE ext_cine_tab2 AS SELECT 42 AS y;
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  table ext_cine_tab2 is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+CONTEXT:  SQL statement "CREATE TABLE IF NOT EXISTS ext_cine_tab2 AS SELECT 42 AS y"
+extension script file "test_ext_cine--1.0.sql"
 DROP TABLE ext_cine_tab2;
 CREATE EXTENSION test_ext_cine;
 \dx+ test_ext_cine
diff --git a/src/test/modules/test_extensions/meson.build b/src/test/modules/test_extensions/meson.build
index c5f3424da5..2b31cbf425 100644
--- a/src/test/modules/test_extensions/meson.build
+++ b/src/test/modules/test_extensions/meson.build
@@ -15,6 +15,8 @@ test_install_data += files(
   'test_ext6.control',
   'test_ext7--1.0--2.0.sql',
   'test_ext7--1.0.sql',
+  'test_ext7--2.0--2.1bad.sql',
+  'test_ext7--2.0--2.2bad.sql',
   'test_ext7.control',
   'test_ext8--1.0.sql',
   'test_ext8.control',
diff --git a/src/test/modules/test_extensions/sql/test_extensions.sql b/src/test/modules/test_extensions/sql/test_extensions.sql
index 642c82ff5d..b5878f6f80 100644
--- a/src/test/modules/test_extensions/sql/test_extensions.sql
+++ b/src/test/modules/test_extensions/sql/test_extensions.sql
@@ -28,6 +28,10 @@ create extension test_ext7;
 alter extension test_ext7 update to '2.0';
 \dx+ test_ext7
 
+-- test reporting of errors in extension scripts
+alter extension test_ext7 update to '2.1bad';
+alter extension test_ext7 update to '2.2bad';
+
 -- test handling of temp objects created by extensions
 create extension test_ext8;
 
diff --git a/src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql b/src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql
new file mode 100644
index 0000000000..910b1d4f38
--- /dev/null
+++ b/src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql
@@ -0,0 +1,11 @@
+/* src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION test_ext7 UPDATE TO '2.1bad'" to load this file. \quit
+
+-- test reporting of a simple syntax error in an extension script
+CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL
+AS $$ SELECT $1 + 1 $$;
+
+CREATE FUNCTION my_other_func(int) RETURNS int LANGUAGE SQL
+AS $$ SELECT $1 + 1 $$;
diff --git a/src/test/modules/test_extensions/test_ext7--2.0--2.2bad.sql b/src/test/modules/test_extensions/test_ext7--2.0--2.2bad.sql
new file mode 100644
index 0000000000..19aa9bf264
--- /dev/null
+++ b/src/test/modules/test_extensions/test_ext7--2.0--2.2bad.sql
@@ -0,0 +1,13 @@
+/* src/test/modules/test_extensions/test_ext7--2.0--2.2bad.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION test_ext7 UPDATE TO '2.2bad'" to load this file. \quit
+
+-- test reporting of a nested syntax error in an extension script
+SET LOCAL check_function_bodies = on;
+
+CREATE FUNCTION my_erroneous_func(int) RETURNS int LANGUAGE SQL
+AS $$ SELECT $1 + , 1 $$;
+
+CREATE FUNCTION my_other_func(int) RETURNS int LANGUAGE SQL
+AS $$ SELECT $1 + 1 $$;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 5fabb127d7..964f51f429 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3883,6 +3883,7 @@ saophash_hash
 save_buffer
 scram_state
 scram_state_enum
+script_error_callback_arg
 security_class_t
 sem_t
 sepgsql_context_info_t
-- 
2.43.5

#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#30)
2 attachment(s)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

I wrote:

... There's still a question
of whether reporting the whole script as the query is OK when
we have a syntax error, but I have no good ideas as to how to
make that terser.

I had an idea about this: we can use a pretty simple heuristic
such as "break at semicolon-newline sequences". That could fail
and show you just a fragment of a statement, but that still seems
better than showing a whole extension script. We can ameliorate
the problem that we might not show enough to clearly identify
what failed by including a separate line number counter.
In the attached v4 I included that in the context line that
reports the script file, eg

+CONTEXT:  SQL statement "CREATE OR REPLACE FUNCTION ext_cor_func() RETURNS text
+  AS $$ SELECT 'ext_cor_func: from extension'::text $$ LANGUAGE sql"
+extension script file "test_ext_cor--1.0.sql", near line 8

This way seems a whole lot more usable when dealing with a
large extension script.

regards, tom lane

Attachments:

v4-0001-Improve-parser-s-reporting-of-statement-start-loc.patchtext/x-diff; charset=us-ascii; name*0=v4-0001-Improve-parser-s-reporting-of-statement-start-loc.p; name*1=atchDownload
From 6a8e6f66bd1abdb3c52b543651eec50ed4ceb071 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Tue, 8 Oct 2024 14:36:15 -0400
Subject: [PATCH v4 1/2] Improve parser's reporting of statement start
 locations.

Up to now, the parser's reporting of a statement's stmt_location
included any preceding whitespace or comments.  This isn't really
desirable but was done to avoid accounting honestly for nonterminals
that reduce to empty.  It causes problems for pg_stat_statements,
which partially compensates by manually stripping whitespace, but
is not bright enough to strip /*-style comments.  There will be
more problems with an upcoming patch to improve reporting of errors
in extension scripts, so it's time to do something about this.

The thing we have to do to make it work right is to adjust
YYLLOC_DEFAULT to scan the inputs of each production to find the
first one that has a valid location (i.e., did not reduce to
empty).  In theory this adds a little bit of per-reduction overhead,
but in practice it's negligible.  I checked by measuring the time
to run raw_parser() on the contents of information_schema.sql, and
there was basically no change.

Having done that, we can rely on any nonterminal that didn't reduce
to completely empty to have a correct starting location, and we don't
need the kluges the stmtmulti production formerly used.

This should have a side benefit of allowing parse error reports to
include an error position in some cases where they formerly failed to
do so, due to trying to report the position of an empty nonterminal.
I did not go looking for an example though.  The one previously known
case where that could happen (OptSchemaEltList) no longer needs the
kluge it had; but I rather doubt that that was the only case.

Discussion: https://postgr.es/m/ZvV1ClhnbJLCz7Sm@msg.df7cb.de
---
 .../pg_stat_statements/expected/select.out    |  5 +-
 contrib/pg_stat_statements/sql/select.sql     |  3 +-
 src/backend/nodes/queryjumblefuncs.c          |  6 ++
 src/backend/parser/gram.y                     | 66 +++++++------------
 4 files changed, 34 insertions(+), 46 deletions(-)

diff --git a/contrib/pg_stat_statements/expected/select.out b/contrib/pg_stat_statements/expected/select.out
index dd6c756f67..e0e2fa265c 100644
--- a/contrib/pg_stat_statements/expected/select.out
+++ b/contrib/pg_stat_statements/expected/select.out
@@ -19,8 +19,9 @@ SELECT 1 AS "int";
    1
 (1 row)
 
+/* this comment should not appear in the output */
 SELECT 'hello'
-  -- multiline
+  -- but this one will appear
   AS "text";
  text  
 -------
@@ -129,7 +130,7 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
 -------+------+------------------------------------------------------------------------------
      1 |    1 | PREPARE pgss_test (int) AS SELECT $1, $2 LIMIT $3
      4 |    4 | SELECT $1                                                                   +
-       |      |   -- multiline                                                              +
+       |      |   -- but this one will appear                                               +
        |      |   AS "text"
      2 |    2 | SELECT $1 + $2
      3 |    3 | SELECT $1 + $2 + $3 AS "add"
diff --git a/contrib/pg_stat_statements/sql/select.sql b/contrib/pg_stat_statements/sql/select.sql
index eb45cb81ad..e0be58d5e2 100644
--- a/contrib/pg_stat_statements/sql/select.sql
+++ b/contrib/pg_stat_statements/sql/select.sql
@@ -12,8 +12,9 @@ SELECT pg_stat_statements_reset() IS NOT NULL AS t;
 --
 SELECT 1 AS "int";
 
+/* this comment should not appear in the output */
 SELECT 'hello'
-  -- multiline
+  -- but this one will appear
   AS "text";
 
 SELECT 'world' AS "text";
diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c
index 5e43fd9229..e8bf95690b 100644
--- a/src/backend/nodes/queryjumblefuncs.c
+++ b/src/backend/nodes/queryjumblefuncs.c
@@ -90,6 +90,12 @@ CleanQuerytext(const char *query, int *location, int *len)
 	/*
 	 * Discard leading and trailing whitespace, too.  Use scanner_isspace()
 	 * not libc's isspace(), because we want to match the lexer's behavior.
+	 *
+	 * Note: the parser now strips leading comments and whitespace from the
+	 * reported stmt_location, so this first loop will only iterate in the
+	 * unusual case that the location didn't propagate to here.  But the
+	 * statement length will extend to the end-of-string or terminating
+	 * semicolon, so the second loop often does something useful.
 	 */
 	while (query_len > 0 && scanner_isspace(query[0]))
 		query++, query_location++, query_len--;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 4aa8646af7..4bab2117d9 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -67,39 +67,25 @@
 
 
 /*
- * Location tracking support --- simpler than bison's default, since we only
- * want to track the start position not the end position of each nonterminal.
+ * Location tracking support.  Unlike bison's default, we only want
+ * to track the start position not the end position of each nonterminal.
+ * Nonterminals that reduce to empty receive position "-1".  Since a
+ * production's leading RHS nonterminal(s) may have reduced to empty,
+ * we have to scan to find the first one that's not -1.
  */
 #define YYLLOC_DEFAULT(Current, Rhs, N) \
 	do { \
-		if ((N) > 0) \
-			(Current) = (Rhs)[1]; \
-		else \
-			(Current) = (-1); \
+		(Current) = (-1); \
+		for (int _i = 1; _i <= (N); _i++) \
+		{ \
+			if ((Rhs)[_i] >= 0) \
+			{ \
+				(Current) = (Rhs)[_i]; \
+				break; \
+			} \
+		} \
 	} while (0)
 
-/*
- * The above macro assigns -1 (unknown) as the parse location of any
- * nonterminal that was reduced from an empty rule, or whose leftmost
- * component was reduced from an empty rule.  This is problematic
- * for nonterminals defined like
- *		OptFooList: / * EMPTY * / { ... } | OptFooList Foo { ... } ;
- * because we'll set -1 as the location during the first reduction and then
- * copy it during each subsequent reduction, leaving us with -1 for the
- * location even when the list is not empty.  To fix that, do this in the
- * action for the nonempty rule(s):
- *		if (@$ < 0) @$ = @2;
- * (Although we have many nonterminals that follow this pattern, we only
- * bother with fixing @$ like this when the nonterminal's parse location
- * is actually referenced in some rule.)
- *
- * A cleaner answer would be to make YYLLOC_DEFAULT scan all the Rhs
- * locations until it's found one that's not -1.  Then we'd get a correct
- * location for any nonterminal that isn't entirely empty.  But this way
- * would add overhead to every rule reduction, and so far there's not been
- * a compelling reason to pay that overhead.
- */
-
 /*
  * Bison doesn't allocate anything that needs to live across parser calls,
  * so we can easily have it use palloc instead of malloc.  This prevents
@@ -930,7 +916,7 @@ parse_toplevel:
 			| MODE_PLPGSQL_EXPR PLpgSQL_Expr
 			{
 				pg_yyget_extra(yyscanner)->parsetree =
-					list_make1(makeRawStmt($2, 0));
+					list_make1(makeRawStmt($2, @2));
 			}
 			| MODE_PLPGSQL_ASSIGN1 PLAssignStmt
 			{
@@ -938,7 +924,7 @@ parse_toplevel:
 
 				n->nnames = 1;
 				pg_yyget_extra(yyscanner)->parsetree =
-					list_make1(makeRawStmt((Node *) n, 0));
+					list_make1(makeRawStmt((Node *) n, @2));
 			}
 			| MODE_PLPGSQL_ASSIGN2 PLAssignStmt
 			{
@@ -946,7 +932,7 @@ parse_toplevel:
 
 				n->nnames = 2;
 				pg_yyget_extra(yyscanner)->parsetree =
-					list_make1(makeRawStmt((Node *) n, 0));
+					list_make1(makeRawStmt((Node *) n, @2));
 			}
 			| MODE_PLPGSQL_ASSIGN3 PLAssignStmt
 			{
@@ -954,19 +940,15 @@ parse_toplevel:
 
 				n->nnames = 3;
 				pg_yyget_extra(yyscanner)->parsetree =
-					list_make1(makeRawStmt((Node *) n, 0));
+					list_make1(makeRawStmt((Node *) n, @2));
 			}
 		;
 
 /*
  * At top level, we wrap each stmt with a RawStmt node carrying start location
- * and length of the stmt's text.  Notice that the start loc/len are driven
- * entirely from semicolon locations (@2).  It would seem natural to use
- * @1 or @3 to get the true start location of a stmt, but that doesn't work
- * for statements that can start with empty nonterminals (opt_with_clause is
- * the main offender here); as noted in the comments for YYLLOC_DEFAULT,
- * we'd get -1 for the location in such cases.
- * We also take care to discard empty statements entirely.
+ * and length of the stmt's text.
+ * We also take care to discard empty statements entirely (which among other
+ * things dodges the problem of assigning them a location).
  */
 stmtmulti:	stmtmulti ';' toplevel_stmt
 				{
@@ -976,14 +958,14 @@ stmtmulti:	stmtmulti ';' toplevel_stmt
 						updateRawStmtEnd(llast_node(RawStmt, $1), @2);
 					}
 					if ($3 != NULL)
-						$$ = lappend($1, makeRawStmt($3, @2 + 1));
+						$$ = lappend($1, makeRawStmt($3, @3));
 					else
 						$$ = $1;
 				}
 			| toplevel_stmt
 				{
 					if ($1 != NULL)
-						$$ = list_make1(makeRawStmt($1, 0));
+						$$ = list_make1(makeRawStmt($1, @1));
 					else
 						$$ = NIL;
 				}
@@ -1584,8 +1566,6 @@ CreateSchemaStmt:
 OptSchemaEltList:
 			OptSchemaEltList schema_stmt
 				{
-					if (@$ < 0)			/* see comments for YYLLOC_DEFAULT */
-						@$ = @2;
 					$$ = lappend($1, $2);
 				}
 			| /* EMPTY */
-- 
2.43.5

v4-0002-Improve-reporting-of-errors-in-extension-script-f.patchtext/x-diff; charset=us-ascii; name*0=v4-0002-Improve-reporting-of-errors-in-extension-script-f.p; name*1=atchDownload
From ab3e000e063f5291aee1b1b8914d029bdf79b90e Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Tue, 8 Oct 2024 16:02:57 -0400
Subject: [PATCH v4 2/2] Improve reporting of errors in extension script files.

Previously, CREATE/ALTER EXTENSION gave basically no useful
context about errors reported while executing script files.
I think the idea was that you could run the same commands
manually to see the error, but that's often quite inconvenient.
Let's improve that.

If we get an error during raw parsing, we won't have a current
statement identified by a RawStmt node, but we should always get
a syntax error position.  Show the portion of the script from
the last semicolon-newline before the error position to the first
one after it.  There are cases where this might show only a
fragment of a statement, but that should be uncommon, and it
seems better than showing the whole script file.

Without an error cursor, if we have gotten past raw parsing (which
we probably have), we can report just the current SQL statement as
an item of error context.

In any case also report the script file name as error context,
since it might not be entirely obvious which of a series of
update scripts failed.  We can also show an approximate script
line number in case whatever we printed of the query isn't
sufficiently identifiable.

The error-context code path is already exercised by some
test_extensions test cases, but add tests for the syntax-error
path.

Discussion: https://postgr.es/m/ZvV1ClhnbJLCz7Sm@msg.df7cb.de
---
 src/backend/commands/extension.c              | 167 +++++++++++++++++-
 src/test/modules/test_extensions/Makefile     |   4 +-
 .../expected/test_extensions.out              |  42 +++++
 src/test/modules/test_extensions/meson.build  |   2 +
 .../test_extensions/sql/test_extensions.sql   |   4 +
 .../test_ext7--2.0--2.1bad.sql                |  14 ++
 .../test_ext7--2.0--2.2bad.sql                |  13 ++
 src/tools/pgindent/typedefs.list              |   1 +
 8 files changed, 244 insertions(+), 3 deletions(-)
 create mode 100644 src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql
 create mode 100644 src/test/modules/test_extensions/test_ext7--2.0--2.2bad.sql

diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index fab59ad5f6..86ea9cd9da 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -54,6 +54,7 @@
 #include "funcapi.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "nodes/queryjumble.h"
 #include "storage/fd.h"
 #include "tcop/utility.h"
 #include "utils/acl.h"
@@ -107,6 +108,17 @@ typedef struct ExtensionVersionInfo
 	struct ExtensionVersionInfo *previous;	/* current best predecessor */
 } ExtensionVersionInfo;
 
+/*
+ * Information for script_error_callback()
+ */
+typedef struct
+{
+	const char *sql;			/* entire script file contents */
+	const char *filename;		/* script file pathname */
+	ParseLoc	stmt_location;	/* current stmt start loc, or -1 if unknown */
+	ParseLoc	stmt_len;		/* length in bytes; 0 means "rest of string" */
+} script_error_callback_arg;
+
 /* Local functions */
 static List *find_update_path(List *evi_list,
 							  ExtensionVersionInfo *evi_start,
@@ -670,9 +682,139 @@ read_extension_script_file(const ExtensionControlFile *control,
 	return dest_str;
 }
 
+/*
+ * error context callback for failures in script-file execution
+ */
+static void
+script_error_callback(void *arg)
+{
+	script_error_callback_arg *callback_arg = (script_error_callback_arg *) arg;
+	const char *query = callback_arg->sql;
+	int			location = callback_arg->stmt_location;
+	int			len = callback_arg->stmt_len;
+	int			syntaxerrposition;
+	const char *lastslash;
+
+	/*
+	 * If there is a syntax error position, convert to internal syntax error;
+	 * otherwise report the current query as an item of context stack.
+	 *
+	 * Note: we'll provide no context except the filename if there's neither
+	 * an error position nor any known current query.  That shouldn't happen
+	 * though: all errors reported during raw parsing should come with an
+	 * error position.
+	 */
+	syntaxerrposition = geterrposition();
+	if (syntaxerrposition > 0)
+	{
+		/*
+		 * If we do not know the bounds of the current statement (as would
+		 * happen for an error occurring during initial raw parsing), we have
+		 * to use a heuristic to decide how much of the script to show.  We'll
+		 * also use the heuristic in the unlikely case that syntaxerrposition
+		 * is outside what we think the statement bounds are.
+		 */
+		if (location < 0 || syntaxerrposition < location ||
+			(len > 0 && syntaxerrposition > location + len))
+		{
+			/*
+			 * Our heuristic is pretty simple: look for semicolon-newline
+			 * sequences, and break at the last one strictly before
+			 * syntaxerrposition and the first one strictly after.  It's
+			 * certainly possible to fool this with semicolon-newline embedded
+			 * in a string literal, but it seems better to do this than to
+			 * show the entire extension script.
+			 */
+			int			slen = strlen(query);
+
+			location = len = 0;
+			for (int loc = 0; loc < slen; loc++)
+			{
+				if (query[loc] != ';')
+					continue;
+				if (query[loc + 1] == '\r')
+					loc++;
+				if (query[loc + 1] == '\n')
+				{
+					int			bkpt = loc + 2;
+
+					if (bkpt < syntaxerrposition)
+						location = bkpt;
+					else if (bkpt > syntaxerrposition)
+					{
+						len = bkpt - location;
+						break;	/* no need to keep searching */
+					}
+				}
+			}
+		}
+
+		/* Trim leading/trailing whitespace, for consistency */
+		query = CleanQuerytext(query, &location, &len);
+
+		/*
+		 * Adjust syntaxerrposition.  It shouldn't be pointing into the
+		 * whitespace we just trimmed, but cope if it is.
+		 */
+		syntaxerrposition -= location;
+		if (syntaxerrposition < 0)
+			syntaxerrposition = 0;
+		else if (syntaxerrposition > len)
+			syntaxerrposition = len;
+
+		/* And report. */
+		errposition(0);
+		internalerrposition(syntaxerrposition);
+		internalerrquery(pnstrdup(query, len));
+	}
+	else if (location >= 0)
+	{
+		/*
+		 * Since no syntax cursor will be shown, it's okay and helpful to trim
+		 * the reported query string to just the current statement.
+		 */
+		query = CleanQuerytext(query, &location, &len);
+		errcontext("SQL statement \"%.*s\"", len, query);
+	}
+
+	/*
+	 * Trim the reported file name to remove the path.  We know that
+	 * get_extension_script_filename() inserted a '/', regardless of whether
+	 * we're on Windows.
+	 */
+	lastslash = strrchr(callback_arg->filename, '/');
+	if (lastslash)
+		lastslash++;
+	else
+		lastslash = callback_arg->filename; /* shouldn't happen, but cope */
+
+	/*
+	 * If we have a location (which, as said above, we really always should)
+	 * then report a line number to aid in localizing problems in big scripts.
+	 */
+	if (location >= 0)
+	{
+		int			linenumber = 1;
+
+		for (query = callback_arg->sql; *query; query++)
+		{
+			if (--location < 0)
+				break;
+			if (*query == '\n')
+				linenumber++;
+		}
+		errcontext("extension script file \"%s\", near line %d",
+				   lastslash, linenumber);
+	}
+	else
+		errcontext("extension script file \"%s\"", lastslash);
+}
+
 /*
  * Execute given SQL string.
  *
+ * The filename the string came from is also provided, for error reporting.
+ *
  * Note: it's tempting to just use SPI to execute the string, but that does
  * not work very well.  The really serious problem is that SPI will parse,
  * analyze, and plan the whole string before executing any of it; of course
@@ -682,12 +824,27 @@ read_extension_script_file(const ExtensionControlFile *control,
  * could be very long.
  */
 static void
-execute_sql_string(const char *sql)
+execute_sql_string(const char *sql, const char *filename)
 {
+	script_error_callback_arg callback_arg;
+	ErrorContextCallback scripterrcontext;
 	List	   *raw_parsetree_list;
 	DestReceiver *dest;
 	ListCell   *lc1;
 
+	/*
+	 * Setup error traceback support for ereport().
+	 */
+	callback_arg.sql = sql;
+	callback_arg.filename = filename;
+	callback_arg.stmt_location = -1;
+	callback_arg.stmt_len = -1;
+
+	scripterrcontext.callback = script_error_callback;
+	scripterrcontext.arg = (void *) &callback_arg;
+	scripterrcontext.previous = error_context_stack;
+	error_context_stack = &scripterrcontext;
+
 	/*
 	 * Parse the SQL string into a list of raw parse trees.
 	 */
@@ -709,6 +866,10 @@ execute_sql_string(const char *sql)
 		List	   *stmt_list;
 		ListCell   *lc2;
 
+		/* Report location of this query for error context callback */
+		callback_arg.stmt_location = parsetree->stmt_location;
+		callback_arg.stmt_len = parsetree->stmt_len;
+
 		/*
 		 * We do the work for each parsetree in a short-lived context, to
 		 * limit the memory used when there are many commands in the string.
@@ -778,6 +939,8 @@ execute_sql_string(const char *sql)
 		MemoryContextDelete(per_parsetree_context);
 	}
 
+	error_context_stack = scripterrcontext.previous;
+
 	/* Be sure to advance the command counter after the last script command */
 	CommandCounterIncrement();
 }
@@ -1054,7 +1217,7 @@ execute_extension_script(Oid extensionOid, ExtensionControlFile *control,
 		/* And now back to C string */
 		c_sql = text_to_cstring(DatumGetTextPP(t_sql));
 
-		execute_sql_string(c_sql);
+		execute_sql_string(c_sql, filename);
 	}
 	PG_FINALLY();
 	{
diff --git a/src/test/modules/test_extensions/Makefile b/src/test/modules/test_extensions/Makefile
index 05272e6a40..1dbec14cba 100644
--- a/src/test/modules/test_extensions/Makefile
+++ b/src/test/modules/test_extensions/Makefile
@@ -13,7 +13,9 @@ EXTENSION = test_ext1 test_ext2 test_ext3 test_ext4 test_ext5 test_ext6 \
 
 DATA = test_ext1--1.0.sql test_ext2--1.0.sql test_ext3--1.0.sql \
        test_ext4--1.0.sql test_ext5--1.0.sql test_ext6--1.0.sql \
-       test_ext7--1.0.sql test_ext7--1.0--2.0.sql test_ext8--1.0.sql \
+       test_ext7--1.0.sql test_ext7--1.0--2.0.sql \
+       test_ext7--2.0--2.1bad.sql test_ext7--2.0--2.2bad.sql \
+       test_ext8--1.0.sql \
        test_ext9--1.0.sql \
        test_ext_cine--1.0.sql test_ext_cine--1.0--1.1.sql \
        test_ext_cor--1.0.sql \
diff --git a/src/test/modules/test_extensions/expected/test_extensions.out b/src/test/modules/test_extensions/expected/test_extensions.out
index f357cc21aa..d5388a1fec 100644
--- a/src/test/modules/test_extensions/expected/test_extensions.out
+++ b/src/test/modules/test_extensions/expected/test_extensions.out
@@ -72,6 +72,22 @@ Objects in extension "test_ext7"
  type ext7_table2[]
 (4 rows)
 
+-- test reporting of errors in extension scripts
+alter extension test_ext7 update to '2.1bad';
+ERROR:  syntax error at or near "FUNCTIN"
+LINE 1: CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE S...
+               ^
+QUERY:  CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL
+AS $$ SELECT $1 + 1 $$;
+CONTEXT:  extension script file "test_ext7--2.0--2.1bad.sql", near line 10
+alter extension test_ext7 update to '2.2bad';
+ERROR:  syntax error at or near ","
+LINE 1:  SELECT $1 + , 1 
+                     ^
+QUERY:   SELECT $1 + , 1 
+CONTEXT:  SQL statement "CREATE FUNCTION my_erroneous_func(int) RETURNS int LANGUAGE SQL
+AS $$ SELECT $1 + , 1 $$"
+extension script file "test_ext7--2.0--2.2bad.sql", near line 9
 -- test handling of temp objects created by extensions
 create extension test_ext8;
 -- \dx+ would expose a variable pg_temp_nn schema name, so we can't use it here
@@ -295,6 +311,9 @@ CREATE FUNCTION ext_cor_func() RETURNS text
 CREATE EXTENSION test_ext_cor;  -- fail
 ERROR:  function ext_cor_func() is not a member of extension "test_ext_cor"
 DETAIL:  An extension is not allowed to replace an object that it does not own.
+CONTEXT:  SQL statement "CREATE OR REPLACE FUNCTION ext_cor_func() RETURNS text
+  AS $$ SELECT 'ext_cor_func: from extension'::text $$ LANGUAGE sql"
+extension script file "test_ext_cor--1.0.sql", near line 8
 SELECT ext_cor_func();
       ext_cor_func      
 ------------------------
@@ -307,6 +326,9 @@ CREATE VIEW ext_cor_view AS
 CREATE EXTENSION test_ext_cor;  -- fail
 ERROR:  view ext_cor_view is not a member of extension "test_ext_cor"
 DETAIL:  An extension is not allowed to replace an object that it does not own.
+CONTEXT:  SQL statement "CREATE OR REPLACE VIEW ext_cor_view AS
+  SELECT 'ext_cor_view: from extension'::text AS col"
+extension script file "test_ext_cor--1.0.sql", near line 11
 SELECT ext_cor_func();
 ERROR:  function ext_cor_func() does not exist
 LINE 1: SELECT ext_cor_func();
@@ -323,6 +345,8 @@ CREATE TYPE test_ext_type;
 CREATE EXTENSION test_ext_cor;  -- fail
 ERROR:  type test_ext_type is not a member of extension "test_ext_cor"
 DETAIL:  An extension is not allowed to replace an object that it does not own.
+CONTEXT:  SQL statement "CREATE TYPE test_ext_type AS ENUM('x', 'y')"
+extension script file "test_ext_cor--1.0.sql", near line 17
 DROP TYPE test_ext_type;
 -- this makes a shell "point <<@@ polygon" operator too
 CREATE OPERATOR @@>> ( PROCEDURE = poly_contain_pt,
@@ -331,6 +355,9 @@ CREATE OPERATOR @@>> ( PROCEDURE = poly_contain_pt,
 CREATE EXTENSION test_ext_cor;  -- fail
 ERROR:  operator <<@@(point,polygon) is not a member of extension "test_ext_cor"
 DETAIL:  An extension is not allowed to replace an object that it does not own.
+CONTEXT:  SQL statement "CREATE OPERATOR <<@@ ( PROCEDURE = pt_contained_poly,
+  LEFTARG = point, RIGHTARG = polygon )"
+extension script file "test_ext_cor--1.0.sql", near line 19
 DROP OPERATOR <<@@ (point, polygon);
 CREATE EXTENSION test_ext_cor;  -- now it should work
 SELECT ext_cor_func();
@@ -379,37 +406,52 @@ CREATE COLLATION ext_cine_coll
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  collation ext_cine_coll is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+CONTEXT:  SQL statement "CREATE COLLATION IF NOT EXISTS ext_cine_coll
+  ( LC_COLLATE = "POSIX", LC_CTYPE = "POSIX" )"
+extension script file "test_ext_cine--1.0.sql", near line 10
 DROP COLLATION ext_cine_coll;
 CREATE MATERIALIZED VIEW ext_cine_mv AS SELECT 11 AS f1;
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  materialized view ext_cine_mv is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+CONTEXT:  SQL statement "CREATE MATERIALIZED VIEW IF NOT EXISTS ext_cine_mv AS SELECT 42 AS f1"
+extension script file "test_ext_cine--1.0.sql", near line 13
 DROP MATERIALIZED VIEW ext_cine_mv;
 CREATE FOREIGN DATA WRAPPER dummy;
 CREATE SERVER ext_cine_srv FOREIGN DATA WRAPPER dummy;
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  server ext_cine_srv is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+CONTEXT:  SQL statement "CREATE SERVER IF NOT EXISTS ext_cine_srv FOREIGN DATA WRAPPER ext_cine_fdw"
+extension script file "test_ext_cine--1.0.sql", near line 17
 DROP SERVER ext_cine_srv;
 CREATE SCHEMA ext_cine_schema;
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  schema ext_cine_schema is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+CONTEXT:  SQL statement "CREATE SCHEMA IF NOT EXISTS ext_cine_schema"
+extension script file "test_ext_cine--1.0.sql", near line 19
 DROP SCHEMA ext_cine_schema;
 CREATE SEQUENCE ext_cine_seq;
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  sequence ext_cine_seq is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+CONTEXT:  SQL statement "CREATE SEQUENCE IF NOT EXISTS ext_cine_seq"
+extension script file "test_ext_cine--1.0.sql", near line 21
 DROP SEQUENCE ext_cine_seq;
 CREATE TABLE ext_cine_tab1 (x int);
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  table ext_cine_tab1 is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+CONTEXT:  SQL statement "CREATE TABLE IF NOT EXISTS ext_cine_tab1 (x int)"
+extension script file "test_ext_cine--1.0.sql", near line 23
 DROP TABLE ext_cine_tab1;
 CREATE TABLE ext_cine_tab2 AS SELECT 42 AS y;
 CREATE EXTENSION test_ext_cine;  -- fail
 ERROR:  table ext_cine_tab2 is not a member of extension "test_ext_cine"
 DETAIL:  An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.
+CONTEXT:  SQL statement "CREATE TABLE IF NOT EXISTS ext_cine_tab2 AS SELECT 42 AS y"
+extension script file "test_ext_cine--1.0.sql", near line 25
 DROP TABLE ext_cine_tab2;
 CREATE EXTENSION test_ext_cine;
 \dx+ test_ext_cine
diff --git a/src/test/modules/test_extensions/meson.build b/src/test/modules/test_extensions/meson.build
index c5f3424da5..2b31cbf425 100644
--- a/src/test/modules/test_extensions/meson.build
+++ b/src/test/modules/test_extensions/meson.build
@@ -15,6 +15,8 @@ test_install_data += files(
   'test_ext6.control',
   'test_ext7--1.0--2.0.sql',
   'test_ext7--1.0.sql',
+  'test_ext7--2.0--2.1bad.sql',
+  'test_ext7--2.0--2.2bad.sql',
   'test_ext7.control',
   'test_ext8--1.0.sql',
   'test_ext8.control',
diff --git a/src/test/modules/test_extensions/sql/test_extensions.sql b/src/test/modules/test_extensions/sql/test_extensions.sql
index 642c82ff5d..b5878f6f80 100644
--- a/src/test/modules/test_extensions/sql/test_extensions.sql
+++ b/src/test/modules/test_extensions/sql/test_extensions.sql
@@ -28,6 +28,10 @@ create extension test_ext7;
 alter extension test_ext7 update to '2.0';
 \dx+ test_ext7
 
+-- test reporting of errors in extension scripts
+alter extension test_ext7 update to '2.1bad';
+alter extension test_ext7 update to '2.2bad';
+
 -- test handling of temp objects created by extensions
 create extension test_ext8;
 
diff --git a/src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql b/src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql
new file mode 100644
index 0000000000..a055a2dbb2
--- /dev/null
+++ b/src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql
@@ -0,0 +1,14 @@
+/* src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION test_ext7 UPDATE TO '2.1bad'" to load this file. \quit
+
+-- test reporting of a simple syntax error in an extension script
+CREATE FUNCTION my_okay_func(int) RETURNS int LANGUAGE SQL
+AS $$ SELECT $1 + 1 $$;
+
+CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL
+AS $$ SELECT $1 + 1 $$;
+
+CREATE FUNCTION my_other_func(int) RETURNS int LANGUAGE SQL
+AS $$ SELECT $1 + 1 $$;
diff --git a/src/test/modules/test_extensions/test_ext7--2.0--2.2bad.sql b/src/test/modules/test_extensions/test_ext7--2.0--2.2bad.sql
new file mode 100644
index 0000000000..19aa9bf264
--- /dev/null
+++ b/src/test/modules/test_extensions/test_ext7--2.0--2.2bad.sql
@@ -0,0 +1,13 @@
+/* src/test/modules/test_extensions/test_ext7--2.0--2.2bad.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION test_ext7 UPDATE TO '2.2bad'" to load this file. \quit
+
+-- test reporting of a nested syntax error in an extension script
+SET LOCAL check_function_bodies = on;
+
+CREATE FUNCTION my_erroneous_func(int) RETURNS int LANGUAGE SQL
+AS $$ SELECT $1 + , 1 $$;
+
+CREATE FUNCTION my_other_func(int) RETURNS int LANGUAGE SQL
+AS $$ SELECT $1 + 1 $$;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a65e1c07c5..b9dde65150 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3884,6 +3884,7 @@ saophash_hash
 save_buffer
 scram_state
 scram_state_enum
+script_error_callback_arg
 security_class_t
 sem_t
 sepgsql_context_info_t
-- 
2.43.5

#32Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#31)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

Hi

út 8. 10. 2024 v 22:18 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:

I wrote:

... There's still a question
of whether reporting the whole script as the query is OK when
we have a syntax error, but I have no good ideas as to how to
make that terser.

I had an idea about this: we can use a pretty simple heuristic
such as "break at semicolon-newline sequences". That could fail
and show you just a fragment of a statement, but that still seems
better than showing a whole extension script. We can ameliorate
the problem that we might not show enough to clearly identify
what failed by including a separate line number counter.
In the attached v4 I included that in the context line that
reports the script file, eg

+CONTEXT:  SQL statement "CREATE OR REPLACE FUNCTION ext_cor_func()
RETURNS text
+  AS $$ SELECT 'ext_cor_func: from extension'::text $$ LANGUAGE sql"
+extension script file "test_ext_cor--1.0.sql", near line 8

This way seems a whole lot more usable when dealing with a
large extension script.

I tested it and it is working nicely. I tested it against Orafce and I
found an interesting point. The body of plpgsql functions is not checked.

Do you know the reason?

Regards

Pavel

Show quoted text

regards, tom lane

#33Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#32)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

Pavel Stehule <pavel.stehule@gmail.com> writes:

I tested it and it is working nicely. I tested it against Orafce and I
found an interesting point. The body of plpgsql functions is not checked.
Do you know the reason?

In execute_extension_script():

/*
* Similarly disable check_function_bodies, to ensure that SQL functions
* won't be parsed during creation.
*/
if (check_function_bodies)
(void) set_config_option("check_function_bodies", "off",
PGC_USERSET, PGC_S_SESSION,
GUC_ACTION_SAVE, true, 0, false);

I wondered if we should reconsider that, but I'm afraid we'd be more
likely to break working extensions than to do anything helpful.
An extension author who wants that can do what I did in the patch's
test cases: manually turn check_function_bodies on in the extension
script.

regards, tom lane

#34Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#33)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

pá 11. 10. 2024 v 18:08 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:

Pavel Stehule <pavel.stehule@gmail.com> writes:

I tested it and it is working nicely. I tested it against Orafce and I
found an interesting point. The body of plpgsql functions is not checked.
Do you know the reason?

In execute_extension_script():

/*
* Similarly disable check_function_bodies, to ensure that SQL
functions
* won't be parsed during creation.
*/
if (check_function_bodies)
(void) set_config_option("check_function_bodies", "off",
PGC_USERSET, PGC_S_SESSION,
GUC_ACTION_SAVE, true, 0, false);

I wondered if we should reconsider that, but I'm afraid we'd be more
likely to break working extensions than to do anything helpful.
An extension author who wants that can do what I did in the patch's
test cases: manually turn check_function_bodies on in the extension
script.

ok,

Pavel

Show quoted text

regards, tom lane

#35jian he
jian.universality@gmail.com
In reply to: Tom Lane (#31)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

On Wed, Oct 9, 2024 at 4:18 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

In the attached v4

in the upper code two branch, both will call CleanQuerytext
so in script_error_callback

+ /*
+ * If we have a location (which, as said above, we really always should)
+ * then report a line number to aid in localizing problems in big scripts.
+ */
+ if (location >= 0)
+ {
+ int linenumber = 1;
+
+ for (query = callback_arg->sql; *query; query++)
+ {
+ if (--location < 0)
+ break;
+ if (*query == '\n')
+ linenumber++;
+ }
+ errcontext("extension script file \"%s\", near line %d",
+   lastslash, linenumber);
+ }
+ else
+ errcontext("extension script file \"%s\"", lastslash);
+ /*
+ * If we have a location (which, as said above, we really always should)
+ * then report a line number to aid in localizing problems in big scripts.
+ */
+ if (location >= 0)
so this part will always be true?
#36Pavel Stehule
pavel.stehule@gmail.com
In reply to: jian he (#35)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

so 12. 10. 2024 v 9:33 odesílatel jian he <jian.universality@gmail.com>
napsal:

On Wed, Oct 9, 2024 at 4:18 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

In the attached v4

in the upper code two branch, both will call CleanQuerytext
so in script_error_callback

+ /*
+ * If we have a location (which, as said above, we really always should)
+ * then report a line number to aid in localizing problems in big scripts.
+ */
+ if (location >= 0)
+ {
+ int linenumber = 1;
+
+ for (query = callback_arg->sql; *query; query++)
+ {
+ if (--location < 0)
+ break;
+ if (*query == '\n')
+ linenumber++;
+ }
+ errcontext("extension script file \"%s\", near line %d",
+   lastslash, linenumber);
+ }
+ else
+ errcontext("extension script file \"%s\"", lastslash);
+ /*
+ * If we have a location (which, as said above, we really always should)
+ * then report a line number to aid in localizing problems in big scripts.
+ */
+ if (location >= 0)
so this part will always be true?

yes, after CleanQuerytext the location should not be -1 ever

Regards

Pavel

#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#36)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

Pavel Stehule <pavel.stehule@gmail.com> writes:

so 12. 10. 2024 v 9:33 odesílatel jian he <jian.universality@gmail.com>
napsal:

+ /*
+ * If we have a location (which, as said above, we really always should)
+ * then report a line number to aid in localizing problems in big scripts.
+ */
+ if (location >= 0)
so this part will always be true?

yes, after CleanQuerytext the location should not be -1 ever

Right, but we might not have entered either of those previous
if-blocks. The question here is whether the raw parser (gram.y)
ever throws an error that doesn't include a cursor position. IMO it
shouldn't, but a quick look through gram.y finds a few ereports that
lack parser_errposition. We could go fix those, and probably should,
but imagining that none will ever be introduced again seems like
folly.

regards, tom lane

#38jian he
jian.universality@gmail.com
In reply to: Tom Lane (#37)
1 attachment(s)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

On Mon, Oct 14, 2024 at 1:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Pavel Stehule <pavel.stehule@gmail.com> writes:

so 12. 10. 2024 v 9:33 odesílatel jian he <jian.universality@gmail.com>
napsal:

+ /*
+ * If we have a location (which, as said above, we really always should)
+ * then report a line number to aid in localizing problems in big scripts.
+ */
+ if (location >= 0)
so this part will always be true?

yes, after CleanQuerytext the location should not be -1 ever

Right, but we might not have entered either of those previous
if-blocks.

in src/backend/parser/gram.y
your makeRawStmt changes (v4) seem to guarantee that
RawStmt->stmt_location >= 0.
other places {DefineView,DoCopy,PrepareQuery} use makeNode(RawStmt),
In these cases, I am not so sure RawStmt->stmt_location >=0 is always true.

in execute_sql_string

raw_parsetree_list = pg_parse_query(sql);
dest = CreateDestReceiver(DestNone);
foreach(lc1, raw_parsetree_list)
{
RawStmt *parsetree = lfirst_node(RawStmt, lc1);
MemoryContext per_parsetree_context,
oldcontext;
List *stmt_list;
ListCell *lc2;
callback_arg.stmt_location = parsetree->stmt_location;
callback_arg.stmt_len = parsetree->stmt_len;
per_parsetree_context =
AllocSetContextCreate(CurrentMemoryContext,
"execute_sql_string per-statement context",
ALLOCSET_DEFAULT_SIZES);
oldcontext = MemoryContextSwitchTo(per_parsetree_context);
CommandCounterIncrement();
stmt_list = pg_analyze_and_rewrite_fixedparams(parsetree,
sql,
NULL,
0,
NULL);

Based on the above code, we do
`callback_arg.stmt_location = parsetree->stmt_location;`
pg_parse_query(sql) doesn't use script_error_callback.

So if we are in script_error_callback
`int location = callback_arg->stmt_location;`
location >= 0 will be always true?

The question here is whether the raw parser (gram.y)
ever throws an error that doesn't include a cursor position. IMO it
shouldn't, but a quick look through gram.y finds a few ereports that
lack parser_errposition. We could go fix those, and probably should,
but imagining that none will ever be introduced again seems like
folly.

I don't know how to add the error position inside the function
insertSelectOptions.
maybe we can add
`parser_errposition(exprLocation(limitClause->limitCount))));`
but limitCount position is a nearby position.
I am also not sure about func mergeTableFuncParameters.

for other places in gram.y, I've added error positions for ereport
that lack it , please check the attached.

Attachments:

v4-0001-add-error-position-in-variable-places-in-gram..no-cfbotapplication/octet-stream; name=v4-0001-add-error-position-in-variable-places-in-gram..no-cfbotDownload
From d4a90372f489100ee655228b6a7e596878c63dec Mon Sep 17 00:00:00 2001
From: jian he <jian.universality@gmail.com>
Date: Mon, 14 Oct 2024 11:20:44 +0800
Subject: [PATCH v4 1/1] add error position in variable places in gram.y

---
 src/backend/parser/gram.y                  | 57 ++++++++++------------
 src/test/regress/expected/create_table.out |  2 +
 src/test/regress/expected/foreign_key.out  |  4 ++
 src/test/regress/expected/sqljson.out      |  4 ++
 src/test/regress/sql/foreign_key.sql       |  1 +
 5 files changed, 37 insertions(+), 31 deletions(-)

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 4bab2117d9..6c9460adf2 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -198,7 +198,6 @@ static void SplitColQualList(List *qualList,
 static void processCASbits(int cas_bits, int location, const char *constrType,
 			   bool *deferrable, bool *initdeferred, bool *not_valid,
 			   bool *no_inherit, core_yyscan_t yyscanner);
-static PartitionStrategy parsePartitionStrategy(char *strategy);
 static void preprocess_pubobj_list(List *pubobjspec_list,
 								   core_yyscan_t yyscanner);
 static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
@@ -3148,11 +3147,14 @@ PartitionBoundSpec:
 					if (n->modulus == -1)
 						ereport(ERROR,
 								(errcode(ERRCODE_SYNTAX_ERROR),
-								 errmsg("modulus for hash partition must be specified")));
+								 errmsg("modulus for hash partition must be specified"),
+								 parser_errposition(@4)));
+
 					if (n->remainder == -1)
 						ereport(ERROR,
 								(errcode(ERRCODE_SYNTAX_ERROR),
-								 errmsg("remainder for hash partition must be specified")));
+								 errmsg("remainder for hash partition must be specified"),
+								 parser_errposition(@4)));
 
 					n->location = @3;
 
@@ -4530,7 +4532,18 @@ PartitionSpec: PARTITION BY ColId '(' part_params ')'
 				{
 					PartitionSpec *n = makeNode(PartitionSpec);
 
-					n->strategy = parsePartitionStrategy($3);
+					if (!pg_strcasecmp($3, "list"))
+						n->strategy = PARTITION_STRATEGY_LIST;
+					else if (!pg_strcasecmp($3, "range"))
+						n->strategy = PARTITION_STRATEGY_RANGE;
+					else if (!pg_strcasecmp($3, "hash"))
+						n->strategy = PARTITION_STRATEGY_HASH;
+					else
+						ereport(ERROR,
+								(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+								 errmsg("unrecognized partitioning strategy \"%s\"", $3),
+								 parser_errposition(@3)));
+
 					n->partParams = $5;
 					n->location = @1;
 
@@ -5964,7 +5977,8 @@ CreateTrigStmt:
 					if (n->replace) /* not supported, see CreateTrigger */
 						ereport(ERROR,
 								(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-								 errmsg("CREATE OR REPLACE CONSTRAINT TRIGGER is not supported")));
+								 errmsg("CREATE OR REPLACE CONSTRAINT TRIGGER is not supported"),
+								 parser_errposition(@2)));
 					n->isconstraint = true;
 					n->trigname = $5;
 					n->relation = $9;
@@ -6249,7 +6263,8 @@ CreateAssertionStmt:
 				{
 					ereport(ERROR,
 							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-							 errmsg("CREATE ASSERTION is not yet implemented")));
+							 errmsg("CREATE ASSERTION is not yet implemented"),
+							 parser_errposition(@2)));
 
 					$$ = NULL;
 				}
@@ -16929,8 +16944,9 @@ json_format_clause:
 						encoding = JS_ENC_UTF32;
 					else
 						ereport(ERROR,
-								errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-								errmsg("unrecognized JSON encoding: %s", $4));
+								(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+								errmsg("unrecognized JSON encoding: %s", $4),
+								parser_errposition(@4)));
 
 					$$ = (Node *) makeJsonFormat(JS_FORMAT_JSON, encoding, @1);
 				}
@@ -17432,7 +17448,8 @@ PLpgSQL_Expr: opt_distinct_clause opt_target_list
 							$9->limitOption == LIMIT_OPTION_WITH_TIES)
 							ereport(ERROR,
 									(errcode(ERRCODE_SYNTAX_ERROR),
-									 errmsg("WITH TIES cannot be specified without ORDER BY clause")));
+									 errmsg("WITH TIES cannot be specified without ORDER BY clause"),
+									 parser_errposition(@9)));
 						n->limitOption = $9->limitOption;
 					}
 					n->lockingClause = $10;
@@ -19349,28 +19366,6 @@ processCASbits(int cas_bits, int location, const char *constrType,
 	}
 }
 
-/*
- * Parse a user-supplied partition strategy string into parse node
- * PartitionStrategy representation, or die trying.
- */
-static PartitionStrategy
-parsePartitionStrategy(char *strategy)
-{
-	if (pg_strcasecmp(strategy, "list") == 0)
-		return PARTITION_STRATEGY_LIST;
-	else if (pg_strcasecmp(strategy, "range") == 0)
-		return PARTITION_STRATEGY_RANGE;
-	else if (pg_strcasecmp(strategy, "hash") == 0)
-		return PARTITION_STRATEGY_HASH;
-
-	ereport(ERROR,
-			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-			 errmsg("unrecognized partitioning strategy \"%s\"",
-					strategy)));
-	return PARTITION_STRATEGY_LIST;		/* keep compiler quiet */
-
-}
-
 /*
  * Process pubobjspec_list to check for errors in any of the objects and
  * convert PUBLICATIONOBJ_CONTINUATION into appropriate PublicationObjSpecType.
diff --git a/src/test/regress/expected/create_table.out b/src/test/regress/expected/create_table.out
index c45e02d42f..57a24050ab 100644
--- a/src/test/regress/expected/create_table.out
+++ b/src/test/regress/expected/create_table.out
@@ -198,6 +198,8 @@ CREATE TABLE partitioned (
     a int
 ) PARTITION BY MAGIC (a);
 ERROR:  unrecognized partitioning strategy "magic"
+LINE 3: ) PARTITION BY MAGIC (a);
+                       ^
 -- specified column must be present in the table
 CREATE TABLE partitioned (
 	a int
diff --git a/src/test/regress/expected/foreign_key.out b/src/test/regress/expected/foreign_key.out
index 8c04a24b37..fe2a19e196 100644
--- a/src/test/regress/expected/foreign_key.out
+++ b/src/test/regress/expected/foreign_key.out
@@ -2916,6 +2916,10 @@ $$;
 CREATE TRIGGER trig_upd_pk AFTER UPDATE ON fkpart11.pk FOR EACH ROW EXECUTE FUNCTION fkpart11.print_row();
 CREATE TRIGGER trig_del_pk AFTER DELETE ON fkpart11.pk FOR EACH ROW EXECUTE FUNCTION fkpart11.print_row();
 CREATE TRIGGER trig_ins_pk AFTER INSERT ON fkpart11.pk FOR EACH ROW EXECUTE FUNCTION fkpart11.print_row();
+CREATE OR REPLACE CONSTRAINT TRIGGER trig_upd_fk_parted AFTER UPDATE ON fkpart11.fk_parted INITIALLY DEFERRED FOR EACH ROW EXECUTE FUNCTION fkpart11.print_row();
+ERROR:  CREATE OR REPLACE CONSTRAINT TRIGGER is not supported
+LINE 1: CREATE OR REPLACE CONSTRAINT TRIGGER trig_upd_fk_parted AFTE...
+               ^
 CREATE CONSTRAINT TRIGGER trig_upd_fk_parted AFTER UPDATE ON fkpart11.fk_parted INITIALLY DEFERRED FOR EACH ROW EXECUTE FUNCTION fkpart11.print_row();
 CREATE CONSTRAINT TRIGGER trig_del_fk_parted AFTER DELETE ON fkpart11.fk_parted INITIALLY DEFERRED FOR EACH ROW EXECUTE FUNCTION fkpart11.print_row();
 CREATE CONSTRAINT TRIGGER trig_ins_fk_parted AFTER INSERT ON fkpart11.fk_parted INITIALLY DEFERRED FOR EACH ROW EXECUTE FUNCTION fkpart11.print_row();
diff --git a/src/test/regress/expected/sqljson.out b/src/test/regress/expected/sqljson.out
index 9adfbe15f6..cf89133aff 100644
--- a/src/test/regress/expected/sqljson.out
+++ b/src/test/regress/expected/sqljson.out
@@ -337,6 +337,8 @@ LINE 1: SELECT JSON_OBJECT(RETURNING text FORMAT JSON ENCODING UTF8)...
                                           ^
 SELECT JSON_OBJECT(RETURNING text FORMAT JSON ENCODING INVALID_ENCODING);
 ERROR:  unrecognized JSON encoding: invalid_encoding
+LINE 1: ...T JSON_OBJECT(RETURNING text FORMAT JSON ENCODING INVALID_EN...
+                                                             ^
 SELECT JSON_OBJECT(RETURNING bytea);
  json_object 
 -------------
@@ -620,6 +622,8 @@ LINE 1: SELECT JSON_ARRAY(RETURNING text FORMAT JSON ENCODING UTF8);
                                          ^
 SELECT JSON_ARRAY(RETURNING text FORMAT JSON ENCODING INVALID_ENCODING);
 ERROR:  unrecognized JSON encoding: invalid_encoding
+LINE 1: ...CT JSON_ARRAY(RETURNING text FORMAT JSON ENCODING INVALID_EN...
+                                                             ^
 SELECT JSON_ARRAY(RETURNING bytea);
  json_array 
 ------------
diff --git a/src/test/regress/sql/foreign_key.sql b/src/test/regress/sql/foreign_key.sql
index d1aac5357f..d79b9523fb 100644
--- a/src/test/regress/sql/foreign_key.sql
+++ b/src/test/regress/sql/foreign_key.sql
@@ -2079,6 +2079,7 @@ $$;
 CREATE TRIGGER trig_upd_pk AFTER UPDATE ON fkpart11.pk FOR EACH ROW EXECUTE FUNCTION fkpart11.print_row();
 CREATE TRIGGER trig_del_pk AFTER DELETE ON fkpart11.pk FOR EACH ROW EXECUTE FUNCTION fkpart11.print_row();
 CREATE TRIGGER trig_ins_pk AFTER INSERT ON fkpart11.pk FOR EACH ROW EXECUTE FUNCTION fkpart11.print_row();
+CREATE OR REPLACE CONSTRAINT TRIGGER trig_upd_fk_parted AFTER UPDATE ON fkpart11.fk_parted INITIALLY DEFERRED FOR EACH ROW EXECUTE FUNCTION fkpart11.print_row();
 CREATE CONSTRAINT TRIGGER trig_upd_fk_parted AFTER UPDATE ON fkpart11.fk_parted INITIALLY DEFERRED FOR EACH ROW EXECUTE FUNCTION fkpart11.print_row();
 CREATE CONSTRAINT TRIGGER trig_del_fk_parted AFTER DELETE ON fkpart11.fk_parted INITIALLY DEFERRED FOR EACH ROW EXECUTE FUNCTION fkpart11.print_row();
 CREATE CONSTRAINT TRIGGER trig_ins_fk_parted AFTER INSERT ON fkpart11.fk_parted INITIALLY DEFERRED FOR EACH ROW EXECUTE FUNCTION fkpart11.print_row();
-- 
2.34.1

#39Tom Lane
tgl@sss.pgh.pa.us
In reply to: jian he (#38)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

jian he <jian.universality@gmail.com> writes:

So if we are in script_error_callback
`int location = callback_arg->stmt_location;`
location >= 0 will be always true?

No, not if the grammar threw an error.

regards, tom lane

#40Pavel Stehule
pavel.stehule@gmail.com
In reply to: jian he (#38)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

po 14. 10. 2024 v 5:38 odesílatel jian he <jian.universality@gmail.com>
napsal:

On Mon, Oct 14, 2024 at 1:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Pavel Stehule <pavel.stehule@gmail.com> writes:

so 12. 10. 2024 v 9:33 odesílatel jian he <jian.universality@gmail.com

napsal:

+ /*
+ * If we have a location (which, as said above, we really always

should)

+ * then report a line number to aid in localizing problems in big

scripts.

+ */
+ if (location >= 0)
so this part will always be true?

yes, after CleanQuerytext the location should not be -1 ever

Right, but we might not have entered either of those previous
if-blocks.

in src/backend/parser/gram.y
your makeRawStmt changes (v4) seem to guarantee that
RawStmt->stmt_location >= 0.
other places {DefineView,DoCopy,PrepareQuery} use makeNode(RawStmt),
In these cases, I am not so sure RawStmt->stmt_location >=0 is always
true.

in execute_sql_string

raw_parsetree_list = pg_parse_query(sql);
dest = CreateDestReceiver(DestNone);
foreach(lc1, raw_parsetree_list)
{
RawStmt *parsetree = lfirst_node(RawStmt, lc1);
MemoryContext per_parsetree_context,
oldcontext;
List *stmt_list;
ListCell *lc2;
callback_arg.stmt_location = parsetree->stmt_location;
callback_arg.stmt_len = parsetree->stmt_len;
per_parsetree_context =
AllocSetContextCreate(CurrentMemoryContext,
"execute_sql_string per-statement
context",
ALLOCSET_DEFAULT_SIZES);
oldcontext = MemoryContextSwitchTo(per_parsetree_context);
CommandCounterIncrement();
stmt_list = pg_analyze_and_rewrite_fixedparams(parsetree,
sql,
NULL,
0,
NULL);

Based on the above code, we do
`callback_arg.stmt_location = parsetree->stmt_location;`
pg_parse_query(sql) doesn't use script_error_callback.

So if we are in script_error_callback
`int location = callback_arg->stmt_location;`
location >= 0 will be always true?

The question here is whether the raw parser (gram.y)
ever throws an error that doesn't include a cursor position. IMO it
shouldn't, but a quick look through gram.y finds a few ereports that
lack parser_errposition. We could go fix those, and probably should,
but imagining that none will ever be introduced again seems like
folly.

I don't know how to add the error position inside the function
insertSelectOptions.
maybe we can add
`parser_errposition(exprLocation(limitClause->limitCount))));`
but limitCount position is a nearby position.
I am also not sure about func mergeTableFuncParameters.

for other places in gram.y, I've added error positions for ereport
that lack it , please check the attached.

I think this can be a separate commitfest issue.

Regards

Pavel

#41Tom Lane
tgl@sss.pgh.pa.us
In reply to: jian he (#38)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

jian he <jian.universality@gmail.com> writes:

On Mon, Oct 14, 2024 at 1:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Right, but we might not have entered either of those previous
if-blocks.

in src/backend/parser/gram.y
your makeRawStmt changes (v4) seem to guarantee that
RawStmt->stmt_location >= 0.

Yes, I would expect that any RawStmt we see here will have valid
stmt_location. What you seem to be missing is that an error could
be thrown from

raw_parsetree_list = pg_parse_query(sql);

before execute_sql_string reaches its loop over RawStmts. In that
case we'll reach script_error_callback with callback_arg.stmt_location
still being -1.

pg_parse_query(sql) doesn't use script_error_callback.

Eh? We've put that on the error context callback stack.
It is not pg_parse_query's decision whether it will be called.

regards, tom lane

#42jian he
jian.universality@gmail.com
In reply to: Tom Lane (#41)
1 attachment(s)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

On Tue, Oct 15, 2024 at 12:45 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

jian he <jian.universality@gmail.com> writes:

On Mon, Oct 14, 2024 at 1:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Right, but we might not have entered either of those previous
if-blocks.

in src/backend/parser/gram.y
your makeRawStmt changes (v4) seem to guarantee that
RawStmt->stmt_location >= 0.

Yes, I would expect that any RawStmt we see here will have valid
stmt_location. What you seem to be missing is that an error could
be thrown from

raw_parsetree_list = pg_parse_query(sql);

before execute_sql_string reaches its loop over RawStmts. In that
case we'll reach script_error_callback with callback_arg.stmt_location
still being -1.

Thanks for your reply.
I think I got it.

if (location < 0 || syntaxerrposition < location ||
(len > 0 && syntaxerrposition > location + len))
{
int slen = strlen(query);
location = len = 0;
.....
}
else
elog(INFO, "should not reached here");

just found out the"elog(INFO, "should not reached here");" part never reached.
in script_error_callback, for syntax error (error while parsing the raw string.)
we can bookkeeping the line number.

so we don't need to loop it again for syntax error in:
for (query = callback_arg->sql; *query; query++)
{
if (--location < 0)
break;
if (*query == '\n')
linenumber++;
}
since we already did it in loop
"for (int loc = 0; loc < slen; loc++)"

i guess, we don't need performance in script_error_callback,
but in script_error_callback arrange code seperate syntax error(raw
parser) and other error seems good.
please check the attached minor refactor.

Attachments:

extension.c.difftext/x-patch; charset=US-ASCII; name=extension.c.diffDownload
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 86ea9cd9da..ff7d77f8e9 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -692,6 +692,8 @@ script_error_callback(void *arg)
 	const char *query = callback_arg->sql;
 	int			location = callback_arg->stmt_location;
 	int			len = callback_arg->stmt_len;
+	bool		raw_syntaxerr = false;
+	int			syntaxerrline_no = 0;
 	int			syntaxerrposition;
 	const char *lastslash;
 
@@ -730,6 +732,8 @@ script_error_callback(void *arg)
 			location = len = 0;
 			for (int loc = 0; loc < slen; loc++)
 			{
+				if (query[loc] == '\n')
+					syntaxerrline_no++;
 				if (query[loc] != ';')
 					continue;
 				if (query[loc + 1] == '\r')
@@ -766,6 +770,7 @@ script_error_callback(void *arg)
 		errposition(0);
 		internalerrposition(syntaxerrposition);
 		internalerrquery(pnstrdup(query, len));
+		raw_syntaxerr = true;
 	}
 	else if (location >= 0)
 	{
@@ -794,17 +799,22 @@ script_error_callback(void *arg)
 	 */
 	if (location >= 0)
 	{
-		int			linenumber = 1;
-
-		for (query = callback_arg->sql; *query; query++)
+		if (!raw_syntaxerr)
 		{
-			if (--location < 0)
-				break;
-			if (*query == '\n')
-				linenumber++;
+			int			linenumber = 1;
+			for (query = callback_arg->sql; *query; query++)
+			{
+				if (--location < 0)
+					break;
+				if (*query == '\n')
+					linenumber++;
+			}
+			errcontext("extension script file \"%s\", near line %d",
+					lastslash, linenumber);
 		}
-		errcontext("extension script file \"%s\", near line %d",
-				   lastslash, linenumber);
+		else
+			errcontext("extension script file \"%s\", near line %d",
+					lastslash, syntaxerrline_no);
 	}
 	else
 		errcontext("extension script file \"%s\"", lastslash);
#43Tom Lane
tgl@sss.pgh.pa.us
In reply to: jian he (#42)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

jian he <jian.universality@gmail.com> writes:

just found out the"elog(INFO, "should not reached here");" part never reached.

You didn't check any of the cases we were discussing I guess?
(That is, places in gram.y that throw an error without a
parser_errposition call.)

Note that even if we fix all of those and keep them fixed, we still
couldn't assume the case is unreachable, because gram.y isn't
self-contained. For instance, if we hit out-of-memory during raw
parsing, the OOM error out of mcxt.c isn't going to provide a syntax
error position. I'm not too concerned about doing better than what
the patch does now (i.e. nothing) in such edge cases, but we can't
do worse.

i guess, we don't need performance in script_error_callback,
but in script_error_callback arrange code seperate syntax error(raw
parser) and other error seems good.
please check the attached minor refactor.

I do not think that's an improvement. It's more complicated and
less readable, and I don't see why we need to squeeze more
performance out of this error-reporting path that should never
be taken in production.

regards, tom lane

#44Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#34)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

Hi

pá 11. 10. 2024 v 19:39 odesílatel Pavel Stehule <pavel.stehule@gmail.com>
napsal:

pá 11. 10. 2024 v 18:08 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:

Pavel Stehule <pavel.stehule@gmail.com> writes:

I tested it and it is working nicely. I tested it against Orafce and I
found an interesting point. The body of plpgsql functions is not

checked.

Do you know the reason?

In execute_extension_script():

/*
* Similarly disable check_function_bodies, to ensure that SQL
functions
* won't be parsed during creation.
*/
if (check_function_bodies)
(void) set_config_option("check_function_bodies", "off",
PGC_USERSET, PGC_S_SESSION,
GUC_ACTION_SAVE, true, 0, false);

I wondered if we should reconsider that, but I'm afraid we'd be more
likely to break working extensions than to do anything helpful.
An extension author who wants that can do what I did in the patch's
test cases: manually turn check_function_bodies on in the extension
script.

ok,

I tested this patch and I didn't find any issue. The possibility to show
errors inside extensions more precisely is very useful.

compilation without problems, all tests passed

I'll mark this patch as ready for committer.

Regards

Pavel

Show quoted text

Pavel

regards, tom lane

#45Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#44)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

Pavel Stehule <pavel.stehule@gmail.com> writes:

I'll mark this patch as ready for committer.

Pushed then. Thanks for reviewing!

regards, tom lane

#46Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#45)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

út 22. 10. 2024 v 17:37 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:

Pavel Stehule <pavel.stehule@gmail.com> writes:

I'll mark this patch as ready for committer.

Pushed then. Thanks for reviewing!

Thank you for this patch. It is really practical

Regards

Pavel

Show quoted text

regards, tom lane

#47Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#46)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

In the no-good-deed-goes-unpunished department: buildfarm member
hamerkop doesn't like this patch [1]https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamerkop&amp;dt=2024-10-23%2011%3A00%3A37. The diffs look like

@@ -77,7 +77,7 @@
 ERROR:  syntax error at or near "FUNCTIN"
 LINE 1: CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE S...
                ^
-QUERY:  CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL
+QUERY:  CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL
 AS $$ SELECT $1 + 1 $$;
 CONTEXT:  extension script file "test_ext7--2.0--2.1bad.sql", near line 10
 alter extension test_ext7 update to '2.2bad';

It's hard to be totally sure from the web page, but I suppose what
is happening is that the newline within the quoted query fragment
is represented as "\r\n" not just "\n". (I wonder why the cfbot
failed to detect this; there must be more moving parts involved
than just "it's Windows".)

The reason why this is happening seems fairly clear: extension.c's
read_whole_file() opens the script file with PG_BINARY_R, preventing
Windows' libc from hiding DOS-style newlines from us, even though the
git checkout on that machine is evidently using DOS newlines.

That seems like a rather odd choice. Elsewhere in the same module,
parse_extension_control_file() opens control files with plain "r",
so that those act differently. I checked the git history and did
not learn much. The original extensions commit d9572c4e3 implemented
reading with a call to read_binary_file(), and we seem to have just
carried that behavioral decision forward through various refactorings.
I don't recall if there was an intentional choice to use binary read
or that was just a random decision to use an available function.

So what I'd like to do to fix this is to change

-	if ((file = AllocateFile(filename, PG_BINARY_R)) == NULL)
+	if ((file = AllocateFile(filename, "r")) == NULL)

The argument against that is it creates a nonzero chance of breaking
somebody's extension script file on Windows. But there's a
counter-argument that it might *prevent* bugs on Windows, by making
script behavior more nearly identical to what it is on not-Windows.
So I think that's kind of a wash.

Other approaches we could take with perhaps-smaller blast radii
include making script_error_callback() trim \r's out of the quoted
text (ugly) or creating a variant expected-file (hard to maintain,
and I wonder if it'd survive git-related newline munging).

Thoughts?

regards, tom lane

[1]: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamerkop&amp;dt=2024-10-23%2011%3A00%3A37

#48Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#47)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

I wrote:

In the no-good-deed-goes-unpunished department: buildfarm member
hamerkop doesn't like this patch [1].  The diffs look like
...
So what I'd like to do to fix this is to change
-	if ((file = AllocateFile(filename, PG_BINARY_R)) == NULL)
+	if ((file = AllocateFile(filename, "r")) == NULL)

Well, that didn't fix it :-(. I went so far as to extract the raw log
files from the buildfarm database, and what they show is that there is
absolutely no difference between the lines diff is claiming are
different:

-QUERY:  CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL\r\n
+QUERY:  CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL\r\n

It's the same both before and after 924e03917, which made the code
change depicted above, so that didn't help.

So I'm pretty baffled. I suppose the expected and result files must
actually be different, and something in subsequent processing is
losing the difference before it gets to the buildfarm database.
But I don't have the ability to debug that from here. Does anyone
with access to hamerkop want to poke into this?

Without additional information, the only thing I can think of that
I have any confidence will eliminate these failures is to reformat
the affected test cases so that they produce just a single line of
output. That's kind of annoying from a functionality-coverage point
of view, but I'm not sure avoiding it is worth moving mountains for.

In any case, I'm disinclined to revert 924e03917. It seems like a
good change on balance, even if it failed to fix whatever is
happening on hamerkop.

regards, tom lane

#49Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#48)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

ne 27. 10. 2024 v 18:42 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:

I wrote:

In the no-good-deed-goes-unpunished department: buildfarm member
hamerkop doesn't like this patch [1].  The diffs look like
...
So what I'd like to do to fix this is to change
-     if ((file = AllocateFile(filename, PG_BINARY_R)) == NULL)
+     if ((file = AllocateFile(filename, "r")) == NULL)

Well, that didn't fix it :-(. I went so far as to extract the raw log
files from the buildfarm database, and what they show is that there is
absolutely no difference between the lines diff is claiming are
different:

-QUERY:  CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL\r\n
+QUERY:  CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL\r\n

It's the same both before and after 924e03917, which made the code
change depicted above, so that didn't help.

So I'm pretty baffled. I suppose the expected and result files must
actually be different, and something in subsequent processing is
losing the difference before it gets to the buildfarm database.
But I don't have the ability to debug that from here. Does anyone
with access to hamerkop want to poke into this?

Without additional information, the only thing I can think of that
I have any confidence will eliminate these failures is to reformat
the affected test cases so that they produce just a single line of
output. That's kind of annoying from a functionality-coverage point
of view, but I'm not sure avoiding it is worth moving mountains for.

In any case, I'm disinclined to revert 924e03917. It seems like a
good change on balance, even if it failed to fix whatever is
happening on hamerkop.

+1

This is very useful feature

Pavel

Show quoted text

regards, tom lane

#50Alexander Lakhin
exclusion@gmail.com
In reply to: Tom Lane (#48)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

Hello Tom,

27.10.2024 20:41, Tom Lane wrote:

I wrote:

In the no-good-deed-goes-unpunished department: buildfarm member
hamerkop doesn't like this patch [1].  The diffs look like
...
So what I'd like to do to fix this is to change
-	if ((file = AllocateFile(filename, PG_BINARY_R)) == NULL)
+	if ((file = AllocateFile(filename, "r")) == NULL)

Well, that didn't fix it :-(. I went so far as to extract the raw log
files from the buildfarm database, and what they show is that there is
absolutely no difference between the lines diff is claiming are
different:

-QUERY:  CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL\r\n
+QUERY:  CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL\r\n

It's the same both before and after 924e03917, which made the code
change depicted above, so that didn't help.

So I'm pretty baffled. I suppose the expected and result files must
actually be different, and something in subsequent processing is
losing the difference before it gets to the buildfarm database.
But I don't have the ability to debug that from here. Does anyone
with access to hamerkop want to poke into this?

Without additional information, the only thing I can think of that
I have any confidence will eliminate these failures is to reformat
the affected test cases so that they produce just a single line of
output. That's kind of annoying from a functionality-coverage point
of view, but I'm not sure avoiding it is worth moving mountains for.

I've managed to reproduce the issue with the core.autocrlf=true git setting
(which sets CR+LF line ending in test_ext7--2.0--2.1bad.sql) and with diff
from msys:
C:\msys64\usr\bin\diff.exe --version
diff (GNU diffutils) 3.8

(Gnu/Win32 Diff [1]https://gnuwin32.sourceforge.net/packages/diffutils.htm doesn't detect those EOL differences and thus the test
doesn't fail.)

I can really see different line endings with hexdump:
hexdump -C ...testrun\test_extensions\regress\regression.diffs
00000230  20 20 20 20 20 20 20 20  5e 0a 2d 51 55 45 52 59  | ^.-QUERY|
00000240  3a 20 20 43 52 45 41 54  45 20 46 55 4e 43 54 49  |: CREATE FUNCTI|
00000250  4e 20 6d 79 5f 65 72 72  6f 6e 65 6f 75 73 5f 66  |N my_erroneous_f|
00000260  75 6e 63 28 69 6e 74 29  20 52 45 54 55 52 4e 53 |unc(int) RETURNS|
00000270  20 69 6e 74 20 4c 41 4e  47 55 41 47 45 20 53 51  | int LANGUAGE SQ|
00000280  4c 0a 2b 51 55 45 52 59  3a 20 20 43 52 45 41 54 |L.+QUERY:  CREAT|
00000290  45 20 46 55 4e 43 54 49  4e 20 6d 79 5f 65 72 72  |E FUNCTIN my_err|
000002a0  6f 6e 65 6f 75 73 5f 66  75 6e 63 28 69 6e 74 29 |oneous_func(int)|
000002b0  20 52 45 54 55 52 4e 53  20 69 6e 74 20 4c 41 4e  | RETURNS int LAN|
000002c0  47 55 41 47 45 20 53 51  4c 0d 0a 20 41 53 20 24  |GUAGE SQL.. AS $|

hexdump -C .../testrun/test_extensions/regress/results/test_extensions.out | grep -C5 FUNCTIN
00000b80  20 5e 0d 0a 51 55 45 52  59 3a 20 20 43 52 45 41  | ^..QUERY:  CREA|
00000b90  54 45 20 46 55 4e 43 54  49 4e 20 6d 79 5f 65 72  |TE FUNCTIN my_er|
00000ba0  72 6f 6e 65 6f 75 73 5f  66 75 6e 63 28 69 6e 74 |roneous_func(int|
00000bb0  29 20 52 45 54 55 52 4e  53 20 69 6e 74 20 4c 41  |) RETURNS int LA|
00000bc0  4e 47 55 41 47 45 20 53  51 4c 0d 0d 0a 41 53 20  |NGUAGE SQL...AS |

whilst
hexdump -C .../src/test/modules/test_extensions/expected/test_extensions.out | grep -C5 FUNCTIN
00000b80  20 5e 0d 0a 51 55 45 52  59 3a 20 20 43 52 45 41  | ^..QUERY:  CREA|
00000b90  54 45 20 46 55 4e 43 54  49 4e 20 6d 79 5f 65 72  |TE FUNCTIN my_er|
00000ba0  72 6f 6e 65 6f 75 73 5f  66 75 6e 63 28 69 6e 74 |roneous_func(int|
00000bb0  29 20 52 45 54 55 52 4e  53 20 69 6e 74 20 4c 41  |) RETURNS int LA|
00000bc0  4e 47 55 41 47 45 20 53  51 4c 0d 0a 41 53 20 24  |NGUAGE SQL..AS $|

It looks like --strip-trailing-cr doesn't work as desired for this diff version.

I've also dumped buf in read_whole_file() and found that in both
PG_BINARY_R and "r" modes the 0d 0a ending is preserved. But it changed
to 0a with the "rt" mode (see [1]https://gnuwin32.sourceforge.net/packages/diffutils.htm), and it makes the test (and the whole
`meson test`) pass for me.

[1]: https://gnuwin32.sourceforge.net/packages/diffutils.htm
[2]: https://learn.microsoft.com/en-us/cpp/c-runtime-library/file-translation-constants?view=msvc-170

Best regards,
Alexander

#51Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alexander Lakhin (#50)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

Alexander Lakhin <exclusion@gmail.com> writes:

27.10.2024 20:41, Tom Lane wrote:

In the no-good-deed-goes-unpunished department: buildfarm member
hamerkop doesn't like this patch [1]. The diffs look like

I've managed to reproduce the issue with the core.autocrlf=true git setting
(which sets CR+LF line ending in test_ext7--2.0--2.1bad.sql) and with diff
from msys:
C:\msys64\usr\bin\diff.exe --version
diff (GNU diffutils) 3.8
(Gnu/Win32 Diff [1] doesn't detect those EOL differences and thus the test
doesn't fail.)

Thank you for looking at this!

I can really see different line endings with hexdump:
hexdump -C .../testrun/test_extensions/regress/results/test_extensions.out | grep -C5 FUNCTIN
00000b80  20 5e 0d 0a 51 55 45 52  59 3a 20 20 43 52 45 41  | ^..QUERY:  CREA|
00000b90  54 45 20 46 55 4e 43 54  49 4e 20 6d 79 5f 65 72  |TE FUNCTIN my_er|
00000ba0  72 6f 6e 65 6f 75 73 5f  66 75 6e 63 28 69 6e 74 |roneous_func(int|
00000bb0  29 20 52 45 54 55 52 4e  53 20 69 6e 74 20 4c 41  |) RETURNS int LA|
00000bc0  4e 47 55 41 47 45 20 53  51 4c 0d 0d 0a 41 53 20  |NGUAGE SQL...AS |

Wow. How are we producing \r\r\n? I'm not hugely surprised that some
versions of diff might see that as different from a single newline
even with --strip-trailing-cr --- formally, I think that's supposed
to make \r\n match \n, but the extra \r doesn't fit that pattern.

I've also dumped buf in read_whole_file() and found that in both
PG_BINARY_R and "r" modes the 0d 0a ending is preserved. But it changed
to 0a with the "rt" mode (see [1]), and it makes the test (and the whole
`meson test`) pass for me.

Interesting. I believe we decided years ago that we didn't need to
use "rt" mode because that was the default on Windows, but was that
a misreading of the documentation? The link you provided doesn't
give any hint that there are more than two behaviors.

However ... the link you provided also mentions that text mode
includes treating control-Z as EOF; which I'd forgotten, but it
does make it less safe than I thought to use text mode for
reading script files.

What I'm now thinking is that we should revert 924e03917 after
all (that is, go back to using PG_BINARY_R) and instead make
read_whole_file manually squash \r\n to \n if we're on Windows.
Ugly, but I have yet to find anything about that platform that
isn't.

regards, tom lane

#52Alexander Lakhin
exclusion@gmail.com
In reply to: Tom Lane (#51)
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

28.10.2024 19:06, Tom Lane wrote:

I've also dumped buf in read_whole_file() and found that in both
PG_BINARY_R and "r" modes the 0d 0a ending is preserved. But it changed
to 0a with the "rt" mode (see [1]), and it makes the test (and the whole
`meson test`) pass for me.

Interesting. I believe we decided years ago that we didn't need to
use "rt" mode because that was the default on Windows, but was that
a misreading of the documentation? The link you provided doesn't
give any hint that there are more than two behaviors.

However ... the link you provided also mentions that text mode
includes treating control-Z as EOF; which I'd forgotten, but it
does make it less safe than I thought to use text mode for
reading script files.

I think that this other behavior can be explained by pgwin32_fopen()/
pgwin32_open() coding (O_TEXT assumed implicitly only #ifdef FRONTEND).

Anyway, as you noticed, \x1A injected into test_ext....sql really leads to
the file contents truncation on read (with "rt"), so I agree that using the
text/translation mode here is not an improvement.

What I'm now thinking is that we should revert 924e03917 after
all (that is, go back to using PG_BINARY_R) and instead make
read_whole_file manually squash \r\n to \n if we're on Windows.
Ugly, but I have yet to find anything about that platform that
isn't.

Yes, I think this should help.

Best regards,
Alexander