Small doubt on update a partition when some rows need to move among partition

Started by movead.li@highgo.caover 5 years ago2 messages
#1movead.li@highgo.ca
movead.li@highgo.ca
1 attachment(s)

Hello,

I am trying to handle the limit that we can't do a tuple move caused by BEFORE TRIGGER,
during which I get two doubt points:

The first issue:
In ExecBRUpdateTriggers() or ExecBRInsertTriggers() function why we need to check the
result slot after every trigger call. If we should check the result slot after all triggers are
called.

For example, we have a table t1(i int, j int), and a BEFORE INSERT TRIGGER on t1 make i - 1,
and another BEFORE INSERT TRIGGER on t1 make i + 1. If the first trigger causes a partition
move, then the insert query will be interrupted. However, it will not change partition after
all triggers are called.

The second issue:
I read the code for partition move caused by an update, it deletes tuple in an old partition
and inserts a new tuple in a partition. But during the insert, it triggers the trigger on the new
partition, so the result value may be changed again, I want to know if it's intended way? In
my mind, if an insert produced by partition move should not trigger before trigger again.

I make an initial patch as my thought, sorry if I missing some of the historical discussion.

Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca

Attachments:

partition_move_issues.patchapplication/octet-stream; name=partition_move_issues.patchDownload
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 672fccff5b..a3650b54c7 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -2197,6 +2197,7 @@ ExecBRInsertTriggers(EState *estate, ResultRelInfo *relinfo,
 	bool		should_free;
 	TriggerData LocTriggerData = {0};
 	int			i;
+	bool		partition_check = false;
 
 	LocTriggerData.type = T_TriggerData;
 	LocTriggerData.tg_event = TRIGGER_EVENT_INSERT |
@@ -2238,20 +2239,7 @@ ExecBRInsertTriggers(EState *estate, ResultRelInfo *relinfo,
 		{
 			ExecForceStoreHeapTuple(newtuple, slot, false);
 
-			/*
-			 * After a tuple in a partition goes through a trigger, the user
-			 * could have changed the partition key enough that the tuple no
-			 * longer fits the partition.  Verify that.
-			 */
-			if (trigger->tgisclone &&
-				!ExecPartitionCheck(relinfo, slot, estate, false))
-				ereport(ERROR,
-						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-						 errmsg("moving row to another partition during a BEFORE FOR EACH ROW trigger is not supported"),
-						 errdetail("Before executing trigger \"%s\", the row was to be in partition \"%s.%s\".",
-								   trigger->tgname,
-								   get_namespace_name(RelationGetNamespace(relinfo->ri_RelationDesc)),
-								   RelationGetRelationName(relinfo->ri_RelationDesc))));
+			partition_check = trigger->tgisclone;
 
 			if (should_free)
 				heap_freetuple(oldtuple);
@@ -2261,6 +2249,19 @@ ExecBRInsertTriggers(EState *estate, ResultRelInfo *relinfo,
 		}
 	}
 
+	/*
+	 * After a tuple in a partition goes through a trigger, the user
+	 * could have changed the partition key enough that the tuple no
+	 * longer fits the partition.  Verify that.
+	 */
+	if (partition_check && !ExecPartitionCheck(relinfo, slot, estate, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					errmsg("moving row to another partition during a BEFORE FOR EACH ROW trigger is not supported"),
+					errdetail("After executing triggers, the row was to be in partition \"%s.%s\".",
+							get_namespace_name(RelationGetNamespace(relinfo->ri_RelationDesc)),
+							RelationGetRelationName(relinfo->ri_RelationDesc))));
+
 	return true;
 }
 
@@ -2656,6 +2657,7 @@ ExecBRUpdateTriggers(EState *estate, EPQState *epqstate,
 	int			i;
 	Bitmapset  *updatedCols;
 	LockTupleMode lockmode;
+	bool		partition_check = false;
 
 	/* Determine lock mode to use */
 	lockmode = ExecUpdateLockMode(estate, relinfo);
@@ -2747,15 +2749,7 @@ ExecBRUpdateTriggers(EState *estate, EPQState *epqstate,
 		{
 			ExecForceStoreHeapTuple(newtuple, newslot, false);
 
-			if (trigger->tgisclone &&
-				!ExecPartitionCheck(relinfo, newslot, estate, false))
-				ereport(ERROR,
-						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-						 errmsg("moving row to another partition during a BEFORE trigger is not supported"),
-						 errdetail("Before executing trigger \"%s\", the row was to be in partition \"%s.%s\".",
-								   trigger->tgname,
-								   get_namespace_name(RelationGetNamespace(relinfo->ri_RelationDesc)),
-								   RelationGetRelationName(relinfo->ri_RelationDesc))));
+			partition_check = trigger->tgisclone;
 
 			/*
 			 * If the tuple returned by the trigger / being stored, is the old
@@ -2773,6 +2767,15 @@ ExecBRUpdateTriggers(EState *estate, EPQState *epqstate,
 			newtuple = NULL;
 		}
 	}
+
+	if (partition_check && !ExecPartitionCheck(relinfo, newslot, estate, false))
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						errmsg("moving row to another partition during a BEFORE trigger is not supported"),
+						errdetail("After executing triggers, the row was to be in partition \"%s.%s\".",
+								get_namespace_name(RelationGetNamespace(relinfo->ri_RelationDesc)),
+								RelationGetRelationName(relinfo->ri_RelationDesc))));
+
 	if (should_free_trig)
 		heap_freetuple(trigtuple);
 
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 20a4c474cc..4108a0fe8c 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -376,7 +376,8 @@ ExecInsert(ModifyTableState *mtstate,
 		   TupleTableSlot *slot,
 		   TupleTableSlot *planSlot,
 		   EState *estate,
-		   bool canSetTag)
+		   bool canSetTag,
+		   bool tryBRTrigger)
 {
 	ResultRelInfo *resultRelInfo;
 	Relation	resultRelationDesc;
@@ -404,7 +405,8 @@ ExecInsert(ModifyTableState *mtstate,
 	 * side-effects that we can't cancel by just not inserting the tuple.
 	 */
 	if (resultRelInfo->ri_TrigDesc &&
-		resultRelInfo->ri_TrigDesc->trig_insert_before_row)
+		resultRelInfo->ri_TrigDesc->trig_insert_before_row &&
+		tryBRTrigger)
 	{
 		if (!ExecBRInsertTriggers(estate, resultRelInfo, slot))
 			return NULL;		/* "do nothing" */
@@ -1309,7 +1311,7 @@ lreplace:;
 										   mtstate->rootResultRelInfo, slot);
 
 			ret_slot = ExecInsert(mtstate, slot, planSlot,
-								  estate, canSetTag);
+								  estate, canSetTag, false);
 
 			/* Revert ExecPrepareTupleRouting's node change. */
 			estate->es_result_relation_info = resultRelInfo;
@@ -2244,7 +2246,7 @@ ExecModifyTable(PlanState *pstate)
 					slot = ExecPrepareTupleRouting(node, estate, proute,
 												   resultRelInfo, slot);
 				slot = ExecInsert(node, slot, planSlot,
-								  estate, node->canSetTag);
+								  estate, node->canSetTag, true);
 				/* Revert ExecPrepareTupleRouting's state change. */
 				if (proute)
 					estate->es_result_relation_info = resultRelInfo;
#2Ashutosh Bapat
ashutosh.bapat.oss@gmail.com
In reply to: movead.li@highgo.ca (#1)
Re: Small doubt on update a partition when some rows need to move among partition

On Thu, Aug 20, 2020 at 5:22 PM movead.li@highgo.ca <movead.li@highgo.ca> wrote:

Hello,

I am trying to handle the limit that we can't do a tuple move caused by BEFORE TRIGGER,
during which I get two doubt points:

The first issue:
In ExecBRUpdateTriggers() or ExecBRInsertTriggers() function why we need to check the
result slot after every trigger call. If we should check the result slot after all triggers are
called.

For example, we have a table t1(i int, j int), and a BEFORE INSERT TRIGGER on t1 make i - 1,
and another BEFORE INSERT TRIGGER on t1 make i + 1. If the first trigger causes a partition
move, then the insert query will be interrupted. However, it will not change partition after
all triggers are called.

This was discussed at
/messages/by-id/20200318210213.GA9781@alvherre.pgsql.

--
Best Wishes,
Ashutosh Bapat