[BUG] segfault during delete

Started by Drouvot, Bertrandalmost 5 years ago8 messages

Drouvot, Bertrand

bdrouvot@amazon.com

almost 5 years ago

1 attachment(s)

Hi hackers,

Here is a scenario that segfault during delete (with version >= 12):

create table parent (
col1 text primary key
);

create table child (
col1 text primary key,
FOREIGN KEY (col1) REFERENCES parent(col1) on delete cascade
);

CREATE or replace FUNCTION trigger_function()
RETURNS TRIGGER
LANGUAGE PLPGSQL
AS $$
BEGIN
raise notice 'trigger = %, old table = %',
TG_NAME,
(select string_agg(old_table::text, ', ' order by col1) from
old_table);
return NULL;
END;
$$
;

create trigger bdt_trigger after delete on child REFERENCING OLD TABLE
AS old_table for each statement EXECUTE function trigger_function();

alter table child add column col2 text not null default 'tutu';
insert into parent(col1) values ('1');
insert into child(col1) values ('1');
insert into parent(col1) values ('2');
insert into child(col1) values ('2');
delete from parent;

produces:

CREATE TABLE
CREATE TABLE
CREATE FUNCTION
CREATE TRIGGER
ALTER TABLE
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
psql:./segfault.repro.sql:35: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
psql:./segfault.repro.sql:35: fatal: connection to server was lost

The column being added to the child relation needs to have a default
value for the segfault to be triggered.

The stack is the following:

#0 0x000000000049aa59 in execute_attr_map_slot (attrMap=0x11736c8,
in_slot=0x1179c90, out_slot=0x1179da8) at tupconvert.c:193
#1 0x0000000000700ec0 in AfterTriggerSaveEvent (estate=0x1171820,
relinfo=0x1171cc8, event=1, row_trigger=true, oldslot=0x1179c90,
newslot=0x0, recheckIndexes=0x0, modifiedCols=0x0,
transition_capture=0x1172438) at trigger.c:5488
#2 0x00000000006fc6a8 in ExecARDeleteTriggers (estate=0x1171820,
relinfo=0x1171cc8, tupleid=0x7ffd2b7f2e40, fdw_trigtuple=0x0,
transition_capture=0x1172438) at trigger.c:2565
#3 0x0000000000770794 in ExecDelete (mtstate=0x1171a90,
resultRelInfo=0x1171cc8, tupleid=0x7ffd2b7f2e40, oldtuple=0x0,
planSlot=0x1173630, epqstate=0x1171b88, estate=0x1171820,
processReturning=true, canSetTag=true, changingPart=false,
tupleDeleted=0x0, epqreturnslot=0x0)
at nodeModifyTable.c:1128
#4 0x00000000007724cc in ExecModifyTable (pstate=0x1171a90) at
nodeModifyTable.c:2259

I had a look at it, and it looks to me that the slot being reused in
AfterTriggerSaveEvent() (when (map != NULL) and not (!storeslot)) has
previously been damaged by FreeExecutorState() in standard_ExecutorEnd().

Then I ended up with the enclosed patch proposal that does not make the
repro segfaulting and that is passing make check too.

Not sure that what is being done in the patch is the right/best approach
to solve the issue though.

Bertrand

PS: the same repro with a foreign key with update cascade, an update
trigger and an update on the col1 column would segfault too (but does
not with the enclosed patch proposal).

Attachments:

v1-0001-segfault-after-trigger.patchtext/plain; charset=UTF-8; name=v1-0001-segfault-after-trigger.patch; x-mac-creator=0; x-mac-type=0Download

 src/backend/commands/trigger.c    |  6 ++++++
 src/backend/executor/execTuples.c | 44 +++++++++++++++++++++++++++++++++++++++
 src/include/executor/tuptable.h   |  2 ++
 3 files changed, 52 insertions(+)
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 8908847c6c..a6c464b120 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -5475,8 +5475,11 @@ AfterTriggerSaveEvent(EState *estate, ResultRelInfo *relinfo,
 			if (map != NULL)
 			{
 				TupleTableSlot *storeslot;
+				MemoryContext oldcontext;
+				oldcontext = MemoryContextSwitchTo(TopTransactionContext);
 
 				storeslot = transition_capture->tcs_private->storeslot;
+
 				if (!storeslot)
 				{
 					storeslot = ExecAllocTableSlot(&estate->es_tupleTable,
@@ -5484,7 +5487,10 @@ AfterTriggerSaveEvent(EState *estate, ResultRelInfo *relinfo,
 												   &TTSOpsVirtual);
 					transition_capture->tcs_private->storeslot = storeslot;
 				}
+				else
+					InitTupleTableSlot(&estate->es_tupleTable, map->outdesc, &TTSOpsVirtual, storeslot);
 
+				MemoryContextSwitchTo(oldcontext);
 				execute_attr_map_slot(map->attrMap, oldslot, storeslot);
 				tuplestore_puttupleslot(old_tuplestore, storeslot);
 			}
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index 73c35df9c9..5f31287529 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -1131,6 +1131,50 @@ MakeTupleTableSlot(TupleDesc tupleDesc,
 	return slot;
 }
 
+/* --------------------------------
+ *		InitTupleTableSlot
+ *
+ *		Basic routine to make an empty TupleTableSlot for an existing slot of given
+ *		TupleTableSlotType. If tupleDesc is specified the slot's descriptor is
+ *		fixed for its lifetime, gaining some efficiency. If that's
+ *		undesirable, pass NULL.
+ * --------------------------------
+ */
+void
+InitTupleTableSlot(List** tupleTable, TupleDesc tupleDesc,
+				   const TupleTableSlotOps *tts_ops, TupleTableSlot *slot)
+{
+	Size		basesz;
+
+	basesz = tts_ops->base_slot_size;
+
+	/* const for optimization purposes, OK to modify at allocation time */
+	*((const TupleTableSlotOps **) &slot->tts_ops) = tts_ops;
+	slot->type = T_TupleTableSlot;
+	slot->tts_flags |= TTS_FLAG_EMPTY;
+	if (tupleDesc != NULL)
+		slot->tts_flags |= TTS_FLAG_FIXED;
+	slot->tts_tupleDescriptor = tupleDesc;
+	slot->tts_mcxt = CurrentMemoryContext;
+	slot->tts_nvalid = 0;
+
+	if (tupleDesc != NULL)
+	{
+		slot->tts_values = (Datum *)
+			(((char *) slot)
+			 + MAXALIGN(basesz));
+		slot->tts_isnull = (bool *)
+			(((char *) slot)
+			 + MAXALIGN(basesz)
+			 + MAXALIGN(tupleDesc->natts * sizeof(Datum)));
+	}
+
+	/*
+	 * And allow slot type specific initialization.
+	 */
+	slot->tts_ops->init(slot);
+}
+
 /* --------------------------------
  *		ExecAllocTableSlot
  *
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index 679e57fbdd..6c82883877 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -295,6 +295,8 @@ typedef struct MinimalTupleTableSlot
 /* in executor/execTuples.c */
 extern TupleTableSlot *MakeTupleTableSlot(TupleDesc tupleDesc,
 										  const TupleTableSlotOps *tts_ops);
+extern void InitTupleTableSlot(List **tupleTable, TupleDesc tupleDesc,
+										  const TupleTableSlotOps *tts_ops, TupleTableSlot *slot);
 extern TupleTableSlot *ExecAllocTableSlot(List **tupleTable, TupleDesc desc,
 										  const TupleTableSlotOps *tts_ops);
 extern void ExecResetTupleTable(List *tupleTable, bool shouldFree);

Amit Langote

amitlangote09@gmail.com

almost 5 years ago

In reply to: Drouvot, Bertrand (#1)

1 attachment(s)

Re: [BUG] segfault during delete

Hi Bertrand,

On Wed, Feb 24, 2021 at 5:56 AM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:

Hi hackers,

Here is a scenario that segfault during delete (with version >= 12):

create table parent (
col1 text primary key
);

create table child (
col1 text primary key,
FOREIGN KEY (col1) REFERENCES parent(col1) on delete cascade
);

CREATE or replace FUNCTION trigger_function()
RETURNS TRIGGER
LANGUAGE PLPGSQL
AS $$
BEGIN
raise notice 'trigger = %, old table = %',
TG_NAME,
(select string_agg(old_table::text, ', ' order by col1) from old_table);
return NULL;
END;
$$
;

create trigger bdt_trigger after delete on child REFERENCING OLD TABLE AS old_table for each statement EXECUTE function trigger_function();

alter table child add column col2 text not null default 'tutu';
insert into parent(col1) values ('1');
insert into child(col1) values ('1');
insert into parent(col1) values ('2');
insert into child(col1) values ('2');
delete from parent;

produces:

CREATE TABLE
CREATE TABLE
CREATE FUNCTION
CREATE TRIGGER
ALTER TABLE
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
psql:./segfault.repro.sql:35: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
psql:./segfault.repro.sql:35: fatal: connection to server was lost

The column being added to the child relation needs to have a default value for the segfault to be triggered.

The stack is the following:

#0 0x000000000049aa59 in execute_attr_map_slot (attrMap=0x11736c8, in_slot=0x1179c90, out_slot=0x1179da8) at tupconvert.c:193
#1 0x0000000000700ec0 in AfterTriggerSaveEvent (estate=0x1171820, relinfo=0x1171cc8, event=1, row_trigger=true, oldslot=0x1179c90, newslot=0x0, recheckIndexes=0x0, modifiedCols=0x0, transition_capture=0x1172438) at trigger.c:5488
#2 0x00000000006fc6a8 in ExecARDeleteTriggers (estate=0x1171820, relinfo=0x1171cc8, tupleid=0x7ffd2b7f2e40, fdw_trigtuple=0x0, transition_capture=0x1172438) at trigger.c:2565
#3 0x0000000000770794 in ExecDelete (mtstate=0x1171a90, resultRelInfo=0x1171cc8, tupleid=0x7ffd2b7f2e40, oldtuple=0x0, planSlot=0x1173630, epqstate=0x1171b88, estate=0x1171820, processReturning=true, canSetTag=true, changingPart=false, tupleDeleted=0x0, epqreturnslot=0x0)
at nodeModifyTable.c:1128
#4 0x00000000007724cc in ExecModifyTable (pstate=0x1171a90) at nodeModifyTable.c:2259

I had a look at it, and it looks to me that the slot being reused in AfterTriggerSaveEvent() (when (map != NULL) and not (!storeslot)) has previously been damaged by FreeExecutorState() in standard_ExecutorEnd().

Right.

Then I ended up with the enclosed patch proposal that does not make the repro segfaulting and that is passing make check too.

Not sure that what is being done in the patch is the right/best approach to solve the issue though.

Hmm, I don't think we should be *freshly* allocating the
TupleTableSlot every time. Not having to do that is why the commit
ff11e7f4b9ae0 seems to have invented AfterTriggersTableData.storeslot
in the first place, that is, to cache the slot once created.

Problem with the current way as you've discovered is that the slot
gets allocated in the execution-span memory context, whereas the
AfterTriggersTableData instance, of which the slot is a part, is
supposed to last the entire (sub-)transaction. So the correct fix I
think is to allocate the slot to be stored in
AfterTriggersTableData.storeslot in the transaction-span memory
context as well.

Having taken care of that, another problem with the current way is
that it adds the slot to es_tupleTable by calling
ExecAllocTupleSlot(), which means that the slot will be released when
ExecResetTupleTable() is called on es_tupleTable as part of
ExecEndPlan(). That would defeat the point of allocating the slot in
the transaction-span context. So let's use MakeSingleTableSlot() to
allocate a standalone slot to be stored in
AfterTriggersTableData.storeslot and have AfterTriggerFreeQuery() call
ExecDropSingleTupleTableSlot() to release it.

PS: the same repro with a foreign key with update cascade, an update trigger and an update on the col1 column would segfault too (but does not with the enclosed patch proposal).

Actually, we also need to fix similar code in the block for populating
the transition NEW TABLE, because without doing so that block too can
crash similarly:

if (!TupIsNull(newslot) &&
((event == TRIGGER_EVENT_INSERT && insert_new_table) ||
(event == TRIGGER_EVENT_UPDATE && update_new_table)))

I've attached a patch with my suggested fixes and also test cases.
Please take a look.

--
Amit Langote
EDB: http://www.enterprisedb.com

Attachments:

v2-0001-Fix-use-after-tree-bug-with-AfterTriggersTableDat.patchapplication/octet-stream; name=v2-0001-Fix-use-after-tree-bug-with-AfterTriggersTableDat.patchDownload

From 42451e1c5d84ba00ace2490fd245d354185c5cd3 Mon Sep 17 00:00:00 2001
From: amitlan <amitlangote09@gmail.com>
Date: Wed, 24 Feb 2021 16:58:48 +0900
Subject: [PATCH v2] Fix use-after-tree bug with
 AfterTriggersTableData.storeslot

AfterTriggerSaveEvent() wrongly allocates the slot in execution-span
memory context, whereas the correct thing is to allocate it in
a transaction-span context, because that's where the enclosing
AfterTriggersTableData instance belongs.

Author: Bertrand Drouvot
Author: Amit Langote
---
 src/backend/commands/trigger.c         | 59 ++++++++++++++++--------
 src/test/regress/expected/triggers.out | 58 +++++++++++++++++++++++
 src/test/regress/sql/triggers.sql      | 64 ++++++++++++++++++++++++++
 3 files changed, 162 insertions(+), 19 deletions(-)

diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 8908847c6c..7275d5b4cd 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -3536,6 +3536,8 @@ static void AfterTriggerExecute(EState *estate,
 								TupleTableSlot *trig_tuple_slot2);
 static AfterTriggersTableData *GetAfterTriggersTableData(Oid relid,
 														 CmdType cmdType);
+static TupleTableSlot *GetAfterTriggerTableStoreSlot(AfterTriggersTableData *table,
+							  TupleDesc tupdesc);
 static void AfterTriggerFreeQuery(AfterTriggersQueryData *qs);
 static SetConstraintState SetConstraintStateCreate(int numalloc);
 static SetConstraintState SetConstraintStateCopy(SetConstraintState state);
@@ -4336,6 +4338,38 @@ GetAfterTriggersTableData(Oid relid, CmdType cmdType)
 	return table;
 }
 
+/*
+ * Returns the TupleTableSlot suitable for holding the tuples to be put
+ * into transition table tuplestores.
+ */
+static TupleTableSlot *
+GetAfterTriggerTableStoreSlot(AfterTriggersTableData *table,
+							  TupleDesc tupdesc)
+{
+	TupleTableSlot *storeslot = table->storeslot;
+
+	/* Create it if not already done. */
+	if (!storeslot)
+	{
+		/*
+		 * Make the slot valid until end of subtransaction.  We really only
+		 * need it until AfterTriggerEndQuery().
+		 */
+		MemoryContext	oldcxt = MemoryContextSwitchTo(CurTransactionContext);
+
+		storeslot = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
+
+		/*
+		 * Remember it for future requests.  Will be released in
+		 * AfterTriggerFreeQuery().
+		 */
+		table->storeslot = storeslot;
+		MemoryContextSwitchTo(oldcxt);
+	}
+
+	return storeslot;
+}
+
 
 /*
  * MakeTransitionCaptureState
@@ -4625,6 +4659,8 @@ AfterTriggerFreeQuery(AfterTriggersQueryData *qs)
 		table->new_tuplestore = NULL;
 		if (ts)
 			tuplestore_end(ts);
+		if (table->storeslot)
+			ExecDropSingleTupleTableSlot(table->storeslot);
 	}
 
 	/*
@@ -5474,17 +5510,10 @@ AfterTriggerSaveEvent(EState *estate, ResultRelInfo *relinfo,
 
 			if (map != NULL)
 			{
+				AfterTriggersTableData *table = transition_capture->tcs_private;
 				TupleTableSlot *storeslot;
 
-				storeslot = transition_capture->tcs_private->storeslot;
-				if (!storeslot)
-				{
-					storeslot = ExecAllocTableSlot(&estate->es_tupleTable,
-												   map->outdesc,
-												   &TTSOpsVirtual);
-					transition_capture->tcs_private->storeslot = storeslot;
-				}
-
+				storeslot = GetAfterTriggerTableStoreSlot(table, map->outdesc);
 				execute_attr_map_slot(map->attrMap, oldslot, storeslot);
 				tuplestore_puttupleslot(old_tuplestore, storeslot);
 			}
@@ -5504,18 +5533,10 @@ AfterTriggerSaveEvent(EState *estate, ResultRelInfo *relinfo,
 										original_insert_tuple);
 			else if (map != NULL)
 			{
+				AfterTriggersTableData *table = transition_capture->tcs_private;
 				TupleTableSlot *storeslot;
 
-				storeslot = transition_capture->tcs_private->storeslot;
-
-				if (!storeslot)
-				{
-					storeslot = ExecAllocTableSlot(&estate->es_tupleTable,
-												   map->outdesc,
-												   &TTSOpsVirtual);
-					transition_capture->tcs_private->storeslot = storeslot;
-				}
-
+				storeslot = GetAfterTriggerTableStoreSlot(table, map->outdesc);
 				execute_attr_map_slot(map->attrMap, newslot, storeslot);
 				tuplestore_puttupleslot(new_tuplestore, storeslot);
 			}
diff --git a/src/test/regress/expected/triggers.out b/src/test/regress/expected/triggers.out
index b263002293..6832a30cfa 100644
--- a/src/test/regress/expected/triggers.out
+++ b/src/test/regress/expected/triggers.out
@@ -3290,3 +3290,61 @@ create trigger aft_row after insert or update on trigger_parted
 create table trigger_parted_p1 partition of trigger_parted for values in (1)
   partition by list (a);
 create table trigger_parted_p1_1 partition of trigger_parted_p1 for values in (1);
+-- test cases for a bug around AfterTriggersTableData.storeslot
+create table convslot_test_parent (col1 text primary key);
+create table convslot_test_child (col1 text primary key,
+	foreign key (col1) references convslot_test_parent(col1) on delete cascade on update cascade
+);
+alter table convslot_test_child add column col2 text not null default 'tutu';
+insert into convslot_test_parent(col1) values ('1');
+insert into convslot_test_child(col1) values ('1');
+insert into convslot_test_parent(col1) values ('3');
+insert into convslot_test_child(col1) values ('3');
+create or replace function trigger_function1()
+returns trigger
+language plpgsql
+AS $$
+begin
+raise notice 'trigger = %, old_table = %',
+          TG_NAME,
+          (select string_agg(old_table::text, ', ' order by col1) from old_table);
+return null;
+end; $$;
+create or replace function trigger_function2()
+returns trigger
+language plpgsql
+AS $$
+begin
+raise notice 'trigger = %, new table = %',
+          TG_NAME,
+          (select string_agg(new_table::text, ', ' order by col1) from new_table);
+return null;
+end; $$;
+create trigger but_trigger after update on convslot_test_child
+referencing new table as new_table
+for each statement execute function trigger_function2();
+update convslot_test_parent set col1 = col1 || '1';
+NOTICE:  trigger = but_trigger, new table = (11,tutu), (31,tutu)
+create or replace function trigger_function3()
+returns trigger
+language plpgsql
+AS $$
+begin
+raise notice 'trigger = %, old_table = %, new table = %',
+          TG_NAME,
+          (select string_agg(old_table::text, ', ' order by col1) from old_table),
+          (select string_agg(new_table::text, ', ' order by col1) from new_table);
+return null;
+end; $$;
+create trigger but_trigger2 after update on convslot_test_child
+referencing old table as old_table new table as new_table
+for each statement execute function trigger_function3();
+update convslot_test_parent set col1 = col1 || '1';
+NOTICE:  trigger = but_trigger, new table = (111,tutu), (311,tutu)
+NOTICE:  trigger = but_trigger2, old_table = (11,tutu), (31,tutu), new table = (111,tutu), (311,tutu)
+create trigger bdt_trigger after delete on convslot_test_child
+referencing old table as old_table
+for each statement execute function trigger_function1();
+delete from convslot_test_parent;
+NOTICE:  trigger = bdt_trigger, old_table = (111,tutu), (311,tutu)
+drop table convslot_test_child, convslot_test_parent;
diff --git a/src/test/regress/sql/triggers.sql b/src/test/regress/sql/triggers.sql
index 01f66af699..dac59a5003 100644
--- a/src/test/regress/sql/triggers.sql
+++ b/src/test/regress/sql/triggers.sql
@@ -2470,3 +2470,67 @@ create trigger aft_row after insert or update on trigger_parted
 create table trigger_parted_p1 partition of trigger_parted for values in (1)
   partition by list (a);
 create table trigger_parted_p1_1 partition of trigger_parted_p1 for values in (1);
+
+-- test cases for a bug around AfterTriggersTableData.storeslot
+create table convslot_test_parent (col1 text primary key);
+create table convslot_test_child (col1 text primary key,
+	foreign key (col1) references convslot_test_parent(col1) on delete cascade on update cascade
+);
+
+alter table convslot_test_child add column col2 text not null default 'tutu';
+insert into convslot_test_parent(col1) values ('1');
+insert into convslot_test_child(col1) values ('1');
+insert into convslot_test_parent(col1) values ('3');
+insert into convslot_test_child(col1) values ('3');
+
+create or replace function trigger_function1()
+returns trigger
+language plpgsql
+AS $$
+begin
+raise notice 'trigger = %, old_table = %',
+          TG_NAME,
+          (select string_agg(old_table::text, ', ' order by col1) from old_table);
+return null;
+end; $$;
+
+create or replace function trigger_function2()
+returns trigger
+language plpgsql
+AS $$
+begin
+raise notice 'trigger = %, new table = %',
+          TG_NAME,
+          (select string_agg(new_table::text, ', ' order by col1) from new_table);
+return null;
+end; $$;
+
+create trigger but_trigger after update on convslot_test_child
+referencing new table as new_table
+for each statement execute function trigger_function2();
+
+update convslot_test_parent set col1 = col1 || '1';
+
+create or replace function trigger_function3()
+returns trigger
+language plpgsql
+AS $$
+begin
+raise notice 'trigger = %, old_table = %, new table = %',
+          TG_NAME,
+          (select string_agg(old_table::text, ', ' order by col1) from old_table),
+          (select string_agg(new_table::text, ', ' order by col1) from new_table);
+return null;
+end; $$;
+
+create trigger but_trigger2 after update on convslot_test_child
+referencing old table as old_table new table as new_table
+for each statement execute function trigger_function3();
+update convslot_test_parent set col1 = col1 || '1';
+
+create trigger bdt_trigger after delete on convslot_test_child
+referencing old table as old_table
+for each statement execute function trigger_function1();
+delete from convslot_test_parent;
+
+drop table convslot_test_child, convslot_test_parent;
-- 
2.24.1

Drouvot, Bertrand

bdrouvot@amazon.com

almost 5 years ago

In reply to: Amit Langote (#2)

Re: [BUG] segfault during delete

Hi Amit,

On 2/24/21 9:12 AM, Amit Langote wrote:

Hi Bertrand,

On Wed, Feb 24, 2021 at 5:56 AM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:

Hi hackers,

Here is a scenario that segfault during delete (with version >= 12):

create table parent (
col1 text primary key
);

create table child (
col1 text primary key,
FOREIGN KEY (col1) REFERENCES parent(col1) on delete cascade
);

CREATE or replace FUNCTION trigger_function()
RETURNS TRIGGER
LANGUAGE PLPGSQL
AS $$
BEGIN
raise notice 'trigger = %, old table = %',
TG_NAME,
(select string_agg(old_table::text, ', ' order by col1) from old_table);
return NULL;
END;
$$
;

create trigger bdt_trigger after delete on child REFERENCING OLD TABLE AS old_table for each statement EXECUTE function trigger_function();

alter table child add column col2 text not null default 'tutu';
insert into parent(col1) values ('1');
insert into child(col1) values ('1');
insert into parent(col1) values ('2');
insert into child(col1) values ('2');
delete from parent;

produces:

CREATE TABLE
CREATE TABLE
CREATE FUNCTION
CREATE TRIGGER
ALTER TABLE
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
psql:./segfault.repro.sql:35: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
psql:./segfault.repro.sql:35: fatal: connection to server was lost

The column being added to the child relation needs to have a default value for the segfault to be triggered.

The stack is the following:

#0 0x000000000049aa59 in execute_attr_map_slot (attrMap=0x11736c8, in_slot=0x1179c90, out_slot=0x1179da8) at tupconvert.c:193
#1 0x0000000000700ec0 in AfterTriggerSaveEvent (estate=0x1171820, relinfo=0x1171cc8, event=1, row_trigger=true, oldslot=0x1179c90, newslot=0x0, recheckIndexes=0x0, modifiedCols=0x0, transition_capture=0x1172438) at trigger.c:5488
#2 0x00000000006fc6a8 in ExecARDeleteTriggers (estate=0x1171820, relinfo=0x1171cc8, tupleid=0x7ffd2b7f2e40, fdw_trigtuple=0x0, transition_capture=0x1172438) at trigger.c:2565
#3 0x0000000000770794 in ExecDelete (mtstate=0x1171a90, resultRelInfo=0x1171cc8, tupleid=0x7ffd2b7f2e40, oldtuple=0x0, planSlot=0x1173630, epqstate=0x1171b88, estate=0x1171820, processReturning=true, canSetTag=true, changingPart=false, tupleDeleted=0x0, epqreturnslot=0x0)
at nodeModifyTable.c:1128
#4 0x00000000007724cc in ExecModifyTable (pstate=0x1171a90) at nodeModifyTable.c:2259

I had a look at it, and it looks to me that the slot being reused in AfterTriggerSaveEvent() (when (map != NULL) and not (!storeslot)) has previously been damaged by FreeExecutorState() in standard_ExecutorEnd().

Right.

Thanks for reviewing the issue and the patch!

Then I ended up with the enclosed patch proposal that does not make the repro segfaulting and that is passing make check too.

Not sure that what is being done in the patch is the right/best approach to solve the issue though.

Hmm, I don't think we should be *freshly* allocating the
TupleTableSlot every time. Not having to do that is why the commit
ff11e7f4b9ae0 seems to have invented AfterTriggersTableData.storeslot
in the first place, that is, to cache the slot once created.

Oh, ok thanks for the explanation.

Problem with the current way as you've discovered is that the slot
gets allocated in the execution-span memory context, whereas the
AfterTriggersTableData instance, of which the slot is a part, is
supposed to last the entire (sub-)transaction.

Right.

So the correct fix I
think is to allocate the slot to be stored in
AfterTriggersTableData.storeslot in the transaction-span memory
context as well.

Right, that makes more sense that what my patch was doing.

Having taken care of that, another problem with the current way is
that it adds the slot to es_tupleTable by calling
ExecAllocTupleSlot(), which means that the slot will be released when
ExecResetTupleTable() is called on es_tupleTable as part of
ExecEndPlan(). That would defeat the point of allocating the slot in
the transaction-span context. So let's use MakeSingleTableSlot() to
allocate a standalone slot to be stored in
AfterTriggersTableData.storeslot and have AfterTriggerFreeQuery() call
ExecDropSingleTupleTableSlot() to release it.

PS: the same repro with a foreign key with update cascade, an update trigger and an update on the col1 column would segfault too (but does not with the enclosed patch proposal).

Actually, we also need to fix similar code in the block for populating
the transition NEW TABLE, because without doing so that block too can
crash similarly:

if (!TupIsNull(newslot) &&
((event == TRIGGER_EVENT_INSERT && insert_new_table) ||
(event == TRIGGER_EVENT_UPDATE && update_new_table)))

I've attached a patch with my suggested fixes and also test cases.
Please take a look.

I had a look and it looks good to me. Also the new regression tests are
doing it right and are segfaulting without your patch.

That's all good to me.

That fix should be back ported to 12 and 13, do you agree?

Thanks a lot for your help and explanations!

Bertrand

Amit Langote

amitlangote09@gmail.com

almost 5 years ago

In reply to: Drouvot, Bertrand (#3)

Re: [BUG] segfault during delete

On Wed, Feb 24, 2021 at 6:12 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:

On 2/24/21 9:12 AM, Amit Langote wrote:

I've attached a patch with my suggested fixes and also test cases.
Please take a look.

I had a look and it looks good to me. Also the new regression tests are
doing it right and are segfaulting without your patch.

That's all good to me.

Thanks.

That fix should be back ported to 12 and 13, do you agree?

Yeah, I think so.

--
Amit Langote
EDB: http://www.enterprisedb.com

Drouvot, Bertrand

bdrouvot@amazon.com

almost 5 years ago

In reply to: Amit Langote (#4)

Re: [BUG] segfault during delete

On 2/24/21 1:59 PM, Amit Langote wrote:

On Wed, Feb 24, 2021 at 6:12 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:

On 2/24/21 9:12 AM, Amit Langote wrote:

I've attached a patch with my suggested fixes and also test cases.
Please take a look.

I had a look and it looks good to me. Also the new regression tests are
doing it right and are segfaulting without your patch.

That's all good to me.

Thanks.

That fix should be back ported to 12 and 13, do you agree?

Yeah, I think so.

great, patch added to the CF.

Bertrand

Álvaro Herrera

alvherre@alvh.no-ip.org

almost 5 years ago

In reply to: Amit Langote (#4)

Re: [BUG] segfault during delete

Thanks Amit for working on this fix! It seems correct to me, so I pushed it with trivial changes. Thanks Bertrand for reporting the problem.

In addition to backpatching the code fix to pg12, I backpatched the test case to pg11. It worked fine for me (with no code changes), but it seems good to have it there just to make sure the buildfarm agrees with us on this.

Amit Langote

amitlangote09@gmail.com

almost 5 years ago

In reply to: Álvaro Herrera (#6)

Re: [BUG] segfault during delete

On Sun, Feb 28, 2021 at 6:14 AM Álvaro Herrera <alvherre@alvh.no-ip.org> wrote:

Thanks Amit for working on this fix! It seems correct to me, so I pushed it with trivial changes. Thanks Bertrand for reporting the problem.

Thanks Alvaro.

In addition to backpatching the code fix to pg12, I backpatched the test case to pg11. It worked fine for me (with no code changes), but it seems good to have it there just to make sure the buildfarm agrees with us on this.

Ah, good call.

--
Amit Langote
EDB: http://www.enterprisedb.com

Drouvot, Bertrand

bdrouvot@amazon.com

almost 5 years ago

In reply to: Amit Langote (#7)

Re: [BUG] segfault during delete

Thanks Amit and Alvaro for having helped to fix this bug so quickly.

Bertrand

Show quoted text

On 3/1/21 3:22 AM, Amit Langote wrote:

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.

On Sun, Feb 28, 2021 at 6:14 AM Álvaro Herrera <alvherre@alvh.no-ip.org> wrote:

Thanks Amit for working on this fix! It seems correct to me, so I pushed it with trivial changes. Thanks Bertrand for reporting the problem.

Thanks Alvaro.

In addition to backpatching the code fix to pg12, I backpatched the test case to pg11. It worked fine for me (with no code changes), but it seems good to have it there just to make sure the buildfarm agrees with us on this.

Ah, good call.

--
Amit Langote
EDB: http://www.enterprisedb.com