sublink [exists (select xxx group by grouping sets ())] causes an assertion error
Hello postgres hackers:
I recently notice these sql can lead to a assertion error in pg14 and older version. Here is an example:
postgres=> CREATE TABLE t1 (a int); CREATE TABLE postgres=> INSERT INTO t1 VALUES (1); INSERT 0 1 postgres=> SELECT EXISTS ( SELECT * FROM t1 GROUP BY GROUPING SETS ((a), generate_series (1, 262144)) ) AS result; server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request.Here is breaktrace (release v14.11)
#0 0x00007fdc4d58d387 in raise () from /lib64/libc.so.6 #1 0x00007fdc4d58ea78 in abort () from /lib64/libc.so.6 #2 0x00000000009479aa in ExceptionalCondition (conditionName=conditionName@entry=0xb27bf0 "!lt->writing || lt->buffer == NULL", errorType=errorType@entry=0x99a00b "FailedAssertion", fileName=fileName@entry=0xb27989 "logtape.c", lineNumber=lineNumber@en try=1279) at assert.c:69 #3 0x000000000097715b in LogicalTapeSetBlocks (lts=<optimized out>) at logtape.c:1279 #4 0x00000000009793ab in tuplesort_free (state=0x1ee30e0) at tuplesort.c:1402 #5 0x000000000097f2b9 in tuplesort_end (state=0x1ee30e0) at tuplesort.c:1468 #6 0x00000000006cd3a1 in ExecEndAgg (node=0x1ecf2c8) at nodeAgg.c:4401 #7 0x00000000006be3b1 in ExecEndNode (node=<optimized out>) at execProcnode.c:733 #8 0x00000000006b8514 in ExecEndPlan (estate=0x1ecf050, planstate=<optimized out>) at execMain.c:1416 #9 standard_ExecutorEnd (queryDesc=0x1e18af0) at execMain.c:493 #10 0x0000000000673779 in PortalCleanup (portal=<optimized out>) at portalcmds.c:299 #11 0x0000000000972b34 in PortalDrop (portal=0x1e586c0, isTopCommit=<optimized out>) at portalmem.c:502 #12 0x0000000000834140 in exec_simple_query (query_string=0x1df5020 "SELECT\nEXISTS (\nSELECT\n* FROM t1\nGROUP BY\nGROUPING SETS ((a), generate_series (1, 26214400))\n) AS result;") at postgres.c:1223 #13 0x0000000000835e37 in PostgresMain (argc=argc@entry=1, argv=argv@entry=0x7fff746c4470, dbname=<optimized out>, username=<optimized out>) at postgres.c:4513 #14 0x00000000007acdcb in BackendRun (port=<optimized out>, port=<optimized out>) at postmaster.c:4540 #15 BackendStartup (port=<optimized out>) at postmaster.c:4262 #16 ServerLoop () at postmaster.c:1748 #17 0x00000000007adc45 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x1defb50) at postmaster.c:1420 #18 0x000000000050a544 in main (argc=3, argv=0x1defb50) at main.c:209
The reason could be that, there are mutiple phase in grouping sets in this exists sublink. In executing phase, once function ExecAgg return a valid tupleslot, ExecSubPlan won't call exector_node repeatedly, ineed there is no needed. It causes unexpected status in sort_out and sort_in, so they don't pass assertion checking in ExecEndAgg.
I haven’t thought of a good solution yet. Only in my opinion, it is unreasonable to add processing for other specific types of execution nodes in function ExecSubplan, but there is no way to know in ExecAgg whether it is executed again.
Best regards,
Tinghai Zhao
Attachments:
0001-fix-crash-in-exists-sublink.patchapplication/octet-streamDownload
From cc9327b8f207c70fa35ba4bc20fac8ddbeb4eab1 Mon Sep 17 00:00:00 2001
From: "zhaotinghai.zth" <zhaotinghai.zth@alibaba-inc.com>
Date: Fri, 22 Mar 2024 16:41:31 +0800
Subject: [PATCH] patch
---
src/backend/executor/nodeSubplan.c | 9 +++++++++
src/backend/utils/sort/logtape.c | 15 +++++++++++++++
src/backend/utils/sort/tuplesort.c | 9 +++++++++
src/include/utils/logtape.h | 1 +
src/include/utils/tuplesort.h | 1 +
5 files changed, 35 insertions(+)
diff --git a/src/backend/executor/nodeSubplan.c b/src/backend/executor/nodeSubplan.c
index 45f8ef518f..fb6d575885 100644
--- a/src/backend/executor/nodeSubplan.c
+++ b/src/backend/executor/nodeSubplan.c
@@ -1147,6 +1147,15 @@ ExecSetParamPlan(SubPlanState *node, ExprContext *econtext)
prm->value = BoolGetDatum(true);
prm->isnull = false;
found = true;
+
+ if (IsA(planstate, AggState))
+ {
+ AggState *state = (AggState *) planstate;
+
+ tuplesort_cleanup_tape(state->sort_in);
+ tuplesort_cleanup_tape(state->sort_out);
+ }
+
break;
}
diff --git a/src/backend/utils/sort/logtape.c b/src/backend/utils/sort/logtape.c
index c361e47cee..180ca1892b 100644
--- a/src/backend/utils/sort/logtape.c
+++ b/src/backend/utils/sort/logtape.c
@@ -1281,3 +1281,18 @@ LogicalTapeSetBlocks(LogicalTapeSet *lts)
#endif
return lts->nBlocksWritten - lts->nHoleBlocks;
}
+
+void
+LogicalTapeFreeMem(LogicalTapeSet *lts)
+{
+ for (int i = 0; i < lts->nTapes; i++)
+ {
+ LogicalTape *lt = <s->tapes[i];
+
+ if (lt->buffer)
+ {
+ pfree(lt->buffer);
+ lt->buffer = NULL;
+ }
+ }
+}
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 9f1df87438..6050efceb8 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -4606,6 +4606,15 @@ tuplesort_attach_shared(Sharedsort *shared, dsm_segment *seg)
SharedFileSetAttach(&shared->fileset, seg);
}
+void
+tuplesort_cleanup_tape(Tuplesortstate *state)
+{
+ if (state == NULL)
+ return;
+
+ LogicalTapeFreeMem(state->tapeset);
+}
+
/*
* worker_get_identifier - Assign and return ordinal identifier for worker
*
diff --git a/src/include/utils/logtape.h b/src/include/utils/logtape.h
index 85d2e03c63..e5601f1a1e 100644
--- a/src/include/utils/logtape.h
+++ b/src/include/utils/logtape.h
@@ -75,6 +75,7 @@ extern void LogicalTapeSeek(LogicalTapeSet *lts, int tapenum,
long blocknum, int offset);
extern void LogicalTapeTell(LogicalTapeSet *lts, int tapenum,
long *blocknum, int *offset);
+extern void LogicalTapeFreeMem(LogicalTapeSet *lts);
extern long LogicalTapeSetBlocks(LogicalTapeSet *lts);
#endif /* LOGTAPE_H */
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index f94949370b..f726540d84 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -266,6 +266,7 @@ extern Size tuplesort_estimate_shared(int nworkers);
extern void tuplesort_initialize_shared(Sharedsort *shared, int nWorkers,
dsm_segment *seg);
extern void tuplesort_attach_shared(Sharedsort *shared, dsm_segment *seg);
+extern void tuplesort_cleanup_tape(Tuplesortstate *state);
/*
* These routines may only be called if randomAccess was specified 'true'.
--
2.39.3
"=?UTF-8?B?6LW15bqt5rW3KOW6reeroCk=?=" <zhaotinghai.zth@alibaba-inc.com> writes:
I recently notice these sql can lead to a assertion error in pg14 and older version. Here is an example:
postgres=> CREATE TABLE t1 (a int);
postgres=> INSERT INTO t1 VALUES (1);
postgres=> SELECT EXISTS ( SELECT * FROM t1 GROUP BY GROUPING SETS ((a), generate_series (1, 262144)) ) AS result;
server closed the connection unexpectedly
In current v14 this produces:
TRAP: FailedAssertion("!lt->writing || lt->buffer == NULL", File: "logtape.c", Line: 1279, PID: 928622)
Thanks for the report. I did some bisecting and found that the crash
appears at Jeff's commit c8aeaf3ab (which introduced this assertion)
and disappears at Heikki's c4649cce3 (which removed it). So I would
say that the problem is "this assertion is wrong", and we should fix
the problem by fixing the assertion, not by hacking around in distant
calling code. On the whole, since this code has been dead for
several versions, I'd be inclined to just remove the assertion.
I think it's quite risky because of the possibility that we reach
this function during post-transaction-abort cleanup, when there's
no very good reason to assume that the tapeset's been closed down
cleanly. (To be clear, that's not what's happening in the given
test case ... but I fear that it could.)
regards, tom lane
On Fri, 2024-03-22 at 12:28 -0400, Tom Lane wrote:
Thanks for the report. I did some bisecting and found that the crash
appears at Jeff's commit c8aeaf3ab (which introduced this assertion)
and disappears at Heikki's c4649cce3 (which removed it). So I would
say that the problem is "this assertion is wrong", and we should fix
the problem by fixing the assertion, not by hacking around in distant
calling code. On the whole, since this code has been dead for
several versions, I'd be inclined to just remove the assertion.
c4649cce3 didn't add additional calls to LogicalTapeSetBlocks(), so I'm
not sure if the removal of the Assert was related to his changes, or if
he just realized the assertion was wrong and removed it along the way?
Also, without the assertion, the word "should" in the comment is
ambiguous (does it mean "must not" or something else), and it still
exists in master. Do we care about the calculation being wrong if
there's an unfinished write? If not, I'll just clarify that the
calculation doesn't take into account still-buffered data. If we do
care, then something might need to be fixed.
Regards,
Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes:
On Fri, 2024-03-22 at 12:28 -0400, Tom Lane wrote:
Thanks for the report. I did some bisecting and found that the crash
appears at Jeff's commit c8aeaf3ab (which introduced this assertion)
and disappears at Heikki's c4649cce3 (which removed it). So I would
say that the problem is "this assertion is wrong", and we should fix
the problem by fixing the assertion, not by hacking around in distant
calling code. On the whole, since this code has been dead for
several versions, I'd be inclined to just remove the assertion.
c4649cce3 didn't add additional calls to LogicalTapeSetBlocks(), so I'm
not sure if the removal of the Assert was related to his changes, or if
he just realized the assertion was wrong and removed it along the way?
My guess is he just zapped it because the code block was dependent
on the "tape" abstraction which was going away. Heikki?
Also, without the assertion, the word "should" in the comment is
ambiguous (does it mean "must not" or something else), and it still
exists in master. Do we care about the calculation being wrong if
there's an unfinished write?
I think it's actually fine. The callers of LogicalTapeSetBlocks only
use the number for statistics or trace reporting, so precision isn't
critical in the first place, but I think what they care about is the
amount of data that's really been written out to files. The
comment should be clarified, for sure.
regards, tom lane