Memory leak in incremental sort re-scan

Started by Laurenz Albeover 2 years ago10 messages
#1Laurenz Albe
laurenz.albe@cybertec.at

ExecIncrementalSort() calls tuplesort_begin_common(), which creates the "TupleSort main"
and "TupleSort sort" memory contexts, and ExecEndIncrementalSort() calls tuplesort_end(),
which destroys them.
But ExecReScanIncrementalSort() only resets the memory contexts. Since the next call to
ExecIncrementalSort() will create them again, we end up leaking these contexts for every
re-scan.

Here is a reproducer with the regression test database:

SET enable_sort = off;
SET enable_hashjoin = off;
SET enable_mergejoin = off;
SET enable_material = off;

SELECT t.unique2, t2.r
FROM tenk1 AS t
JOIN (SELECT unique1,
row_number() OVER (ORDER BY hundred, thousand) AS r
FROM tenk1
OFFSET 0) AS t2
ON t.unique1 + 0 = t2.unique1
WHERE t.unique1 < 1000;

The execution plan:

Nested Loop
Join Filter: ((t.unique1 + 0) = tenk1.unique1)
-> Bitmap Heap Scan on tenk1 t
Recheck Cond: (unique1 < 1000)
-> Bitmap Index Scan on tenk1_unique1
Index Cond: (unique1 < 1000)
-> WindowAgg
-> Incremental Sort
Sort Key: tenk1.hundred, tenk1.thousand
Presorted Key: tenk1.hundred
-> Index Scan using tenk1_hundred on tenk1

A memory context dump at the end of the execution looks like this:

ExecutorState: 262144 total in 6 blocks; 74136 free (29 chunks); 188008 used
TupleSort main: 32832 total in 2 blocks; 7320 free (0 chunks); 25512 used
TupleSort sort: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
Caller tuples: 8192 total in 1 blocks (0 chunks); 7984 free (0 chunks); 208 used
TupleSort main: 32832 total in 2 blocks; 7256 free (0 chunks); 25576 used
TupleSort sort: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
Caller tuples: 8192 total in 1 blocks (0 chunks); 7984 free (0 chunks); 208 used
TupleSort main: 32832 total in 2 blocks; 7320 free (0 chunks); 25512 used
TupleSort sort: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
Caller tuples: 8192 total in 1 blocks (0 chunks); 7984 free (0 chunks); 208 used
[many more]
1903 more child contexts containing 93452928 total in 7597 blocks; 44073240 free (0 chunks); 49379688 used

The following patch fixes the problem for me:

--- a/src/backend/executor/nodeIncrementalSort.c
+++ b/src/backend/executor/nodeIncrementalSort.c
@@ -1145,21 +1145,16 @@ ExecReScanIncrementalSort(IncrementalSortState *node)
    node->execution_status = INCSORT_LOADFULLSORT;
    /*
-    * If we've set up either of the sort states yet, we need to reset them.
-    * We could end them and null out the pointers, but there's no reason to
-    * repay the setup cost, and because ExecIncrementalSort guards presorted
-    * column functions by checking to see if the full sort state has been
-    * initialized yet, setting the sort states to null here might actually
-    * cause a leak.
+    * Release tuplesort resources.
     */
    if (node->fullsort_state != NULL)
    {
-       tuplesort_reset(node->fullsort_state);
+       tuplesort_end(node->fullsort_state);
        node->fullsort_state = NULL;
    }
    if (node->prefixsort_state != NULL)
    {
-       tuplesort_reset(node->prefixsort_state);
+       tuplesort_end(node->prefixsort_state);
        node->prefixsort_state = NULL;
    }

The original comment hints that this might mot be the correct thing to do...

Yours,
Laurenz Albe

#2Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Laurenz Albe (#1)
Re: Memory leak in incremental sort re-scan

Hi,

On 6/15/23 13:48, Laurenz Albe wrote:

ExecIncrementalSort() calls tuplesort_begin_common(), which creates the "TupleSort main"
and "TupleSort sort" memory contexts, and ExecEndIncrementalSort() calls tuplesort_end(),
which destroys them.
But ExecReScanIncrementalSort() only resets the memory contexts. Since the next call to
ExecIncrementalSort() will create them again, we end up leaking these contexts for every
re-scan.

Here is a reproducer with the regression test database:

SET enable_sort = off;
SET enable_hashjoin = off;
SET enable_mergejoin = off;
SET enable_material = off;

SELECT t.unique2, t2.r
FROM tenk1 AS t
JOIN (SELECT unique1,
row_number() OVER (ORDER BY hundred, thousand) AS r
FROM tenk1
OFFSET 0) AS t2
ON t.unique1 + 0 = t2.unique1
WHERE t.unique1 < 1000;

The execution plan:

Nested Loop
Join Filter: ((t.unique1 + 0) = tenk1.unique1)
-> Bitmap Heap Scan on tenk1 t
Recheck Cond: (unique1 < 1000)
-> Bitmap Index Scan on tenk1_unique1
Index Cond: (unique1 < 1000)
-> WindowAgg
-> Incremental Sort
Sort Key: tenk1.hundred, tenk1.thousand
Presorted Key: tenk1.hundred
-> Index Scan using tenk1_hundred on tenk1

A memory context dump at the end of the execution looks like this:

ExecutorState: 262144 total in 6 blocks; 74136 free (29 chunks); 188008 used
TupleSort main: 32832 total in 2 blocks; 7320 free (0 chunks); 25512 used
TupleSort sort: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
Caller tuples: 8192 total in 1 blocks (0 chunks); 7984 free (0 chunks); 208 used
TupleSort main: 32832 total in 2 blocks; 7256 free (0 chunks); 25576 used
TupleSort sort: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
Caller tuples: 8192 total in 1 blocks (0 chunks); 7984 free (0 chunks); 208 used
TupleSort main: 32832 total in 2 blocks; 7320 free (0 chunks); 25512 used
TupleSort sort: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
Caller tuples: 8192 total in 1 blocks (0 chunks); 7984 free (0 chunks); 208 used
[many more]
1903 more child contexts containing 93452928 total in 7597 blocks; 44073240 free (0 chunks); 49379688 used

The following patch fixes the problem for me:

--- a/src/backend/executor/nodeIncrementalSort.c
+++ b/src/backend/executor/nodeIncrementalSort.c
@@ -1145,21 +1145,16 @@ ExecReScanIncrementalSort(IncrementalSortState *node)
node->execution_status = INCSORT_LOADFULLSORT;
/*
-    * If we've set up either of the sort states yet, we need to reset them.
-    * We could end them and null out the pointers, but there's no reason to
-    * repay the setup cost, and because ExecIncrementalSort guards presorted
-    * column functions by checking to see if the full sort state has been
-    * initialized yet, setting the sort states to null here might actually
-    * cause a leak.
+    * Release tuplesort resources.
*/
if (node->fullsort_state != NULL)
{
-       tuplesort_reset(node->fullsort_state);
+       tuplesort_end(node->fullsort_state);
node->fullsort_state = NULL;
}
if (node->prefixsort_state != NULL)
{
-       tuplesort_reset(node->prefixsort_state);
+       tuplesort_end(node->prefixsort_state);
node->prefixsort_state = NULL;
}

The original comment hints that this might mot be the correct thing to do...

I think it's correct, but I need to look at the code more closely - it's
been a while. The code is a bit silly, as it resets the tuplesort and
then throws away all the pointers - so what could the _end() break?

AFAICS the comment says that we can't just do tuplesort_reset and keep
the pointers, because some other code depends on them being NULL.

In hindsight, that's a bit awkward - it'd probably be better to have a
separate flag, which would allow us to just reset the tuplesort.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tomas Vondra (#2)
1 attachment(s)
Re: Memory leak in incremental sort re-scan

Tomas Vondra <tomas.vondra@enterprisedb.com> writes:

On 6/15/23 13:48, Laurenz Albe wrote:

ExecIncrementalSort() calls tuplesort_begin_common(), which creates the "TupleSort main"
and "TupleSort sort" memory contexts, and ExecEndIncrementalSort() calls tuplesort_end(),
which destroys them.
But ExecReScanIncrementalSort() only resets the memory contexts.

I think it's correct, but I need to look at the code more closely - it's
been a while. The code is a bit silly, as it resets the tuplesort and
then throws away all the pointers - so what could the _end() break?

The report at [1]/messages/by-id/db03c582-086d-e7cd-d4a1-3bc722f81765@inf.ethz.ch seems to be the same issue of ExecReScanIncrementalSort
leaking memory. I applied Laurenz's fix, and that greatly reduces the
speed of leak but doesn't remove the problem entirely. It looks like
the remaining issue is that the data computed by preparePresortedCols() is
recomputed each time we rescan the node. This seems entirely gratuitous,
because there's nothing in that that could change across rescans.
I see zero leakage in that example after applying the attached quick
hack. (It might be better to make the check in the caller, or to just
move the call to ExecInitIncrementalSort.)

regards, tom lane

[1]: /messages/by-id/db03c582-086d-e7cd-d4a1-3bc722f81765@inf.ethz.ch

Attachments:

stop-incremental-sort-rescan-leaks.patchtext/x-diff; charset=us-ascii; name=stop-incremental-sort-rescan-leaks.patchDownload
diff --git a/src/backend/executor/nodeIncrementalSort.c b/src/backend/executor/nodeIncrementalSort.c
index 34257ce34b..655b1c30e1 100644
--- a/src/backend/executor/nodeIncrementalSort.c
+++ b/src/backend/executor/nodeIncrementalSort.c
@@ -166,6 +166,9 @@ preparePresortedCols(IncrementalSortState *node)
 {
 	IncrementalSort *plannode = castNode(IncrementalSort, node->ss.ps.plan);
 
+	if (node->presorted_keys)
+		return;
+
 	node->presorted_keys =
 		(PresortedKeyData *) palloc(plannode->nPresortedCols *
 									sizeof(PresortedKeyData));
@@ -1140,7 +1143,6 @@ ExecReScanIncrementalSort(IncrementalSortState *node)
 	node->outerNodeDone = false;
 	node->n_fullsort_remaining = 0;
 	node->bound_Done = 0;
-	node->presorted_keys = NULL;
 
 	node->execution_status = INCSORT_LOADFULLSORT;
 
@@ -1154,12 +1156,12 @@ ExecReScanIncrementalSort(IncrementalSortState *node)
 	 */
 	if (node->fullsort_state != NULL)
 	{
-		tuplesort_reset(node->fullsort_state);
+		tuplesort_end(node->fullsort_state);
 		node->fullsort_state = NULL;
 	}
 	if (node->prefixsort_state != NULL)
 	{
-		tuplesort_reset(node->prefixsort_state);
+		tuplesort_end(node->prefixsort_state);
 		node->prefixsort_state = NULL;
 	}
 
#4Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Tom Lane (#3)
Re: Memory leak in incremental sort re-scan

On 6/15/23 22:11, Tom Lane wrote:

Tomas Vondra <tomas.vondra@enterprisedb.com> writes:

On 6/15/23 13:48, Laurenz Albe wrote:

ExecIncrementalSort() calls tuplesort_begin_common(), which creates the "TupleSort main"
and "TupleSort sort" memory contexts, and ExecEndIncrementalSort() calls tuplesort_end(),
which destroys them.
But ExecReScanIncrementalSort() only resets the memory contexts.

I think it's correct, but I need to look at the code more closely - it's
been a while. The code is a bit silly, as it resets the tuplesort and
then throws away all the pointers - so what could the _end() break?

The report at [1] seems to be the same issue of ExecReScanIncrementalSort
leaking memory.

Funny how these reports often come in pairs ...

I applied Laurenz's fix, and that greatly reduces the
speed of leak but doesn't remove the problem entirely. It looks like
the remaining issue is that the data computed by preparePresortedCols() is
recomputed each time we rescan the node. This seems entirely gratuitous,
because there's nothing in that that could change across rescans.

Yeah, I was wondering about that too when I skimmed over that code
earlier today.

I see zero leakage in that example after applying the attached quick
hack. (It might be better to make the check in the caller, or to just
move the call to ExecInitIncrementalSort.)

Thanks for looking. Are you planning to work on this and push the fix,
or do you want me to finish this up?

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tomas Vondra (#4)
Re: Memory leak in incremental sort re-scan

Tomas Vondra <tomas.vondra@enterprisedb.com> writes:

On 6/15/23 22:11, Tom Lane wrote:

I see zero leakage in that example after applying the attached quick
hack. (It might be better to make the check in the caller, or to just
move the call to ExecInitIncrementalSort.)

Thanks for looking. Are you planning to work on this and push the fix,
or do you want me to finish this up?

I'm happy to let you take it -- got lots of other stuff on my plate.

regards, tom lane

#6Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Tom Lane (#5)
Re: Memory leak in incremental sort re-scan

On 6/15/23 22:36, Tom Lane wrote:

Tomas Vondra <tomas.vondra@enterprisedb.com> writes:

On 6/15/23 22:11, Tom Lane wrote:

I see zero leakage in that example after applying the attached quick
hack. (It might be better to make the check in the caller, or to just
move the call to ExecInitIncrementalSort.)

Thanks for looking. Are you planning to work on this and push the fix,
or do you want me to finish this up?

I'm happy to let you take it -- got lots of other stuff on my plate.

OK, will do.

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#7James Coleman
jtc331@gmail.com
In reply to: Tomas Vondra (#6)
1 attachment(s)
Re: Memory leak in incremental sort re-scan

On Thu, Jun 15, 2023 at 6:35 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

On 6/15/23 22:36, Tom Lane wrote:

Tomas Vondra <tomas.vondra@enterprisedb.com> writes:

On 6/15/23 22:11, Tom Lane wrote:

I see zero leakage in that example after applying the attached quick
hack. (It might be better to make the check in the caller, or to just
move the call to ExecInitIncrementalSort.)

Thanks for looking. Are you planning to work on this and push the fix,
or do you want me to finish this up?

I'm happy to let you take it -- got lots of other stuff on my plate.

OK, will do.

I think the attached is enough to fix it -- rather than nulling out
the sort states in rescan, we can reset them (as the comment says),
but not set them to null (we also have the same mistake with
presorted_keys). That avoids unnecessary recreation of the sort
states, but it also fixes the problem Tom noted as well: the call to
preparePresortedCols() is already guarded by a test on fullsort_state
being NULL, so with this change we also won't unnecessarily redo that
work.

Regards,
James Coleman

Attachments:

v2-0001-Fix-memory-leak-in-incremental-sort-rescan.patchapplication/octet-stream; name=v2-0001-Fix-memory-leak-in-incremental-sort-rescan.patchDownload
From e5848bc3f9de7becc2a5273ff0e0d4685f7905fa Mon Sep 17 00:00:00 2001
From: jcoleman <jtc331@gmail.com>
Date: Wed, 21 Jun 2023 14:31:44 -0400
Subject: [PATCH v2] Fix memory leak in incremental sort rescan

There are multiple issues here that this solves:
1. The sort states are incorrectly set to NULL during a rescan (despite
   the comment saying otherwise), and so we never end them.
2. presorted_keys is also incorrectly set to NULL during a rescan.
2. Because fullsort_state is improperly NULL we additionally call
   preparePresortedCols more than once, leaking memory there as well.
---
 src/backend/executor/nodeIncrementalSort.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/src/backend/executor/nodeIncrementalSort.c b/src/backend/executor/nodeIncrementalSort.c
index 34257ce34b..7683e3341c 100644
--- a/src/backend/executor/nodeIncrementalSort.c
+++ b/src/backend/executor/nodeIncrementalSort.c
@@ -1140,7 +1140,6 @@ ExecReScanIncrementalSort(IncrementalSortState *node)
 	node->outerNodeDone = false;
 	node->n_fullsort_remaining = 0;
 	node->bound_Done = 0;
-	node->presorted_keys = NULL;
 
 	node->execution_status = INCSORT_LOADFULLSORT;
 
@@ -1153,15 +1152,9 @@ ExecReScanIncrementalSort(IncrementalSortState *node)
 	 * cause a leak.
 	 */
 	if (node->fullsort_state != NULL)
-	{
 		tuplesort_reset(node->fullsort_state);
-		node->fullsort_state = NULL;
-	}
 	if (node->prefixsort_state != NULL)
-	{
 		tuplesort_reset(node->prefixsort_state);
-		node->prefixsort_state = NULL;
-	}
 
 	/*
 	 * If chgParam of subnode is not null, then the plan will be re-scanned by
-- 
2.39.2 (Apple Git-143)

#8Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Tomas Vondra (#6)
Re: Memory leak in incremental sort re-scan

On Fri, 2023-06-16 at 00:34 +0200, Tomas Vondra wrote:

On 6/15/23 22:36, Tom Lane wrote:

Tomas Vondra <tomas.vondra@enterprisedb.com> writes:

On 6/15/23 22:11, Tom Lane wrote:

I see zero leakage in that example after applying the attached quick
hack.  (It might be better to make the check in the caller, or to just
move the call to ExecInitIncrementalSort.)

Thanks for looking. Are you planning to work on this and push the fix,
or do you want me to finish this up?

I'm happy to let you take it -- got lots of other stuff on my plate.

OK, will do.

It would be cool if we could get that into the next minor release in August.

Yours,
Laurenz Albe

#9Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Laurenz Albe (#8)
Re: Memory leak in incremental sort re-scan

On 6/29/23 13:49, Laurenz Albe wrote:

On Fri, 2023-06-16 at 00:34 +0200, Tomas Vondra wrote:

On 6/15/23 22:36, Tom Lane wrote:

Tomas Vondra <tomas.vondra@enterprisedb.com> writes:

On 6/15/23 22:11, Tom Lane wrote:

I see zero leakage in that example after applying the attached quick
hack.  (It might be better to make the check in the caller, or to just
move the call to ExecInitIncrementalSort.)

Thanks for looking. Are you planning to work on this and push the fix,
or do you want me to finish this up?

I'm happy to let you take it -- got lots of other stuff on my plate.

OK, will do.

It would be cool if we could get that into the next minor release in August.

FWIW I've pushed the fix prepared by James a couple days ago. Thanks for
the report!

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#10Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Tomas Vondra (#9)
Re: Memory leak in incremental sort re-scan

On Sun, 2023-07-02 at 20:13 +0200, Tomas Vondra wrote:

FWIW I've pushed the fix prepared by James a couple days ago. Thanks for
the report!

Thanks, and sorry for being pushy.

Yours,
Laurenz Albe