Memory leak with CALL to Procedure with COMMIT.

Started by Prabhat Sahuover 7 years ago13 messages
#1Prabhat Sahu
prabhat.sahu@enterprisedb.com

Hi Hackers,

While testing with PG procedure, I found a memory leak on HEAD, with below
steps:

postgres=# CREATE OR REPLACE PROCEDURE proc1(v1 INOUT INT)
AS $$
BEGIN
commit;
END; $$ LANGUAGE plpgsql;
CREATE PROCEDURE

postgres=# call proc1(10);
WARNING: Snapshot reference leak: Snapshot 0x23678e8 still referenced
v1
----
10
(1 row)

--

With Regards,

Prabhat Kumar Sahu
Skype ID: prabhat.sahu1984
EnterpriseDB Corporation

The Postgres Database Company

#2Michael Paquier
michael@paquier.xyz
In reply to: Prabhat Sahu (#1)
Re: Memory leak with CALL to Procedure with COMMIT.

On Mon, Jul 23, 2018 at 12:19:12PM +0530, Prabhat Sahu wrote:

While testing with PG procedure, I found a memory leak on HEAD, with below
steps:

postgres=# CREATE OR REPLACE PROCEDURE proc1(v1 INOUT INT)
AS $$
BEGIN
commit;
END; $$ LANGUAGE plpgsql;
CREATE PROCEDURE

postgres=# call proc1(10);
WARNING: Snapshot reference leak: Snapshot 0x23678e8 still referenced
v1
----
10
(1 row)

I can reproduce this issue on HEAD and v11, so an open item is added.
Peter, could you look at it?
--
Michael

#3Jonathan S. Katz
jkatz@postgresql.org
In reply to: Michael Paquier (#2)
Re: Memory leak with CALL to Procedure with COMMIT.

On Jul 23, 2018, at 3:06 AM, Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Jul 23, 2018 at 12:19:12PM +0530, Prabhat Sahu wrote:

While testing with PG procedure, I found a memory leak on HEAD, with below
steps:

postgres=# CREATE OR REPLACE PROCEDURE proc1(v1 INOUT INT)
AS $$
BEGIN
commit;
END; $$ LANGUAGE plpgsql;
CREATE PROCEDURE

postgres=# call proc1(10);
WARNING: Snapshot reference leak: Snapshot 0x23678e8 still referenced
v1
----
10
(1 row)

I can reproduce this issue on HEAD and v11, so an open item is added.
Peter, could you look at it?

I tested and was able to reproduce this on head. I also tried a few other other
and was able to reproduce it when the procedure contained a few read-only
statements prior to commit, where the argument passed in was designated "INOUT."

Scenarios 1 & 2 show the leak whereas 3 & 4 do not.

/** Scenario 1: Original scenario */
CREATE OR REPLACE PROCEDURE proc1(v1 INOUT INT)
AS $$
BEGIN
commit;
END; $$ LANGUAGE plpgsql;

CALL proc1(10);

WARNING: Snapshot reference leak: Snapshot 0x7f9519826d18 still referenced
CONTEXT: PL/pgSQL function proc1(integer) line 3 at COMMIT
v1
----
10
(1 row)

/** Scenario 2: call "perform" prior to the commit */
CREATE OR REPLACE PROCEDURE proc2(v1 INOUT INT)
AS $$
BEGIN
PERFORM v1;
COMMIT;
END; $$ LANGUAGE plpgsql;

CALL proc2(10);
WARNING: Snapshot reference leak: Snapshot 0x7f9519826d18 still referenced
CONTEXT: PL/pgSQL function proc2(integer) line 4 at COMMIT
v1
----
10
(1 row)

/** Scenario 3: argument is only IN */
CREATE OR REPLACE PROCEDURE proc3(v1 IN INT)
AS $$
BEGIN
PERFORM v1;
COMMIT;
END; $$ LANGUAGE plpgsql;

CALL proc3(10);
CALL

/** Scenario 4: Same as #2 but with a ROLLBACK */
CREATE OR REPLACE PROCEDURE proc4(v1 INOUT INT)
AS $$
BEGIN
PERFORM v1;
ROLLBACK;
END; $$ LANGUAGE plpgsql;

CALL proc4(10);
CALL

Jonathan

#4Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Jonathan S. Katz (#3)
Re: Memory leak with CALL to Procedure with COMMIT.

The commit that started this is

commit 59a85323d9d5927a852939fa93f09d142c72c91a
Author: Peter Eisentraut <peter_e@gmx.net>
Date: Mon Jul 9 13:58:08 2018

Add UtilityReturnsTuples() support for CALL

This ensures that prepared statements for CALL can return tuples.

The change whether a statement returns tuples or not causes some
different paths being taking deep in the portal code that set snapshot
in different ways. I haven't fully understood them yet. I plan to post
more tomorrow.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#5Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Peter Eisentraut (#4)
Re: Memory leak with CALL to Procedure with COMMIT.

On 14/08/2018 15:35, Peter Eisentraut wrote:

The commit that started this is

commit 59a85323d9d5927a852939fa93f09d142c72c91a
Author: Peter Eisentraut <peter_e@gmx.net>
Date: Mon Jul 9 13:58:08 2018

Add UtilityReturnsTuples() support for CALL

This ensures that prepared statements for CALL can return tuples.

The change whether a statement returns tuples or not causes some
different paths being taking deep in the portal code that set snapshot
in different ways. I haven't fully understood them yet.

The problem arises with the combination of CALL with output parameters
and doing a COMMIT inside the procedure.

When a CALL has output parameters, the portal uses the strategy
PORTAL_UTIL_SELECT instead of PORTAL_MULTI_QUERY. Using
PORTAL_UTIL_SELECT causes the portal's snapshot to be registered with
the current resource owner (portal->holdSnapshot). I'm not sure why
this is done for one kind of portal strategy but not the other.

Then, when the COMMIT inside the procedure is run, the current resource
owner is released and it complains that a snapshot is still registered
there.

There are, technically, three ways to fix this: silence the warning,
unregister the snapshot explicitly, or don't register the snapshot to
begin with.

Silencing the warning, other than by just deleting it, might require
passing in some more context information into ResourceOwnerRelease().
(The existing isTopLevel isn't quite the right thing.)

Unregistering the snapshot explicitly looks tricky because in
SPI_commit() or thereabouts we have no context information about which
snapshot might have been registered where.

Not registering the snapshot to begin with seems dubious, but see my
question above about why this is not done in the case of PORTAL_MULTI_QUERY.

Thoughts?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#6Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Peter Eisentraut (#5)
Re: Memory leak with CALL to Procedure with COMMIT.

On 2018-Aug-16, Peter Eisentraut wrote:

There are, technically, three ways to fix this: silence the warning,
unregister the snapshot explicitly, or don't register the snapshot to
begin with.

Silencing the warning, other than by just deleting it, might require
passing in some more context information into ResourceOwnerRelease().
(The existing isTopLevel isn't quite the right thing.)

Unregistering the snapshot explicitly looks tricky because in
SPI_commit() or thereabouts we have no context information about which
snapshot might have been registered where.

Hmm, this got me thinking whether the current resource owner setup for a
procedure is appropriate. Maybe the problem is that resowners are still
thought of in terms of transactions plus portals, so that if
transactions are done then everything is over; maybe we need to teach
them that procedures can outlive transactions, so you'd have a resowner
that's global to the procedure and then each transaction resowner is a
child of that one?

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#6)
Re: Memory leak with CALL to Procedure with COMMIT.

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Hmm, this got me thinking whether the current resource owner setup for a
procedure is appropriate. Maybe the problem is that resowners are still
thought of in terms of transactions plus portals, so that if
transactions are done then everything is over; maybe we need to teach
them that procedures can outlive transactions, so you'd have a resowner
that's global to the procedure and then each transaction resowner is a
child of that one?

The procedure still has to be running inside a query, and therefore
inside a portal, so the portal's resowner ought to be sufficiently
long-lived for any resources that ought to be procedure-lifetime.
So I doubt we need any more resowners. It's certainly possible that
something somewhere is assigning a particular resource to the wrong
resowner, of course.

regards, tom lane

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#5)
Re: Memory leak with CALL to Procedure with COMMIT.

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

The problem arises with the combination of CALL with output parameters
and doing a COMMIT inside the procedure.

When a CALL has output parameters, the portal uses the strategy
PORTAL_UTIL_SELECT instead of PORTAL_MULTI_QUERY. Using
PORTAL_UTIL_SELECT causes the portal's snapshot to be registered with
the current resource owner (portal->holdSnapshot). I'm not sure why
this is done for one kind of portal strategy but not the other.

I'm a bit confused by that statement. AFAICS, for both PortalRunUtility
and PortalRunMulti, setHoldSnapshot is only passed as true by
FillPortalStore, so registering the snapshot should happen (or not) the
same way for either portal execution strategy. What scenarios are you
comparing here, exactly?

In the long run where we want to think about allowing multiple rowsets to
be returned out of a procedure, it's fairly likely that PORTAL_UTIL_SELECT
isn't going to work anyway. Maybe we should be thinking about inventing a
different portal execution strategy for CALL. But I'm not sure we need
that just yet.

regards, tom lane

#9Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Tom Lane (#8)
Re: Memory leak with CALL to Procedure with COMMIT.

On 16/08/2018 19:26, Tom Lane wrote:

When a CALL has output parameters, the portal uses the strategy
PORTAL_UTIL_SELECT instead of PORTAL_MULTI_QUERY. Using
PORTAL_UTIL_SELECT causes the portal's snapshot to be registered with
the current resource owner (portal->holdSnapshot). I'm not sure why
this is done for one kind of portal strategy but not the other.

I'm a bit confused by that statement. AFAICS, for both PortalRunUtility
and PortalRunMulti, setHoldSnapshot is only passed as true by
FillPortalStore, so registering the snapshot should happen (or not) the
same way for either portal execution strategy. What scenarios are you
comparing here, exactly?

The call to PortalRunMulti() in FillPortalStore() is for
PORTAL_ONE_RETURNING and PORTAL_ONE_MOD_WITH, not for
PORTAL_MULTI_QUERY. For PORTAL_MULTI_QUERY, FillPortalStore() isn't
called; PortalRun() calls PortalRunMulti() directly with setHoldSnapshot
= false.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#10Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Peter Eisentraut (#9)
1 attachment(s)
Re: Memory leak with CALL to Procedure with COMMIT.

I think I've found a reasonable fix for this.

The problem arises with the combination of CALL with output parameters
and doing a COMMIT inside the procedure. When a CALL has output
parameters, the portal uses the strategy PORTAL_UTIL_SELECT instead of
PORTAL_MULTI_QUERY. Using PORTAL_UTIL_SELECT causes the portal's
snapshot to be registered with the current resource
owner (portal->holdSnapshot); see
9ee1cf04ab6bcefe03a11837b53f29ca9dc24c7a for the reason.

Normally, PortalDrop() unregisters the snapshot. If not, then
ResourceOwnerRelease() will print a warning about a snapshot leak on
transaction commit. A transaction commit normally drops all
portals (PreCommit_Portals()), except the active portal. So in case of
the active portal, we need to manually release the snapshot to avoid the
warning.

PreCommit_Portals() already contains some code that deals specially with
the active portal versus resource owners, so it seems reasonable to add
there.

I think this problem could theoretically apply to any
multiple-transaction utility command. For example, if we had a variant
of VACUUM that returned tuples, it would produce the same warning.
(This patch would fix that as well.)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-Fix-snapshot-leak-warning-for-some-procedures.patchtext/plain; charset=UTF-8; name=0001-Fix-snapshot-leak-warning-for-some-procedures.patch; x-mac-creator=0; x-mac-type=0Download
From d91f42f024afa7bc2764045b615296a87c4033b8 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter_e@gmx.net>
Date: Thu, 23 Aug 2018 15:13:48 +0200
Subject: [PATCH] Fix snapshot leak warning for some procedures

The problem arises with the combination of CALL with output parameters
and doing a COMMIT inside the procedure.  When a CALL has output
parameters, the portal uses the strategy PORTAL_UTIL_SELECT instead of
PORTAL_MULTI_QUERY.  Using PORTAL_UTIL_SELECT causes the portal's
snapshot to be registered with the current resource
owner (portal->holdSnapshot); see
9ee1cf04ab6bcefe03a11837b53f29ca9dc24c7a for the reason.

Normally, PortalDrop() unregisters the snapshot.  If not, then
ResourceOwnerRelease() will print a warning about a snapshot leak on
transaction commit.  A transaction commit normally drops all
portals (PreCommit_Portals()), except the active portal.  So in case of
the active portal, we need to manually release the snapshot to avoid the
warning.

Reported-by: Prabhat Sahu <prabhat.sahu@enterprisedb.com>
---
 src/backend/utils/mmgr/portalmem.c            | 14 +++++++--
 .../src/expected/plpgsql_transaction.out      | 30 +++++++++++++++++++
 .../plpgsql/src/sql/plpgsql_transaction.sql   | 25 ++++++++++++++++
 3 files changed, 67 insertions(+), 2 deletions(-)

diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 04ea32f49f..d34cab0eb8 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -689,13 +689,23 @@ PreCommit_Portals(bool isPrepare)
 
 		/*
 		 * Do not touch active portals --- this can only happen in the case of
-		 * a multi-transaction utility command, such as VACUUM.
+		 * a multi-transaction utility command, such as VACUUM, or a commit in
+		 * a procedure.
 		 *
 		 * Note however that any resource owner attached to such a portal is
-		 * still going to go away, so don't leave a dangling pointer.
+		 * still going to go away, so don't leave a dangling pointer.  Also
+		 * unregister any snapshots held by the portal, mainly to avoid
+		 * snapshot leak warnings from ResourceOwnerRelease().
 		 */
 		if (portal->status == PORTAL_ACTIVE)
 		{
+			if (portal->holdSnapshot)
+			{
+				if (portal->resowner)
+					UnregisterSnapshotFromOwner(portal->holdSnapshot,
+												portal->resowner);
+				portal->holdSnapshot = NULL;
+			}
 			portal->resowner = NULL;
 			continue;
 		}
diff --git a/src/pl/plpgsql/src/expected/plpgsql_transaction.out b/src/pl/plpgsql/src/expected/plpgsql_transaction.out
index 0b5a039b89..77a83adab5 100644
--- a/src/pl/plpgsql/src/expected/plpgsql_transaction.out
+++ b/src/pl/plpgsql/src/expected/plpgsql_transaction.out
@@ -463,6 +463,36 @@ SELECT * FROM test2;
  42
 (1 row)
 
+-- Test transaction in procedure with output parameters.  This uses a
+-- different portal strategy and different code paths in pquery.c.
+CREATE PROCEDURE transaction_test10a(INOUT x int)
+LANGUAGE plpgsql
+AS $$
+BEGIN
+  x := x + 1;
+  COMMIT;
+END;
+$$;
+CALL transaction_test10a(10);
+ x  
+----
+ 11
+(1 row)
+
+CREATE PROCEDURE transaction_test10b(INOUT x int)
+LANGUAGE plpgsql
+AS $$
+BEGIN
+  x := x - 1;
+  ROLLBACK;
+END;
+$$;
+CALL transaction_test10b(10);
+ x 
+---
+ 9
+(1 row)
+
 DROP TABLE test1;
 DROP TABLE test2;
 DROP TABLE test3;
diff --git a/src/pl/plpgsql/src/sql/plpgsql_transaction.sql b/src/pl/plpgsql/src/sql/plpgsql_transaction.sql
index 236db9bf2b..0ed9ab873a 100644
--- a/src/pl/plpgsql/src/sql/plpgsql_transaction.sql
+++ b/src/pl/plpgsql/src/sql/plpgsql_transaction.sql
@@ -387,6 +387,31 @@ CREATE PROCEDURE transaction_test9()
 SELECT * FROM test2;
 
 
+-- Test transaction in procedure with output parameters.  This uses a
+-- different portal strategy and different code paths in pquery.c.
+CREATE PROCEDURE transaction_test10a(INOUT x int)
+LANGUAGE plpgsql
+AS $$
+BEGIN
+  x := x + 1;
+  COMMIT;
+END;
+$$;
+
+CALL transaction_test10a(10);
+
+CREATE PROCEDURE transaction_test10b(INOUT x int)
+LANGUAGE plpgsql
+AS $$
+BEGIN
+  x := x - 1;
+  ROLLBACK;
+END;
+$$;
+
+CALL transaction_test10b(10);
+
+
 DROP TABLE test1;
 DROP TABLE test2;
 DROP TABLE test3;
-- 
2.18.0

#11Jonathan S. Katz
jkatz@postgresql.org
In reply to: Peter Eisentraut (#10)
Re: Memory leak with CALL to Procedure with COMMIT.

On Aug 23, 2018, at 9:34 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:

I think I've found a reasonable fix for this.

The problem arises with the combination of CALL with output parameters
and doing a COMMIT inside the procedure. When a CALL has output
parameters, the portal uses the strategy PORTAL_UTIL_SELECT instead of
PORTAL_MULTI_QUERY. Using PORTAL_UTIL_SELECT causes the portal's
snapshot to be registered with the current resource
owner (portal->holdSnapshot); see
9ee1cf04ab6bcefe03a11837b53f29ca9dc24c7a for the reason.

Normally, PortalDrop() unregisters the snapshot. If not, then
ResourceOwnerRelease() will print a warning about a snapshot leak on
transaction commit. A transaction commit normally drops all
portals (PreCommit_Portals()), except the active portal. So in case of
the active portal, we need to manually release the snapshot to avoid the
warning.

PreCommit_Portals() already contains some code that deals specially with
the active portal versus resource owners, so it seems reasonable to add
there.

I think this problem could theoretically apply to any
multiple-transaction utility command. For example, if we had a variant
of VACUUM that returned tuples, it would produce the same warning.
(This patch would fix that as well.)

Applied the patch against head. make check passed, all the tests I ran
earlier in this thread passed as well.

Reviewed the code - looks to match above reasoning, and comments
are clear.

From my perspective it looks good, but happy to hear from someone
more familiar than me with this area of the code.

Thanks,

Jonathan

#12Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Jonathan S. Katz (#11)
Re: Memory leak with CALL to Procedure with COMMIT.

On 23/08/2018 17:35, Jonathan S. Katz wrote:

Applied the patch against head. make check passed, all the tests I ran
earlier in this thread passed as well.

Reviewed the code - looks to match above reasoning, and comments
are clear.

From my perspective it looks good, but happy to hear from someone
more familiar than me with this area of the code.

committed

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#13Jonathan S. Katz
jkatz@postgresql.org
In reply to: Peter Eisentraut (#12)
Re: Memory leak with CALL to Procedure with COMMIT.

On Aug 27, 2018, at 5:13 PM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:

On 23/08/2018 17:35, Jonathan S. Katz wrote:

Applied the patch against head. make check passed, all the tests I ran
earlier in this thread passed as well.

Reviewed the code - looks to match above reasoning, and comments
are clear.

From my perspective it looks good, but happy to hear from someone
more familiar than me with this area of the code.

committed

Awesome, thank you! I’ve removed this from the Open Items list.

Jonathan