BUG #5842: Memory leak in PL/Python when taking slices of results

Started by Daniel Popowichalmost 15 years ago6 messages
#1Daniel Popowich
danielpopowich@gmail.com

The following bug has been logged online:

Bug reference: 5842
Logged by: Daniel Popowich
Email address: danielpopowich@gmail.com
PostgreSQL version: 8.4.6
Operating system: x86_64-pc-linux-gnu (Ubuntu 10.04.1)
Description: Memory leak in PL/Python when taking slices of results
Details:

There is a memory leak in PL/Python when taking slices of results. This was
first discussed in pgsql-general:

http://archives.postgresql.org/pgsql-general/2011-01/msg00367.php

Thanks to Alex Hunsaker for pinpointing the problem to slices. The
following code (a modification of Alex's) demonstrates the problem well...in
a database with plpythonu installed:

-- leaks big time
CREATE or replace FUNCTION py_leak() RETURNS void
LANGUAGE plpythonu
AS $$
results = plpy.execute("""select generate_series(0, 1000000)""")
slice_creates_leak = results[:]
for r in slice_creates_leak:
pass
return
$$;

-- does not leak
CREATE or replace FUNCTION py_no_leak() RETURNS void
LANGUAGE plpythonu
AS $$
results = plpy.execute("""select generate_series(0, 1000000)""")
for noleak in results:
pass
return
$$;

I traced the bug to PLy_result_slice() in src/pl/plpython/plpython.c. That
function calls the python API function PyList_GetSlice() and erroneously
increments the reference count before returning the result to the caller.
PyList_GetSlice returns a *new* reference, not a borrowed one, so it should
just return the object as-is.

A patch is attached below.

Cheers,

Dan Popowich

----------------------------------------------------------------------

*** src/pl/plpython/plpython.c~	2010-12-13 21:59:19.000000000 -0500
--- src/pl/plpython/plpython.c	2011-01-18 11:18:28.857831733 -0500
***************
*** 2328,2341 ****
  static PyObject *
  PLy_result_slice(PyObject *arg, Py_ssize_t lidx, Py_ssize_t hidx)
  {
- 	PyObject   *rv;
  	PLyResultObject *ob = (PLyResultObject *) arg;

! rv = PyList_GetSlice(ob->rows, lidx, hidx);
! if (rv == NULL)
! return NULL;
! Py_INCREF(rv);
! return rv;
}

  static int
--- 2328,2336 ----
  static PyObject *
  PLy_result_slice(PyObject *arg, Py_ssize_t lidx, Py_ssize_t hidx)
  {
  	PLyResultObject *ob = (PLyResultObject *) arg;

! return PyList_GetSlice(ob->rows, lidx, hidx);
}

static int

#2Bruce Momjian
bruce@momjian.us
In reply to: Daniel Popowich (#1)
Re: [BUGS] BUG #5842: Memory leak in PL/Python when taking slices of results

What has been done with this report/fix?

---------------------------------------------------------------------------

Daniel Popowich wrote:

The following bug has been logged online:

Bug reference: 5842
Logged by: Daniel Popowich
Email address: danielpopowich@gmail.com
PostgreSQL version: 8.4.6
Operating system: x86_64-pc-linux-gnu (Ubuntu 10.04.1)
Description: Memory leak in PL/Python when taking slices of results
Details:

There is a memory leak in PL/Python when taking slices of results. This was
first discussed in pgsql-general:

http://archives.postgresql.org/pgsql-general/2011-01/msg00367.php

Thanks to Alex Hunsaker for pinpointing the problem to slices. The
following code (a modification of Alex's) demonstrates the problem well...in
a database with plpythonu installed:

-- leaks big time
CREATE or replace FUNCTION py_leak() RETURNS void
LANGUAGE plpythonu
AS $$
results = plpy.execute("""select generate_series(0, 1000000)""")
slice_creates_leak = results[:]
for r in slice_creates_leak:
pass
return
$$;

-- does not leak
CREATE or replace FUNCTION py_no_leak() RETURNS void
LANGUAGE plpythonu
AS $$
results = plpy.execute("""select generate_series(0, 1000000)""")
for noleak in results:
pass
return
$$;

I traced the bug to PLy_result_slice() in src/pl/plpython/plpython.c. That
function calls the python API function PyList_GetSlice() and erroneously
increments the reference count before returning the result to the caller.
PyList_GetSlice returns a *new* reference, not a borrowed one, so it should
just return the object as-is.

A patch is attached below.

Cheers,

Dan Popowich

----------------------------------------------------------------------

*** src/pl/plpython/plpython.c~	2010-12-13 21:59:19.000000000 -0500
--- src/pl/plpython/plpython.c	2011-01-18 11:18:28.857831733 -0500
***************
*** 2328,2341 ****
static PyObject *
PLy_result_slice(PyObject *arg, Py_ssize_t lidx, Py_ssize_t hidx)
{
- 	PyObject   *rv;
PLyResultObject *ob = (PLyResultObject *) arg;

! rv = PyList_GetSlice(ob->rows, lidx, hidx);
! if (rv == NULL)
! return NULL;
! Py_INCREF(rv);
! return rv;
}

static int
--- 2328,2336 ----
static PyObject *
PLy_result_slice(PyObject *arg, Py_ssize_t lidx, Py_ssize_t hidx)
{
PLyResultObject *ob = (PLyResultObject *) arg;

! return PyList_GetSlice(ob->rows, lidx, hidx);
}

static int

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#3Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#2)
Re: Re: [BUGS] BUG #5842: Memory leak in PL/Python when taking slices of results

On Fri, Mar 11, 2011 at 6:02 AM, Bruce Momjian <bruce@momjian.us> wrote:

What has been done with this report/fix?

AFAIK, nothing. Added to 9.1 open items list.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#4Marko Kreen
markokr@gmail.com
In reply to: Robert Haas (#3)
Re: Re: [BUGS] BUG #5842: Memory leak in PL/Python when taking slices of results

On Thu, Mar 17, 2011 at 4:40 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 11, 2011 at 6:02 AM, Bruce Momjian <bruce@momjian.us> wrote:

What has been done with this report/fix?

AFAIK, nothing.  Added to 9.1 open items list.

The patch seems to do the right thing.

--
marko

#5Alvaro Herrera
alvherre@commandprompt.com
In reply to: Marko Kreen (#4)
Re: Re: [BUGS] BUG #5842: Memory leak in PL/Python when taking slices of results

Excerpts from Marko Kreen's message of jue mar 17 12:01:13 -0300 2011:

On Thu, Mar 17, 2011 at 4:40 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 11, 2011 at 6:02 AM, Bruce Momjian <bruce@momjian.us> wrote:

What has been done with this report/fix?

AFAIK, nothing.  Added to 9.1 open items list.

The patch seems to do the right thing.

Looking into this. AFAICT this needs to be backported.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#6Alvaro Herrera
alvherre@commandprompt.com
In reply to: Daniel Popowich (#1)
Re: BUG #5842: Memory leak in PL/Python when taking slices of results

Excerpts from Daniel Popowich's message of mar ene 18 13:30:04 -0300 2011:

I traced the bug to PLy_result_slice() in src/pl/plpython/plpython.c. That
function calls the python API function PyList_GetSlice() and erroneously
increments the reference count before returning the result to the caller.
PyList_GetSlice returns a *new* reference, not a borrowed one, so it should
just return the object as-is.

A patch is attached below.

Thanks, applied and backpatched all the way back to 8.2.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support