BUG #14403: Large numbers of CREATE OR UPDATE function causes memory exhaustion

Started by Will Pearsonover 9 years ago3 messagesbugs
Jump to latest
#1Will Pearson
will.pearson@digital.cabinet-office.gov.uk

The following bug has been logged on the website:

Bug reference: 14403
Logged by: Will Pearson
Email address: will.pearson@digital.cabinet-office.gov.uk
PostgreSQL version: 9.6.1
Operating system: OSX El Capitan (also in Linux Containers/RDS)
Description:

Summary:

We had a problem in production where RDS failed over availability zones. We
later found this was due to a memory overflow. Doing more digging it seems
to due to large numbers of CREATE OR UPDATE functions done by sequelize (a
node ORM) during session creation. We've mitigated the problem.

We've now isolated the problem from our code and have created the following
python script doing standard sql. But we could do with more understanding
of postgresql internals how to dig further (and any tuning possibilities).

----
Python script to replicate the problems

import psycopg2
import sys

def create_or_replace (n, cur, conn):
print ("Doing action " + str(n))
cur.execute("CREATE OR REPLACE FUNCTION pg_temp.testfunc(OUT response
test, OUT sequelize_caught_exception text) RETURNS RECORD AS $func_a7" +
str(n) + "$ BEGIN INSERT INTO test (\"id\") VALUES (" + str(n) + ")
RETURNING * INTO response; EXCEPTION WHEN unique_violation THEN GET STACKED
DIAGNOSTICS sequelize_caught_exception = PG_EXCEPTION_DETAIL; END $func_a7"
+ str(n) + "$ LANGUAGE plpgsql;")
cur.execute("SELECT (testfunc.response).*,
testfunc.sequelize_caught_exception FROM pg_temp.testfunc();")
cur.execute("DROP FUNCTION IF EXISTS pg_temp.testfunc();")
conn.commit()

def lots_of_create_or_replace(t, cur, conn) :
print ("Doing lots of actions for t" + str(t))
for n in range(1, 100000000):
create_or_replace(n+t, cur, conn)
print "Finished"

conn = psycopg2.connect("dbname =testdb")

cur = conn.cursor()
cur.execute("DELETE FROM test;")
lots_of_create_or_replace(int(sys.argv[1]), cur,conn)

#Usage#
Create the required tables and database then run it with a starting point.
The code is designed so that you can run multiple copies at different
points.

Look at the memory usage go up on the python processes.

Killing the python process doing the updates means the memory is freed. On
some tests against our dev environments it appeared that simply stopping
doing the requests to create sessions allowed the system to recover the
memory.

----
More details:

Seen in RDS (9.4.4)
locally in Mac 9.5.4
On Linux in a container 9.4.4

Sequelize code that shows it in the wild:

https://github.com/sequelize/sequelize/blob/95d3a5713499ab90b19d7a312b8cb878b26eb1b4/lib/dialects/abstract/query-generator.js#L154

----

Hypotheses I've looked at - Reason I don't think that is the problem:

Functions aren't being cleaned up properly - \df doesn't show more functions
during the run. Although a colleague points out that these temp functions
are per connection, so wouldn't show up. We're not sure how to see them.
No vacuums happening - There are vacuums happening during memory usage
increase.

Current hypothesis:

The process that frees up memory for temporary functions doesn't have a
chance to act because of the consistent number of inserts and deletions
pre-empting it. It is not a normal vacuum to deal with it.

Thanks for any help,

Will Pearson
WebOps

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Will Pearson (#1)
Re: BUG #14403: Large numbers of CREATE OR UPDATE function causes memory exhaustion

will.pearson@digital.cabinet-office.gov.uk writes:

[ lots-n-lots-of CREATE FUNCTION/execute function/DROP FUNCTION eat memory ]

I think probably what's going on here is that plpgsql is creating cache
entries for these functions on first execution, and not reclaiming them
before end of session.

I'm not terribly excited about adding overhead to make it keep track of
DROP FUNCTION operations, because this coding style seems less than great
anyway. Have you considered using DO blocks instead of short-lived
functions?

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#3Will Pearson
will.pearson@digital.cabinet-office.gov.uk
In reply to: Tom Lane (#2)
Re: BUG #14403: Large numbers of CREATE OR UPDATE function causes memory exhaustion

Hi Tom,

DO blocks do seem to do what we want. We'll talk with the sequelize
maintainers to try and get a patched based on this incorporated.
Thanks!

I didn't show the server crashes that we managed to create with the
above behaviour. We think it managed to make our RDS availability
zone failover.

How feasible would it be to try and free the cache entries at the
point of a failed memory allocation before exiting (or maybe
periodically)? I can imagine people using this functionality in a
saner way than us and a malicious user hammering that sane usage and
causing problems.

Thanks again for your help,

Will Pearson

```WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
DEBUG:  proc_exit(-1): 0 callbacks to make
DEBUG:  shmem_exit(-1): 0 before_shmem_exit callbacks to make
DEBUG:  shmem_exit(-1): 0 on_shmem_exit callbacks to make
DEBUG:  proc_exit(-1): 0 callbacks to make
DEBUG:  shmem_exit(-1): 0 on_shmem_exit callbacks to make
DEBUG:  proc_exit(-1): 0 callbacks to make
DEBUG:  writing stats file "pg_stat/db_16400.stat"
DEBUG:  shmem_exit(-1): 0 on_shmem_exit callbacks to make
DEBUG:  proc_exit(-1): 0 callbacks to make
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
DEBUG:  shmem_exit(-1): 0 before_shmem_exit callbacks to make
DEBUG:  shmem_exit(-1): 0 on_shmem_exit callbacks to make
DEBUG:  proc_exit(-1): 0 callbacks to make
DEBUG:  reaping dead processes
DEBUG:  shmem_exit(-1): 0 before_shmem_exit callbacks to make
DEBUG:  server process (PID 71) exited with exit code 2
DETAIL:  Failed process was running: SELECT "id", "username",
"createdAt", "updatedAt" FROM "People" AS "Person" WHERE
"Person"."username" = 'aaa-1973721468-bb@example.com' LIMIT 1;
DEBUG:  shmem_exit(-1): 0 on_shmem_exit callbacks to make
DEBUG:  proc_exit(-1): 0 callbacks to make
DEBUG:  removing temporary stats file "pg_stat_tmp/db_16400.stat"
DEBUG:  reaping dead processes
DEBUG:  server process (PID 69) exited with exit code 2
DETAIL:  Failed process was running: SELECT "id", "username",
"createdAt", "updatedAt" FROM "People" AS "Person" WHERE
"Person"."username" = 'aaa-604124768-bb@example.com' LIMIT 1;
DEBUG:  writing stats file "pg_stat/db_0.stat"
DEBUG:  removing temporary stats file "pg_stat_tmp/db_0.stat"
DEBUG:  reaping dead processes
DEBUG:  shmem_exit(-1): 0 before_shmem_exit callbacks to make
DEBUG:  reaping dead processes
DEBUG:  server process (PID 68) was terminated by signal 9: Killed
DETAIL:  Failed process was running: CREATE OR REPLACE FUNCTION
pg_temp.testfunc(OUT response "People", OUT sequelize_caught_exception
text) RETURNS RECORD AS $func_46fb58d873b34b40b4ace61f7ac30040$ BEGIN
INSERT INTO "People" ("id","username","createdAt","updatedAt") VALUES
(DEFAULT,'aaa-1616469740-bb@example.com','2016-10-27 14:15:07.166
+00:00','2016-10-27 14:15:07.166 +00:00') RETURNING * INTO response;
EXCEPTION WHEN unique_violation THEN GET STACKED DIAGNOSTICS
sequelize_caught_exception = PG_EXCEPTION_DETAIL; END
$func_46fb58d873b34b40b4ace61f7ac30040$ LANGUAGE plpgsql; SELECT
(testfunc.response).*, testfunc.sequelize_caught_exception FROM
pg_temp.testfunc(); DROP FUNCTION IF EXISTS pg_temp.testfunc();
DEBUG:  shmem_exit(-1): 0 on_shmem_exit callbacks to make
DEBUG:  proc_exit(-1): 0 callbacks to make
DEBUG:  reaping dead processes
DEBUG:  reaping dead processes
DEBUG:  server process (PID 24) exited with exit code 2
DETAIL:  Failed process was running: select count(*) from "People";
DEBUG:  reaping dead processes
DEBUG:  sending signal 9 to process 70
DEBUG:  reaping dead processes
DEBUG:  server process (PID 70) was terminated by signal 9: Killed
DETAIL:  Failed process was running: CREATE OR REPLACE FUNCTION
pg_temp.testfunc(OUT response "People", OUT sequelize_caught_exception
text) RETURNS RECORD AS $func_21fe0323d32f40bb817efe5a470becc9$ BEGIN
INSERT INTO "People" ("id","username","createdAt","updatedAt") VALUES
(DEFAULT,'aaa--1665959435-bb@example.com','2016-10-27 14:15:07.168
+00:00','2016-10-27 14:15:07.168 +00:00') RETURNING * INTO response;
EXCEPTION WHEN unique_violation THEN GET STACKED DIAGNOSTICS
sequelize_caught_exception = PG_EXCEPTION_DETAIL; END
$func_21fe0323d32f40bb817efe5a470becc9$ LANGUAGE plpgsql; SELECT
(testfunc.response).*, testfunc.sequelize_caught_exception FROM
pg_temp.testfunc(); DROP FUNCTION IF EXISTS pg_temp.testfunc();
LOG:  all server processes terminated; reinitializing

On 28 October 2016 at 19:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:

will.pearson@digital.cabinet-office.gov.uk writes:

[ lots-n-lots-of CREATE FUNCTION/execute function/DROP FUNCTION eat memory ]

I think probably what's going on here is that plpgsql is creating cache
entries for these functions on first execution, and not reclaiming them
before end of session.

I'm not terribly excited about adding overhead to make it keep track of
DROP FUNCTION operations, because this coding style seems less than great
anyway. Have you considered using DO blocks instead of short-lived
functions?

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs