BUG #6763: Severe memory leak with arrays and hstore

Started by Любен Каравеловover 13 years ago15 messagesbugs
Jump to latest

The following bug has been logged on the website:

Bug reference: 6763
Logged by: Luben Karavelov
Email address: karavelov@mail.bg
PostgreSQL version: 9.1.4
Operating system: Debian Linux
Description:

I was trying to migrate a big ER table (user preferences) into a new table.

Old able:
old_prefs (
user_id integer NOT NULL,
name varchar NOT NULL,
value varchar NOT NULL
);

New table:
preferences (
user_id integer PRIMARY KEY,
prefs hstore;
);
The query I have tried to use is:

INSERT INTO preferences
SELECT user_id,hstore(array_agg(name), array_agg(value))
FROM old_prefs
GROUP BY user_id;

But the postgres backend consumed all the available memory (6G free + 4G
swap) and finally was killed by the kernel.

Its 8G RAM machine and here are the memory options from the config:

shared_buffers = 1800MB
temp_buffers = 16MB
work_mem = 64MB
maintenance_work_mem = 256MB
max_stack_depth = 2MB

Finally I have managed to migrate it in batches of 100-200k user ids and
disconnecting after each query in order to free the backend and leaked
memory.

Best regards
Luben

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Любен Каравелов (#1)
Re: BUG #6763: Severe memory leak with arrays and hstore

karavelov@mail.bg writes:

The query I have tried to use is:

INSERT INTO preferences
SELECT user_id,hstore(array_agg(name), array_agg(value))
FROM old_prefs
GROUP BY user_id;

But the postgres backend consumed all the available memory (6G free + 4G
swap) and finally was killed by the kernel.

Is there any reason to think this is an actual leak, and not just "you
were trying to compute an impractically large value"?

regards, tom lane

#3Craig Ringer
craig@2ndquadrant.com
In reply to: Любен Каравелов (#1)
Re: BUG #6763: Severe memory leak with arrays and hstore

On 07/26/2012 09:32 AM, karavelov@mail.bg wrote:

Finally I have managed to migrate it in batches of 100-200k user ids and
disconnecting after each query in order to free the backend and leaked
memory.

If you do it in batches, but you do NOT disconnect and reconnect, does
the backend continue to grow?

What's the output of:

SELECT count(sub.user_id), to_char(AVG(sub.n_prefs), '99999.99') FROM (
SELECT user_id, count(name) AS n_prefs FROM old_prefs GROUP BY user_id)
AS sub;

and

SELECT pg_size_pretty(pg_total_relation_size('old_prefs'));

?

--
Craig Ringer

In reply to: Craig Ringer (#3)
Re: BUG #6763: Severe memory leak with arrays and hstore

----- Craig Ringer (ringerc@ringerc.id.au), на 26.07.2012 в 11:17 -----

On 07/26/2012 09:32 AM, karavelov@mail.bg wrote: >> Finally I have

managed to migrate it in batches of 100-200k user ids and >> disconnecting
after each query in order to free the backend and leaked >> memory. > If
you do it in batches, but you do NOT disconnect and reconnect, does > the
backend continue to grow? > > What's the output of: > > SELECT
count(sub.user_id), to_char(AVG(sub.n_prefs), '99999.99') FROM ( > SELECT
user_id, count(name) AS n_prefs FROM old_prefs GROUP BY user_id) > AS sub;

and > > SELECT pg_size_pretty(pg_total_relation_size('old_prefs')); > >

? > > -- > Craig Ringer > Ok, I will do the procedure again with taking
notes on each step. First, here are the results of the queries you asked:
pg=> SELECT count(sub.user_id), to_char(AVG(sub.n_prefs), '99999.99') FROM
( SELECT user_id, count(name) AS n_prefs FROM old_prefs GROUP BY user_id)
AS sub; count | to_char ---------+----------- 1257262 | 2.26 (1 row) pg=>
SELECT pg_size_pretty(pg_total_relation_size('old_prefs')); pg_size_pretty
---------------- 264 MB (1 row) pg=> d old_prefs Table "public.old_prefs"
Column | Type | Modifiers ---------+-------------------+----------- user_id
| integer | not null name | character varying | not null value | character
varying | not null Indexes: "old_prefs_user_id_ids" btree (user_id) Also
there are max of 34 rows per user_id in old_prefs here is the new table I
just created: pg=> d new_preferences Table "public.new_preferences" Column
| Type | Modifiers ---------+---------+----------- user_id | integer | not
null prefs | hstore | Indexes: "new_preferences_pkey" PRIMARY KEY, btree
(user_id) Foreign-key constraints: "new_preferences_user_id_fkey" FOREIGN
KEY (user_id) REFERENCES users(user_id) ON DELETE CASCADE Here is a newly
connected the backend: root@pg:/var/log# ps axu | egrep '10.0.2.71|USER' |
grep -v grep USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 19121 0.0 0.0 2266944 3448 ? Ss 15:23 0:00 postgres: pg pg
10.0.2.71(51734) idle Migrating the first 200k of the users to the new
scheme: pg=> select count(*) from old_prefs where user_id INSERT INTO
new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value))
FROM old_prefs WHERE user_id commit; COMMIT Here is the backend: USER PID
%CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres 19121 0.8 7.1
3081772 582712 ? Ss 15:23 0:02 postgres: pg pg 10.0.2.71(51734) idle
Migrating another batch of users: pg => select count(*) from old_prefs
where user_id>=200000 and user_id INSERT INTO new_preferences SELECT
user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
user_id>=200000 AND user_id commit; COMMIT USER PID %CPU %MEM VSZ RSS TTY
STAT START TIME COMMAND postgres 19121 1.1 8.5 3176164 697444 ? Ss 15:23
0:05 postgres: pg pg 10.0.2.71(51734) idle Another batch: pg=> select
count(*) from old_prefs where user_id>=600000 and user_id INSERT INTO
new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value))
FROM old_prefs WHERE user_id>=600000 AND user_id commit; COMMIT USER PID
%CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres 19121 0.7 9.6
3210224 791404 ? Ss 15:23 0:08 postgres: pg pg 10.0.2.71(51734) idle
Another batch: pg=> select count(*) from old_prefs where user_id>=1100000
and user_id INSERT INTO new_preferences SELECT
user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
user_id>=1100000 AND user_id commit; COMMIT USER PID %CPU %MEM VSZ RSS TTY
STAT START TIME COMMAND postgres 19121 0.9 10.8 3277412 889860 ? Ss 15:23
0:11 postgres: pg pg 10.0.2.71(51734) idle So Pg backeng keep growing with
100M per 200k row from old table that became 50-60k rows in the new table
Proceeding with another batch: pg=> select count(*) from old_prefs where
user_id>=1600000 and user_id INSERT INTO new_preferences SELECT
user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
user_id>=1600000 AND user_id commit; COMMIT USER PID %CPU %MEM VSZ RSS TTY
STAT START TIME COMMAND postgres 19121 0.9 11.5 3277412 945560 ? Ss 15:23
0:15 postgres: pg pg 10.0.2.71(51734) idle Another batch: pg=> select
count(*) from old_prefs where user_id>=2400000 and user_id INSERT INTO
new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value))
FROM old_prefs WHERE user_id>=2400000 AND user_id commit; COMMIT USER PID
%CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres 19121 1.2 16.2
3736968 1331796 ? Ss 15:23 0:20 postgres: pg pg 10.0.2.71(51734) idle
Another batch: pg => select count(*) from old_prefs where user_id>=3400000
and user_id INSERT INTO new_preferences SELECT
user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
user_id>=3400000 AND user_id rollback; ROLLBACK Ops.. have to cleanup the
old_prefs, some users were deleted in the meantime: pg=> delete from
old_prefs where user_id not in (select user_id from users); DELETE 7 pg=>
commit; COMMIT USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 19121 1.4 26.4 4469520 2157588 ? Ss 15:23 0:29 postgres: pg pg
10.0.2.71(51734) idle Near 1G grow on rolled back transaction.... pg=>
INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name),
array_agg(value)) FROM old_prefs WHERE user_id>=3400000 AND user_id commit;
COMMIT USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres
19121 1.7 26.6 4479944 2180536 ? Ss 15:23 0:35 postgres: pg pg
10.0.2.71(51734) idle Another batch, bigger this time: pg=> select count(*)
from old_prefs where user_id>=3800000 and user_id INSERT INTO
new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value))
FROM old_prefs WHERE user_id>=3800000 AND user_id commit; COMMIT USER PID
%CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres 19121 1.9 33.1
5238968 2710756 ? Ss 15:23 0:45 postgres: pg pg 10.0.2.71(51734) idle
Another big batch: pg=> select count(*) from old_prefs where
user_id>=4200000 and user_id INSERT INTO new_preferences SELECT
user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
user_id>=4200000 AND user_id commit; COMMIT USER PID %CPU %MEM VSZ RSS TTY
STAT START TIME COMMAND postgres 19121 2.2 35.7 5438412 2918720 ? Ss 15:23
0:55 postgres: pg pg 10.0.2.71(51734) idle Now a smaller batch: pg=> select
count(*) from old_prefs where user_id>=4400000 and user_id INSERT INTO
new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value))
FROM old_prefs WHERE user_id>=4400000 AND user_id commit; COMMIT RSS keeps
growing: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres
19121 2.2 36.0 5438412 2943784 ? Ss 15:23 1:00 postgres: pg pg
10.0.2.71(51734) idle Lets see if a bigger batch will pass: pg=> select
count(*) from old_prefs where user_id>=4500000; count -------- 631911 (1
row) pg=> INSERT INTO new_preferences SELECT
user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
user_id>=4500000 GROUP BY user_id; INSERT 0 296541 pg=> commit; COMMIT Ok,
this time it passed, but the backend is over 4G USER PID %CPU %MEM VSZ RSS
TTY STAT START TIME COMMAND postgres 19121 2.2 50.0 7227968 4088928 ? Ss
15:23 1:17 postgres: pg pg 10.0.2.71(51734) idle Some observations: 1.
Backend does not free allocated memory between transactions. 2. Rolled back
transactions also leak memory. 3. Leaked memory is not linear to work done
- 2 transactions with 200k keys will leak less than 1 transaction with 400k
keys Regarding Tom's question: The old_prefs does not fit in work_mem but
is quite small regarding the total RAM. Isn't the "work_mem" a limit of the
memory each backend could allocate for sorting, grouping and aggregation?
My understanding is that bigger allocation will overflow to disk and will
not kill the server. I could be wrong though. Thanks in advance and best
regards -- Luben Karavelov

In reply to: Craig Ringer (#3)
Re: BUG #6763: Severe memory leak with arrays and hstore

On Jul 26, 2012, at 11:17 AM, Craig Ringer wrote:

On 07/26/2012 09:32 AM, karavelov@mail.bg wrote:

Finally I have managed to migrate it in batches of 100-200k user ids and
disconnecting after each query in order to free the backend and leaked
memory.

If you do it in batches, but you do NOT disconnect and reconnect, does the backend continue to grow?

What's the output of:

SELECT count(sub.user_id), to_char(AVG(sub.n_prefs), '99999.99') FROM (
SELECT user_id, count(name) AS n_prefs FROM old_prefs GROUP BY user_id) AS sub;

and

SELECT pg_size_pretty(pg_total_relation_size('old_prefs'));

?

--
Craig Ringer

-
Sorry for the broken formatting in the last message. Here it is again:

Ok, I will do the procedure again with taking notes on each step.

First, here are the results of the queries you asked:

pg=> SELECT count(sub.user_id), to_char(AVG(sub.n_prefs), '99999.99') FROM (
SELECT user_id, count(name) AS n_prefs FROM old_prefs GROUP BY user_id) AS sub;

count | to_char
---------+-----------
1257262 | 2.26
(1 row)

pg=> SELECT pg_size_pretty(pg_total_relation_size('old_prefs'));
pg_size_pretty
----------------
264 MB
(1 row)

pg=> \d old_prefs
Table "public.old_prefs"
Column | Type | Modifiers
---------+-------------------+-----------
user_id | integer | not null
name | character varying | not null
value | character varying | not null
Indexes:
"old_prefs_user_id_ids" btree (user_id)

Also there are max of 34 rows per user_id in old_prefs

Here is the new table I just created:

pg=> \d new_preferences
Table "public.new_preferences"
Column | Type | Modifiers
---------+---------+-----------
user_id | integer | not null
prefs | hstore |
Indexes:
"new_preferences_pkey" PRIMARY KEY, btree (user_id)
Foreign-key constraints:
"new_preferences_user_id_fkey" FOREIGN KEY (user_id) REFERENCES users(user_id) ON DELETE CASCADE

Here is the newly connected backend:

root@pg:/var/log# ps axu | egrep '10.0.2.71|USER' | grep -v grep
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 19121 0.0 0.0 2266944 3448 ? Ss 15:23 0:00 postgres: pg pg 10.0.2.71(51734) idle

Migrating the first 200k of the users to the new scheme:

pg=> select count(*) from old_prefs where user_id<200000;
count
--------
174767
(1 row)

pg=> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE user_id<200000 GROUP BY user_id;
INSERT 0 48993
pg=> commit;
COMMIT

Here is the backend:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 19121 0.8 7.1 3081772 582712 ? Ss 15:23 0:02 postgres: pg pg 10.0.2.71(51734) idle

Migrating another batch of users:

pg => select count(*) from old_prefs where user_id>=200000 and user_id<600000;
count
--------
193824
(1 row)

pg=> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE user_id>=200000 AND user_id<600000 GROUP BY user_id;
INSERT 0 54157
pg=> commit;
COMMIT

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 19121 1.1 8.5 3176164 697444 ? Ss 15:23 0:05 postgres: pg pg 10.0.2.71(51734) idle

Another batch:
pg=> select count(*) from old_prefs where user_id>=600000 and user_id<1100000;
count
--------
190504
(1 row)

pg=> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE user_id>=600000 AND user_id<1100000 GROUP BY user_id;
INSERT 0 56199
pg=> commit;
COMMIT

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 19121 0.7 9.6 3210224 791404 ? Ss 15:23 0:08 postgres: pg pg 10.0.2.71(51734) idle

Another batch:

pg=> select count(*) from old_prefs where user_id>=1100000 and user_id<1600000;
count
--------
194965
(1 row)

pg=> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE user_id>=1100000 AND user_id<1600000 GROUP BY user_id;
INSERT 0 60257
pg=> commit;
COMMIT

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 19121 0.9 10.8 3277412 889860 ? Ss 15:23 0:11 postgres: pg pg 10.0.2.71(51734) idle

So Pg backeng keep growing with 100M per 200k row from old table that became 50-60k rows in the new table
Proceeding with another batch:

pg=> select count(*) from old_prefs where user_id>=1600000 and user_id<2400000;
count
--------
170858
(1 row)

Time: 83,994 ms
pg=> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE user_id>=1600000 AND user_id<2400000 GROUP BY user_id;
INSERT 0 55447
pg=> commit;
COMMIT

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 19121 0.9 11.5 3277412 945560 ? Ss 15:23 0:15 postgres: pg pg 10.0.2.71(51734) idle

Another batch:

pg=> select count(*) from old_prefs where user_id>=2400000 and user_id<3400000;
count
--------
200614
(1 row)

Time: 83,409 ms
pg=> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE user_id>=2400000 AND user_id<3400000 GROUP BY user_id;
INSERT 0 87940
pg=> commit;
COMMIT

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 19121 1.2 16.2 3736968 1331796 ? Ss 15:23 0:20 postgres: pg pg 10.0.2.71(51734) idle

Another batch:

pg => select count(*) from old_prefs where user_id>=3400000 and user_id<3800000;
count
--------
161390
(1 row)

pg=> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE user_id>=3400000 AND user_id<3800000 GROUP BY user_id;
ERROR: insert or update on table "new_preferences" violates foreign key constraint "new_preferences_user_id_fkey"
DETAIL: Key (user_id)=(3615131) is not present in table "users".
pg=> rollback;
ROLLBACK

Ops.. have to cleanup the old_prefs, some users were deleted in the meantime:
pg=> delete from old_prefs where user_id not in (select user_id from users);
DELETE 7
pg=> commit;
COMMIT

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 19121 1.4 26.4 4469520 2157588 ? Ss 15:23 0:29 postgres: pg pg 10.0.2.71(51734) idle

Near 1G grow on rolled back transaction....

pg=> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE user_id>=3400000 AND user_id<3800000 GROUP BY user_id;
INSERT 0 131803
pg=> commit;
COMMIT

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 19121 1.7 26.6 4479944 2180536 ? Ss 15:23 0:35 postgres: pg pg 10.0.2.71(51734) idle

Another batch, bigger this time:

pg=> select count(*) from old_prefs where user_id>=3800000 and user_id<4200000;
count
--------
327374
(1 row)

pg=> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE user_id>=3800000 AND user_id<4200000 GROUP BY user_id;
INSERT 0 177044
pg=> commit;
COMMIT

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 19121 1.9 33.1 5238968 2710756 ? Ss 15:23 0:45 postgres: pg pg 10.0.2.71(51734) idle

Another big batch:

pg=> select count(*) from old_prefs where user_id>=4200000 and user_id<4400000;
count
--------
375352
(1 row)

pg=> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE user_id>=4200000 AND user_id<4400000 GROUP BY user_id;
INSERT 0 189095
pg=> commit;
COMMIT

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 19121 2.2 35.7 5438412 2918720 ? Ss 15:23 0:55 postgres: pg pg 10.0.2.71(51734) idle

Now a smaller batch:

pg=> select count(*) from old_prefs where user_id>=4400000 and user_id<4500000;
count
--------
219249
(1 row)

pg=> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE user_id>=4400000 AND user_id<4500000 GROUP BY user_id;
INSERT 0 99782
pg=> commit;
COMMIT

RSS keeps growing:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 19121 2.2 36.0 5438412 2943784 ? Ss 15:23 1:00 postgres: pg pg 10.0.2.71(51734) idle

Lets see if a bigger batch will pass:

pg=> select count(*) from old_prefs where user_id>=4500000;
count
--------
631911
(1 row)

pg=> INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE user_id>=4500000 GROUP BY user_id;
INSERT 0 296541
pg=> commit;
COMMIT

Ok, this time it passed, but the backend is over 4G
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 19121 2.2 50.0 7227968 4088928 ? Ss 15:23 1:17 postgres: pg pg 10.0.2.71(51734) idle

Some observations:

1. Backend does not free allocated memory between transactions.
2. Rolled back transactions also leak memory.
3. Leaked memory is not linear to work done - 2 transactions with 200k keys will leak less than 1 transaction with 400k keys

Regarding Tom's question:
The old_prefs does not fit in work_mem but is quite small regarding the total RAM. Isn't the "work_mem" a limit of the memory each backend could allocate for
sorting, grouping and aggregation? My understanding is that bigger allocation will overflow to disk and will not kill the server. I could be wrong though.

Thanks in advance and best regards

--
Luben Karavelov

#6Craig Ringer
craig@2ndquadrant.com
In reply to: Любен Каравелов (#4)
Re: BUG #6763: Severe memory leak with arrays and hstore

Woah. Your email client did something insane, and I cannot read your
message. See below:

Show quoted text

On 07/26/2012 09:37 PM, karavelov@mail.bg wrote:

----- Craig Ringer (ringerc@ringerc.id.au), на 26.07.2012 в 11:17 -----

On 07/26/2012 09:32 AM, karavelov@mail.bg wrote: >> Finally I have

managed to migrate it in batches of 100-200k user ids and >> disconnecting
after each query in order to free the backend and leaked >> memory. > If
you do it in batches, but you do NOT disconnect and reconnect, does > the
backend continue to grow? > > What's the output of: > > SELECT
count(sub.user_id), to_char(AVG(sub.n_prefs), '99999.99') FROM ( > SELECT
user_id, count(name) AS n_prefs FROM old_prefs GROUP BY user_id) > AS sub;

and > > SELECT pg_size_pretty(pg_total_relation_size('old_prefs')); > >

? > > -- > Craig Ringer > Ok, I will do the procedure again with taking
notes on each step. First, here are the results of the queries you asked:
pg=> SELECT count(sub.user_id), to_char(AVG(sub.n_prefs), '99999.99') FROM
( SELECT user_id, count(name) AS n_prefs FROM old_prefs GROUP BY user_id)
AS sub; count | to_char ---------+----------- 1257262 | 2.26 (1 row) pg=>
SELECT pg_size_pretty(pg_total_relation_size('old_prefs')); pg_size_pretty
---------------- 264 MB (1 row) pg=> d old_prefs Table "public.old_prefs"
Column | Type | Modifiers ---------+-------------------+----------- user_id
| integer | not null name | character varying | not null value | character
varying | not null Indexes: "old_prefs_user_id_ids" btree (user_id) Also
there are max of 34 rows per user_id in old_prefs here is the new table I
just created: pg=> d new_preferences Table "public.new_preferences" Column
| Type | Modifiers ---------+---------+----------- user_id | integer | not
null prefs | hstore | Indexes: "new_preferences_pkey" PRIMARY KEY, btree
(user_id) Foreign-key constraints: "new_preferences_user_id_fkey" FOREIGN
KEY (user_id) REFERENCES users(user_id) ON DELETE CASCADE Here is a newly
connected the backend: root@pg:/var/log# ps axu | egrep '10.0.2.71|USER' |
grep -v grep USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 19121 0.0 0.0 2266944 3448 ? Ss 15:23 0:00 postgres: pg pg
10.0.2.71(51734) idle Migrating the first 200k of the users to the new
scheme: pg=> select count(*) from old_prefs where user_id INSERT INTO
new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value))
FROM old_prefs WHERE user_id commit; COMMIT Here is the backend: USER PID
%CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres 19121 0.8 7.1
3081772 582712 ? Ss 15:23 0:02 postgres: pg pg 10.0.2.71(51734) idle
Migrating another batch of users: pg => select count(*) from old_prefs
where user_id>=200000 and user_id INSERT INTO new_preferences SELECT
user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
user_id>=200000 AND user_id commit; COMMIT USER PID %CPU %MEM VSZ RSS TTY
STAT START TIME COMMAND postgres 19121 1.1 8.5 3176164 697444 ? Ss 15:23
0:05 postgres: pg pg 10.0.2.71(51734) idle Another batch: pg=> select
count(*) from old_prefs where user_id>=600000 and user_id INSERT INTO
new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value))
FROM old_prefs WHERE user_id>=600000 AND user_id commit; COMMIT USER PID
%CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres 19121 0.7 9.6
3210224 791404 ? Ss 15:23 0:08 postgres: pg pg 10.0.2.71(51734) idle
Another batch: pg=> select count(*) from old_prefs where user_id>=1100000
and user_id INSERT INTO new_preferences SELECT
user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
user_id>=1100000 AND user_id commit; COMMIT USER PID %CPU %MEM VSZ RSS TTY
STAT START TIME COMMAND postgres 19121 0.9 10.8 3277412 889860 ? Ss 15:23
0:11 postgres: pg pg 10.0.2.71(51734) idle So Pg backeng keep growing with
100M per 200k row from old table that became 50-60k rows in the new table
Proceeding with another batch: pg=> select count(*) from old_prefs where
user_id>=1600000 and user_id INSERT INTO new_preferences SELECT
user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
user_id>=1600000 AND user_id commit; COMMIT USER PID %CPU %MEM VSZ RSS TTY
STAT START TIME COMMAND postgres 19121 0.9 11.5 3277412 945560 ? Ss 15:23
0:15 postgres: pg pg 10.0.2.71(51734) idle Another batch: pg=> select
count(*) from old_prefs where user_id>=2400000 and user_id INSERT INTO
new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value))
FROM old_prefs WHERE user_id>=2400000 AND user_id commit; COMMIT USER PID
%CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres 19121 1.2 16.2
3736968 1331796 ? Ss 15:23 0:20 postgres: pg pg 10.0.2.71(51734) idle
Another batch: pg => select count(*) from old_prefs where user_id>=3400000
and user_id INSERT INTO new_preferences SELECT
user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
user_id>=3400000 AND user_id rollback; ROLLBACK Ops.. have to cleanup the
old_prefs, some users were deleted in the meantime: pg=> delete from
old_prefs where user_id not in (select user_id from users); DELETE 7 pg=>
commit; COMMIT USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 19121 1.4 26.4 4469520 2157588 ? Ss 15:23 0:29 postgres: pg pg
10.0.2.71(51734) idle Near 1G grow on rolled back transaction.... pg=>
INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name),
array_agg(value)) FROM old_prefs WHERE user_id>=3400000 AND user_id commit;
COMMIT USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres
19121 1.7 26.6 4479944 2180536 ? Ss 15:23 0:35 postgres: pg pg
10.0.2.71(51734) idle Another batch, bigger this time: pg=> select count(*)
from old_prefs where user_id>=3800000 and user_id INSERT INTO
new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value))
FROM old_prefs WHERE user_id>=3800000 AND user_id commit; COMMIT USER PID
%CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres 19121 1.9 33.1
5238968 2710756 ? Ss 15:23 0:45 postgres: pg pg 10.0.2.71(51734) idle
Another big batch: pg=> select count(*) from old_prefs where
user_id>=4200000 and user_id INSERT INTO new_preferences SELECT
user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
user_id>=4200000 AND user_id commit; COMMIT USER PID %CPU %MEM VSZ RSS TTY
STAT START TIME COMMAND postgres 19121 2.2 35.7 5438412 2918720 ? Ss 15:23
0:55 postgres: pg pg 10.0.2.71(51734) idle Now a smaller batch: pg=> select
count(*) from old_prefs where user_id>=4400000 and user_id INSERT INTO
new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value))
FROM old_prefs WHERE user_id>=4400000 AND user_id commit; COMMIT RSS keeps
growing: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND postgres
19121 2.2 36.0 5438412 2943784 ? Ss 15:23 1:00 postgres: pg pg
10.0.2.71(51734) idle Lets see if a bigger batch will pass: pg=> select
count(*) from old_prefs where user_id>=4500000; count -------- 631911 (1
row) pg=> INSERT INTO new_preferences SELECT
user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE
user_id>=4500000 GROUP BY user_id; INSERT 0 296541 pg=> commit; COMMIT Ok,
this time it passed, but the backend is over 4G USER PID %CPU %MEM VSZ RSS
TTY STAT START TIME COMMAND postgres 19121 2.2 50.0 7227968 4088928 ? Ss
15:23 1:17 postgres: pg pg 10.0.2.71(51734) idle Some observations: 1.
Backend does not free allocated memory between transactions. 2. Rolled back
transactions also leak memory. 3. Leaked memory is not linear to work done
- 2 transactions with 200k keys will leak less than 1 transaction with 400k
keys Regarding Tom's question: The old_prefs does not fit in work_mem but
is quite small regarding the total RAM. Isn't the "work_mem" a limit of the
memory each backend could allocate for sorting, grouping and aggregation?
My understanding is that bigger allocation will overflow to disk and will
not kill the server. I could be wrong though. Thanks in advance and best
regards -- Luben Karavelov

#7Craig Ringer
craig@2ndquadrant.com
In reply to: Любен Каравелов (#5)
Re: BUG #6763: Severe memory leak with arrays and hstore

Hi all

Here's a fully self contained, automated test case that demonstrates the
leak.

Example output on my system, pasted as quote to stop Thunderbird
mangling it:

Show quoted text

$ ./hstore-leak-demo.sh
NOTICE: extension "hstore" already exists, skipping
CREATE EXTENSION
DROP TABLE
CREATE TABLE
CREATE TABLE
INSERT 0 100000
INSERT 0 80175
INSERT 0 72267
INSERT 0 50649
INSERT 0 30430
avg | max
--------------------+-----
3.3352100000000000 | 20
(1 row)

Backend PID is: 3167
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 3167 0.0 0.0 504276 4368 ? Ss 10:29 0:00 postgres: craig regress [local] idle
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 3167 91.0 13.0 2163384 1055652 ? Ss 10:29 0:00 postgres: craig regress [local] idle
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 3167 77.5 13.2 2163408 1071496 ? Ss 10:29 0:01 postgres: craig regress [local] idle
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 3167 108 13.4 2163408 1084056 ? Ss 10:29 0:02 postgres: craig regress [local] idle
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 3167 93.0 13.5 2163408 1092244 ? Ss 10:29 0:02 postgres: craig regress [local] idle
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 3167 86.0 13.6 2163408 1100444 ? Ss 10:29 0:03 postgres: craig regress [local] idle
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 3167 101 13.7 2163408 1108704 ? Ss 10:29 0:04 postgres: craig regress [local] idle
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 3167 93.8 13.8 2163408 1116896 ? Ss 10:29 0:04 postgres: craig regress [local] idle
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 3167 88.8 13.9 2163408 1125048 ? Ss 10:29 0:05 postgres: craig regress [local] idle
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 3167 99.1 14.0 2163408 1133228 ? Ss 10:29 0:05 postgres: craig regress [local] idle

Attachments:

hstore-leak-demo.shapplication/x-shellscript; name=hstore-leak-demo.shDownload
#8Craig Ringer
craig@2ndquadrant.com
In reply to: Любен Каравелов (#5)
Re: BUG #6763: Severe memory leak with arrays and hstore

On 07/26/2012 09:55 PM, luben karavelov wrote:

Ok, I will do the procedure again with taking notes on each step.

Thankyou for taking the time to do this in detail.

First, here are the results of the queries you asked:
count | to_char
---------+-----------
1257262 | 2.26

pg_size_pretty
----------------
264 MB

OK, none of that looks obviously insanely huge to me. That's a lot of
hstores, but with your query I wouldn't expect them to all sit around in
memory, and nothing individually is particularly huge.

Also there are max of 34 rows per user_id in old_prefs

Thanks, I forgot to ask that. Again, nothing particularly big.

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 19121 0.0 0.0 2266944 3448 ? Ss 15:23 0:00 postgres: pg pg 10.0.2.71(51734) idle
postgres 19121 0.8 7.1 3081772 582712 ? Ss 15:23 0:02 postgres: pg pg 10.0.2.71(51734) idle
postgres 19121 1.1 8.5 3176164 697444 ? Ss 15:23 0:05 postgres: pg pg 10.0.2.71(51734) idle
postgres 19121 0.7 9.6 3210224 791404 ? Ss 15:23 0:08 postgres: pg pg 10.0.2.71(51734) idle
postgres 19121 0.9 10.8 3277412 889860 ? Ss 15:23 0:11 postgres: pg pg 10.0.2.71(51734) idle
postgres 19121 0.9 11.5 3277412 945560 ? Ss 15:23 0:15 postgres: pg pg 10.0.2.71(51734) idle
postgres 19121 1.2 16.2 3736968 1331796 ? Ss 15:23 0:20 postgres: pg pg 10.0.2.71(51734) idle

1. Backend does not free allocated memory between transactions.
2. Rolled back transactions also leak memory.
3. Leaked memory is not linear to work done - 2 transactions with 200k keys will leak less than 1 transaction with 400k keys

Ouch. Sure looks like a leak to me, yeah.

Thankyou for taking the time to do this. It's common to see "leaks"
reported here that are really just queries that require lots of memory,
so the first response tends to be somewhat cautious.

This doesn't look like one of those reports.

I don't know if I can be much use with the actual tracking down of the
leak, but there certainly appears to be one, and you've provided a
pretty clear illustration of it.

--
Craig Ringer

#9Craig Ringer
craig@2ndquadrant.com
In reply to: Craig Ringer (#7)
Re: BUG #6763: Severe memory leak with arrays and hstore

On 07/27/2012 10:30 AM, Craig Ringer wrote:

Hi all

Here's a fully self contained, automated test case that demonstrates
the leak.

Gah. Except it doesn't now, as shown by the text I pasted. WTF?

I was *definitely* seeing this on my system. What's changed?

Will follow up.

#10Craig Ringer
craig@2ndquadrant.com
In reply to: Craig Ringer (#9)
Re: BUG #6763: Severe memory leak with arrays and hstore

OK, it's certainly leaking, but not in the same drastic way I was able
to reproduce manually a couple of times earlier. Self-contained test
case attached.

$ bash hstore-leak-demo.sh
NOTICE: extension "hstore" already exists, skipping
CREATE EXTENSION
DROP TABLE
CREATE TABLE
CREATE TABLE
INSERT 0 100000
INSERT 0 80014
INSERT 0 72434
INSERT 0 50340
INSERT 0 30077
avg | max
--------------------+-----
3.3286500000000000 | 21
(1 row)

Backend PID is: 4823
USER PID %CPU %MEM VSZ RSS TTY STAT START
TIME COMMAND
Before first postgres 4823 0.0 0.0 504276 4312 ? Ss 11:19
0:00 postgres: craig regress [local] idle
Iteration 1 postgres 4823 0.0 0.3 536908 25416 ? Rs 11:19
0:00 postgres: craig regress [local] INSERT
Iteration 2 postgres 4823 33.0 13.1 2163384 1056560 ? Rs 11:19
0:00 postgres: craig regress [local] INSERT
Iteration 3 postgres 4823 56.0 13.3 2163408 1072300 ? Rs 11:19
0:01 postgres: craig regress [local] INSERT
Iteration 4 postgres 4823 58.7 13.4 2163408 1084936 ? Rs 11:19
0:02 postgres: craig regress [local] INSERT
Iteration 20 postgres 4823 85.3 14.3 2173776 1156784 ? Rs 11:19
0:13 postgres: craig regress [local] INSERT
Iteration 40 postgres 4823 92.0 16.3 2176848 1314700 ? Rs 11:19
0:28 postgres: craig regress [local] INSERT
Iteration 60 postgres 4823 94.1 16.4 2173776 1322208 ? Rs 11:19
0:43 postgres: craig regress [local] INSERT
Iteration 80 postgres 4823 96.0 16.4 2173776 1323768 ? Rs 11:19
0:58 postgres: craig regress [local] INSERT
Iteration 100 postgres 4823 95.7 16.5 2176848 1329880 ? Rs 11:19
1:14 postgres: craig regress [local] INSERT
Iteration 120 postgres 4823 97.1 16.4 2176848 1329132 ? Rs 11:19
1:31 postgres: craig regress [local] INSERT
Iteration 140 postgres 4823 96.8 16.4 2176848 1329524 ? Rs 11:19
1:48 postgres: craig regress [local] INSERT

Attachments:

hstore-leak-demo.shapplication/x-shellscript; name=hstore-leak-demo.shDownload
#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Craig Ringer (#10)
Re: BUG #6763: Severe memory leak with arrays and hstore

Craig Ringer <ringerc@ringerc.id.au> writes:

OK, it's certainly leaking, but not in the same drastic way I was able
to reproduce manually a couple of times earlier. Self-contained test
case attached.

Using HEAD with stock parameters, I don't see any significant change in
allocated address space (VSZ): it sits right around 170MB. The reported
resident set size (RSS) starts from very little and rises to about
140MB, but I think that's just an artifact of the process touching more
and more of the shared-buffers array as it runs. The actual backend
memory consumption seems to be just a few meg.

I can get it to blow out memory if I set work_mem large enough to
persuade the planner to use hash aggregation (whereupon it tries to run
all the array_agg aggregates in parallel). However, that requires
work_mem set to a couple of GB, and I don't think it's really a bug when
the backend goes ahead and uses a couple of GB after I told it it could.

It's possible that the OP's problem boiled down to the planner making
a drastic underestimate of the number of GROUP BY groups, which could
mislead it into applying hash aggregation when there's not room; or
if the space used per aggregate was a lot more than the 8K that the
planner assumes when dealing with array_agg. But neither of those
errors seems to be happening in this example case.

regards, tom lane

#12Craig Ringer
craig@2ndquadrant.com
In reply to: Tom Lane (#11)
Re: BUG #6763: Severe memory leak with arrays and hstore

On 07/27/2012 01:47 PM, Tom Lane wrote:

Craig Ringer <ringerc@ringerc.id.au> writes:

OK, it's certainly leaking, but not in the same drastic way I was able
to reproduce manually a couple of times earlier. Self-contained test
case attached.

Using HEAD with stock parameters, I don't see any significant change in
allocated address space (VSZ): it sits right around 170MB. The reported
resident set size (RSS) starts from very little and rises to about
140MB, but I think that's just an artifact of the process touching more
and more of the shared-buffers array as it runs.

Gah. I should know better than that. Sorry.

This makes me wonder if the "leak-like" pattern I saw earlier was just a
similar growth in shared_buffers, and carried on more steeply rather
than tapering off because I was working with a smaller data set.

--
Craig Ringer

#13Craig Ringer
craig@2ndquadrant.com
In reply to: Craig Ringer (#8)
Re: BUG #6763: Severe memory leak with arrays and hstore

On 07/27/2012 10:31 AM, Craig Ringer wrote:

Ouch. Sure looks like a leak to me, yeah.

... but it turns out I'm not thinking very straight. I forgot to check
the size of your `shared_buffers' or `work_mem' and forgot to get you to
report `free -m' output after each run to measure _real_ memory use.

During testing I did got a backend crash when running an INSERT - which
I didn't expect given that I have only 20MB of work_mem and 256MB of
shared_buffers. I was surprised by that as I would not have expected
that query to require a huge gob of RAM. I didn't dig much further as
I'm on a swapless system with overcommit enabled ( 'cos the Java VM
does't work with overcommit off ) which isn't exactly a recommended Pg
configuration.

--
Craig Ringer

In reply to: Tom Lane (#11)
Re: BUG #6763: Severe memory leak with arrays and hstore

On Jul 27, 2012, at 8:47 AM, Tom Lane wrote:

Craig Ringer <ringerc@ringerc.id.au> writes:

OK, it's certainly leaking, but not in the same drastic way I was able
to reproduce manually a couple of times earlier. Self-contained test
case attached.

Using HEAD with stock parameters, I don't see any significant change in
allocated address space (VSZ): it sits right around 170MB. The reported
resident set size (RSS) starts from very little and rises to about
140MB, but I think that's just an artifact of the process touching more
and more of the shared-buffers array as it runs. The actual backend
memory consumption seems to be just a few meg.

I can get it to blow out memory if I set work_mem large enough to
persuade the planner to use hash aggregation (whereupon it tries to run
all the array_agg aggregates in parallel). However, that requires
work_mem set to a couple of GB, and I don't think it's really a bug when
the backend goes ahead and uses a couple of GB after I told it it could.

It's possible that the OP's problem boiled down to the planner making
a drastic underestimate of the number of GROUP BY groups, which could
mislead it into applying hash aggregation when there's not room; or
if the space used per aggregate was a lot more than the 8K that the
planner assumes when dealing with array_agg. But neither of those
errors seems to be happening in this example case.

regards, tom lane

It's good that the bug is not in HEAD. I was testing on 9.1.4. Definitely the size of RSS is not just references to shared buffers because they are 1.8G and the backend RSS got to 4G. My setting for work_mem is 64M, so it's quite conservative - the server was tuned for max concurrency, not for max throughput per single query.

Here is the plan of the insert:

=> explain INSERT INTO new_preferences SELECT user_id,hstore(array_agg(name), array_agg(value)) FROM old_prefs WHERE user_id<200000 GROUP BY user_id;
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Insert on new_preferences (cost=65615.89..65617.73 rows=67 width=36)
-> HashAggregate (cost=65615.89..65617.06 rows=67 width=68)
-> Bitmap Heap Scan on old_prefs (cost=17645.25..56555.65 rows=1208032 width=68)
Recheck Cond: (user_id < 200000)
-> Bitmap Index Scan on old_prefs_user_id_ids (cost=0.00..17343.24 rows=1208032 width=0)
Index Cond: (user_id < 200000)
(6 rows)

So, it is using hash aggregate as you have suggested. I have tried the query with disabled hash aggregate and it consumes a lot less memory - single query to migrate the whole table finishes with 900M RSS.

After "ANALYZE old_prefs" the planner chooses GroupAggregate instead of HashAggregate - you were right about missing statistics of old_prefs.

Thank you for figuring out this case

Best regards

--
luben karavelov

#15Craig Ringer
craig@2ndquadrant.com
In reply to: Любен Каравелов (#14)
Re: BUG #6763: Severe memory leak with arrays and hstore

On 07/27/2012 07:52 PM, luben karavelov wrote:

It's good that the bug is not in HEAD. I was testing on 9.1.4.

So was I, and while I thought I'd reproduced it I now suspect I was just
seeing shared_buffers touching.

Are you able to produce a self-contained SQL test that demonstrates the
leak?

Does the test program I posted behave differently on your system?

--
Craig Ringer