work_mem RAM Accounting in PostgreSQL

Started by Alexandru Lazarevover 1 year ago4 messagesgeneral

alexandru.lazarev@gmail.com

over 1 year ago

Hello Everyone,

I have some questions regarding how the "work_mem" parameter affects the
overall RAM usage of PostgreSQL processes within a physical host or
container.

Each backend process during SQL execution may allocate N * "work_mem"
simultaneously. For example, if "work_mem" is set to 32MB and N=5 (i.e. 5
simultaneous and/or sequential ORDER and hash operations), and the initial
RAM usage (RSS - Resident Set Size) of the backend is 10MB, I would expect
the backend process to use 160MB (32MB * 5) + 10MB, resulting in a total
RAM usage of 170MB.

My questions are as follows:

1. What happens to the allocated "work_mem" after the execution of query
nodes? Are these memory allocations freed?
2. If they are freed, do they remain in the RSS of the PostgreSQL backend?
3. From various sources, I understand that these allocations are freed
after each node execution due to memory contexts, but they might remain in
some sort of backend memory pool for future reuse. Is this correct?
4. If so, will this memory be accounted for as used RAM on my
Linux/Container system after the backend returns to an idle state (e.g.,
connection pooling)?

Additionally: If the above is true, and my PostgreSQL host or container is
limited to 16GB of RAM, what would happen if I have 100 pooled connections,
each gradually allocating those 160MB? Will this memory be reclaimed (if I
understood it correctly as a kind of inactive anon mem), or will the OOM
Killer be triggered at some point (because it is real allocated memory)?

Thank you for your insights.

Best regards,
AlexL
Java Dev

Laurenz Albe

laurenz.albe@cybertec.at

over 1 year ago

In reply to: Alexandru Lazarev (#1)

Re: work_mem RAM Accounting in PostgreSQL

On Wed, 2024-11-13 at 21:09 +0200, Alexandru Lazarev wrote:

I have some questions regarding how the "work_mem" parameter affects the overall RAM
usage of PostgreSQL processes within a physical host or container.

Each backend process during SQL execution may allocate N * "work_mem" simultaneously.
For example, if "work_mem" is set to 32MB and N=5 (i.e. 5 simultaneous and/or sequential
ORDER and hash operations), and the initial RAM usage (RSS - Resident Set Size) of the
backend is 10MB, I would expect the backend process to use 160MB (32MB * 5) + 10MB,
resulting in a total RAM usage of 170MB.

The limit for a hash is hash_mem_multiplier * work_mem.

My questions are as follows:

1. What happens to the allocated "work_mem" after the execution of query nodes? Are
these memory allocations freed?

Yes.

2. If they are freed, do they remain in the RSS of the PostgreSQL backend?

They may, because the C library can choose not to actually free all the memory,
but retain some to serve future malloc() requests more efficiently.

3. From various sources, I understand that these allocations are freed after
each node execution due to memory contexts, but they might remain in some sort
of backend memory pool for future reuse. Is this correct?

I am not sure what you mean, but perhaps what I wrote above.

4. If so, will this memory be accounted for as used RAM on my Linux/Container
system after the backend returns to an idle state (e.g., connection pooling)?

Certainly.

Additionally: If the above is true, and my PostgreSQL host or container is limited
to 16GB of RAM, what would happen if I have 100 pooled connections, each gradually
allocating those 160MB? Will this memory be reclaimed (if I understood it correctly
as a kind of inactive anon mem), or will the OOM Killer be triggered at some point
(because it is real allocated memory)?

The backends won't keep that much memory allocated, so you need not worry.

100 connections are a lot. With efficient pooling, you could have fewer connections
and use your resources more efficiently.

Yours,
Laurenz Albe

Alexandru Lazarev

alexandru.lazarev@gmail.com

over 1 year ago

In reply to: Laurenz Albe (#2)

Re: work_mem RAM Accounting in PostgreSQL

Hi @Laurenz Albe <laurenz.albe@cybertec.at> & PG Community,
Highly appreciate your response. But I have some additional questions
(inline)

On Thu, Nov 14, 2024 at 4:40 PM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:

On Wed, 2024-11-13 at 21:09 +0200, Alexandru Lazarev wrote:

I have some questions regarding how the "work_mem" parameter affects the

overall RAM

usage of PostgreSQL processes within a physical host or container.

Each backend process during SQL execution may allocate N * "work_mem"

simultaneously.

For example, if "work_mem" is set to 32MB and N=5 (i.e. 5 simultaneous

and/or sequential

ORDER and hash operations), and the initial RAM usage (RSS - Resident

Set Size) of the

backend is 10MB, I would expect the backend process to use 160MB (32MB *

5) + 10MB,

resulting in a total RAM usage of 170MB.

The limit for a hash is hash_mem_multiplier * work_mem.

Yes, I know, I considered it above in "N".

My questions are as follows:

1. What happens to the allocated "work_mem" after the execution of query

nodes? Are

these memory allocations freed?

Yes.

2. If they are freed, do they remain in the RSS of the PostgreSQL

backend?

They may, because the C library can choose not to actually free all the
memory,
but retain some to serve future malloc() requests more efficiently.

This part is important for me to understand (my apologies, I am not a C
programmer and for me it is difficult reading PG sources :)): I wanted to
understand if in this part there isn't some kind of allocated memory
pooling in postgres. So, since some memory is freed then it is up to C
Library and underlying OS if it will be returned back to OS or will stay
somehow reserved - did I get it correctly? If so, then most probably this
"reserved" memory should be reclaimed under memory pressure conditions
(other backends processes try greedely allocate big chunks of memory)?

3. From various sources, I understand that these allocations are freed

after

each node execution due to memory contexts, but they might remain in

some sort

of backend memory pool for future reuse. Is this correct?

I am not sure what you mean, but perhaps what I wrote above.

I repeated myself, so Yes, You wrote above.

4. If so, will this memory be accounted for as used RAM on my

Linux/Container

system after the backend returns to an idle state (e.g., connection

pooling)?

Certainly.

Additionally: If the above is true, and my PostgreSQL host or container

is limited

to 16GB of RAM, what would happen if I have 100 pooled connections, each

gradually

allocating those 160MB? Will this memory be reclaimed (if I understood

it correctly

as a kind of inactive anon mem), or will the OOM Killer be triggered at

some point

(because it is real allocated memory)?

The backends won't keep that much memory allocated, so you need not worry.

Let me reformulate the use-case a bit differently - I would highly
appreciate any community inputs:
1) Let say 60 connections did some intensive memory consuming operations,
each one allocating up to 200MB of work_mem, then they finished and
returned to 'idle' state.
2) After that the rest of 40 connections starting doing "work_mem"
consuming operations in parallel, each one allocating up to 300MB, then
will N*"work_mem" allocated RAM from step#1 be reclaimed at point of step#2
when multiple backend need to allocate aggressively K*"wor_mem" memory?

Thank You

Show quoted text

100 connections are a lot. With efficient pooling, you could have fewer
connections
and use your resources more efficiently.

Yours,
Laurenz Albe

Laurenz Albe

laurenz.albe@cybertec.at

over 1 year ago

In reply to: Alexandru Lazarev (#3)

Re: work_mem RAM Accounting in PostgreSQL

On Mon, 2024-11-18 at 18:11 +0200, Alexandru Lazarev wrote:

Highly appreciate your response. But I have some additional questions (inline)

On Thu, Nov 14, 2024 at 4:40 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:

On Wed, 2024-11-13 at 21:09 +0200, Alexandru Lazarev wrote:

My questions are as follows:

1. What happens to the allocated "work_mem" after the execution of query nodes? Are
these memory allocations freed?

Yes.

2. If they are freed, do they remain in the RSS of the PostgreSQL backend?

They may, because the C library can choose not to actually free all the memory,
but retain some to serve future malloc() requests more efficiently.

This part is important for me to understand (my apologies, I am not a C programmer
and for me it is difficult reading PG sources :)): I wanted to understand if in this part
there isn't some kind of allocated memory pooling in postgres. So, since some memory is
freed then it is up to C Library and underlying OS if it will be returned back to OS or
will stay somehow reserved - did I get it correctly? If so, then most probably this
"reserved" memory should be reclaimed under memory pressure conditions (other backends
processes try greedely allocate big chunks of memory)?

Idle PostgreSQL backends won't keep moch memory allocated under normal conditions:
the catalog cache, prepared statements, private memory to buffer temporary tables,
potentially a materialized WITH HOLD cursor.

Yes, it is up to the C library to keep some free'd memory allocated for reuse.
This memory won't be reclaimed when there is memory pressure, since it belongs to
the process.

The backends won't keep that much memory allocated, so you need not worry.

Let me reformulate the use-case a bit differently - I would highly appreciate any community inputs:
1) Let say 60 connections did some intensive memory consuming operations, each one
allocating up to 200MB of work_mem, then they finished and returned to 'idle' state.

The lion's share of that memory will be freed when processing is done.

2) After that the rest of 40 connections starting doing "work_mem" consuming operations
in parallel, each one allocating up to 300MB, then will N*"work_mem" allocated RAM
from step#1 be reclaimed at point of step#2 when multiple backend need to allocate
aggressively K*"wor_mem" memory?

Most of the memory is already reclaimed when the first cconnections are done, so you
shouldn't be in trouble.
All this is somehow vague, since it depends on the C library implementation.

I recommend that you start tuning with a value for work_mem that is low enough
to be very certain that you won't go out of memory.
Then monitor temporary files. If too many get created, increase work_mem, all the
while watching your memory consumption to avoid going OOM.
As soon as only few temporary files are created, you are probably good.
If you cannot get there, try to tune your queries.

The king's way to avoid problems in this area is to use a moderately sized
connection pool. Then connections don't sit idle for long and cannot hog much
memory.

Yours,
Laurenz Albe