Add the ability to limit the amount of memory that can be allocated to backends.
Hi Hackers,
Add the ability to limit the amount of memory that can be allocated to
backends.
This builds on the work that adds backend memory allocated to
pg_stat_activity
/messages/by-id/67bb5c15c0489cb499723b0340f16e10c22485ec.camel@crunchydata.com
Both patches are attached.
Add GUC variable max_total_backend_memory.
Specifies a limit to the amount of memory (MB) that may be allocated to
backends in total (i.e. this is not a per user or per backend limit).
If unset, or set to 0 it is disabled. It is intended as a resource to
help avoid the OOM killer. A backend request that would push the total
over the limit will be denied with an out of memory error causing that
backends current query/transaction to fail. Due to the dynamic nature
of memory allocations, this limit is not exact. If within 1.5MB of the
limit and two backends request 1MB each at the same time both may be
allocated exceeding the limit. Further requests will not be allocated
until dropping below the limit. Keep this in mind when setting this
value to avoid the OOM killer. Currently, this limit does not affect
auxiliary backend processes, this list of non-affected backend
processes is open for discussion as to what should/should not be
included. Backend memory allocations are displayed in the
pg_stat_activity view.
--
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.
reid.thompson@crunchydata.com
www.crunchydata.com
On Wed, Aug 31, 2022 at 12:50:19PM -0400, Reid Thompson wrote:
Hi Hackers,
Add the ability to limit the amount of memory that can be allocated to
backends.This builds on the work that adds backend memory allocated to
pg_stat_activity
/messages/by-id/67bb5c15c0489cb499723b0340f16e10c22485ec.camel@crunchydata.com
Both patches are attached.
You should name the patches with different prefixes, like
001,002,003 Otherwise, cfbot may try to apply them in the wrong order.
git format-patch is the usual tool for that.
+ Specifies a limit to the amount of memory (MB) that may be allocated to
MB are just the default unit, right ?
The user should be allowed to write max_total_backend_memory='2GB'
+ backends in total (i.e. this is not a per user or per backend limit). + If unset, or set to 0 it is disabled. A backend request that would push + the total over the limit will be denied with an out of memory error + causing that backends current query/transaction to fail. Due to the dynamic
backend's
+ nature of memory allocations, this limit is not exact. If within 1.5MB of + the limit and two backends request 1MB each at the same time both may be + allocated exceeding the limit. Further requests will not be allocated until
allocated, and exceed the limit
+bool +exceeds_max_total_bkend_mem(uint64 allocation_request) +{ + bool result = false; + + if (MyAuxProcType != NotAnAuxProcess) + return result;
The double negative is confusing, so could use a comment.
+ /* Convert max_total_bkend_mem to bytes for comparison */ + if (max_total_bkend_mem && + pgstat_get_all_backend_memory_allocated() + + allocation_request > (uint64)max_total_bkend_mem * 1024 * 1024) + { + /* + * Explicitely identify the OOM being a result of this + * configuration parameter vs a system failure to allocate OOM. + */ + elog(WARNING, + "request will exceed postgresql.conf defined max_total_backend_memory limit (%lu > %lu)", + pgstat_get_all_backend_memory_allocated() + + allocation_request, (uint64)max_total_bkend_mem * 1024 * 1024);
I think it should be ereport() rather than elog(), which is
internal-only, and not-translated.
+ {"max_total_backend_memory", PGC_SIGHUP, RESOURCES_MEM, + gettext_noop("Restrict total backend memory allocations to this max."), + gettext_noop("0 turns this feature off."), + GUC_UNIT_MB + }, + &max_total_bkend_mem, + 0, 0, INT_MAX, + NULL, NULL, NULL
I think this needs a maximum like INT_MAX/1024/1024
+uint64 +pgstat_get_all_backend_memory_allocated(void) +{
...
+ for (i = 1; i <= NumBackendStatSlots; i++) + {
It's looping over every backend for each allocation.
Do you know if there's any performance impact of that ?
I think it may be necessary to track the current allocation size in
shared memory (with atomic increments?). Maybe decrements would need to
be exactly accounted for, or otherwise Assert() that the value is not
negative. I don't know how expensive it'd be to have conditionals for
each decrement, but maybe the value would only be decremented at
strategic times, like at transaction commit or backend shutdown.
--
Justin
At Wed, 31 Aug 2022 12:50:19 -0400, Reid Thompson <reid.thompson@crunchydata.com> wrote in
Hi Hackers,
Add the ability to limit the amount of memory that can be allocated to
backends.
The patch seems to limit both of memory-context allocations and DSM
allocations happen on a specific process by the same budget. In the
fist place I don't think it's sensible to cap the amount of DSM
allocations by per-process budget.
DSM is used by pgstats subsystem. There can be cases where pgstat
complains for denial of DSM allocation after the budget has been
exhausted by memory-context allocations, or every command complains
for denial of memory-context allocation after once the per-process
budget is exhausted by DSM allocations. That doesn't seem reasonable.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Hi,
On 8/31/22 6:50 PM, Reid Thompson wrote:
Hi Hackers,
Add the ability to limit the amount of memory that can be allocated to
backends.
Thanks for the patch.
+ 1 on the idea.
Specifies a limit to the amount of memory (MB) that may be allocated to
backends in total (i.e. this is not a per user or per backend limit).
If unset, or set to 0 it is disabled. It is intended as a resource to
help avoid the OOM killer. A backend request that would push the total
over the limit will be denied with an out of memory error causing that
backends current query/transaction to fail.
I'm not sure we are choosing the right victims here (aka the ones that
are doing the request that will push the total over the limit).
Imagine an extreme case where a single backend consumes say 99% of the
limit, shouldn't it be the one to be "punished"? (and somehow forced to
give the memory back).
The problem that i see with the current approach is that a "bad" backend
could impact all the others and continue to do so.
what about punishing say the highest consumer , what do you think? (just
speaking about the general idea here, not about the implementation)
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Thu, 1 Sept 2022 at 04:52, Reid Thompson
<reid.thompson@crunchydata.com> wrote:
Add the ability to limit the amount of memory that can be allocated to
backends.
Are you aware that relcache entries are stored in backend local memory
and that once we've added a relcache entry for a relation that we have
no current code which attempts to reduce the memory consumption used
by cache entries when there's memory pressure?
It seems to me that if we had this feature as you propose that a
backend could hit the limit and stay there just from the memory
requirements of the relation cache after some number of tables have
been accessed from the given backend. It's not hard to imagine a
situation where the palloc() would start to fail during parse, which
might make it quite infuriating for anyone trying to do something
like:
SET max_total_backend_memory TO 0;
or
ALTER SYSTEM SET max_total_backend_memory TO 0;
I think a better solution to this problem would be to have "memory
grants", where we configure some amount of "pool" memory that backends
are allowed to use for queries. The planner would have to add the
expected number of work_mem that the given query is expected to use
and before that query starts, the executor would have to "checkout"
that amount of memory from the pool and return it when finished. If
there is not enough memory in the pool then the query would have to
wait until enough memory is available. This creates a deadlocking
hazard that the deadlock detector would need to be made aware of.
I know Thomas Munro has mentioned this "memory grant" or "memory pool"
feature to me previously and I think he even has some work in progress
code for it. It's a very tricky problem, however, as aside from the
deadlocking issue, it requires working out how much memory a given
plan will use concurrently. That's not as simple as counting the nodes
that use work_mem and summing those up.
There is some discussion about the feature in [1]/messages/by-id/20220713222342.GE18011@telsasoft.com. I was unable to
find what Thomas mentioned on the list about this. I've included him
here in case he has any extra information to share.
David
On Wed, 2022-08-31 at 12:34 -0500, Justin Pryzby wrote:
You should name the patches with different prefixes, like
001,002,003 Otherwise, cfbot may try to apply them in the wrong
order.
git format-patch is the usual tool for that.
Thanks for the pointer. My experience with git in the past has been
minimal and basic.
+ Specifies a limit to the amount of memory (MB) that may be
allocated toMB are just the default unit, right ?
The user should be allowed to write max_total_backend_memory='2GB'
Correct. Default units are MB. Other unit types are converted to MB.
+ causing that backends current query/transaction to fail.
backend's
+ allocated exceeding the limit. Further requests will not
allocated, and exceed the limit
+ if (MyAuxProcType != NotAnAuxProcess)
The double negative is confusing, so could use a comment.
+ elog(WARNING,
I think it should be ereport() rather than elog(), which is
internal-only, and not-translated.
Corrected/added the the above items. Attached patches with the corrections.
+ 0, 0, INT_MAX,
+ NULL, NULL, NULLI think this needs a maximum like INT_MAX/1024/1024
Is this noting that we'd set a ceiling of 2048MB?
+ for (i = 1; i <= NumBackendStatSlots; i++) + {It's looping over every backend for each allocation.
Do you know if there's any performance impact of that ?
I'm not very familiar with how to test performance impact, I'm open to
suggestions. I have performed the below pgbench tests and noted the basic
tps differences in the table.
Test 1:
branch master
CFLAGS="-I/usr/include/python3.8/ " /home/rthompso/src/git/postgres/configure --silent --prefix=/home/rthompso/src/git/postgres/install/master --with-openssl --with-tcl --with-tclconfig=/usr/lib/tcl8.6 --with-perl --with-libxml --with-libxslt --with-python --with-gssapi --with-systemd --with-ldap --enable-nls
make -s -j12 && make -s install
initdb
default postgresql.conf settings
init pgbench pgbench -U rthompso -p 5433 -h localhost -i -s 50 testpgbench
10 iterations
for ctr in {1..10}; do { time pgbench -p 5433 -h localhost -c 10 -j 10 -t 50000 testpgbench; } 2>&1 | tee -a pgstatsResultsNoLimitSet; done
Test 2:
branch pg-stat-activity-backend-memory-allocated
CFLAGS="-I/usr/include/python3.8/ " /home/rthompso/src/git/postgres/configure --silent --prefix=/home/rthompso/src/git/postgres/install/pg-stats-memory/ --with-openssl --with-tcl --with-tclconfig=/usr/lib/tcl8.6 --with-perl --with-libxml --with-libxslt --with-python --with-gssapi --with-systemd --with-ldap --enable-nls
make -s -j12 && make -s install
initdb
default postgresql.conf settings
init pgbench pgbench -U rthompso -p 5433 -h localhost -i -s 50
testpgbench
10 iterations
for ctr in {1..10}; do { time pgbench -p 5433 -h localhost -c 10 -j 10 -t 50000 testpgbench; } 2>&1 | tee -a pgstatsResultsPg-stats-memory; done
Test 3:
branch dev-max-memory
CFLAGS="-I/usr/include/python3.8/ " /home/rthompso/src/git/postgres/configure --silent --prefix=/home/rthompso/src/git/postgres/install/dev-max-memory/ --with-openssl --with-tcl --with-tclconfig=/usr/lib/tcl8.6 --with-perl --with-libxml --with-libxslt --with-python --with-gssapi --with-systemd --with-ldap --enable-nls
make -s -j12 && make -s install
initdb
default postgresql.conf settings
init pgbench pgbench -U rthompso -p 5433 -h localhost -i -s 50 testpgbench
10 iterations
for ctr in {1..10}; do { time pgbench -p 5433 -h localhost -c 10 -j 10 -t 50000 testpgbench; } 2>&1 | tee -a pgstatsResultsDev-max-memory; done
Test 4:
branch dev-max-memory
CFLAGS="-I/usr/include/python3.8/ " /home/rthompso/src/git/postgres/configure --silent --prefix=/home/rthompso/src/git/postgres/install/dev-max-memory/ --with-openssl --with-tcl --with-tclconfig=/usr/lib/tcl8.6 --with-perl --with-libxml --with-libxslt --with-python --with-gssapi --with-systemd --with-ldap --enable-nls
make -s -j12 && make -s install
initdb
non-default postgresql.conf setting for max_total_backend_memory = 100MB
init pgbench pgbench -U rthompso -p 5433 -h localhost -i -s 50 testpgbench
10 iterations
for ctr in {1..10}; do { time pgbench -p 5433 -h localhost -c 10 -j 10 -t 50000 testpgbench; } 2>&1 | tee -a pgstatsResultsDev-max-memory100MB; done
Laptop
11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz 8 Cores 16 threads
32GB RAM
SSD drive
Averages from the 10 runs and tps difference over the 10 runs
|------------------+------------------+------------------------+-------------------+------------------+-------------------+---------------+------------------|
| Test Run | Master | Track Memory Allocated | Diff from Master | Max Mem off | Diff from Master | Max Mem 100MB | Diff from Master |
| Set 1 | Test 1 | Test 2 | | Test 3 | | Test 4 | |
| latency average | 2.43390909090909 | 2.44327272727273 | | 2.44381818181818 | | 2.6843 | |
| tps inc conn est | 3398.99291372727 | 3385.40984336364 | -13.583070363637 | 3385.08184309091 | -13.9110706363631 | 3729.5363413 | 330.54342757273 |
| tps exc conn est | 3399.12185727273 | 3385.52527490909 | -13.5965823636366 | 3385.22100872727 | -13.9008485454547 | 3729.7097607 | 330.58790342727 |
|------------------+------------------+------------------------+-------------------+------------------+-------------------+---------------+------------------|
| Set 2 | | | | | | | |
| latency average | 2.691 | 2.6895 | 2 | 2.69 | 3 | 2.6827 | 4 |
| tps inc conn est | 3719.56 | 3721.7587106 | 2.1987106 | 3720.3 | .74 | 3730.86 | 11.30 |
| tps exc conn est | 3719.71 | 3721.9268465 | 2.2168465 | 3720.47 | .76 | 3731.02 | 11.31 |
|------------------+------------------+------------------------+-------------------+------------------+-------------------+---------------+------------------|
I think it may be necessary to track the current allocation size in
shared memory (with atomic increments?). Maybe decrements would need
to
be exactly accounted for, or otherwise Assert() that the value is not
negative. I don't know how expensive it'd be to have conditionals
for
each decrement, but maybe the value would only be decremented at
strategic times, like at transaction commit or backend shutdown.
--
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.
reid.thompson@crunchydata.com
www.crunchydata.com
Attachments:
0002-Add-the-ability-to-limit-the-amount-of-memory-that-c.patchtext/x-patch; charset=UTF-8; name=0002-Add-the-ability-to-limit-the-amount-of-memory-that-c.patchDownload+195-1
0001-Add-tracking-of-backend-memory-allocated-to-pg_stat_.patchtext/x-patch; charset=UTF-8; name=0001-Add-tracking-of-backend-memory-allocated-to-pg_stat_.patchDownload+269-10
On Thu, 2022-09-01 at 11:48 +0900, Kyotaro Horiguchi wrote:
The patch seems to limit both of memory-context allocations and DSM
allocations happen on a specific process by the same budget. In the
fist place I don't think it's sensible to cap the amount of DSM
allocations by per-process budget.DSM is used by pgstats subsystem. There can be cases where pgstat
complains for denial of DSM allocation after the budget has been
exhausted by memory-context allocations, or every command complains
for denial of memory-context allocation after once the per-process
budget is exhausted by DSM allocations. That doesn't seem
reasonable.
regards.
It's intended as a mechanism for administrators to limit total
postgresql memory consumption to avoid the OOM killer causing a crash
and restart, or to ensure that resources are available for other
processes on shared hosts, etc. It limits all types of allocations in
order to accomplish this. Our documentation will note this, so that
administrators that have the need to set it are aware that it can
affect all non-auxiliary processes and what the effect is.
On Fri, 2022-09-02 at 09:30 +0200, Drouvot, Bertrand wrote:
Hi,
I'm not sure we are choosing the right victims here (aka the ones
that are doing the request that will push the total over the limit).Imagine an extreme case where a single backend consumes say 99% of
the limit, shouldn't it be the one to be "punished"? (and somehow forced
to give the memory back).The problem that i see with the current approach is that a "bad"
backend could impact all the others and continue to do so.what about punishing say the highest consumer , what do you think?
(just speaking about the general idea here, not about the implementation)
Initially, we believe that punishing the detector is reasonable if we
can help administrators avoid the OOM killer/resource starvation. But
we can and should expand on this idea.
Another thought is, rather than just failing the query/transaction we
have the affected backend do a clean exit, freeing all it's resources.
--
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.
reid.thompson@crunchydata.com
www.crunchydata.com
Greetings,
* David Rowley (dgrowleyml@gmail.com) wrote:
On Thu, 1 Sept 2022 at 04:52, Reid Thompson
<reid.thompson@crunchydata.com> wrote:Add the ability to limit the amount of memory that can be allocated to
backends.Are you aware that relcache entries are stored in backend local memory
and that once we've added a relcache entry for a relation that we have
no current code which attempts to reduce the memory consumption used
by cache entries when there's memory pressure?
Short answer to this is yes, and that's an issue, but it isn't this
patch's problem to deal with- that's an issue that the relcache system
needs to be changed to address.
It seems to me that if we had this feature as you propose that a
backend could hit the limit and stay there just from the memory
requirements of the relation cache after some number of tables have
been accessed from the given backend. It's not hard to imagine a
situation where the palloc() would start to fail during parse, which
might make it quite infuriating for anyone trying to do something
like:
Agreed that this could happen but I don't imagine it to be super likely-
and even if it does, this is probably a better position to be in as the
backend could then be disconnected from and would then go away and its
memory free'd, unlike the current OOM-killer situation where we crash
and go through recovery. We should note this in the documentation
though, sure, so that administrators understand how this can occur and
can take action to address it.
I think a better solution to this problem would be to have "memory
grants", where we configure some amount of "pool" memory that backends
are allowed to use for queries. The planner would have to add the
expected number of work_mem that the given query is expected to use
and before that query starts, the executor would have to "checkout"
that amount of memory from the pool and return it when finished. If
there is not enough memory in the pool then the query would have to
wait until enough memory is available. This creates a deadlocking
hazard that the deadlock detector would need to be made aware of.
Sure, that also sounds great and a query acceptance system would be
wonderful. If someone is working on that with an expectation of it
landing before v16, great. Otherwise, I don't see it as relevant to
the question about if we should include this feature or not, and I'm not
even sure that we'd refuse this feature even if we already had an
acceptance system as a stop-gap should we guess wrong and not realize it
until it's too late.
Thanks,
Stephen
On Sat, Sep 03, 2022 at 11:40:03PM -0400, Reid Thompson wrote:
+�������������� 0, 0, INT_MAX, +�������������� NULL, NULL, NULLI think this needs a maximum like INT_MAX/1024/1024
Is this noting that we'd set a ceiling of 2048MB?
The reason is that you're later multiplying it by 1024*1024, so you need
to limit it to avoid overflowing. Compare with
min_dynamic_shared_memory, Log_RotationSize, maintenance_work_mem,
autovacuum_work_mem.
typo: Explicitely
+ errmsg("request will exceed postgresql.conf defined max_total_backend_memory limit (%lu > %lu)",
I wouldn't mention postgresql.conf - it could be in
postgresql.auto.conf, or an include file, or a -c parameter.
Suggest: allocation would exceed max_total_backend_memory limit...
+ ereport(LOG, errmsg("decrease reduces reported backend memory allocated below zero; setting reported to 0"));
Suggest: deallocation would decrease backend memory below zero;
+ {"max_total_backend_memory", PGC_SIGHUP, RESOURCES_MEM,
Should this be PGC_SU_BACKEND to allow a superuser to set a higher
limit (or no limit)?
There's compilation warning under mingw cross compile due to
sizeof(long). See d914eb347 and other recent commits which I guess is
the current way to handle this.
http://cfbot.cputube.org/reid-thompson.html
For performance test, you'd want to check what happens with a large
number of max_connections (and maybe a large number of clients). TPS
isn't the only thing that matters. For example, a utility command might
sometimes do a lot of allocations (or deallocations), or a
"parameterized nested loop" may loop over over many outer tuples and
reset for each. There's also a lot of places that reset to a
"per-tuple" context. I started looking at its performance, but nothing
to show yet.
Would you keep people copied on your replies ("reply all") ? Otherwise
I (at least) may miss them. I think that's what's typical on these
lists (and the list tool is smart enough not to send duplicates to
people who are direct recipients).
--
Justin
On Fri, 2022-09-09 at 12:14 -0500, Justin Pryzby wrote:
On Sat, Sep 03, 2022 at 11:40:03PM -0400, Reid Thompson wrote:
+ 0, 0, INT_MAX,
+ NULL, NULL, NULLI think this needs a maximum like INT_MAX/1024/1024
Is this noting that we'd set a ceiling of 2048MB?
The reason is that you're later multiplying it by 1024*1024, so you
need
to limit it to avoid overflowing. Compare with
min_dynamic_shared_memory, Log_RotationSize, maintenance_work_mem,
autovacuum_work_mem.
What I originally attempted to implement is:
GUC "max_total_backend_memory" max value as INT_MAX = 2147483647 MB
(2251799812636672 bytes). And the other variables and comparisons as
bytes represented as uint64 to avoid overflow.
Is this invalid?
typo: Explicitely
corrected
+ errmsg("request will exceed postgresql.conf
defined max_total_backend_memory limit (%lu > %lu)",I wouldn't mention postgresql.conf - it could be in
postgresql.auto.conf, or an include file, or a -c parameter.
Suggest: allocation would exceed max_total_backend_memory limit...
updated
+ ereport(LOG, errmsg("decrease reduces reported
backend memory allocated below zero; setting reported to 0"));Suggest: deallocation would decrease backend memory below zero;
updated
+ {"max_total_backend_memory", PGC_SIGHUP,
RESOURCES_MEM,
Should this be PGC_SU_BACKEND to allow a superuser to set a higher
limit (or no limit)?
Sounds good to me. I'll update to that.
Would PGC_SUSET be too open?
There's compilation warning under mingw cross compile due to
sizeof(long). See d914eb347 and other recent commits which I guess
is
the current way to handle this.
http://cfbot.cputube.org/reid-thompson.html
updated %lu to %llu and changed cast from uint64 to
unsigned long long in the ereport call
For performance test, you'd want to check what happens with a large
number of max_connections (and maybe a large number of clients). TPS
isn't the only thing that matters. For example, a utility command
might
sometimes do a lot of allocations (or deallocations), or a
"parameterized nested loop" may loop over over many outer tuples and
reset for each. There's also a lot of places that reset to a
"per-tuple" context. I started looking at its performance, but
nothing
to show yet.
Thanks
Would you keep people copied on your replies ("reply all") ?
Otherwise
I (at least) may miss them. I think that's what's typical on these
lists (and the list tool is smart enough not to send duplicates to
people who are direct recipients).
Ok - will do, thanks.
--
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.
reid.thompson@crunchydata.com
www.crunchydata.com
Attachments:
0001-Add-tracking-of-backend-memory-allocated-to-pg_stat_.patchtext/x-patch; charset=UTF-8; name=0001-Add-tracking-of-backend-memory-allocated-to-pg_stat_.patchDownload+269-10
0002-Add-the-ability-to-limit-the-amount-of-memory-that-c.patchtext/x-patch; charset=UTF-8; name=0002-Add-the-ability-to-limit-the-amount-of-memory-that-c.patchDownload+196-2
On Mon, Sep 12, 2022 at 8:30 PM Reid Thompson <reid.thompson@crunchydata.com>
wrote:
On Fri, 2022-09-09 at 12:14 -0500, Justin Pryzby wrote:
On Sat, Sep 03, 2022 at 11:40:03PM -0400, Reid Thompson wrote:
+ 0, 0, INT_MAX,
+ NULL, NULL, NULLI think this needs a maximum like INT_MAX/1024/1024
Is this noting that we'd set a ceiling of 2048MB?
The reason is that you're later multiplying it by 1024*1024, so you
need
to limit it to avoid overflowing. Compare with
min_dynamic_shared_memory, Log_RotationSize, maintenance_work_mem,
autovacuum_work_mem.What I originally attempted to implement is:
GUC "max_total_backend_memory" max value as INT_MAX = 2147483647 MB
(2251799812636672 bytes). And the other variables and comparisons as
bytes represented as uint64 to avoid overflow.Is this invalid?
typo: Explicitely
corrected
+ errmsg("request will exceed postgresql.conf
defined max_total_backend_memory limit (%lu > %lu)",I wouldn't mention postgresql.conf - it could be in
postgresql.auto.conf, or an include file, or a -c parameter.
Suggest: allocation would exceed max_total_backend_memory limit...updated
+ ereport(LOG, errmsg("decrease reduces reported
backend memory allocated below zero; setting reported to 0"));Suggest: deallocation would decrease backend memory below zero;
updated
+ {"max_total_backend_memory", PGC_SIGHUP,
RESOURCES_MEM,Should this be PGC_SU_BACKEND to allow a superuser to set a higher
limit (or no limit)?Sounds good to me. I'll update to that.
Would PGC_SUSET be too open?There's compilation warning under mingw cross compile due to
sizeof(long). See d914eb347 and other recent commits which I guess
is
the current way to handle this.
http://cfbot.cputube.org/reid-thompson.htmlupdated %lu to %llu and changed cast from uint64 to
unsigned long long in the ereport callFor performance test, you'd want to check what happens with a large
number of max_connections (and maybe a large number of clients). TPS
isn't the only thing that matters. For example, a utility command
might
sometimes do a lot of allocations (or deallocations), or a
"parameterized nested loop" may loop over over many outer tuples and
reset for each. There's also a lot of places that reset to a
"per-tuple" context. I started looking at its performance, but
nothing
to show yet.Thanks
Would you keep people copied on your replies ("reply all") ?
Otherwise
I (at least) may miss them. I think that's what's typical on these
lists (and the list tool is smart enough not to send duplicates to
people who are direct recipients).Ok - will do, thanks.
--
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.reid.thompson@crunchydata.com
www.crunchydata.comThe patch does not apply; please rebase the patch.
patching file src/backend/utils/misc/guc.c
Hunk #1 FAILED at 3664.
1 out of 1 hunk FAILED -- saving rejects to file
src/backend/utils/misc/guc.c.rej
patching file src/backend/utils/misc/postgresql.conf.sample
--
Ibrar Ahmed
On Thu, 2022-09-15 at 12:07 +0400, Ibrar Ahmed wrote:
The patch does not apply; please rebase the patch.
patching file src/backend/utils/misc/guc.c
Hunk #1 FAILED at 3664.
1 out of 1 hunk FAILED -- saving rejects to file
src/backend/utils/misc/guc.c.rejpatching file src/backend/utils/misc/postgresql.conf.sample
rebased patches attached.
Thanks,
Reid
Attachments:
0001-Add-tracking-of-backend-memory-allocated-to-pg_stat_.patchtext/x-patch; charset=UTF-8; name=0001-Add-tracking-of-backend-memory-allocated-to-pg_stat_.patchDownload+269-10
0002-Add-the-ability-to-limit-the-amount-of-memory-that-c.patchtext/x-patch; charset=UTF-8; name=0002-Add-the-ability-to-limit-the-amount-of-memory-that-c.patchDownload+199-3
Hello Reid,
could you rebase the patch again? It doesn't apply currently (http://cfbot.cputube.org/patch_40_3867.log). Thanks!
You mention, that you want to prevent the compiler from getting cute.
I don't think this comments are exactly helpful in the current state. I think probably fine to just omit them.
I don't understand the purpose of the result variable in exceeds_max_total_bkend_mem. What purpose does it serve?
I really like the simplicity of the suggestion here to prevent oom.
I intent to play around with a lot of backends, once I get a rebased patch.
Regards
Arne
________________________________
From: Reid Thompson <reid.thompson@crunchydata.com>
Sent: Thursday, September 15, 2022 4:58:19 PM
To: Ibrar Ahmed; pgsql-hackers@lists.postgresql.org
Cc: reid.thompson@crunchydata.com; Justin Pryzby
Subject: Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Thu, 2022-09-15 at 12:07 +0400, Ibrar Ahmed wrote:
The patch does not apply; please rebase the patch.
patching file src/backend/utils/misc/guc.c
Hunk #1 FAILED at 3664.
1 out of 1 hunk FAILED -- saving rejects to file
src/backend/utils/misc/guc.c.rejpatching file src/backend/utils/misc/postgresql.conf.sample
rebased patches attached.
Thanks,
Reid
Hi Arne,
On Mon, 2022-10-24 at 15:27 +0000, Arne Roland wrote:
Hello Reid,
could you rebase the patch again? It doesn't apply currently
(http://cfbot.cputube.org/patch_40_3867.log). Thanks!
rebased patches attached.
You mention, that you want to prevent the compiler from getting
cute.I don't think this comments are exactly helpful in the current
state. I think probably fine to just omit them.
I attempted to follow previous convention when adding code and these
comments have been consistently applied throughout backend_status.c
where a volatile pointer is being used.
I don't understand the purpose of the result variable in
exceeds_max_total_bkend_mem. What purpose does it serve?I really like the simplicity of the suggestion here to prevent oom.
If max_total_backend_memory is configured, exceeds_max_total_bkend_mem()
will return true if an allocation request will push total backend memory
allocated over the configured value.
exceeds_max_total_bkend_mem() is implemented in the various allocators
along the lines of
...snip...
/* Do not exceed maximum allowed memory allocation */
if (exceeds_max_total_bkend_mem('new request size'))
return NULL;
...snip...
Do not allocate the memory requested, return NULL instead. PG already
had code in place to handle NULL returns from allocation requests.
The allocation code in aset.c, slab.c, generation.c, dsm_impl.c utilizes
exceeds_max_total_bkend_mem()
max_total_backend_memory (integer)
Specifies a limit to the amount of memory (MB) that may be allocated
to backends in total (i.e. this is not a per user or per backend limit).
If unset, or set to 0 it is disabled. A backend request that would push
the total over the limit will be denied with an out of memory error
causing that backend's current query/transaction to fail. Due to the
dynamic nature of memory allocations, this limit is not exact. If within
1.5MB of the limit and two backends request 1MB each at the same time
both may be allocated, and exceed the limit. Further requests will not
be allocated until dropping below the limit. Keep this in mind when
setting this value. This limit does not affect auxiliary backend
processes Auxiliary process . Backend memory allocations
(backend_mem_allocated) are displayed in the pg_stat_activity view.
Show quoted text
I intent to play around with a lot of backends, once I get a rebased
patch.Regards
Arne
Attachments:
0001-Add-tracking-of-backend-memory-allocated-to-pg_stat_.patchtext/x-patch; charset=UTF-8; name=0001-Add-tracking-of-backend-memory-allocated-to-pg_stat_.patchDownload+269-10
0002-Add-the-ability-to-limit-the-amount-of-memory-that-c.patchtext/x-patch; charset=UTF-8; name=0002-Add-the-ability-to-limit-the-amount-of-memory-that-c.patchDownload+199-3
On Tue, 2022-10-25 at 11:49 -0400, Reid Thompson wrote:
Hi Arne,
On Mon, 2022-10-24 at 15:27 +0000, Arne Roland wrote:
Hello Reid,
could you rebase the patch again? It doesn't apply currently
(http://cfbot.cputube.org/patch_40_3867.log). Thanks!rebased patches attached.
Rebased to current. Add a couple changes per conversation with D
Christensen (include units in field name, group field with backend_xid
and backend_xmin fields in pg_stat_activity view, rather than between
query_id and query)
--
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.
reid.thompson@crunchydata.com
www.crunchydata.com
Attachments:
0002-Add-the-ability-to-limit-the-amount-of-memory-that-c.patchtext/x-patch; charset=UTF-8; name=0002-Add-the-ability-to-limit-the-amount-of-memory-that-c.patchDownload+198-2
0001-Add-tracking-of-backend-memory-allocated-to-pg_stat_.patchtext/x-patch; charset=UTF-8; name=0001-Add-tracking-of-backend-memory-allocated-to-pg_stat_.patchDownload+270-10
On Thu, 2022-11-03 at 11:48 -0400, Reid Thompson wrote:
On Tue, 2022-10-25 at 11:49 -0400, Reid Thompson wrote:
Rebased to current. Add a couple changes per conversation with D
Christensen (include units in field name, group field with
backend_xid
and backend_xmin fields in pg_stat_activity view, rather than between
query_id and query)
rebased/patched to current master && current pg-stat-activity-backend-memory-allocated
--
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.
reid.thompson@crunchydata.com
www.crunchydata.com
Attachments:
0002-Add-the-ability-to-limit-the-amount-of-memory-that-c.patchtext/x-patch; charset=UTF-8; name=0002-Add-the-ability-to-limit-the-amount-of-memory-that-c.patchDownload+197-1
0001-Add-tracking-of-backend-memory-allocated-to-pg_stat_.patchtext/x-patch; charset=UTF-8; name=0001-Add-tracking-of-backend-memory-allocated-to-pg_stat_.patchDownload+300-10
Hi,
On 2022-11-26 22:22:15 -0500, Reid Thompson wrote:
rebased/patched to current master && current pg-stat-activity-backend-memory-allocated
This version fails to build with msvc, and builds with warnings on other
platforms.
https://cirrus-ci.com/build/5410696721072128
msvc:
[20:26:51.286] c:\cirrus\src\include\utils/backend_status.h(40): error C2059: syntax error: 'constant'
mingw cross:
[20:26:26.358] from /usr/share/mingw-w64/include/winsock2.h:23,
[20:26:26.358] from ../../src/include/port/win32_port.h:60,
[20:26:26.358] from ../../src/include/port.h:24,
[20:26:26.358] from ../../src/include/c.h:1306,
[20:26:26.358] from ../../src/include/postgres.h:47,
[20:26:26.358] from controldata_utils.c:18:
[20:26:26.358] ../../src/include/utils/backend_status.h:40:2: error: expected identifier before numeric constant
[20:26:26.358] 40 | IGNORE,
[20:26:26.358] | ^~~~~~
[20:26:26.358] In file included from ../../src/include/postgres.h:48,
[20:26:26.358] from controldata_utils.c:18:
[20:26:26.358] ../../src/include/utils/backend_status.h: In function ‘pgstat_report_allocated_bytes’:
[20:26:26.358] ../../src/include/utils/backend_status.h:365:12: error: format ‘%ld’ expects argument of type ‘long int’, but argument 3 has type ‘uint64’ {aka ‘long long unsigned int’} [-Werror=format=]
[20:26:26.358] 365 | errmsg("Backend %d deallocated %ld bytes, exceeding the %ld bytes it is currently reporting allocated. Setting reported to 0.",
[20:26:26.358] | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[20:26:26.358] 366 | MyProcPid, allocated_bytes, *my_allocated_bytes));
[20:26:26.358] | ~~~~~~~~~~~~~~~
[20:26:26.358] | |
[20:26:26.358] | uint64 {aka long long unsigned int}
Due to windows having long be 32bit, you need to use %lld. Our custom to deal
with that is to cast the argument to errmsg as long long unsigned and use
%llu.
Btw, given that the argument is uint64, it doesn't seem correct to use %ld,
that's signed. Not that it's going to matter, but ...
Greetings,
Andres Freund
On Tue, 2022-12-06 at 10:32 -0800, Andres Freund wrote:
Hi,
On 2022-11-26 22:22:15 -0500, Reid Thompson wrote:
rebased/patched to current master && current pg-stat-activity-
backend-memory-allocatedThis version fails to build with msvc, and builds with warnings on
other
platforms.
https://cirrus-ci.com/build/5410696721072128
msvc:Andres Freund
updated patches
--
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.
reid.thompson@crunchydata.com
www.crunchydata.com
Attachments:
0002-Add-the-ability-to-limit-the-amount-of-memory-that-c.patchtext/x-patch; charset=UTF-8; name=0002-Add-the-ability-to-limit-the-amount-of-memory-that-c.patchDownload+197-1
0001-Add-tracking-of-backend-memory-allocated-to-pg_stat_.patchtext/x-patch; charset=UTF-8; name=0001-Add-tracking-of-backend-memory-allocated-to-pg_stat_.patchDownload+300-10
On Fri, 9 Dec 2022 at 20:41, Reid Thompson
<reid.thompson@crunchydata.com> wrote:
On Tue, 2022-12-06 at 10:32 -0800, Andres Freund wrote:
Hi,
On 2022-11-26 22:22:15 -0500, Reid Thompson wrote:
rebased/patched to current master && current pg-stat-activity-
backend-memory-allocatedThis version fails to build with msvc, and builds with warnings on
other
platforms.
https://cirrus-ci.com/build/5410696721072128
msvc:Andres Freund
updated patches
The patch does not apply on top of HEAD as in [1]http://cfbot.cputube.org/patch_41_3867.log, please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
92957ed98c5c565362ce665266132a7f08f6b0c0 ===
=== applying patch
./0001-Add-tracking-of-backend-memory-allocated-to-pg_stat_.patch
...
patching file src/backend/utils/mmgr/slab.c
Hunk #1 succeeded at 69 (offset 16 lines).
Hunk #2 succeeded at 414 (offset 175 lines).
Hunk #3 succeeded at 436 with fuzz 2 (offset 176 lines).
Hunk #4 FAILED at 286.
Hunk #5 succeeded at 488 (offset 186 lines).
Hunk #6 FAILED at 381.
Hunk #7 FAILED at 554.
3 out of 7 hunks FAILED -- saving rejects to file
src/backend/utils/mmgr/slab.c.rej
[1]: http://cfbot.cputube.org/patch_41_3867.log
Regards,
Vignesh