[DOC] Add detail regarding resource consumption wrt max_connections

Started by Roberto Melloabout 2 years ago15 messages

roberto.mello@gmail.com

about 2 years ago

1 attachment(s)

The documentation for max_connections does not mention that just by having
a higher value for max_connections, PostgreSQL will use more resources.

While working with different customers, I noticed that several of them set
max_connections to very high numbers, even though they never expected to
actually have that many connections to their PostgreSQL instance.

In one extreme case, the user set max_connections to 200000 and was
befuddled that the instance was using more memory than another with the
same number of connections.

This patch adds language to the documentation pointing to the fact that
higher value of max_connections leads to higher consumption of resources by
Postgres, adding one paragraph to doc/src/sgml/config.sgml

<para>
PostgreSQL sizes certain resources based directly on the value of
<varname>max_connections</varname>. Increasing its value leads to
higher allocation of those resources, including shared memory.
</para>

Sincerely,

Roberto Mello

Attachments:

max-connections-guc-detail.patchapplication/octet-stream; name=max-connections-guc-detail.patchDownload

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index bd70ff2e4b..c3e40a548e 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -705,6 +705,12 @@ include_dir 'conf.d'
         only be set at server start.
        </para>
 
+       <para>
+        PostgreSQL sizes certain resources based directly on the value of
+        <varname>max_connections</varname>. Increasing its value leads to
+        higher allocation of those resources, including shared memory.
+       </para>
+
        <para>
         When running a standby server, you must set this parameter to the
         same or higher value than on the primary server. Otherwise, queries

Cary Huang

cary.huang@highgo.ca

almost 2 years ago

In reply to: Roberto Mello (#1)

Re: [DOC] Add detail regarding resource consumption wrt max_connections

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: tested, passed

I think it is good to warn the user about the increased allocation of memory for certain parameters so that they do not abuse it by setting it to a huge number without knowing the consequences.

It is true that max_connections can increase the size of proc array and other resources, which are allocated in the shared buffer, which also means less shared buffer to perform regular data operations. I am sure this is not the only parameter that affects the memory allocation. "max_prepared_xacts" can also affect the shared memory allocation too so the same warning message applies here as well. Maybe there are other parameters with similar effects.

Instead of stating that higher max_connections results in higher allocation, It may be better to tell the user that if the value needs to be set much higher, consider increasing the "shared_buffers" setting as well.

thank you

-----------------------
Cary Huang
Highgo Software Canada
www.highgo.ca

Roberto Mello

roberto.mello@gmail.com

almost 2 years ago

In reply to: Cary Huang (#2)

Re: [DOC] Add detail regarding resource consumption wrt max_connections

On Fri, Jan 12, 2024 at 3:15 PM Cary Huang <cary.huang@highgo.ca> wrote:

I think it is good to warn the user about the increased allocation of
memory for certain parameters so that they do not abuse it by setting it to
a huge number without knowing the consequences.

It is true that max_connections can increase the size of proc array and
other resources, which are allocated in the shared buffer, which also means
less shared buffer to perform regular data operations. I am sure this is
not the only parameter that affects the memory allocation.
"max_prepared_xacts" can also affect the shared memory allocation too so
the same warning message applies here as well. Maybe there are other
parameters with similar effects.

Instead of stating that higher max_connections results in higher
allocation, It may be better to tell the user that if the value needs to be
set much higher, consider increasing the "shared_buffers" setting as well.

Appreciate the review, Cary.

My goal was to inform the reader that there are implications to setting
max_connections higher. I've personally seen a user mindlessly set this to
50k connections, unaware it would cause unintended consequences.

I can add a suggestion for the user to consider increasing shared_buffers
in accordance with higher max_connections, but it would be better if there
was a "rule of thumb" guideline to go along. I'm open to suggestions.

I can revise with a similar warning in max_prepared_xacts as well.

Sincerely,

Roberto

Noname

reid.thompson@crunchydata.com

almost 2 years ago

In reply to: Roberto Mello (#3)

Re: [DOC] Add detail regarding resource consumption wrt max_connections

On Sat, 2024-01-13 at 10:31 -0700, Roberto Mello wrote:

On Fri, Jan 12, 2024 at 3:15 PM Cary Huang <cary.huang@highgo.ca>
wrote:

I think it is good to warn the user about the increased allocation
of memory for certain parameters so that they do not abuse it by
setting it to a huge number without knowing the consequences.

It is true that max_connections can increase the size of proc array
and other resources, which are allocated in the shared buffer,
which also means less shared buffer to perform regular data
operations. I am sure this is not the only parameter that affects
the memory allocation. "max_prepared_xacts" can also affect the
shared memory allocation too so the same warning message applies
here as well. Maybe there are other parameters with similar
effects.

Instead of stating that higher max_connections results in higher
allocation, It may be better to tell the user that if the value
needs to be set much higher, consider increasing the
"shared_buffers" setting as well.

Appreciate the review, Cary.

My goal was to inform the reader that there are implications to
setting max_connections higher. I've personally seen a user
mindlessly set this to 50k connections, unaware it would cause
unintended consequences.

I can add a suggestion for the user to consider increasing
shared_buffers in accordance with higher max_connections, but it
would be better if there was a "rule of thumb" guideline to go along.
I'm open to suggestions.

I can revise with a similar warning in max_prepared_xacts as well.

Sincerely,

Roberto

Can a "close enough" rule of thumb be calculated from:
postgresql.conf -> log_min_messages = debug3

start postgresql with varying max_connections to get
CreateSharedMemoryAndSemaphores() sizes to generate a rough equation

postgresql-12-main.log

max_connections=100
75:2024-01-19 17:04:56.544 EST [2762535] DEBUG: invoking
IpcMemoryCreate(size=149110784)
0.149110784GB

max_connections=10000
1203:2024-01-19 17:06:13.502 EST [2764895] DEBUG: invoking
IpcMemoryCreate(size=644997120)
0.64499712GB

max_connections=20000
5248:2024-01-19 17:24:27.956 EST [2954550] DEBUG: invoking
IpcMemoryCreate(size=1145774080)
1.14577408GB

max_connections=50000
2331:2024-01-19 17:07:27.716 EST [2767079] DEBUG: invoking
IpcMemoryCreate(size=2591490048)
2.591490048GB

from lines 184-186

$ rg -B28 -A35 'invoking IpcMemoryCreate'
backend/storage/ipc/ipci.c
158-/*
159- * CreateSharedMemoryAndSemaphores
160- * Creates and initializes shared memory and semaphores.
161- *
162- * This is called by the postmaster or by a standalone backend.
163- * It is also called by a backend forked from the postmaster in the
164- * EXEC_BACKEND case. In the latter case, the shared memory segment
165- * already exists and has been physically attached to, but we have
to
166- * initialize pointers in local memory that reference the shared
structures,
167- * because we didn't inherit the correct pointer values from the
postmaster
168- * as we do in the fork() scenario. The easiest way to do that is
to run
169- * through the same code as before. (Note that the called routines
mostly
170- * check IsUnderPostmaster, rather than EXEC_BACKEND, to detect
this case.
171- * This is a bit code-wasteful and could be cleaned up.)
172- */
173-void
174-CreateSharedMemoryAndSemaphores(void)
175-{
176- PGShmemHeader *shim = NULL;
177-
178- if (!IsUnderPostmaster)
179- {
180- PGShmemHeader *seghdr;
181- Size size;
182- int numSemas;
183-
184- /* Compute the size of the shared-memory block */
185- size = CalculateShmemSize(&numSemas);
186: elog(DEBUG3, "invoking IpcMemoryCreate(size=%zu)", size);
187-
188- /*
189- * Create the shmem segment
190- */
191- seghdr = PGSharedMemoryCreate(size, &shim);
192-
193- InitShmemAccess(seghdr);
194-
195- /*
196- * Create semaphores
197- */
198- PGReserveSemaphores(numSemas);
199-
200- /*
201- * If spinlocks are disabled, initialize emulation layer (which
202- * depends on semaphores, so the order is important here).
203- */
204-#ifndef HAVE_SPINLOCKS
205- SpinlockSemaInit();
206-#endif
207- }
208- else
209- {
210- /*
211- * We are reattaching to an existing shared memory segment. This
212- * should only be reached in the EXEC_BACKEND case.
213- */
214-#ifndef EXEC_BACKEND
215- elog(PANIC, "should be attached to shared memory already");
216-#endif
217- }
218-
219- /*
220- * Set up shared memory allocation mechanism
221- */

Noname

reid.thompson@crunchydata.com

almost 2 years ago

In reply to: Noname (#4)

Re: [DOC] Add detail regarding resource consumption wrt max_connections

On Fri, 2024-01-19 at 17:37 -0500, reid.thompson@crunchydata.com wrote:

On Sat, 2024-01-13 at 10:31 -0700, Roberto Mello wrote:

I can add a suggestion for the user to consider increasing
shared_buffers in accordance with higher max_connections, but it
would be better if there was a "rule of thumb" guideline to go
along. I'm open to suggestions.

I can revise with a similar warning in max_prepared_xacts as well.

Sincerely,

Roberto

Can a "close enough" rule of thumb be calculated from:
postgresql.conf -> log_min_messages = debug3

start postgresql with varying max_connections to get
CreateSharedMemoryAndSemaphores() sizes to generate a rough equation

or maybe it would be sufficient to advise to set log_min_messages =
debug3 on a test DB and start/stop it with varying values of
max_connections and look at the differing values in
DEBUG: invoking IpcMemoryCreate(size=...) log messages for themselves.

Robert Treat

rob@xzilla.net

almost 2 years ago

In reply to: Noname (#5)

Re: [DOC] Add detail regarding resource consumption wrt max_connections

On Mon, Jan 22, 2024 at 8:58 AM <reid.thompson@crunchydata.com> wrote:

On Fri, 2024-01-19 at 17:37 -0500, reid.thompson@crunchydata.com wrote:

On Sat, 2024-01-13 at 10:31 -0700, Roberto Mello wrote:

I can add a suggestion for the user to consider increasing
shared_buffers in accordance with higher max_connections, but it
would be better if there was a "rule of thumb" guideline to go
along. I'm open to suggestions.

I can revise with a similar warning in max_prepared_xacts as well.

Sincerely,

Roberto

Can a "close enough" rule of thumb be calculated from:
postgresql.conf -> log_min_messages = debug3

start postgresql with varying max_connections to get
CreateSharedMemoryAndSemaphores() sizes to generate a rough equation

or maybe it would be sufficient to advise to set log_min_messages =
debug3 on a test DB and start/stop it with varying values of
max_connections and look at the differing values in
DEBUG: invoking IpcMemoryCreate(size=...) log messages for themselves.

I'm of the opinion that advice suggestingDBA's set things to DEBUG 3
is unfriendly at best. If you really want to add more, there is an
existing unfriendly section of the docs at
https://www.postgresql.org/docs/devel/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT
that mentions this problem, specifically:

"If PostgreSQL itself is the cause of the system running out of
memory, you can avoid the problem by changing your configuration. In
some cases, it may help to lower memory-related configuration
parameters, particularly shared_buffers, work_mem, and
hash_mem_multiplier. In other cases, the problem may be caused by
allowing too many connections to the database server itself. In many
cases, it may be better to reduce max_connections and instead make use
of external connection-pooling software."

I couldn't really find a spot to add in your additional info, but
maybe you can find a spot that fits? Or maybe a well written
walk-through of this would make for a good wiki page in case people
really want to dig in.

In any case, I think Roberto's original language is an improvement
over what we have now, so I'd probably recommend just going with that,
along with a similar note to max_prepared_xacts, and optionally a
pointer to the shared mem section of the docs.

Robert Treat
https://xzilla.net

Michael Banck

mbanck@gmx.net

almost 2 years ago

In reply to: Cary Huang (#2)

Re: [DOC] Add detail regarding resource consumption wrt max_connections

Hi,

On Fri, Jan 12, 2024 at 10:14:38PM +0000, Cary Huang wrote:

I think it is good to warn the user about the increased allocation of
memory for certain parameters so that they do not abuse it by setting
it to a huge number without knowing the consequences.

Right, and I think it might be useful to log (i.e. at LOG not DEBUG3
level, with a nicer message) the amount of memory we allocate on
startup, that is just one additional line per instance lifetime but
might be quite useful to admins. Or maybe two lines if we log whether we
could allocate it as huge pages or not as well:

|2024-03-08 16:46:13.117 CET [237899] DEBUG: invoking IpcMemoryCreate(size=145145856)
|2024-03-08 16:46:13.117 CET [237899] DEBUG: mmap(146800640) with MAP_HUGETLB failed, huge pages disabled: Cannot allocate memory

It is true that max_connections can increase the size of proc array
and other resources, which are allocated in the shared buffer, which
also means less shared buffer to perform regular data operations.

AFAICT, those resources are allocated on top of shared_buffers, i.e. the
total allocated memory is shared_buffers + (some resources) *
max_connections + (other resources) * other_factors.

Instead of stating that higher max_connections results in higher
allocation, It may be better to tell the user that if the value needs
to be set much higher, consider increasing the "shared_buffers"
setting as well.

Only if what you say above is true and I am at fault.

Michael

Robert Treat

rob@xzilla.net

almost 2 years ago

In reply to: Michael Banck (#7)

Re: [DOC] Add detail regarding resource consumption wrt max_connections

On Fri, Mar 8, 2024 at 10:47 AM Michael Banck <mbanck@gmx.net> wrote:

Hi,

On Fri, Jan 12, 2024 at 10:14:38PM +0000, Cary Huang wrote:

I think it is good to warn the user about the increased allocation of
memory for certain parameters so that they do not abuse it by setting
it to a huge number without knowing the consequences.

Right, and I think it might be useful to log (i.e. at LOG not DEBUG3
level, with a nicer message) the amount of memory we allocate on
startup, that is just one additional line per instance lifetime but
might be quite useful to admins. Or maybe two lines if we log whether we
could allocate it as huge pages or not as well:

|2024-03-08 16:46:13.117 CET [237899] DEBUG: invoking IpcMemoryCreate(size=145145856)
|2024-03-08 16:46:13.117 CET [237899] DEBUG: mmap(146800640) with MAP_HUGETLB failed, huge pages disabled: Cannot allocate memory

If we were going to add these details (and I very much like the idea),
I would advocate that we put it somewhere more permanent than a single
log entry at start-up. Given that database up-times easily run months
and sometimes years, it is hard to imagine we'd always have access to
the log files to figure this out on any actively running systems.

Robert Treat
https://xzilla.net

Michael Banck

mbanck@gmx.net

almost 2 years ago

In reply to: Robert Treat (#8)

Re: [DOC] Add detail regarding resource consumption wrt max_connections

Hi,

On Sun, Mar 10, 2024 at 09:58:25AM -0400, Robert Treat wrote:

On Fri, Mar 8, 2024 at 10:47 AM Michael Banck <mbanck@gmx.net> wrote:

On Fri, Jan 12, 2024 at 10:14:38PM +0000, Cary Huang wrote:

I think it is good to warn the user about the increased allocation of
memory for certain parameters so that they do not abuse it by setting
it to a huge number without knowing the consequences.

Right, and I think it might be useful to log (i.e. at LOG not DEBUG3
level, with a nicer message) the amount of memory we allocate on
startup, that is just one additional line per instance lifetime but
might be quite useful to admins. Or maybe two lines if we log whether we
could allocate it as huge pages or not as well:

|2024-03-08 16:46:13.117 CET [237899] DEBUG: invoking IpcMemoryCreate(size=145145856)
|2024-03-08 16:46:13.117 CET [237899] DEBUG: mmap(146800640) with MAP_HUGETLB failed, huge pages disabled: Cannot allocate memory

If we were going to add these details (and I very much like the idea),
I would advocate that we put it somewhere more permanent than a single
log entry at start-up. Given that database up-times easily run months
and sometimes years, it is hard to imagine we'd always have access to
the log files to figure this out on any actively running systems.

Well actually, those two numbers are already available at runtime, via
the shared_memory_size and (from 17 on) huge_pages_status GUCs.

So this would be geared at admins that keeps in long-term storage and
want to know what the numbers were a while ago. Maybe it is not that
interesting, but I think one or two lines at startup would not hurt.

Michael

#10

Robert Haas

robertmhaas@gmail.com

almost 2 years ago

In reply to: Robert Treat (#6)

Re: [DOC] Add detail regarding resource consumption wrt max_connections

On Fri, Mar 8, 2024 at 9:52 AM Robert Treat <rob@xzilla.net> wrote:

I'm of the opinion that advice suggestingDBA's set things to DEBUG 3
is unfriendly at best. If you really want to add more, there is an
existing unfriendly section of the docs at
https://www.postgresql.org/docs/devel/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT
that mentions this problem, specifically:

"If PostgreSQL itself is the cause of the system running out of
memory, you can avoid the problem by changing your configuration. In
some cases, it may help to lower memory-related configuration
parameters, particularly shared_buffers, work_mem, and
hash_mem_multiplier. In other cases, the problem may be caused by
allowing too many connections to the database server itself. In many
cases, it may be better to reduce max_connections and instead make use
of external connection-pooling software."

I couldn't really find a spot to add in your additional info, but
maybe you can find a spot that fits? Or maybe a well written
walk-through of this would make for a good wiki page in case people
really want to dig in.

In any case, I think Roberto's original language is an improvement
over what we have now, so I'd probably recommend just going with that,
along with a similar note to max_prepared_xacts, and optionally a
pointer to the shared mem section of the docs.

I agree with this.

I don't agree with Cary's statement that if you increase
max_connections you should increase shared_buffers as well. That seems
situation-dependent to me, and it's also missing Roberto's point,
which is that JUST increasing max_connections without doing anything
else uses more shared memory.

Similarly, I don't think we need to document a detailed testing
procedure, as proposed by Reid. If users want to know exactly how many
additional resources are used, they can test; either using the DEBUG3
approach, or perhaps more simply via the pg_shmem_allocations view.
But I think it's overkill for us to recommend any specific testing
procedure here.

Rather, I think that it's entirely appropriate to do what Roberto
suggested, which is to say, let users know that they're going to use
some extra resources if they increase the setting, and then let them
figure out what if anything they want to do about that.

--
Robert Haas
EDB: http://www.enterprisedb.com

#11

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Robert Haas (#10)

Re: [DOC] Add detail regarding resource consumption wrt max_connections

On Fri, Mar 22, 2024 at 1:57 PM Robert Haas <robertmhaas@gmail.com> wrote:

Rather, I think that it's entirely appropriate to do what Roberto
suggested, which is to say, let users know that they're going to use
some extra resources if they increase the setting, and then let them
figure out what if anything they want to do about that.

Considering that, and the lack of further comment, I propose to commit
the original patch.

Objections?

--
Robert Haas
EDB: http://www.enterprisedb.com

#12

Robert Treat

rob@xzilla.net

over 1 year ago

In reply to: Robert Haas (#11)

Re: [DOC] Add detail regarding resource consumption wrt max_connections

On Wed, May 15, 2024 at 11:14 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 22, 2024 at 1:57 PM Robert Haas <robertmhaas@gmail.com> wrote:

Rather, I think that it's entirely appropriate to do what Roberto
suggested, which is to say, let users know that they're going to use
some extra resources if they increase the setting, and then let them
figure out what if anything they want to do about that.

Considering that, and the lack of further comment, I propose to commit
the original patch.

Objections?

I think the only unresolved question in my mind was if we should add a
similar note to the original patch to max_prepared_xacts as well; do
you intend to do that?

Robert Treat
https://xzilla.net

#13

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Robert Treat (#12)

Re: [DOC] Add detail regarding resource consumption wrt max_connections

On Wed, May 15, 2024 at 4:00 PM Robert Treat <rob@xzilla.net> wrote:

I think the only unresolved question in my mind was if we should add a
similar note to the original patch to max_prepared_xacts as well; do
you intend to do that?

I didn't intend to do that. I don't think it would be incorrect to do
so, but then we're kind of getting into a slippery slope of trying to
label every parameter that has increases shared memory usage or any
other kind of research consumption, and there are probably (pulls
number out of the air) twenty of those. It seems more worthwhile to
mention it for max_connections than the other (deducts one from
previous random guess) nineteen because it affects a whole lot more
things, like the size of the fsync queue and the size of the lock
table, and also because it tends to get set to relatively large
values, unlike, for example, autovacuum_max_workers. If you think we
should go further than just doing max_connections, then I think we
either need to (a) add a note to every single bloody parameter that
affects the size of shared memory or (b) prove that the subset where
we add such a note have a significantly larger impact than the others
where we don't. Do you think we should get into all that?

--
Robert Haas
EDB: http://www.enterprisedb.com

#14

Robert Treat

rob@xzilla.net

over 1 year ago

In reply to: Robert Haas (#13)

Re: [DOC] Add detail regarding resource consumption wrt max_connections

On Wed, May 15, 2024 at 4:05 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, May 15, 2024 at 4:00 PM Robert Treat <rob@xzilla.net> wrote:

I think the only unresolved question in my mind was if we should add a
similar note to the original patch to max_prepared_xacts as well; do
you intend to do that?

I didn't intend to do that. I don't think it would be incorrect to do
so, but then we're kind of getting into a slippery slope of trying to
label every parameter that has increases shared memory usage or any
other kind of research consumption, and there are probably (pulls
number out of the air) twenty of those. It seems more worthwhile to
mention it for max_connections than the other (deducts one from
previous random guess) nineteen because it affects a whole lot more
things, like the size of the fsync queue and the size of the lock
table, and also because it tends to get set to relatively large
values, unlike, for example, autovacuum_max_workers. If you think we
should go further than just doing max_connections, then I think we
either need to (a) add a note to every single bloody parameter that
affects the size of shared memory or (b) prove that the subset where
we add such a note have a significantly larger impact than the others
where we don't. Do you think we should get into all that?

Nope. Let's do the best bang for the buck improvement and we can see
if we get any feedback that indicates more needs to be done.

Robert Treat
https://xzilla.net

#15

Robert Haas

robertmhaas@gmail.com

over 1 year ago

In reply to: Robert Treat (#14)

Re: [DOC] Add detail regarding resource consumption wrt max_connections

On Wed, May 15, 2024 at 4:22 PM Robert Treat <rob@xzilla.net> wrote:

Nope. Let's do the best bang for the buck improvement and we can see
if we get any feedback that indicates more needs to be done.

Done.

--
Robert Haas
EDB: http://www.enterprisedb.com