make MaxBackends available in _PG_init
Hi Hackers:
I find the value in function _PG_init, the value of MaxBackends is 0.
In source, I find that the postmaster will first load library, and then calculate the value of MaxBackends.
In the old version, the MaxBackends was calculated by:
MaxBackends = MaxConnections + autovacuum_max_workers + 1 +
GetNumShmemAttachedBgworkers();
Because any extension can register workers which will affect the return value of GetNumShmemAttachedBgworkers.
InitializeMaxBackends must be called after shared_preload_libraries. This is also mentioned in comments.
Now, function GetNumShmemAttachedBgworkers was deleted and replaced by guc max_worker_processes,
so if we changed the calling order like:
Step1: calling InitializeMaxBackends.
Step2: calling process_shared_preload_libraries
In this order extension can get the correct value of MaxBackends in _PG_init.
Any thoughts?
Regards
Attachments:
0001-Make-MaxBackends-available-in-_PG_init.patchapplication/octet-stream; name=0001-Make-MaxBackends-available-in-_PG_init.patchDownload+8-14
On Mon, Sep 21, 2020 at 9:14 AM Wang, Shenhao
<wangsh.fnst@cn.fujitsu.com> wrote:
In source, I find that the postmaster will first load library, and then calculate the value of MaxBackends.
In the old version, the MaxBackends was calculated by:
MaxBackends = MaxConnections + autovacuum_max_workers + 1 +
GetNumShmemAttachedBgworkers();
Because any extension can register workers which will affect the return value of GetNumShmemAttachedBgworkers.
InitializeMaxBackends must be called after shared_preload_libraries. This is also mentioned in comments.Now, function GetNumShmemAttachedBgworkers was deleted and replaced by guc max_worker_processes,
so if we changed the calling order like:
Step1: calling InitializeMaxBackends.
Step2: calling process_shared_preload_libraries
Yes, the GetNumShmemAttachedBgworkers() was removed by commit #
dfbba2c86cc8f09cf3ffca3d305b4ce54a7fb49a. ASAICS, changing the order
of InitializeMaxBackends() and process_shared_preload_libraries() has
no problem, as InitializeMaxBackends() doesn't calculate the
MaxBackends based on bgworker infra code, it does calculate based on
GUCs.
Having said that, I'm not quite sure whether any of the bgworker
registration code, for that matter process_shared_preload_libraries()
code path will somewhere use MaxBackends?
In this order extension can get the correct value of MaxBackends in _PG_init.
Is there any specific use case that any of the _PG_init will use MaxBackends?
I think the InitializeMaxBackends() function comments still say as shown below.
* This must be called after modules have had the chance to register background
* workers in shared_preload_libraries, and before shared memory size is
* determined.
What happens with your patch in EXEC_BACKEND cases? (On linux
EXEC_BACKEND (include/pg_config_manual.h) can be enabled for testing
purposes).
With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
On 9/21/20, 4:52 AM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
On Mon, Sep 21, 2020 at 9:14 AM Wang, Shenhao
<wangsh.fnst@cn.fujitsu.com> wrote:In source, I find that the postmaster will first load library, and then calculate the value of MaxBackends.
In the old version, the MaxBackends was calculated by:
MaxBackends = MaxConnections + autovacuum_max_workers + 1 +
GetNumShmemAttachedBgworkers();
Because any extension can register workers which will affect the return value of GetNumShmemAttachedBgworkers.
InitializeMaxBackends must be called after shared_preload_libraries. This is also mentioned in comments.Now, function GetNumShmemAttachedBgworkers was deleted and replaced by guc max_worker_processes,
so if we changed the calling order like:
Step1: calling InitializeMaxBackends.
Step2: calling process_shared_preload_librariesYes, the GetNumShmemAttachedBgworkers() was removed by commit #
dfbba2c86cc8f09cf3ffca3d305b4ce54a7fb49a. ASAICS, changing the order
of InitializeMaxBackends() and process_shared_preload_libraries() has
no problem, as InitializeMaxBackends() doesn't calculate the
MaxBackends based on bgworker infra code, it does calculate based on
GUCs.Having said that, I'm not quite sure whether any of the bgworker
registration code, for that matter process_shared_preload_libraries()
code path will somewhere use MaxBackends?In this order extension can get the correct value of MaxBackends in _PG_init.
Is there any specific use case that any of the _PG_init will use MaxBackends?
I just encountered the same thing, so I am bumping this thread. I was
trying to use MaxBackends in a call to RequestAddinShmemSpace() in a
_PG_init() function for a module, but since MaxBackends is not yet
initialized, you essentially need to open-code InitializeMaxBackends()
instead.
I think the comments about needing to register background workers
before initializing MaxBackends have been incorrect since the addition
of max_worker_processes in v9.4 (6bc8ef0b). Furthermore, I think the
suggested reordering is a good idea because it is not obvious that
MaxBackends will be uninitialized in _PG_init(), and use-cases like
the RequestAddinShmemSpace() one are not guaranteed to fail when
MaxBackends is used incorrectly (presumably due to the 100 KB buffer
added in CreateSharedMemoryAndSemaphores()).
I've attached a new version of the proposed patch with some slight
adjustments and an attempt at a commit message.
Nathan
Attachments:
v1-0001-Calculate-MaxBackends-earlier-in-PostmasterMain.patchapplication/octet-stream; name=v1-0001-Calculate-MaxBackends-earlier-in-PostmasterMain.patchDownload+10-14
On Mon, Aug 2, 2021 at 2:18 PM Bossart, Nathan <bossartn@amazon.com> wrote:
I just encountered the same thing, so I am bumping this thread. I was
trying to use MaxBackends in a call to RequestAddinShmemSpace() in a
_PG_init() function for a module, but since MaxBackends is not yet
initialized, you essentially need to open-code InitializeMaxBackends()
instead.I think the comments about needing to register background workers
before initializing MaxBackends have been incorrect since the addition
of max_worker_processes in v9.4 (6bc8ef0b). Furthermore, I think the
suggested reordering is a good idea because it is not obvious that
MaxBackends will be uninitialized in _PG_init(), and use-cases like
the RequestAddinShmemSpace() one are not guaranteed to fail when
MaxBackends is used incorrectly (presumably due to the 100 KB buffer
added in CreateSharedMemoryAndSemaphores()).I've attached a new version of the proposed patch with some slight
adjustments and an attempt at a commit message.
I think this is a good idea. Anyone object?
--
Robert Haas
EDB: http://www.enterprisedb.com
On 2021-Aug-02, Robert Haas wrote:
On Mon, Aug 2, 2021 at 2:18 PM Bossart, Nathan <bossartn@amazon.com> wrote:
I think the comments about needing to register background workers
before initializing MaxBackends have been incorrect since the addition
of max_worker_processes in v9.4 (6bc8ef0b).
I think this is a good idea. Anyone object?
No objection here. AFAICS Nathan is correct.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Los dioses no protegen a los insensatos. Éstos reciben protección de
otros insensatos mejor dotados" (Luis Wu, Mundo Anillo)
Hi,
On 2021-08-02 16:06:15 -0400, Robert Haas wrote:
On Mon, Aug 2, 2021 at 2:18 PM Bossart, Nathan <bossartn@amazon.com> wrote:
I just encountered the same thing, so I am bumping this thread. I was
trying to use MaxBackends in a call to RequestAddinShmemSpace() in a
_PG_init() function for a module, but since MaxBackends is not yet
initialized, you essentially need to open-code InitializeMaxBackends()
instead.I think the comments about needing to register background workers
before initializing MaxBackends have been incorrect since the addition
of max_worker_processes in v9.4 (6bc8ef0b). Furthermore, I think the
suggested reordering is a good idea because it is not obvious that
MaxBackends will be uninitialized in _PG_init(), and use-cases like
the RequestAddinShmemSpace() one are not guaranteed to fail when
MaxBackends is used incorrectly (presumably due to the 100 KB buffer
added in CreateSharedMemoryAndSemaphores()).I've attached a new version of the proposed patch with some slight
adjustments and an attempt at a commit message.I think this is a good idea. Anyone object?
I'm not so sure it's a good idea. I've seen several shared_preload_library
using extensions that adjust some GUCs (e.g. max_prepared_transactions)
because they need some more resources internally - that's perhaps not a great
idea, but there's also not an obviously better way.
ISTM this would better be solved with making the hook or config logic for
libraries a bit more elaborate (e.g. having a hook you can register for that's
called once all libraries are loaded).
Greetings,
Andres Freund
On 8/2/21, 1:37 PM, "Andres Freund" <andres@anarazel.de> wrote:
On 2021-08-02 16:06:15 -0400, Robert Haas wrote:
I think this is a good idea. Anyone object?
I'm not so sure it's a good idea. I've seen several shared_preload_library
using extensions that adjust some GUCs (e.g. max_prepared_transactions)
because they need some more resources internally - that's perhaps not a great
idea, but there's also not an obviously better way.
Interesting. I hadn't heard of extensions adjusting GUCs in
_PG_init() before. I think the other way to handle that scenario is
to check the GUCs in _PG_init() and fail startup if necessary.
However, while it's probably good to avoid changing GUCs from what
users specified, failing startup isn't exactly user-friendly, either.
In any case, changing a GUC in _PG_init() seems quite risky. You're
effectively bypassing all of the usual checks.
ISTM this would better be solved with making the hook or config logic for
libraries a bit more elaborate (e.g. having a hook you can register for that's
called once all libraries are loaded).
My worry is that this requires even more specialized knowledge to get
things right. If I need to do anything based on the value of a GUC,
I have to register another hook. In this new hook, we'd likely want
to somehow prevent modules from further adjusting GUCs. We'd also
need modules to check the GUCs they care about again in case another
module's _PG_init() adjusted it in an incompatible way. If you detect
a problem in this new hook, you probably need to fail startup.
Perhaps I am making a mountain out of a molehill and the modules that
are adjusting GUCs in _PG_init() are doing so in a generally safe way,
but IMO it ought to ordinarily be discouraged.
Nathan
Hi,
On 2021-08-02 21:57:18 +0000, Bossart, Nathan wrote:
On 8/2/21, 1:37 PM, "Andres Freund" <andres@anarazel.de> wrote:
On 2021-08-02 16:06:15 -0400, Robert Haas wrote:
I think this is a good idea. Anyone object?
I'm not so sure it's a good idea. I've seen several shared_preload_library
using extensions that adjust some GUCs (e.g. max_prepared_transactions)
because they need some more resources internally - that's perhaps not a great
idea, but there's also not an obviously better way.Interesting. I hadn't heard of extensions adjusting GUCs in
_PG_init() before. I think the other way to handle that scenario is
to check the GUCs in _PG_init() and fail startup if necessary.
The problem is that that makes it hard to configure things for the users own
needs. If e.g. the user's workload needs a certain number of prepared
transactions and the extension internally as well (as e.g. citus does), the
extension can't know if the configured number is "aimed" to be for the
extension, or for the user's own needs. If you instead just increase
max_prepared_transactions in _PG_init(), the story is different.
However, while it's probably good to avoid changing GUCs from what
users specified, failing startup isn't exactly user-friendly, either.
In any case, changing a GUC in _PG_init() seems quite risky. You're
effectively bypassing all of the usual checks.
It doesn't need to bypass all - you can override them using GUC mechanisms at
that point.
ISTM this would better be solved with making the hook or config logic for
libraries a bit more elaborate (e.g. having a hook you can register for that's
called once all libraries are loaded).My worry is that this requires even more specialized knowledge to get
things right. If I need to do anything based on the value of a GUC,
I have to register another hook. In this new hook, we'd likely want
to somehow prevent modules from further adjusting GUCs. We'd also
need modules to check the GUCs they care about again in case another
module's _PG_init() adjusted it in an incompatible way.
I think this is overblown. We already size resources *after*
shared_preload_libraries' _PG_init() runs, because that's the whole point of
shared_preload_libraries. What's proposed in this thread is to *disallow*
increasing resource usage in s_p_l _PG_init(), to make one specific case
simpler - but it'll actually also make things more complicated, because other
resources will still only be sized after all of s_p_l has been processed.
If you detect a problem in this new hook, you probably need to fail startup.
Same is true for _PG_init().
Perhaps I am making a mountain out of a molehill and the modules that
are adjusting GUCs in _PG_init() are doing so in a generally safe way,
but IMO it ought to ordinarily be discouraged.
It's not something I would recommend doing blithely - but I also don't think
we have a better answer for some cases.
Greetings,
Andres Freund
On 8/2/21, 3:12 PM, "Andres Freund" <andres@anarazel.de> wrote:
I think this is overblown. We already size resources *after*
shared_preload_libraries' _PG_init() runs, because that's the whole point of
shared_preload_libraries. What's proposed in this thread is to *disallow*
increasing resource usage in s_p_l _PG_init(), to make one specific case
simpler - but it'll actually also make things more complicated, because other
resources will still only be sized after all of s_p_l has been processed.
True. Perhaps the comments should reference the possibility that a
library will adjust resource usage to explain why
InitializeMaxBackends() is where it is.
Nathan
Hi,
On 2021-08-02 22:35:13 +0000, Bossart, Nathan wrote:
On 8/2/21, 3:12 PM, "Andres Freund" <andres@anarazel.de> wrote:
I think this is overblown. We already size resources *after*
shared_preload_libraries' _PG_init() runs, because that's the whole point of
shared_preload_libraries. What's proposed in this thread is to *disallow*
increasing resource usage in s_p_l _PG_init(), to make one specific case
simpler - but it'll actually also make things more complicated, because other
resources will still only be sized after all of s_p_l has been processed.True. Perhaps the comments should reference the possibility that a
library will adjust resource usage to explain why
InitializeMaxBackends() is where it is.
I've wondered, independent of this thread, about not making MaxBackends
externally visible, and requiring a function call to access it. It should be
easier to find cases of misuse if we errored out when accessed at the wrong
time. And we could use that opportunity to add flags that determine which
types of backends are included (e.g. not including autovac, or additionally
including aux workers or prepared xacts).
Greetings,
Andres Freund
On 8/2/21, 3:42 PM, "Andres Freund" <andres@anarazel.de> wrote:
I've wondered, independent of this thread, about not making MaxBackends
externally visible, and requiring a function call to access it. It should be
easier to find cases of misuse if we errored out when accessed at the wrong
time. And we could use that opportunity to add flags that determine which
types of backends are included (e.g. not including autovac, or additionally
including aux workers or prepared xacts).
I'm not opposed to this. I can work on putting a patch together if no
opposition materializes.
Nathan
Hi,
Bossart, Nathan <bossartn@amazon.com> wrote:
I just encountered the same thing, so I am bumping this thread.
Thank you for bumping this thread.
I've wondered, independent of this thread, about not making MaxBackends
externally visible, and requiring a function call to access it. It should be
easier to find cases of misuse if we errored out when accessed at the wrong
time.
I'm not opposed to this. I can work on putting a patch together if no
opposition materializes.
I think this is a good idea.
Shenhao Wang
Best regards
On Mon, Aug 2, 2021 at 4:37 PM Andres Freund <andres@anarazel.de> wrote:
I'm not so sure it's a good idea. I've seen several shared_preload_library
using extensions that adjust some GUCs (e.g. max_prepared_transactions)
because they need some more resources internally - that's perhaps not a great
idea, but there's also not an obviously better way.
Blech.
I'm fine with solving the problem some other way, but I say again, blech.
--
Robert Haas
EDB: http://www.enterprisedb.com
On 8/2/21, 4:02 PM, "Bossart, Nathan" <bossartn@amazon.com> wrote:
On 8/2/21, 3:42 PM, "Andres Freund" <andres@anarazel.de> wrote:
I've wondered, independent of this thread, about not making MaxBackends
externally visible, and requiring a function call to access it. It should be
easier to find cases of misuse if we errored out when accessed at the wrong
time. And we could use that opportunity to add flags that determine which
types of backends are included (e.g. not including autovac, or additionally
including aux workers or prepared xacts).I'm not opposed to this. I can work on putting a patch together if no
opposition materializes.
Here is a first attempt.
Nathan
Attachments:
v1-0001-Disallow-external-access-to-MaxBackends.patchapplication/octet-stream; name=v1-0001-Disallow-external-access-to-MaxBackends.patchDownload+118-82
On 8/3/21, 4:14 PM, "Bossart, Nathan" <bossartn@amazon.com> wrote:
On 8/2/21, 4:02 PM, "Bossart, Nathan" <bossartn@amazon.com> wrote:
On 8/2/21, 3:42 PM, "Andres Freund" <andres@anarazel.de> wrote:
I've wondered, independent of this thread, about not making MaxBackends
externally visible, and requiring a function call to access it. It should be
easier to find cases of misuse if we errored out when accessed at the wrong
time. And we could use that opportunity to add flags that determine which
types of backends are included (e.g. not including autovac, or additionally
including aux workers or prepared xacts).I'm not opposed to this. I can work on putting a patch together if no
opposition materializes.Here is a first attempt.
Here is a rebased version of the patch.
Nathan
Attachments:
v2-0001-Disallow-external-access-to-MaxBackends.patchapplication/octet-stream; name=v2-0001-Disallow-external-access-to-MaxBackends.patchDownload+118-82
On Sat, Aug 7, 2021 at 2:01 PM Bossart, Nathan <bossartn@amazon.com> wrote:
Here is a rebased version of the patch.
Giving this a review.
Patch applies cleanly and `make check` works as of
e12694523e7e4482a052236f12d3d8b58be9a22c
Overall looks very nice and tucks MaxBackends safely away.
I have a few suggestions:
+ size = add_size(size, mul_size(GetMaxBackends(0),
sizeof(BTOneVacInfo)));
The use of GetMaxBackends(0) looks weird - can we add another constant in
there
for the "default" case? Or just have GetMaxBackends() work?
+ GMB_MAX_PREPARED_XACTS = 1 << 0, /* include max_prepared_xacts */
s/include/add in/
+typedef enum GMBOption +{ + GMB_MAX_PREPARED_XACTS = 1 << 0, /* include max_prepared_xacts */ + GMB_NUM_AUXILIARY_PROCS = 1 << 1 /* include NUM_AUXILIARY_PROCS */ +} GMBOption;
Is a typedef enum really needed? As opposed to something like this style:
#define WL_LATCH_SET (1 << 0)
#define WL_SOCKET_READABLE (1 << 1)
#define WL_SOCKET_WRITEABLE (1 << 2)
- (MaxBackends + max_prepared_xacts + 1)); + (GetMaxBackends(GMB_MAX_PREPARED_XACTS) + 1));
This is a little confusing - there is no indication to the reader that this
is an additive function. Perhaps a little more intuitive name:
+ (GetMaxBackends(GMB_PLUS_MAX_PREPARED_XACTS) + 1));
However, the more I went through this patch, the more the GetMaxBackends(0)
nagged at me. The vast majority of the calls are with "0". I'd argue for
just having no arguments at all, which removes a bit of code and actually
makes things like this easier to read:
Original change:
- uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS +
max_prepared_xacts;
+ uint32 TotalProcs = GetMaxBackends(GMB_NUM_AUXILIARY_PROCS |
GMB_MAX_PREPARED_XACTS);
Versus:
- uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS +
max_prepared_xacts;
+ uint32 TotalProcs = GetMaxBackends() + NUM_AUXILIARY_PROCS +
max_prepared_xacts;
+ * This must be called after modules have had the change to alter GUCs in + * shared_preload_libraries, and before shared memory size is determined.
s/change/chance/;
+void +SetMaxBackends(int max_backends) +{ + if (MaxBackendsInitialized) + elog(ERROR, "MaxBackends already initialized");
Is this going to get tripped by a call from restore_backend_variables?
Cheers,
Greg
On 8/9/21, 1:14 PM, "Greg Sabino Mullane" <htamfids@gmail.com> wrote:
Giving this a review.
Thanks for your review.
However, the more I went through this patch, the more the GetMaxBackends(0) nagged at me. The vast majority of the calls are with "0". I'd argue for just having no arguments at all, which removes a bit of code and actually makes things like this easier to read:
I agree. The argument is nonzero in less than half of the calls to
GetMaxBackends(), and I'm not sure it adds a whole lot of value in the
first place. I removed the argument in v3.
+ * This must be called after modules have had the change to alter GUCs in + * shared_preload_libraries, and before shared memory size is determined.s/change/chance/;
I fixed this in v3.
+void +SetMaxBackends(int max_backends) +{ + if (MaxBackendsInitialized) + elog(ERROR, "MaxBackends already initialized");Is this going to get tripped by a call from restore_backend_variables?
I ran 'make check-world' with EXEC_BACKEND with no problems, so I
don't think so.
Nathan
Attachments:
v3-0001-Disallow-external-access-to-MaxBackends.patchapplication/octet-stream; name=v3-0001-Disallow-external-access-to-MaxBackends.patchDownload+99-81
On Mon, Aug 9, 2021 at 8:22 PM Bossart, Nathan <bossartn@amazon.com> wrote:
Is this going to get tripped by a call from restore_backend_variables?
I ran 'make check-world' with EXEC_BACKEND with no problems, so I
don't think so.
v3 looks good, but I'm still not sure how to test the bit mentioned above.
I'm not familiar with this part of the code (SubPostmasterMain etc.), but
running make check-world with EXEC_BACKEND does not seem to execute that
code, as I added exit(1) to restore_backend_variables() and the tests still
ran fine. Further digging shows that even though the #ifdef EXEC_BACKEND
path is triggered, no --fork argument was being passed. Is there something
else one needs to provide to force that --fork (see line 189 of
src/backend/main/main.c) when testing?
Cheers,
Greg
Greg Sabino Mullane <htamfids@gmail.com> writes:
v3 looks good, but I'm still not sure how to test the bit mentioned above.
I'm not familiar with this part of the code (SubPostmasterMain etc.), but
running make check-world with EXEC_BACKEND does not seem to execute that
code, as I added exit(1) to restore_backend_variables() and the tests still
ran fine.
You must not have enabled EXEC_BACKEND properly. It's a compile-time
#define that affects multiple modules, so it's easy to get wrong.
The way I usually turn it on is
make distclean
./configure ... options of choice ...
edit src/include/pg_config.h, add "#define EXEC_BACKEND" line
make, install, test
In this way the setting is persistent till the next distclean/configure
cycle.
regards, tom lane
On Wed, Aug 11, 2021 at 10:08 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
You must not have enabled EXEC_BACKEND properly. It's a compile-time
#define that affects multiple modules, so it's easy to get wrong.
The way I usually turn it on is
Thank you. I was able to get it all working, and withdraw any objections to
that bit of the patch. :)
Cheers,
Greg