Parallelizing startup with many databases
Hello,
I hope you are doing well.
In PostgreSQL 16, startup appears to initialize databases sequentially and
primarily uses a single CPU core. In clusters with a very large number of
databases (around 5,000 in our case), this results in noticeably long
startup times after restarts or crash recovery.
I would like to ask:
- Is the largely single-threaded startup behavior a fundamental
architectural
constraint (e.g. catalog dependencies, locking, recovery ordering), or
mainly
an unimplemented optimization?
- Are there any existing discussions, patches, versions (18+) to
parallelize parts of startup or otherwise improve startup scalability with
many databases?
- Are there any PostgreSQL configuration settings known to dramatically
reduce startup time, or is startup performance mostly fixed by architecture
in this scenario?
I understand that having many databases in a single cluster is not the most
common or recommended multi-tenant model, but this is an existing system and
I’m trying to better understand the current limits and future direction.
Thank you for your time and insights.
Best regards.
On 1/2/26 8:55 AM, Babak Ghadiri wrote:
In PostgreSQL 16, startup appears to initialize databases sequentially and
primarily uses a single CPU core. In clusters with a very large number of
databases (around 5,000 in our case), this results in noticeably long
startup times after restarts or crash recovery.
Have you measured what is actually causing the slow startup? Without
knowing what is actually slow it is hard to say if threading would even
help.
How slow are we talking about and have you managed to create a minimal
case for reproducing the issue?
- Is the largely single-threaded startup behavior a fundamental
architectural
constraint (e.g. catalog dependencies, locking, recovery ordering),
or mainly
an unimplemented optimization?
PostgreSQL does not support threading, it uses a multi-process model to
implement for example parallel queries. And there is no way threading
would be introduced just to improved startup performance.
- Are there any existing discussions, patches, versions (18+) to
parallelize parts of startup or otherwise improve startup scalability
with many databases?
Not as far as I am aware but you can search our archives.
- Are there any PostgreSQL configuration settings known to dramatically
reduce startup time, or is startup performance mostly fixed by
architecture in this scenario?
I would first start trying to figure out why startup is slow before
doing anything else.
Andreas
Andreas Karlsson <andreas@proxel.se> writes:
On 1/2/26 8:55 AM, Babak Ghadiri wrote:
In PostgreSQL 16, startup appears to initialize databases sequentially and
primarily uses a single CPU core. In clusters with a very large number of
databases (around 5,000 in our case), this results in noticeably long
startup times after restarts or crash recovery.
Have you measured what is actually causing the slow startup? Without
knowing what is actually slow it is hard to say if threading would even
help.
"perf" results would likely be useful.
I tried creating 5000 databases here and didn't notice any particular
increase in server startup time (didn't try crash-recovery case).
So whatever this is is likely somewhat configuration- or
platform-dependent.
Having said that, 5000 databases sounds like an anti-pattern to
begin with. You're paying for an additional copy of the system
catalogs for each one.
regards, tom lane
On Fri, Jan 2, 2026, 08:55 Babak Ghadiri <bbkghadiri6@gmail.com> wrote:
Hello,
I hope you are doing well.In PostgreSQL 16, startup appears to initialize databases sequentially and
primarily uses a single CPU core. In clusters with a very large number of
databases (around 5,000 in our case), this results in noticeably long
startup times after restarts or crash recovery.
You probably want to consider setting:
recovery_init_sync_method=syncfs
I'm 99% certain that that will solve your problem.
https://www.postgresql.org/docs/current/runtime-config-error-handling.html
/messages/by-id/11bc2bb7-ecb5-3ad0-b39f-df632734cd81@discourse.org
PS It took me way to long to find that setting. I think we should move it
from the error handling docs page to the page with all of the other
recovery settings.
https://www.postgresql.org/docs/current/runtime-config-wal.html#RUNTIME-CONFIG-WAL-RECOVERY
Show quoted text
On 1/3/26 1:58 AM, Jelte Fennema-Nio wrote:
PS It took me way to long to find that setting. I think we should move
it from the error handling docs page to the page with all of the other
recovery settings. https://www.postgresql.org/docs/current/runtime-
config-wal.html#RUNTIME-CONFIG-WAL-RECOVERY <https://www.postgresql.org/
docs/current/runtime-config-wal.html#RUNTIME-CONFIG-WAL-RECOVERY>
I agree that it is currently not in exactly a great location but the
issue is that the "Recovery" section is a subsection of the "WAL"
section, and syncing the data directory is only loosely related to WAL.
One could argue it is related to WAL as in that it is something done
before replaying WAL but it is not an obvious location either. Or is it?
Andreas
On Sat, 3 Jan 2026 at 20:09, Andreas Karlsson <andreas@proxel.se> wrote:
On 1/3/26 1:58 AM, Jelte Fennema-Nio wrote:
PS It took me way to long to find that setting. I think we should move
it from the error handling docs page to the page with all of the other
recovery settings. https://www.postgresql.org/docs/current/runtime-
config-wal.html#RUNTIME-CONFIG-WAL-RECOVERY <https://www.postgresql.org/
docs/current/runtime-config-wal.html#RUNTIME-CONFIG-WAL-RECOVERY>I agree that it is currently not in exactly a great location but the
issue is that the "Recovery" section is a subsection of the "WAL"
section, and syncing the data directory is only loosely related to WAL.
One could argue it is related to WAL as in that it is something done
before replaying WAL but it is not an obvious location either. Or is it?
<Moving this discussion to pgsql-docs, with accompanying patch>
While the setting is not strictly related to WAL it still seems a much
more natural place than the "Error handling" page. Especially because of
the description of this subheading:
This section describes the settings that apply to recovery in general,
affecting crash recovery, streaming replication and archive-based
replication.
The only recovery related GUC that exists that's not on the WAL page is
recovery_min_apply_delay, which is under the Replication->Standby
section. Since that GUC is only valid on standbys that seems like a
sensible choice.
Attachments:
v1-0001-docs-Move-recovery_init_sync_method-to-WAL-Recove.patchtext/x-patch; charset=utf-8; name=v1-0001-docs-Move-recovery_init_sync_method-to-WAL-Recove.patchDownload+42-40
On 1/4/26 12:26 AM, Jelte Fennema-Nio wrote:
<Moving this discussion to pgsql-docs, with accompanying patch>
If we move the GUC in the documentation shouldn't we also move it in
postgresql.conf.sample? The sections in the documentation and the
sections in the sample config file seem to be the same.
Andreas
Andreas Karlsson <andreas@proxel.se> writes:
On 1/4/26 12:26 AM, Jelte Fennema-Nio wrote:
<Moving this discussion to pgsql-docs, with accompanying patch>
If we move the GUC in the documentation shouldn't we also move it in
postgresql.conf.sample? The sections in the documentation and the
sections in the sample config file seem to be the same.
You would also need to change the group that the GUC is assigned to
in guc_parameters.dat. So this isn't really a docs-only patch.
(I agree that the GUC seems misclassified as-is.)
regards, tom lane
On Sun Jan 4, 2026 at 1:02 AM CET, Tom Lane wrote:
Andreas Karlsson <andreas@proxel.se> writes:
On 1/4/26 12:26 AM, Jelte Fennema-Nio wrote:
<Moving this discussion to pgsql-docs, with accompanying patch>
If we move the GUC in the documentation shouldn't we also move it in
postgresql.conf.sample? The sections in the documentation and the
sections in the sample config file seem to be the same.You would also need to change the group that the GUC is assigned to
in guc_parameters.dat. So this isn't really a docs-only patch.
(I agree that the GUC seems misclassified as-is.)
Good points. Attached an updated patch that changes
postgresql.conf.sample and the group too.
I didn't move the thread back to pgsql-hackers though, since changing
the location once more seemed counterproductive. Especially since it's
still a docs change at heart.