BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

Started by PG Bug reporting formover 3 years ago25 messagesbugs

noreply@postgresql.org

over 3 years ago

The following bug has been logged on the website:

Bug reference: 17757
Logged by: David Angel
Email address: david_sisson@dell.com
PostgreSQL version: 14.5
Operating system: Linux
Description:

On an OS where hugepages are enabled, if no hugepages resources are assigned
in Kubernetes and the postgres instance is set to hugepages = off in the
config then one would assume that the DB would not use hugepages.
However, because the initdb process uses postgresql.conf.sample or
postgresql.conf.template instead of the actual specified configuration the
applied setting is actually hugepages = try during initdb.
In these cases, the initdb phase will attempt to allocate huge pages that
are available in the OS, but it will be denied access by Kubernetes and
fail.

Here is a PR with a possible fix:
https://github.com/postgres/postgres/pull/114/files

Here are some links for further information
Ours: https://github.com/CrunchyData/postgres-operator/issues/3477

Others with the same having no solution to disable huge pages.
https://github.com/CrunchyData/postgres-operator/issues/3039
https://github.com/CrunchyData/postgres-operator/issues/2258
https://github.com/CrunchyData/postgres-operator/issues/3126
https://github.com/CrunchyData/postgres-operator/issues/3421

Bitnami
https://github.com/bitnami/charts/issues/7901

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 3 years ago

In reply to: PG Bug reporting form (#1)

Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

On 1/20/23 23:48, PG Bug reporting form wrote:

The following bug has been logged on the website:

Bug reference: 17757
Logged by: David Angel
Email address: david_sisson@dell.com
PostgreSQL version: 14.5
Operating system: Linux
Description:

On an OS where hugepages are enabled, if no hugepages resources are assigned
in Kubernetes and the postgres instance is set to hugepages = off in the
config then one would assume that the DB would not use hugepages.

There's no config at that point - it's initdb that creates it, by
copying the .sample file, IIRC. So not sure which file you're modifying.

However, because the initdb process uses postgresql.conf.sample or
postgresql.conf.template instead of the actual specified configuration the
applied setting is actually hugepages = try during initdb.

Specified how?

In these cases, the initdb phase will attempt to allocate huge pages that
are available in the OS, but it will be denied access by Kubernetes and
fail.

Well, so how exactly this fails? Does that mean Kubernetes broke mmap()
with MAP_HUGETLB so that it doesn't return MAP_FAILED when hugepages are
not available, or what? Because that's the only explanation I can see,
looking at the code.

Or it just does not realize there are no hugepages, returns something
and then crashes with SIGBUS later when trying to access it?

Here is a PR with a possible fix:
https://github.com/postgres/postgres/pull/114/files

I doubt we want to just go straight to changing the default value for
everyone. IMHO if the "try" logic is somehow broken, we should fix the
try logic, not mess with the defaults.

In the worst case, the operator can probably tweak the .sample config
before calling initdb.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Andres Freund

andres@anarazel.de

over 3 years ago

In reply to: Tomas Vondra (#2)

Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

Hi,

On 2023-01-22 00:10:29 +0100, Tomas Vondra wrote:

On 1/20/23 23:48, PG Bug reporting form wrote:

In these cases, the initdb phase will attempt to allocate huge pages that
are available in the OS, but it will be denied access by Kubernetes and
fail.

Well, so how exactly this fails? Does that mean Kubernetes broke mmap()
with MAP_HUGETLB so that it doesn't return MAP_FAILED when hugepages are
not available, or what? Because that's the only explanation I can see,
looking at the code.

Yea, that's what I was wondering about as well.

Or it just does not realize there are no hugepages, returns something
and then crashes with SIGBUS later when trying to access it?

I assume that that's the case. There's references to bus errors in a bunch of
the linked issues. E.g.
https://github.com/CrunchyData/postgres-operator/issues/413

selecting default max_connections ... sh: line 1: 60 Bus error (core dumped) "/usr/pgsql-10/bin/postgres" --boot -x0 -F -c max_connections=100 -c shared_buffers=1000 -c dynamic_shared_memory_type=none < "/dev/null" > "/dev/null" 2>&1

It's possible that the problem would go away if we used MAP_POPULATE for the
allocation.

I'd guess that this is annoying cgroups stuff :(

I doubt we want to just go straight to changing the default value for
everyone. IMHO if the "try" logic is somehow broken, we should fix the
try logic, not mess with the defaults.

Agreed. But we could disable huge pages explicitly inside initdb - there's
really no point in using it there...

Greetings,

Andres Freund

Tom Lane

tgl@sss.pgh.pa.us

over 3 years ago

In reply to: Tomas Vondra (#2)

Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

Tomas Vondra <tomas.vondra@enterprisedb.com> writes:

On 1/20/23 23:48, PG Bug reporting form wrote:

Here is a PR with a possible fix:
https://github.com/postgres/postgres/pull/114/files

I doubt we want to just go straight to changing the default value for
everyone.

Yeah, that proposal is a non-starter. I could see providing an
initdb option to adjust the value applied during initdb, though.

Ideally, maybe what we want is a generalized switch that could
replace any variable in the sample config, along the lines of
the server's "-c foo=bar". I recall having tried to do that and
having run into quoting hazards, but I did not try very hard.

regards, tom lane

Tom Lane

tgl@sss.pgh.pa.us

over 3 years ago

In reply to: Andres Freund (#3)

Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

Andres Freund <andres@anarazel.de> writes:

On 2023-01-22 00:10:29 +0100, Tomas Vondra wrote:

I doubt we want to just go straight to changing the default value for
everyone. IMHO if the "try" logic is somehow broken, we should fix the
try logic, not mess with the defaults.

Agreed. But we could disable huge pages explicitly inside initdb - there's
really no point in using it there...

One of the things initdb is trying to do is establish a set of values
that is known to allow the server to start. Not using the same settings
that the server is expected to use would break that idea completely.

regards, tom lane

Andres Freund

andres@anarazel.de

over 3 years ago

In reply to: Tom Lane (#5)

Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

Hi,

On 2023-01-21 18:33:03 -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2023-01-22 00:10:29 +0100, Tomas Vondra wrote:

I doubt we want to just go straight to changing the default value for
everyone. IMHO if the "try" logic is somehow broken, we should fix the
try logic, not mess with the defaults.

Agreed. But we could disable huge pages explicitly inside initdb - there's
really no point in using it there...

One of the things initdb is trying to do is establish a set of values
that is known to allow the server to start. Not using the same settings
that the server is expected to use would break that idea completely.

Yea, I'm not saying like the approach. OTOH, we don't provide a proper way to
influence the configuration, which is bad, as this issue shows.

Perhaps we should add an option to force MAP_POPULATE being used? I'm fairly
certain that'd avoid the SIGBUS in this case. And it'd make sense to ensure
that we can actually use the memory in initdb.

Unfortunately it's not unproblematic to use it in general, because with large
shared_buffers values it can be quite slow, because the kernel initializes the
memory in a single thread. I've seen ~3GB/s on multi-socket machines.

Greetings,

Andres Freund

Tom Lane

tgl@sss.pgh.pa.us

over 3 years ago

In reply to: Andres Freund (#6)

Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

Andres Freund <andres@anarazel.de> writes:

Perhaps we should add an option to force MAP_POPULATE being used? I'm fairly
certain that'd avoid the SIGBUS in this case. And it'd make sense to ensure
that we can actually use the memory in initdb.

Unfortunately it's not unproblematic to use it in general, because with large
shared_buffers values it can be quite slow, because the kernel initializes the
memory in a single thread. I've seen ~3GB/s on multi-socket machines.

Hmm ... but if we can't use it by default, we're still back to the
problem of needing a way to tell initdb to do things differently.
I'd just as soon keep that to "set huge_pages = off" rather than
inventing whole new things.

regards, tom lane

Andres Freund

andres@anarazel.de

over 3 years ago

In reply to: Andres Freund (#3)

Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

Hi,

On 2023-01-21 15:29:22 -0800, Andres Freund wrote:

On 2023-01-22 00:10:29 +0100, Tomas Vondra wrote:

On 1/20/23 23:48, PG Bug reporting form wrote:

In these cases, the initdb phase will attempt to allocate huge pages that
are available in the OS, but it will be denied access by Kubernetes and
fail.

Well, so how exactly this fails? Does that mean Kubernetes broke mmap()
with MAP_HUGETLB so that it doesn't return MAP_FAILED when hugepages are
not available, or what? Because that's the only explanation I can see,
looking at the code.

Yea, that's what I was wondering about as well.

Or it just does not realize there are no hugepages, returns something
and then crashes with SIGBUS later when trying to access it?

I assume that that's the case. There's references to bus errors in a bunch of
the linked issues. E.g.
https://github.com/CrunchyData/postgres-operator/issues/413

selecting default max_connections ... sh: line 1: 60 Bus error (core dumped) "/usr/pgsql-10/bin/postgres" --boot -x0 -F -c max_connections=100 -c shared_buffers=1000 -c dynamic_shared_memory_type=none < "/dev/null" > "/dev/null" 2>&1

It's possible that the problem would go away if we used MAP_POPULATE for the
allocation.

I'd guess that this is annoying cgroups stuff :(

Ah, the fun:
https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v1/hugetlb.html

The HugeTLB controller allows users to limit the HugeTLB usage (page fault) per
control group and enforces the limit during page fault. Since HugeTLB
doesn't support page reclaim, enforcing the limit at page fault time implies
that, the application will get SIGBUS signal if it tries to fault in HugeTLB
pages beyond its limit. Therefore the application needs to know exactly how many
HugeTLB pages it uses before hand, and the sysadmin needs to make sure that
there are enough available on the machine for all the users to avoid processes
getting SIGBUS.

but there's also

Reservation accounting

hugetlb.<hugepagesize>.rsvd.limit_in_bytes hugetlb.<hugepagesize>.rsvd.max_usage_in_bytes hugetlb.<hugepagesize>.rsvd.usage_in_bytes hugetlb.<hugepagesize>.rsvd.failcnt

The HugeTLB controller allows to limit the HugeTLB reservations per control
group and enforces the controller limit at reservation time and at the fault
of HugeTLB memory for which no reservation exists. Since reservation limits
are enforced at reservation time (on mmap or shget), reservation limits
never causes the application to get SIGBUS signal if the memory was reserved
before hand. For MAP_NORESERVE allocations, the reservation limit behaves
the same as the fault limit, enforcing memory usage at fault time and
causing the application to receive a SIGBUS if it’s crossing its limit.

Reservation limits are superior to page fault limits described above, since
reservation limits are enforced at reservation time (on mmap or shget), and
never causes the application to get SIGBUS signal if the memory was reserved
before hand. This allows for easier fallback to alternatives such as
non-HugeTLB memory for example. In the case of page fault accounting, it’s
very hard to avoid processes getting SIGBUS since the sysadmin needs
precisely know the HugeTLB usage of all the tasks in the system and make
sure there is enough pages to satisfy all requests. Avoiding tasks getting
SIGBUS on overcommited systems is practically impossible with page fault
accounting.

So the problem is that the wrong time of cgroup limits are used. I don't know
if that's a kubernetes or a postgres-operator issue.

Greetings,

Andres Freund

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 3 years ago

In reply to: Tom Lane (#4)

Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

On 1/22/23 00:30, Tom Lane wrote:

Tomas Vondra <tomas.vondra@enterprisedb.com> writes:

On 1/20/23 23:48, PG Bug reporting form wrote:

Here is a PR with a possible fix:
https://github.com/postgres/postgres/pull/114/files

I doubt we want to just go straight to changing the default value for
everyone.

Yeah, that proposal is a non-starter. I could see providing an
initdb option to adjust the value applied during initdb, though.

Ideally, maybe what we want is a generalized switch that could
replace any variable in the sample config, along the lines of
the server's "-c foo=bar". I recall having tried to do that and
having run into quoting hazards, but I did not try very hard.

Yeah, I was looking for something like "-c" in initdb, only to realize
there's nothing like that. The main "problem" with adding that is that
we're unlikely to backpatch that (I guess), and thus it does not really
solve the issue for the OP.

I'm not sure we'd be keen to backpatch a change of the default, but
maybe we would ...

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#10

Tom Lane

tgl@sss.pgh.pa.us

over 3 years ago

In reply to: Tomas Vondra (#9)

Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

Tomas Vondra <tomas.vondra@enterprisedb.com> writes:

On 1/22/23 00:30, Tom Lane wrote:

Yeah, that proposal is a non-starter. I could see providing an
initdb option to adjust the value applied during initdb, though.
Ideally, maybe what we want is a generalized switch that could
replace any variable in the sample config, along the lines of
the server's "-c foo=bar". I recall having tried to do that and
having run into quoting hazards, but I did not try very hard.

Yeah, I was looking for something like "-c" in initdb, only to realize
there's nothing like that. The main "problem" with adding that is that
we're unlikely to backpatch that (I guess), and thus it does not really
solve the issue for the OP.

I'm not sure we'd be keen to backpatch a change of the default, but
maybe we would ...

Back-patching a change of default seems like REALLY a non-starter.
Perhaps adding a switch (which would break nothing if not used)
could be discussed, though.

regards, tom lane

#11

Andres Freund

andres@anarazel.de

over 3 years ago

In reply to: Tomas Vondra (#9)

Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

Hi,

On 2023-01-22 01:55:01 +0100, Tomas Vondra wrote:

I'm not sure we'd be keen to backpatch a change of the default, but
maybe we would ...

After figuring out that it's clearly a configuration issue *somewhere* outside
of postgres's remit, I'm not that sure it's worth doing something concretely
to avoid the SIGBUS issue.

But if we end up doing something, I think a parameter triggering use of
MAP_POPULATE would be a good idea. It's actually useful outside of the SIGBUS
issue, because benchmarks reach a steady state noticably more quickly when
using it.

OTOH, in a production scenario with large shared_buffers I'd probably not want
to use it, because getting up more quickly and and distributing the memory
initialization across across cores is more important.

I think it'd be ok to explicitly specify such an option in initdb - after all,
initdb does do work to determine the correct shared buffers size etc, and
MAP_POPULATE will lead to a more reliable determination. Not just with huge
pages, but also with "small" pages and system-level memory overcommit.

Greetings,

Andres Freund

#12

Sisson, David

David.Sisson@dell.com

over 3 years ago

In reply to: Andres Freund (#11)

RE: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

I believe something should be done with PostgreSQL because we are configuring huge_pages = off in the standard "postgresql.conf" file.
huge_pages can be turned on through outside manipulation but it can't be turned off.
Not without altering the sample config file.

Thanks,
David Angel 😊

Internal Use - Confidential

-----Original Message-----
From: Andres Freund <andres@anarazel.de>
Sent: Saturday, January 21, 2023 8:08 PM
To: Tomas Vondra
Cc: Tom Lane; Sisson, David; pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

[EXTERNAL EMAIL]

Hi,

On 2023-01-22 01:55:01 +0100, Tomas Vondra wrote:

I'm not sure we'd be keen to backpatch a change of the default, but
maybe we would ...

After figuring out that it's clearly a configuration issue *somewhere* outside of postgres's remit, I'm not that sure it's worth doing something concretely to avoid the SIGBUS issue.

But if we end up doing something, I think a parameter triggering use of MAP_POPULATE would be a good idea. It's actually useful outside of the SIGBUS issue, because benchmarks reach a steady state noticably more quickly when using it.

OTOH, in a production scenario with large shared_buffers I'd probably not want to use it, because getting up more quickly and and distributing the memory initialization across across cores is more important.

I think it'd be ok to explicitly specify such an option in initdb - after all, initdb does do work to determine the correct shared buffers size etc, and MAP_POPULATE will lead to a more reliable determination. Not just with huge pages, but also with "small" pages and system-level memory overcommit.

Greetings,

Andres Freund

#13

Christophe Pettus

xof@thebuild.com

over 3 years ago

In reply to: Sisson, David (#12)

Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

On Jan 23, 2023, at 11:26, Sisson, David <David.Sisson@dell.com> wrote:

I believe something should be done with PostgreSQL because we are configuring huge_pages = off in the standard "postgresql.conf" file.

We are? I believe the default is "huge_pages = try", not off.

#14

Sisson, David

David.Sisson@dell.com

over 3 years ago

In reply to: Christophe Pettus (#13)

RE: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

The default is "huge_pages = try" which is commented out in the "postgresql.conf.sample" file.
When a consumer like myself turns it off in the standard "postgresql.conf" file, it should not be turned on when initdb runs.
There is no way to turn it off without altering the sample config file.

It is quite difficult to nearly impossible to alter the "postgresql.conf.sample" file using a 3rd party controller.
The file is read-only at runtime within Kubernetes.
Only some controllers let you modify the sample file without rebuilding their code.

You guys are awesome with truly outstanding responses.
I certainly didn't expect my initial solution to be used but to help in finding a good solution. 😊

Thanks,
David Angel

Internal Use - Confidential

-----Original Message-----
From: Christophe Pettus <xof@thebuild.com>
Sent: Monday, January 23, 2023 1:38 PM
To: Sisson, David
Cc: Andres Freund; Tomas Vondra; Tom Lane; pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

[EXTERNAL EMAIL]

On Jan 23, 2023, at 11:26, Sisson, David <David.Sisson@dell.com> wrote:

I believe something should be done with PostgreSQL because we are configuring huge_pages = off in the standard "postgresql.conf" file.

We are? I believe the default is "huge_pages = try", not off.

#15

Andres Freund

andres@anarazel.de

over 3 years ago

In reply to: Sisson, David (#12)

Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

Hi,

On 2023-01-23 19:26:09 +0000, Sisson, David wrote:

I believe something should be done with PostgreSQL because we are configuring huge_pages = off in the standard "postgresql.conf" file.
huge_pages can be turned on through outside manipulation but it can't be
turned off.

It's a fault of the environment if mmap(MAP_HUGETLB) causes a SIGBUS. Normally
huge_pages = try is harmless, because it'll just fall back. That source of
SIGBUSes needs to be fixed regardless of anything else - plenty allocators try
to use huge pages for example, so you'll run into problems regardless of
postgres' default.

That said, I'm for allowing to specify options to initdb.

Greetings,

Andres Freund

#16

Tom Lane

tgl@sss.pgh.pa.us

over 3 years ago

In reply to: Sisson, David (#14)

Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

"Sisson, David" <David.Sisson@dell.com> writes:

The default is "huge_pages = try" which is commented out in the "postgresql.conf.sample" file.
When a consumer like myself turns it off in the standard "postgresql.conf" file, it should not be turned on when initdb runs.

What "standard postgresql.conf file"? There is no such thing until
initdb creates it.

There is no way to turn it off without altering the sample config file.

Yup, that's exactly why we are having this discussion.

regards, tom lane

#17

David G. Johnston

david.g.johnston@gmail.com

over 3 years ago

In reply to: Sisson, David (#14)

Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

On Mon, Jan 23, 2023 at 12:51 PM Sisson, David <David.Sisson@dell.com>
wrote:

The default is "huge_pages = try" which is commented out in the
"postgresql.conf.sample" file.
When a consumer like myself turns it off in the standard "postgresql.conf"
file, it should not be turned on when initdb runs.
There is no way to turn it off without altering the sample config file.

Right, the present way to control what is seen by initdb is
postgresql.conf.sample since that is the template that initdb uses to then
produce an actual postgresql.conf for the newly created instance.
postgresql.conf is only ever a per-instance configuration file. It doesn't
make sense to "change postgresql.conf in hopes of influencing some future
initdb run."

David J.

#18

Sisson, David

David.Sisson@dell.com

over 3 years ago

In reply to: Tom Lane (#16)

RE: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

That makes sense, the PostgreSQL controllers are calling initdb to create the "postgresql.conf" file before they apply customizations to it.
To the consumer, it is just yaml to be added to the "postgresql.conf" file.

That makes it much harder to fix and means it is really the controllers at fault.

This probably needs to be explicitly documented when creating a HA cluster or within initdb docs.
https://www.postgresql.org/docs/15/app-initdb.html

Maybe something about how initdb uses sample and what configuration settings must be pre-configured.

Thanks,
David Angel

Internal Use - Confidential

-----Original Message-----
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Monday, January 23, 2023 1:56 PM
To: Sisson, David
Cc: Christophe Pettus; Andres Freund; Tomas Vondra; pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

[EXTERNAL EMAIL]

"Sisson, David" <David.Sisson@dell.com> writes:

The default is "huge_pages = try" which is commented out in the "postgresql.conf.sample" file.
When a consumer like myself turns it off in the standard "postgresql.conf" file, it should not be turned on when initdb runs.

What "standard postgresql.conf file"? There is no such thing until initdb creates it.

There is no way to turn it off without altering the sample config file.

Yup, that's exactly why we are having this discussion.

regards, tom lane

#19

Sisson, David

David.Sisson@dell.com

over 3 years ago

In reply to: Sisson, David (#18)

RE: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

A quick and dirty solution could be to alter initdb to catch the exception and retry using a copy of the sample with "huge_pages=false".
Would that be acceptable?

Passing in a config setting into initdb would still require a rebuild of all controllers.
That could take months to years at best.

Thanks,
David Angel

Internal Use - Confidential

-----Original Message-----
From: Sisson, David <David_Sisson@Dell.com>
Sent: Monday, January 23, 2023 2:12 PM
To: Tom Lane
Cc: Christophe Pettus; Andres Freund; Tomas Vondra; pgsql-bugs@lists.postgresql.org; Sisson, David; Howell, Stephen
Subject: RE: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

That makes it much harder to fix and means it is really the controllers at fault.

This probably needs to be explicitly documented when creating a HA cluster or within initdb docs.
https://www.postgresql.org/docs/15/app-initdb.html

Maybe something about how initdb uses sample and what configuration settings must be pre-configured.

Thanks,
David Angel

Internal Use - Confidential

[EXTERNAL EMAIL]

"Sisson, David" <David.Sisson@dell.com> writes:

The default is "huge_pages = try" which is commented out in the "postgresql.conf.sample" file.
When a consumer like myself turns it off in the standard "postgresql.conf" file, it should not be turned on when initdb runs.

What "standard postgresql.conf file"? There is no such thing until initdb creates it.

There is no way to turn it off without altering the sample config file.

Yup, that's exactly why we are having this discussion.

regards, tom lane

#20

Andres Freund

andres@anarazel.de

over 3 years ago

In reply to: Sisson, David (#19)

Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

Hi,

On 2023-01-23 20:35:17 +0000, Sisson, David wrote:

A quick and dirty solution could be to alter initdb to catch the exception and retry using a copy of the sample with "huge_pages=false".
Would that be acceptable?

This is a kubernetes or postgres-operator bug (setting up the wrong cgroup
limit, which the docs explicitly warn against doing). I don't think we want to
accumulate workarounds like that in postgres.

Passing in a config setting into initdb would still require a rebuild of all controllers.
That could take months to years at best.

Huh. I don't know anything about the controller, but that seems problematic
independent of this specific issue. And you'd still need to deploy a new
version of postgres to get such changes...