Unexpected behavior after OOM errors

Started by Alexander Lakhin2 days ago10 messageshackers
Jump to latest
#1Alexander Lakhin
exclusion@gmail.com

Hello hackers,

I'd like to share my findings related to OOM error handling. I'm not sure
how large the class of such anomalies is (and if all of these can be
detected and fixed), but please look at a few issues I have discovered so
far:

1) An issue in lookup_type_cache()

The following modification:
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -104,6 +104,7 @@
  #include "storage/spin.h"
  #include "utils/memutils.h"

+#include "common/pg_prng.h"

  /*
   * Constants
@@ -528,7 +529,7 @@ hash_create(const char *tabname, int64 nelem, const HASHCTL *info, int flags)
       * that this is the first allocation made with the alloc function.  That's
       * a little ugly, but works for now.
       */
-    hashp->hctl = (HASHHDR *) hashp->alloc(sizeof(HASHHDR), hashp->alloc_arg);
+    hashp->hctl = (pg_prng_double(&pg_global_prng_state) < 0.001) ? NULL : (HASHHDR *) hashp->alloc(sizeof(HASHHDR), 
hashp->alloc_arg);
      if (!hashp->hctl)
          ereport(ERROR,
                  (errcode(ERRCODE_OUT_OF_MEMORY),
@@ -609,7 +610,7 @@ hash_create(const char *tabname, int64 nelem, const HASHCTL *info, int flags)
          {
              int         temp = (i == 0) ? nelem_alloc_first : nelem_alloc;
-            if (!element_alloc(hashp, temp, i))
+            if ((pg_prng_double(&pg_global_prng_state) < 0.001) || !element_alloc(hashp, temp, i))
                  ereport(ERROR,
                          (errcode(ERRCODE_OUT_OF_MEMORY),
                           errmsg("out of memory")));

makes this script:
for i in {1..10000}; do
cat << 'EOS' | psql >>psql.log
SELECT 1 ORDER BY 1;

SELECT 1 ORDER BY 1;
EOS
grep "terminated by signal" server.log && break;
done

trigger an assertion failure:
2026-06-17 07:26:07.837 EEST [87325:3] [unknown] LOG:  connection authorized: user=law database=regression
application_name=psql
2026-06-17 07:26:07.837 EEST [87325:4] psql LOG:  statement: SELECT 1 ORDER BY 1;
2026-06-17 07:26:07.837 EEST [87325:5] psql ERROR:  out of memory at character 19
2026-06-17 07:26:07.837 EEST [87325:6] psql LOG:  statement: SELECT 1 ORDER BY 1;
TRAP: failed Assert("TypeCacheHash != NULL && RelIdToTypeIdCacheHash != NULL"), File: "typcache.c", Line: 441, PID: 87325
ExceptionalCondition at assert.c:51:13
lookup_type_cache at typcache.c:444:27
get_sort_group_operators at parse_oper.c:207:13
addTargetToSortList at parse_clause.c:3647:4
transformSortClause at parse_clause.c:2959:14
transformSelectStmt at analyze.c:1806:18
transformStmt at analyze.c:396:15
transformOptionalSelectInto at analyze.c:327:1
transformTopLevelStmt at analyze.c:276:11
parse_analyze_fixedparams at analyze.c:144:10
pg_analyze_and_rewrite_fixedparams at postgres.c:699:10
exec_simple_query at postgres.c:1206:20
PostgresMain at postgres.c:4860:27
BackendInitialize at backend_startup.c:142:1
postmaster_child_launch at launch_backend.c:269:3
BackendStartup at postmaster.c:3627:8
ServerLoop at postmaster.c:1731:10
PostmasterMain at postmaster.c:1415:11
main at main.c:236:2
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b)[0x7b3d5c02a28b]
postgres: law regression [local] SELECT(_start+0x25)[0x5944d79e8155]
2026-06-17 07:26:07.914 EEST [85875:6] LOG:  client backend (PID 87325) was terminated by signal 6: Aborted

Without asserts enables, the server might crash.

2) An issue in GetSnapshotData()

The following modification:
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -71,2 +71,3 @@
  #include "utils/wait_event.h"
+#include "common/pg_prng.h"
@@ -2157,3 +2158,3 @@ GetSnapshotData(Snapshot snapshot)
          Assert(snapshot->subxip == NULL);
-        snapshot->subxip = (TransactionId *)
+        snapshot->subxip = (pg_prng_double(&pg_global_prng_state) < 0.01) ? NULL : (TransactionId *)
              malloc(GetMaxSnapshotSubxidCount() * sizeof(TransactionId));

makes this script (max_prepared_transactions = 2 in postgresql.conf):
for i in {1..1000}; do
cat << 'EOS' | psql >>psql.log
SELECT 1;
BEGIN;
  CREATE TABLE t1(a int);
  SAVEPOINT sp1;
    INSERT INTO t1 VALUES (1);
  ROLLBACK TO sp1;
  INSERT INTO t1 VALUES (2);
PREPARE TRANSACTION 'pt1';
BEGIN;
CREATE TABLE t2(a int);
ROLLBACK;
ROLLBACK PREPARED 'pt1';
EOS
grep "terminated by signal" server.log && break;
done

trigger a segmentation fault:
2026-06-17 07:37:52.619 EEST [108789:3] [unknown] LOG:  connection authorized: user=law database=regression
application_name=psql
2026-06-17 07:37:52.620 EEST [108789:4] psql LOG:  statement: SELECT 1;
2026-06-17 07:37:52.620 EEST [108789:5] psql ERROR:  out of memory
2026-06-17 07:37:52.620 EEST [108789:6] psql LOG:  statement: BEGIN;
2026-06-17 07:37:52.620 EEST [108789:7] psql LOG:  statement: CREATE TABLE t1(a int);
2026-06-17 07:37:52.621 EEST [108789:8] psql LOG:  statement: SAVEPOINT sp1;
2026-06-17 07:37:52.621 EEST [108789:9] psql LOG:  statement: INSERT INTO t1 VALUES (1);
2026-06-17 07:37:52.621 EEST [108789:10] psql LOG:  statement: ROLLBACK TO sp1;
2026-06-17 07:37:52.621 EEST [108789:11] psql LOG:  statement: INSERT INTO t1 VALUES (2);
2026-06-17 07:37:52.621 EEST [108789:12] psql LOG:  statement: PREPARE TRANSACTION 'pt1';
2026-06-17 07:37:52.622 EEST [108789:13] psql LOG:  statement: BEGIN;
2026-06-17 07:37:52.622 EEST [108789:14] psql LOG:  statement: CREATE TABLE t2(a int);
2026-06-17 07:37:52.777 EEST [108710:6] LOG:  client backend (PID 108789) was terminated by signal 11: Segmentation fault
2026-06-17 07:37:52.777 EEST [108710:7] DETAIL:  Failed process was running: CREATE TABLE t2(a int);

Program terminated with signal SIGSEGV, Segmentation fault.
#0  __memcpy_avx512_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:289

(gdb) bt
#0  __memcpy_avx512_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:289
#1  0x00005e857d7dc534 in GetSnapshotData (snapshot=0x5e857df07ae0 <CurrentSnapshotData>) at procarray.c:2297
#2  0x00005e857da9e452 in GetTransactionSnapshot () at snapmgr.c:331
#3  0x00005e857d823117 in PortalRunUtility (portal=0x5e85b337f3b0, pstmt=0x5e85b32fcd58, isTopLevel=true,
    setHoldSnapshot=false, dest=0x5e85b32fd118, qc=0x7ffe081d5aa0) at pquery.c:1127
#4  0x00005e857d82343b in PortalRunMulti (portal=0x5e85b337f3b0, isTopLevel=true, setHoldSnapshot=false,
dest=0x5e85b32fd118,
    altdest=0x5e85b32fd118, qc=0x7ffe081d5aa0) at pquery.c:1307
#5  0x00005e857d82289a in PortalRun (portal=0x5e85b337f3b0, count=9223372036854775807, isTopLevel=true, dest=0x5e85b32fd118,
    altdest=0x5e85b32fd118, qc=0x7ffe081d5aa0) at pquery.c:784

(gdb) f 1
#1  0x00005e857d7dc534 in GetSnapshotData (snapshot=0x5e857df07ae0 <CurrentSnapshotData>) at procarray.c:2297
2297 memcpy(snapshot->subxip + subcount,
(gdb) p *snapshot
$1 = {snapshot_type = SNAPSHOT_MVCC, xmin = 745, xmax = 747, xip = 0x5e85b3329030, xcnt = 0, subxip = 0x0, subxcnt = 0,
  suboverflowed = false, takenDuringRecovery = false, copied = false, curcid = 3, speculativeToken = 0, vistest = 0x0,
  active_count = 0, regd_count = 0, ph_node = {first_child = 0x0, next_sibling = 0x0, prev_or_parent = 0x0},
  snapXactCompletionCount = 55}

3) An issue in StandbyAcquireAccessExclusiveLock()

No modification needed. Please try the attached TAP test on REL_17_STABLE.
It fails as below:
t/099_out_of_shared_memory.pl .. Bailout called.  Further testing stopped:  pg_ctl start failed
099_out_of_shared_memory_standby.log contains:
2026-06-17 07:53:03.237 EEST [167771] LOG:  database system is ready to accept read-only connections
2026-06-17 07:53:03.240 EEST [167775] LOG:  started streaming WAL from primary at 0/3000000 on timeline 1
2026-06-17 07:53:03.269 EEST [167774] FATAL:  out of shared memory
2026-06-17 07:53:03.269 EEST [167774] HINT:  You might need to increase "max_locks_per_transaction".
2026-06-17 07:53:03.269 EEST [167774] CONTEXT:  WAL redo at 0/32218D8 for Standby/LOCK: xid 738 db 5 rel 17839
2026-06-17 07:53:03.269 EEST [167774] WARNING:  you don't own a lock of type AccessExclusiveLock
2026-06-17 07:53:03.269 EEST [167774] LOG:  RecoveryLockHash contains entry for lock no longer recorded by lock manager:
xid 738 database 5 relation 17839
TRAP: failed Assert("false"), File: "standby.c", Line: 1053, PID: 167774
ExceptionalCondition at assert.c:52:13
StandbyReleaseXidEntryLocks at standby.c:1056:8
StandbyReleaseAllLocks at standby.c:1116:3
ShutdownRecoveryTransactionEnvironment at standby.c:178:2
StartupProcExit at startup.c:208:1
shmem_exit at ipc.c:282:9
proc_exit_prepare at ipc.c:201:2
proc_exit at ipc.c:155:2
errfinish at elog.c:593:5
LockAcquireExtended at lock.c:1020:4
LockAcquire at lock.c:763:1
StandbyAcquireAccessExclusiveLock at standby.c:1026:10
standby_redo at standby.c:1175:35
ApplyWalRecord at xlogrecovery.c:2008:13
PerformWalRecovery at xlogrecovery.c:1835:8
StartupXLOG at xlog.c:5803:24
StartupProcessMain at startup.c:264:2
postmaster_child_launch at launch_backend.c:281:9
StartChildProcess at postmaster.c:3918:8
PostmasterMain at postmaster.c:1369:13
startup_hacks at main.c:219:1
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b)[0x71abb4e2a28b]
postgres: standby: startup recovering 000000010000000000000003(_start+0x25)[0x64d0286de095]
2026-06-17 07:53:03.279 EEST [167771] LOG:  startup process (PID 167774) was terminated by signal 6: Aborted

Best regards,
Alexander

Attachments:

099_out_of_shared_memory.plapplication/x-perl; name=099_out_of_shared_memory.plDownload
#2Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Alexander Lakhin (#1)
Re: Unexpected behavior after OOM errors

On Wed, 17 Jun 2026 at 08:00, Alexander Lakhin <exclusion@gmail.com> wrote:

Hello hackers,

I'd like to share my findings related to OOM error handling. I'm not sure
how large the class of such anomalies is (and if all of these can be
detected and fixed), but please look at a few issues I have discovered so
far:

1) An issue in lookup_type_cache()

The following modification:

<snip>

makes this script:

<snip>

trigger an assertion failure:

<snip>

Without asserts enables, the server might crash.

I believe this is caused by partial subsystem initialization. Attached
patch 0001 should address this failure without causing the server to
restart on OOM.

2) An issue in GetSnapshotData()

The following modification:

<snip>

makes this script (max_prepared_transactions = 2 in postgresql.conf):

<snip>

trigger a segmentation fault:

<snip>

Again, caused by partial initialization, though in this case it's of a
SnapshotData* which is later checked again. Attached patch 0002 should
address this failure.

3) An issue in StandbyAcquireAccessExclusiveLock()

<snip>

I'm not sure how to solve this correctly; I think ideally the
StandbyAcquireAccessExclusiveLock() hash code would be wrapped by a
critical section, but I'm not 100% sure if that will be a sufficient
approach; and it'd definitely need some code to allow the various
hashmaps' memctxs to alloc during critical sections.

Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)

Attachments:

v1-0002-Make-GetSnapshotData-more-resilient-against-OOM-e.patchapplication/octet-stream; name=v1-0002-Make-GetSnapshotData-more-resilient-against-OOM-e.patchDownload+11-1
v1-0001-typcache-Use-new-LAZY_INIT-system.patchapplication/octet-stream; name=v1-0001-typcache-Use-new-LAZY_INIT-system.patchDownload+76-17
#3Michael Paquier
michael@paquier.xyz
In reply to: Matthias van de Meent (#2)
Re: Unexpected behavior after OOM errors

On Wed, Jun 17, 2026 at 02:27:25PM +0200, Matthias van de Meent wrote:

On Wed, 17 Jun 2026 at 08:00, Alexander Lakhin <exclusion@gmail.com> wrote:

1) An issue in lookup_type_cache()

I believe this is caused by partial subsystem initialization. Attached
patch 0001 should address this failure without causing the server to
restart on OOM.

Hmm. I think that this is an ordering problem. We could make the
callbacks be registered last, once we are sure that the two hash
tables and the in-progress list have been initialized. I am not sure
that this requires a new facility; it is also an advantage to keep the
initialization sequence in a one code path, without an abstraction.

RelIdToTypeIdCacheHash and RelIdToTypeIdCacheHash are in the
TopMemoryContext, static to the process, so we could just check them
for NULL-ness to make the initialization repeatable. That gives me
the attached v2. Reusing Alexander's randomness trick, that looks
stable here.

2) An issue in GetSnapshotData()

Again, caused by partial initialization, though in this case it's of a
SnapshotData* which is later checked again. Attached patch 0002 should
address this failure.

Yeah, that seems right to make repeated calls of GetSnapshotData()
able to work. LGTM.

3) An issue in StandbyAcquireAccessExclusiveLock()

<snip>

I'm not sure how to solve this correctly; I think ideally the
StandbyAcquireAccessExclusiveLock() hash code would be wrapped by a
critical section, but I'm not 100% sure if that will be a sufficient
approach; and it'd definitely need some code to allow the various
hashmaps' memctxs to alloc during critical sections.

Not checked this one yet.

Thoughts about the first part?
--
Michael

Attachments:

v2-0001-typcache-Make-initialization-more-resilient-on-OO.patchtext/plain; charset=us-asciiDownload+33-25
#4Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Michael Paquier (#3)
Re: Unexpected behavior after OOM errors

On 18/06/2026 07:37, Michael Paquier wrote:

On Wed, Jun 17, 2026 at 02:27:25PM +0200, Matthias van de Meent wrote:

On Wed, 17 Jun 2026 at 08:00, Alexander Lakhin <exclusion@gmail.com> wrote:

1) An issue in lookup_type_cache()

I believe this is caused by partial subsystem initialization. Attached
patch 0001 should address this failure without causing the server to
restart on OOM.

Hmm. I think that this is an ordering problem. We could make the
callbacks be registered last, once we are sure that the two hash
tables and the in-progress list have been initialized. I am not sure
that this requires a new facility; it is also an advantage to keep the
initialization sequence in a one code path, without an abstraction.

RelIdToTypeIdCacheHash and RelIdToTypeIdCacheHash are in the
TopMemoryContext, static to the process, so we could just check them
for NULL-ness to make the initialization repeatable. That gives me
the attached v2. Reusing Alexander's randomness trick, that looks
stable here.

Yeah, this can be solved by ordering. It's a bit fiddly though. I don't
know about Matthias's proposal either, but it'd be nice to have a less
fiddly system for these.

One idea is to have something similar to
START_CRIT_SECTION()/END_CRIT_SECTION(), but instead of promoting the
ERROR to a PANIC, promote it to FATAL. That way, if any of these
one-time allocations fail, the backend exits. If you're so
memory-starved that you cannot even initialize the type cache, you won't
be able to do anything useful with the connection anyway.

Another idea is that instead of having these be singletons in the type
cache, initialized on first use, move it to a new TypeCacheInitialize()
function that is always called at backend startup, like
RelationCacheInitialize(). If an allocation fails at that stage, the
backend will just exit. I think that's my favorite alternative so far.

BTW, I'm surprised we create the hash tables are created in
TopMemoryContext rather than CacheMemoryContext...

- Heikki

#5Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Michael Paquier (#3)
Re: Unexpected behavior after OOM errors

On Thu, 18 Jun 2026 at 06:37, Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Jun 17, 2026 at 02:27:25PM +0200, Matthias van de Meent wrote:

On Wed, 17 Jun 2026 at 08:00, Alexander Lakhin <exclusion@gmail.com> wrote:

1) An issue in lookup_type_cache()

I believe this is caused by partial subsystem initialization. Attached
patch 0001 should address this failure without causing the server to
restart on OOM.

Hmm. I think that this is an ordering problem. We could make the
callbacks be registered last, once we are sure that the two hash
tables and the in-progress list have been initialized.

I don't disagree that there's an ordering problem, but in my view the
main problem is the fallible initialization of N components, gated
behind a single condition. The question of when to register the
callbacks is just one part of many.

I am not sure
that this requires a new facility; it is also an advantage to keep the
initialization sequence in a one code path, without an abstraction.

I'm not a huge fan of templating if(!global) { init_global(); } all
around. But, it works.

RelIdToTypeIdCacheHash and RelIdToTypeIdCacheHash are in the
TopMemoryContext, static to the process, so we could just check them
for NULL-ness to make the initialization repeatable. That gives me
the attached v2. Reusing Alexander's randomness trick, that looks
stable here.

This un-fixes one of the unlikely issues that was fixed in my patch -
though unrelated to OOM:

Each of the calls to
CacheRegisterSyscacheCallback/CacheRegisterRelcacheCallback can throw
an ERROR when all slots have been used. This would leave the typcache
in an invalid state, so I think that must be wrapped in a critical
section: neither syscache nor relcache has options to release
callbacks, and we can't safely continue without the callbacks
installed, so once an error is thrown here this backend can't ever be
properly initialized. This is unlike OOMs, whose conditions for
failure may (and often do) change as workloads change in other
backends.

I think Heikki's suggestion for a FATAL critical section option would
be a good alternative. It wouldn't always be sufficient, but would fix
issues here.

2) An issue in GetSnapshotData()

Again, caused by partial initialization, though in this case it's of a
SnapshotData* which is later checked again. Attached patch 0002 should
address this failure.

Yeah, that seems right to make repeated calls of GetSnapshotData()
able to work. LGTM.

Thanks for committing!

Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)

#6Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Michael Paquier (#3)
Re: Unexpected behavior after OOM errors

On Thu, 18 Jun 2026 at 06:37, Michael Paquier <michael@paquier.xyz> wrote:

3) An issue in StandbyAcquireAccessExclusiveLock()

<snip>

I'm not sure how to solve this correctly; I think ideally the
StandbyAcquireAccessExclusiveLock() hash code would be wrapped by a
critical section, but I'm not 100% sure if that will be a sufficient
approach; and it'd definitely need some code to allow the various
hashmaps' memctxs to alloc during critical sections.

Not checked this one yet.

I found that the attached patch v3 solves that issue. The assert fires
because we link the lock into the transaction's exclusive locks ahead
of actually having acquired the lock, and when that lock acquisition
fails, as part of the error handling we hit
StartupProcExit->ShutdownRecoveryTransactionEnvironment->StandbyReleaseAllLocks,
which causes this assertion failure because the lock was not taken by
this backend.

By moving StandbyAcquireAccessExclusiveLock's LockAcquire ahead of
when it links the lock to the transaction, the local data structure
doesn't know to clean up the lock until after it's acquired, so
failure in that process won't make error cleanup try to clean up the
lock.

Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)

Attachments:

v3-0001-IPC-standby-keep-better-track-of-taken-locks.patchapplication/octet-stream; name=v3-0001-IPC-standby-keep-better-track-of-taken-locks.patchDownload+11-6
#7Michael Paquier
michael@paquier.xyz
In reply to: Matthias van de Meent (#6)
Re: Unexpected behavior after OOM errors

On Thu, Jun 18, 2026 at 05:27:57PM +0200, Matthias van de Meent wrote:

By moving StandbyAcquireAccessExclusiveLock's LockAcquire ahead of
when it links the lock to the transaction, the local data structure
doesn't know to clean up the lock until after it's acquired, so
failure in that process won't make error cleanup try to clean up the
lock.

Yep, reordering these two actions would take care of the list
inconsistency where the startup process goes down following the ERROR
promoted to a FATAL.

I have been fingering the idea of backpatching this fix for a few
minutes, actually, but discarded the idea at the end. It does not
require a random pattern to cause the failure, being actionable
through a combination of GUCs as Alexander has proved. Still, the
only consequence is an extra LOG entry telling that the lock is not
being tracked for non-assert builds. Confusing, OK, but not really
critical.

Comments?
--
Michael

#8Michael Paquier
michael@paquier.xyz
In reply to: Matthias van de Meent (#5)
Re: Unexpected behavior after OOM errors

On Thu, Jun 18, 2026 at 11:27:28AM +0200, Matthias van de Meent wrote:

Each of the calls to
CacheRegisterSyscacheCallback/CacheRegisterRelcacheCallback can throw
an ERROR when all slots have been used. This would leave the typcache
in an invalid state, so I think that must be wrapped in a critical
section: neither syscache nor relcache has options to release
callbacks, and we can't safely continue without the callbacks
installed, so once an error is thrown here this backend can't ever be
properly initialized. This is unlike OOMs, whose conditions for
failure may (and often do) change as workloads change in other
backends.

We don't ERROR when failing to register a syscache/relcache callback,
we FATAL if we reach one of the thresholds. Reaching these thresholds
points to me to a programming error anyway, so these should not matter
in the field. The OOM is a random pattern that can happen outside the
Postgres realm.

Just in case, I have planted a elog(FATAL) triggering randomly in the
middle of cache registration callback calls, and the typcache
inconsistency does not come in play with the shutdown sequence once
these trigger even if we have the tables set but not the callbacks.
As a whole, I tend to think that reordering the actions is a solution
good enough here.

I think Heikki's suggestion for a FATAL critical section option would
be a good alternative. It wouldn't always be sufficient, but would fix
issues here.

That sounds like an interesting idea, potentially reusable for other
areas, but I'm not really convinced that we need to add this kind of
facility for the case dealt with here. To me, that's also where we
could use a TRY/CATCH block and call it a day. If others feel
differently about this matter, I'm fine to be outvoted.
--
Michael

#9Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Michael Paquier (#8)
Re: Unexpected behavior after OOM errors

On Fri, 19 Jun 2026 at 01:55, Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Jun 18, 2026 at 11:27:28AM +0200, Matthias van de Meent wrote:

Each of the calls to
CacheRegisterSyscacheCallback/CacheRegisterRelcacheCallback can throw
an ERROR when all slots have been used. This would leave the typcache
in an invalid state, so I think that must be wrapped in a critical
section: neither syscache nor relcache has options to release
callbacks, and we can't safely continue without the callbacks
installed, so once an error is thrown here this backend can't ever be
properly initialized. This is unlike OOMs, whose conditions for
failure may (and often do) change as workloads change in other
backends.

We don't ERROR when failing to register a syscache/relcache callback,
we FATAL if we reach one of the thresholds.

Ah, thanks for correcting me. I'm not sure why I had ERROR in mind,
but you're obviously correct. Your patch v2 LGTM.

Reaching these thresholds
points to me to a programming error anyway, so these should not matter
in the field.

I don't think that's (necessarily) correct. These callbacks are
accessible to extensions, and if you load sufficiently many of those
you could still run out of slots even if each extension stayed well
within a reasonable threshold.

I think Heikki's suggestion for a FATAL critical section option would
be a good alternative. It wouldn't always be sufficient, but would fix
issues here.

That sounds like an interesting idea, potentially reusable for other
areas, but I'm not really convinced that we need to add this kind of
facility for the case dealt with here. To me, that's also where we
could use a TRY/CATCH block and call it a day. If others feel
differently about this matter, I'm fine to be outvoted.

While TRY/CATCH would work, I'm not so keen on adding those to
system-initalizing code; the allocations generally go into contexts
that aren't cleaned up nicely during error handling. E.g. a partial
hash initialization for any of the cache hashmaps due to OOM will leak
its allocations.
Failing the connection for that makes sure we complete the right
cleanup procedures and not leak those resources (and it adds another
item to the multi-threading concerns list).

-Matthias

#10Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Michael Paquier (#7)
Re: Unexpected behavior after OOM errors

On Fri, 19 Jun 2026 at 01:30, Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Jun 18, 2026 at 05:27:57PM +0200, Matthias van de Meent wrote:

By moving StandbyAcquireAccessExclusiveLock's LockAcquire ahead of
when it links the lock to the transaction, the local data structure
doesn't know to clean up the lock until after it's acquired, so
failure in that process won't make error cleanup try to clean up the
lock.

Yep, reordering these two actions would take care of the list
inconsistency where the startup process goes down following the ERROR
promoted to a FATAL.

I have been fingering the idea of backpatching this fix for a few
minutes, actually, but discarded the idea at the end. It does not
require a random pattern to cause the failure, being actionable
through a combination of GUCs as Alexander has proved. Still, the
only consequence is an extra LOG entry telling that the lock is not
being tracked for non-assert builds. Confusing, OK, but not really
critical.

Comments?

Because it fixes an assertion, I'd vote for backpatching; but because
the case is handled safely without assertions I also wouldn't be upset
if it wasn't backpatched.

-Matthias