dshash_find_or_insert vs. OOM
Hi,
For most memory allocation primitives, we offer a "no OOM" version.
dshash_find_or_insert is an exception. Here's a small patch to fix
that. I have some interest in slipping this into v19 even though I am
submitting it quite late, because it would be useful for
pg_stash_advice[1]As yet uncommitted. See pg_plan_advice thread.. Let me know what you think about that.
--
Robert Haas
EDB: http://www.enterprisedb.com
[1]: As yet uncommitted. See pg_plan_advice thread.
Attachments:
v1-0001-dshash-Make-it-possible-to-suppress-out-of-memory.patchapplication/octet-stream; name=v1-0001-dshash-Make-it-possible-to-suppress-out-of-memory.patchDownload+83-31
On Mar 18, 2026, at 07:34, Robert Haas <robertmhaas@gmail.com> wrote:
Hi,
For most memory allocation primitives, we offer a "no OOM" version.
dshash_find_or_insert is an exception. Here's a small patch to fix
that. I have some interest in slipping this into v19 even though I am
submitting it quite late, because it would be useful for
pg_stash_advice[1]. Let me know what you think about that.--
Robert Haas
EDB: http://www.enterprisedb.com[1] As yet uncommitted. See pg_plan_advice thread.
<v1-0001-dshash-Make-it-possible-to-suppress-out-of-memory.patch>
I think adding a DSHASH_INSERT_NO_OOM flag is the right choice. As you mentioned, we already have other _NO_OOM flags, so this feels consistent with existing patterns.
I don’t see any correctness issue with the patch. I just have a couple of minor nits.
1
```
@@ -477,14 +485,22 @@ restart:
* reacquire all the locks in the right order to avoid deadlocks.
*/
LWLockRelease(PARTITION_LOCK(hash_table, partition_index));
- resize(hash_table, hash_table->size_log2 + 1);
+ if (!resize(hash_table, hash_table->size_log2 + 1, flags))
+ return NULL;
goto restart;
}
/* Finally we can try to insert the new item. */
item = insert_into_bucket(hash_table, key,
- &BUCKET_FOR_HASH(hash_table, hash));
+ &BUCKET_FOR_HASH(hash_table, hash),
+ flags);
+ if (item == NULL)
+ {
+ Assert((flags & DSHASH_INSERT_NO_OOM) != 0);
+ LWLockRelease(PARTITION_LOCK(hash_table, partition_index));
+ return NULL;
+ }
```
When OOM happens, Assert((flags & DSHASH_INSERT_NO_OOM) != 0); makes sense. But for resize(), the assert is inside resize(), while for insert_into_bucket(), the assert is in the caller. That feels a bit inconsistent to me, and I think it hurts readability a little. A reader might wonder why there is no corresponding assert after resize() unless they go read the function body.
I think either style is fine, but using the same style in both places would be better.
2
```
/* Allocate the space for the new table. */
- new_buckets_shared =
- dsa_allocate_extended(hash_table->area,
- sizeof(dsa_pointer) * new_size,
- DSA_ALLOC_HUGE | DSA_ALLOC_ZERO);
- new_buckets = dsa_get_address(hash_table->area, new_buckets_shared);
+ {
+ int dsa_flags = DSA_ALLOC_HUGE | DSA_ALLOC_ZERO;
+
+ if (flags & DSHASH_INSERT_NO_OOM)
+ dsa_flags |= DSA_ALLOC_NO_OOM;
+ new_buckets_shared =
+ dsa_allocate_extended(hash_table->area,
+ sizeof(dsa_pointer) * new_size,
+ dsa_flags);
+ }
```
Making this a nested block does have the benefit of keeping dsa_flags close to where it is used. But from my impression, this style is still fairly uncommon in the codebase. I worry it may implicitly signal to other hackers that this is an acceptable pattern. So unless we intentionally want to encourage that style, I would lean toward avoiding it here.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
Hi,
For most memory allocation primitives, we offer a "no OOM" version.
dshash_find_or_insert is an exception. Here's a small patch to fix
that. I have some interest in slipping this into v19 even though I am
submitting it quite late, because it would be useful for
pg_stash_advice[1]. Let me know what you think about that.
Thanks for the patch!
I agree with this idea, as it could act as a good triggering point for
a caller to perform an eviction of the dshash if they run out of space,
and avoid a hard failure.
Passing this as a flag seems OK with me. Not sure what other
flags could be added in the future, but the flexibility is a good
thing, IMO. I was debating if this should just be a dsh_params
option, but maybe for the same dshash a caller may want OOM
in some code path, and retry in some other. maybe?
+ &BUCKET_FOR_HASH(hash_table, hash), + flags); + if (item == NULL) + { + Assert((flags & DSHASH_INSERT_NO_OOM) != 0); + LWLockRelease(PARTITION_LOCK(hash_table, partition_index)); + return NULL; + } ```When OOM happens, Assert((flags & DSHASH_INSERT_NO_OOM) != 0); makes sense. But for resize(), the assert is inside resize(), while for insert_into_bucket(), the assert is in the caller.
That feels a bit inconsistent to me, and I think it hurts readability a little. A reader might wonder why there is no corresponding assert after resize() unless they go read the function body.I think either style is fine, but using the same style in both places would be better.
I agree with this. The assert should be
if (!resize(hash_table, hash_table->size_log2 + 1, flags))
{
Assert((flags & DSHASH_INSERT_NO_OOM) != 0);
return NULL;
}
instead of inside resize().
I did some testing by triggering an OOM, a no-OOM, and an OOM with
eviction afterwards, as a sanity check. All looks good.
I've attached the tests I created with other basic testing, as dshash
lacks direct testing in general, if you're inclined to add them.
--
Sami Imseih
Amazon Web Services (AWS)