Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()

Started by Tom Lane2 months ago16 messages
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

Chao Li <li.evan.chao@gmail.com> writes:

I noticed this problem while reviewing Tom’s patch [1]. In lsyscache.c,
there are multiple places that get IndexAmRoutine
by GetIndexAmRoutineByAmId(). Only one function get_opmethod_canorder()
pfree the returned object, while the other 3
functions get_op_index_interpretation(), equality_ops_are_compatible()
and comparison_ops_are_compatible() all call GetIndexAmRoutineByAmId() in
for loops but without freeing the memory.

While these paths are not hot enough to cause leaks that matter in
practice, I think for consistency, we should free the memory.

I wonder if it wouldn't be better to rethink the convention that
IndexAmRoutine structs are palloc'd. Is there any AM for which
the contents aren't constant, so that we could just return a pointer
to a static constant struct and save the palloc/pfree overhead?
We'd want to const-ify the result type of GetIndexAmRoutine, and
maybe that'd result in more notational churn than it's worth,
but it seems like we're missing a bet.

(I would not have proposed this before we started using C99
struct initializers. But now that we can, it doesn't seem
like writing out the struct contents as an initializer would
be too painful.)

regards, tom lane

#2Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Tom Lane (#1)

On Mon, 29 Dec 2025 at 17:05, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Chao Li <li.evan.chao@gmail.com> writes:

I noticed this problem while reviewing Tom’s patch [1]. In lsyscache.c,
there are multiple places that get IndexAmRoutine
by GetIndexAmRoutineByAmId(). Only one function get_opmethod_canorder()
pfree the returned object, while the other 3
functions get_op_index_interpretation(), equality_ops_are_compatible()
and comparison_ops_are_compatible() all call GetIndexAmRoutineByAmId() in
for loops but without freeing the memory.

While these paths are not hot enough to cause leaks that matter in
practice, I think for consistency, we should free the memory.

I wonder if it wouldn't be better to rethink the convention that
IndexAmRoutine structs are palloc'd. Is there any AM for which
the contents aren't constant, so that we could just return a pointer
to a static constant struct and save the palloc/pfree overhead?

I'm not aware of any such AMs, and the only reason I can think of to
change this dynamically is for specialization: the amroutine itself
doesn't get sufficient information to return a specialized
IndexAmRoutine pointer; and assuming that rd_indam itself isn't
`const`-ified the specializing code would just have to change the
pointed-to IndexAmRoutine instead of replacing individual am*
functions in the struct.

Requiring `const static` for IndexAMRoutine would make it more
complicated to do JIT for index AMs correctly and without resource
leaks, but such a feature would probably require more resource
management hooks than are currently available to extensions, so I
don't think we'll lose much there.

We'd want to const-ify the result type of GetIndexAmRoutine, and
maybe that'd result in more notational churn than it's worth,
but it seems like we're missing a bet.
(I would not have proposed this before we started using C99
struct initializers. But now that we can, it doesn't seem
like writing out the struct contents as an initializer would
be too painful.)

Yes, let's do it.
Here's my patch that does this, pulled from [0]/messages/by-id/CAEze2Wi7tDidbDVJhu=Pstb2hbUXDCxx_VAZnKSqbTMf7k8+uQ@mail.gmail.com. It was originally
built to reduce the per-index catcache overhead; as using static const
pointers reduces the data allocated into the "index info" context by
264 bytes.

Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)

[0]: /messages/by-id/CAEze2Wi7tDidbDVJhu=Pstb2hbUXDCxx_VAZnKSqbTMf7k8+uQ@mail.gmail.com

Attachments:

v1-0001-Stop-heap-allocating-IndexAmRoutine-for-every-ind.patchapplication/octet-stream; name=v1-0001-Stop-heap-allocating-IndexAmRoutine-for-every-ind.patchDownload+472-487
#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matthias van de Meent (#2)

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:

On Mon, 29 Dec 2025 at 17:05, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wonder if it wouldn't be better to rethink the convention that
IndexAmRoutine structs are palloc'd. Is there any AM for which
the contents aren't constant, so that we could just return a pointer
to a static constant struct and save the palloc/pfree overhead?

I'm not aware of any such AMs, and the only reason I can think of to
change this dynamically is for specialization: the amroutine itself
doesn't get sufficient information to return a specialized
IndexAmRoutine pointer; and assuming that rd_indam itself isn't
`const`-ified the specializing code would just have to change the
pointed-to IndexAmRoutine instead of replacing individual am*
functions in the struct.

Yeah, it's hard to credit any need for per-call specialization
given that the amhandler receives no parameters. I can imagine
per-session specialization, for instance as a result of wanting
to JIT-compile the AM's methods. But you could still do that:
on first call, build one copy of the IndexAMRoutine struct in
a long-lived context, and then just keep returning pointers to
that struct. The "const" requirement applies to the core code's
references to the IndexAMRoutine struct, it does not constrain
the code creating such a struct.

Here's my patch that does this, pulled from [0]. It was originally
built to reduce the per-index catcache overhead; as using static const
pointers reduces the data allocated into the "index info" context by
264 bytes.

Cool, I forgot you'd already been poking at this. The patch
is kinda long, but not as bad as I feared, and it all looks
quite mechanical. It lacks documentation updates though.

regards, tom lane

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#3)

... oh, one other thought: instead of what you did in
InitIndexAmRoutine, we should probably do something like

{
MemoryContext oldcontext;

/*
* We formerly specified that the amhandler should return a
* palloc'd struct. That's now deprecated in favor of returning
* a pointer to a static struct, but to avoid completely breaking
* old external AMs, run the amhandler in the relation's rd_indexcxt.
*/
oldcontext = MemoryContextSwitchTo(relation->rd_indexcxt);
relation->rd_indam = GetIndexAmRoutine(relation->rd_amhandler);
MemoryContextSwitchTo(oldcontext);
}

regards, tom lane

#5Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Tom Lane (#3)

On Mon, 29 Dec 2025 at 20:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Matthias van de Meent <boekewurm+postgres@gmail.com> writes

Here's my patch that does this, pulled from [0]. It was originally
built to reduce the per-index catcache overhead; as using static const
pointers reduces the data allocated into the "index info" context by
264 bytes.

Cool, I forgot you'd already been poking at this. The patch
is kinda long, but not as bad as I feared, and it all looks
quite mechanical. It lacks documentation updates though.

Attached v2, in which I've updated the one place where I could find
IndexAmRoutine's allocation policy being described, and in which I've
also updated InitIndexAmRoutine with the suggested changes of your
other mail. Thanks!

Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)

Attachments:

v2-0001-Stop-allocating-one-IndexAmRoutine-for-every-inde.patchapplication/octet-stream; name=v2-0001-Stop-allocating-one-IndexAmRoutine-for-every-inde.patchDownload+488-493
#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matthias van de Meent (#5)

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:

Attached v2, in which I've updated the one place where I could find
IndexAmRoutine's allocation policy being described, and in which I've
also updated InitIndexAmRoutine with the suggested changes of your
other mail. Thanks!

I went through this and made some mostly-cosmetic changes.
I think the attached v3 is ready to go, but I'll wait a day
or so to see if anyone has any comments.

regards, tom lane

Attachments:

v3-0001-Change-IndexAmRoutines-to-be-statically-allocated.patchtext/x-diff; charset=us-ascii; name*0=v3-0001-Change-IndexAmRoutines-to-be-statically-allocated.p; name*1=atchDownload+501-517
#7Chao Li
li.evan.chao@gmail.com
In reply to: Matthias van de Meent (#5)

On Dec 30, 2025, at 04:04, Matthias van de Meent <boekewurm+postgres@gmail.com> wrote:

On Mon, 29 Dec 2025 at 20:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Matthias van de Meent <boekewurm+postgres@gmail.com> writes

Here's my patch that does this, pulled from [0]. It was originally
built to reduce the per-index catcache overhead; as using static const
pointers reduces the data allocated into the "index info" context by
264 bytes.

Cool, I forgot you'd already been poking at this. The patch
is kinda long, but not as bad as I feared, and it all looks
quite mechanical. It lacks documentation updates though.

Attached v2, in which I've updated the one place where I could find
IndexAmRoutine's allocation policy being described, and in which I've
also updated InitIndexAmRoutine with the suggested changes of your
other mail. Thanks!

Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)
<v2-0001-Stop-allocating-one-IndexAmRoutine-for-every-inde.patch>

I’m glad that my finding helped move this patch forward. After reviewing v2, I think my patch can be completely superseded by yours, which is fine with me. I have a couple of comments on v2.

1 - amapi.c
```
/*
* GetIndexAmRoutine - call the specified access method handler routine to get
* its IndexAmRoutine struct, which will be palloc'd in the caller's context.
*
* Note that if the amhandler function is built-in, this will not involve
* any catalog access. It's therefore safe to use this while bootstrapping
* indexes for the system catalogs. relcache.c relies on that.
*/
const IndexAmRoutine *
GetIndexAmRoutine(Oid amhandler)
{
Datum datum;
IndexAmRoutine *routine;

```

* The function header comment needs an update, it still talks about “palloc”, now it should say something like “returned structure should not be free-ed”.
* The local variable “routine” now can be “const” as well.

2 - relcache.c
```
InitIndexAmRoutine(Relation relation)
{
- IndexAmRoutine *cached,
- *tmp;
+ MemoryContext oldctx;

 	/*
-	 * Call the amhandler in current, short-lived memory context, just in case
-	 * it leaks anything (it probably won't, but let's be paranoid).
+	 * We formerly specified that the amhandler should return a
+	 * palloc'd struct.  That's now deprecated in favor of returning
+	 * a pointer to a static struct, but to avoid completely breaking
+	 * old external AMs, run the amhandler in the relation's rd_indexcxt.
 	 */
-	tmp = GetIndexAmRoutine(relation->rd_amhandler);
-
-	/* OK, now transfer the data into relation's rd_indexcxt. */
-	cached = (IndexAmRoutine *) MemoryContextAlloc(relation->rd_indexcxt,
-												   sizeof(IndexAmRoutine));
-	memcpy(cached, tmp, sizeof(IndexAmRoutine));
-	relation->rd_indam = cached;
-
-	pfree(tmp);
+	oldctx = MemoryContextSwitchTo(relation->rd_indexcxt);
+	relation->rd_indam = GetIndexAmRoutine(relation->rd_amhandler);
+	MemoryContextSwitchTo(oldctx);
 }
 ```

I understand that switching to rd_indexcxt is intended to mitigate the risk that external index AMs might still palloc the returned IndexAmRoutine.

However, I don’t see a way to actually enforce external AMs to always return a static pointer. In particular, if an AM switches to a different memory context internally, the MemoryContextSwitchTo() here would not help.

I’m not sure whether we want to go further than the current approach, but it seems that fully eliminating the risk would require detecting dynamically allocated results and copying them into rd_indexcxt, which doesn’t appear easy or robust to implement in practice.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

#8Chao Li
li.evan.chao@gmail.com
In reply to: Tom Lane (#6)

On Tue, Dec 30, 2025 at 6:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:

Attached v2, in which I've updated the one place where I could find
IndexAmRoutine's allocation policy being described, and in which I've
also updated InitIndexAmRoutine with the suggested changes of your
other mail. Thanks!

I went through this and made some mostly-cosmetic changes.
I think the attached v3 is ready to go, but I'll wait a day
or so to see if anyone has any comments.

regards, tom lane

I just saw v3 after sending the review comments for v2. Looks like my
comment 1 has been addressed in v3, and the comment 2 still stands.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

#9Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Chao Li (#7)

On Tue, 30 Dec 2025 at 00:22, Chao Li <li.evan.chao@gmail.com> wrote:

I understand that switching to rd_indexcxt is intended to mitigate the risk that external index AMs might still palloc the returned IndexAmRoutine.

However, I don’t see a way to actually enforce external AMs to always return a static pointer. In particular, if an AM switches to a different memory context internally, the MemoryContextSwitchTo() here would not help.

Yes, sadly it's quite difficult to determine whether something is
statically allocated. Unlike with palloc'd objects, you can't rely on
an always-available header in assert-enabled builds; the best we can
do introspection into which OS memory area this allocation is placed;
which encroaches on ASan and Valgrind's featureset --- and even that
is not necessarily sufficient, as not all compilers may put `static
const` -equivalent objects into knowable memory locations.

I’m not sure whether we want to go further than the current approach, but it seems that fully eliminating the risk would require detecting dynamically allocated results and copying them into rd_indexcxt, which doesn’t appear easy or robust to implement in practice.

Yes, I don't think there is much more we can do here to protect index
AM implementations against this change within Postgres' own tooling
without introducing address space detection features more reasonably
found in ASan or Valgrind. So I think this is sufficient as a
best-effort approach.

Kind regards,

Matthias van de Meent

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Chao Li (#7)

Chao Li <li.evan.chao@gmail.com> writes:

I understand that switching to rd_indexcxt is intended to mitigate the risk that external index AMs might still palloc the returned IndexAmRoutine.

However, I don’t see a way to actually enforce external AMs to always return a static pointer. In particular, if an AM switches to a different memory context internally, the MemoryContextSwitchTo() here would not help.

I do not think we need to "enforce" that, and as you say it'd be quite
hard to do so. The point of this MemoryContextSwitchTo is just to
allow existing AMs that naively do palloc() as they were told will
continue to work (modulo an increased chance of memory-leak issues
since we removed some pfree's). If we don't do the switching, a
non-updated AM will fail in very hard-to-diagnose ways once one
of its rd_indam pointers becomes dangling because the context
that was active at relcache-entry creation time has gone away.
I think it's worth a small number of cycles to save external AM
authors from having that debugging experience.

I did test this, by dint of installing all the changes except the
ones in the amhandler functions. That still passed check-world.
I didn't attempt to analyze whether there were any new leaks of
any significance.

The main objection that could be raised to this is that the old coding
ensures that any memory leaked during GetIndexAmRoutine() will be
leaked in the expected-to-be-short-lived caller's context, but now
it'd be leaked in the cache's rd_indexcxt. I don't believe that
fmgr_info can leak any memory when calling a built-in function, but
for extension functions there will be a syscache lookup and that
could potentially have some incidental leaks. All in all though,
I think this is a good tradeoff. We could perhaps remove those
MemoryContextSwitchTos in a few years when we think everybody's
updated their index AMs.

regards, tom lane

#11Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#10)

On 2025-Dec-29, Tom Lane wrote:

The main objection that could be raised to this is that the old coding
ensures that any memory leaked during GetIndexAmRoutine() will be
leaked in the expected-to-be-short-lived caller's context, but now
it'd be leaked in the cache's rd_indexcxt.

One thing we can perhaps do is (in assert-enabled builds) to detect
whether memory usage for that context has increased during
InitIndexAmRoutine and raise a warning if so. Then extension authors
would realize this and have a chance to fix it promptly.

I tried this out and it doesn't work as well as I thought, due to how
AllocSet works: we don't get a difference in the amount of allocated
memory (thus no WARNING) unless we add some padding bytes to
IndexAmRoutine, as shown below (of course, just POC-quality -- didn't
kry to compile without assertions). Given this, I'm not terribly
optimistic about this idea. I tested this by adding
palloc(sizeof(IndexAmRoutine)) in brinhandler() and verifying that a few
regression tests fail with the added warning message.

diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 88259f7c228..75f2a2f5e37 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1421,6 +1421,7 @@ static void
 InitIndexAmRoutine(Relation relation)
 {
 	MemoryContext oldctx;
+	Size	allocated	PG_USED_FOR_ASSERTS_ONLY;
 	/*
 	 * We formerly specified that the amhandler should return a palloc'd
@@ -1429,7 +1430,19 @@ InitIndexAmRoutine(Relation relation)
 	 * the amhandler in the relation's rd_indexcxt.
 	 */
 	oldctx = MemoryContextSwitchTo(relation->rd_indexcxt);
+
+#ifdef USE_ASSERT_CHECKING
+	allocated = MemoryContextMemAllocated(CurrentMemoryContext, false);
+#endif
+
 	relation->rd_indam = GetIndexAmRoutine(relation->rd_amhandler);
+
+#ifdef USE_ASSERT_CHECKING
+	if (MemoryContextMemAllocated(CurrentMemoryContext, false) - allocated != 0)
+		elog(WARNING, "memory allocated while initializing access method %u (was %zu, now %zu)",
+			 relation->rd_amhandler, allocated, MemoryContextMemAllocated(CurrentMemoryContext, false));
+#endif
+
 	MemoryContextSwitchTo(oldctx);
 }
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 278da36bc08..78fcbcffc14 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -322,6 +322,11 @@ typedef struct IndexAmRoutine
 	/* interface functions to support planning */
 	amtranslate_strategy_function amtranslatestrategy;	/* can be NULL */
 	amtranslate_cmptype_function amtranslatecmptype;	/* can be NULL */
+
+#ifdef USE_ASSERT_CHECKING
+	char		pad[512];
+#endif
+
 } IndexAmRoutine;

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/

#12Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Alvaro Herrera (#11)

On Tue, 30 Dec 2025 at 15:15, Álvaro Herrera <alvherre@kurilemu.de> wrote:

On 2025-Dec-29, Tom Lane wrote:

The main objection that could be raised to this is that the old coding
ensures that any memory leaked during GetIndexAmRoutine() will be
leaked in the expected-to-be-short-lived caller's context, but now
it'd be leaked in the cache's rd_indexcxt.

One thing we can perhaps do is (in assert-enabled builds) to detect
whether memory usage for that context has increased during
InitIndexAmRoutine and raise a warning if so. Then extension authors
would realize this and have a chance to fix it promptly.

I tried this out and it doesn't work as well as I thought, due to how
AllocSet works: we don't get a difference in the amount of allocated
memory (thus no WARNING) unless we add some padding bytes to
IndexAmRoutine

Hmm, wouldn't we be able to detect changes in
MemoryContextMemConsumed(ctx, counters) with one before and one after
GetIndexAmRoutine(), such as included below?

It would cause false positives if amroutine() does memory-related work
other than just returning an appropriate IndexAmRoutine pointer (if it
does so without switching to its own MemoryContext), but I don't think
that such false positives here are necessarily bad - the AM shouldn't
be doing much at this point in the code.

Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)

diff --git a/src/backend/utils/cache/relcache.c
b/src/backend/utils/cache/relcache.c
index 86b90765433..c17f9c3e53a 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1421,6 +1421,13 @@ static void
 InitIndexAmRoutine(Relation relation)
 {
     MemoryContext    oldctx;
+#if USE_ASSERT_CHECKING
+    Size            prevusedmem;
+    MemoryContextCounters usage;
+
+    MemoryContextMemConsumed(relation->rd_indexcxt, &usage);
+    prevusedmem = usage.totalspace - usage.freespace;
+#endif
     /*
      * We formerly specified that the amhandler should return a
@@ -1431,6 +1438,12 @@ InitIndexAmRoutine(Relation relation)
     oldctx = MemoryContextSwitchTo(relation->rd_indexcxt);
     relation->rd_indam = GetIndexAmRoutine(relation->rd_amhandler);
     MemoryContextSwitchTo(oldctx);
+
+#if USE_ASSERT_CHECKING
+    /* ensure the index routine was not palloc'd */
+    MemoryContextMemConsumed(relation->rd_indexcxt, &usage);
+    Assert(prevusedmem == (usage.totalspace - usage.freespace));
+#endif
 }

/*
```

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matthias van de Meent (#12)

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:

On Tue, 30 Dec 2025 at 15:15, Álvaro Herrera <alvherre@kurilemu.de> wrote:

One thing we can perhaps do is (in assert-enabled builds) to detect
whether memory usage for that context has increased during
InitIndexAmRoutine and raise a warning if so. Then extension authors
would realize this and have a chance to fix it promptly.

Hmm, wouldn't we be able to detect changes in
MemoryContextMemConsumed(ctx, counters) with one before and one after
GetIndexAmRoutine(), such as included below?

I don't think we can do this, because there are effects that the
amhandler doesn't have control over. In particular, if we have to
load its pg_proc row into syscache during fmgr_info, I don't think
that is positively guaranteed not to leak anything. (This isn't
a factor for built-in AMs, which will take the fast path in
fmgr_info, but it will be an issue for extensions.)

I am not terribly concerned by one-time leaks of that sort, so
I don't really feel an urge to try to complain about them.

regards, tom lane

#14Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Tom Lane (#13)

On Tue, 30 Dec 2025 at 16:25, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:

On Tue, 30 Dec 2025 at 15:15, Álvaro Herrera <alvherre@kurilemu.de> wrote:

One thing we can perhaps do is (in assert-enabled builds) to detect
whether memory usage for that context has increased during
InitIndexAmRoutine and raise a warning if so. Then extension authors
would realize this and have a chance to fix it promptly.

Hmm, wouldn't we be able to detect changes in
MemoryContextMemConsumed(ctx, counters) with one before and one after
GetIndexAmRoutine(), such as included below?

I don't think we can do this, because there are effects that the
amhandler doesn't have control over. In particular, if we have to
load its pg_proc row into syscache during fmgr_info, I don't think
that is positively guaranteed not to leak anything. (This isn't
a factor for built-in AMs, which will take the fast path in
fmgr_info, but it will be an issue for extensions.)

I am not terribly concerned by one-time leaks of that sort, so
I don't really feel an urge to try to complain about them.

If it's difficult to filter out one-time leaks into the context caused
by e.g. fmgr infra, then -indeed- it's probably not worth the effort.

In which case, v3 LGTM.

Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matthias van de Meent (#14)

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:

On Tue, 30 Dec 2025 at 16:25, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I am not terribly concerned by one-time leaks of that sort, so
I don't really feel an urge to try to complain about them.

If it's difficult to filter out one-time leaks into the context caused
by e.g. fmgr infra, then -indeed- it's probably not worth the effort.
In which case, v3 LGTM.

Pushed then. We can always tweak it if we think of a better answer.

I did spend a bit of time thinking about this sketch:

1. Invent a memory context support method along the lines of
bool ContextContainsPointer(MemoryContext cxt, const void *ptr)
that returns "true" if ptr points anywhere within the context's
owned memory space. The implementation I have in mind is just to
scan the context's list of blocks and see if ptr >= block_start &&
ptr < block_end for any of them. This dodges problems like whether
we can tell if the pointer points at a valid chunk. We aren't
promising that, only that the pointer points somewhere in that
memory area.

2. Remove the MemoryContextSwitchTo that's now in InitIndexAmRoutine,
reverting to running the amhandler in the caller's context.
Instead, in an assert-enabled build, throw error if
ContextContainsPointer(CurrentMemoryContext, relation->rd_indam).

This definitely will throw error if the amhandler just did a palloc(),
and it definitely will not if the amhandler returned a pointer to a
static variable. It won't throw error if the amhandler returned a
pointer to malloc'd space (which is fine now, but would have been an
error before) or space palloc'd from a different context.
In those cases we have to assume the amhandler knows what it's doing.

However, I feel slightly queasy about this idea anyway, because
relcache.c doesn't really have a lot of business assuming what
CurrentMemoryContext it's called in. Consider an extension AM that
has a non-constant IndexAmRoutine (maybe it wants to point to JITted
methods) and palloc's that in some suitably long-lived context.
Is there any way in which we could reach InitIndexAmRoutine while
running in that same long-lived context? This'd likely require a
recursive index_open call while the AM is building its result, which
is not all that hard to credit if the AM is trying to invoke JIT or
the like. However any such index_opens are probably for system
catalogs, which are not likely to be using extension AMs, so maybe the
case can't be reached after all. Nonetheless it doesn't seem quite
impossible to have a false positive, especially if the AM chooses a
common context like CacheMemoryContext or TopMemoryContext for this
purpose.

Another objection is the tedium of writing all those
ContextContainsPointer methods. I suppose that (at least for now)
we could stub out all the ones except aset.c's.

Anyway, I don't find this idea quite attractive enough to pursue,
but maybe somebody else will think differently. It'd be more
attractive if we had additional use-cases for ContextContainsPointer.

regards, tom lane

#16Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#15)

Hi,

On 2025-12-30 18:56:43 -0500, Tom Lane wrote:

Pushed then.

Thanks for this work Matthias and Tom. This waste of memory and cycles has
been bothering me for a long time.

Greetings,

Andres Freund