TupleDescs and refcounts and such, again
I looked into the bug reported by Jean-Pierre Pelletier here:
http://archives.postgresql.org/pgsql-bugs/2006-12/msg00163.php
The cause of the crash is a reference to an already-deallocated
TupleDesc. The problem query has the structure of
SELECT sum(x) FROM (complicated-subselect) GROUP BY ...
which gets planned as HashAggregate atop a SubqueryScan, and the
reason for the crash is this coding in nodeAgg.c:
/* if first time through, initialize hashslot by cloning input slot */
if (hashslot->tts_tupleDescriptor == NULL)
{
ExecSetSlotDescriptor(hashslot, inputslot->tts_tupleDescriptor);
This means the upper query's tupletable contains a reference to the
result tuple descriptor of the subquery, which has been allocated in a
separate memory context (because the subquery has its own ExecutorState
and hence its own es_query_cxt). During plan shutdown, the sub-query's
memory is freed before the upper query's tupletable is deallocated.
In an assert-enabled build this reliably causes a failure like "tupdesc
reference 401901f8 is not owned by resource owner Portal", because the
lower tupdesc has been overwritten by the memory-clobber code, and that
makes it look like it should be reference-counted. (Pre-8.2 it
accidentally failed to malfunction because tupletable shutdown didn't
touch the referenced tupdescs at all.)
I can see a couple of possibilities for fixing this:
1. The most localized fix would be to introduce a CreateTupleDescCopy()
call into the above ExecSetSlotDescriptor() call. But I have zero
confidence in this way because of the likelihood that there are similar
usages elsewhere. Moreover a large part of the point of the tupdesc
refcounting changes was to avoid extra tupdesc-copying steps --- if we
have to copy tupdescs anywhere they might have come from a subplan, that
patch is a failure.
2. Somehow persuade the subplan to allocate its result tupdesc in the
upper query's query context. Could probably be done with localized
copying in nodeSubqueryscan, but seems like a wart.
3. Rejigger CreateExecutorState so that a subquery does not have its own
es_query_cxt but shares the parent's context. Then anything created at
query lifespan in the subquery lives just as long as stuff created in
the upper query. AFAICS this wouldn't make any practical difference in
memory lifespan since subqueries are only destroyed when their parent is
... but it would solve this particular problem as well as any related
ones.
As you can probably guess, I'm leaning to #3, but wanted to see if
anyone had an objection or a better idea.
regards, tom lane
On Tue, 2006-12-26 at 10:47 -0500, Tom Lane wrote:
As you can probably guess, I'm leaning to #3, but wanted to see if
anyone had an objection or a better idea.
Is it possible to allocate the subquery in a child context of the main
query, so that it is technically a different context, yet can be freed
simultaneously?
Hierarchical queries seem like they'd need something like that?
Or should we have the context of a shared context, so that the subquery
can put info somewhere where the main context can read it?
I like the idea of freeing contexts as early as possible. We might not
do that now, but we might like to in the future.
--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com
"Simon Riggs" <simon@2ndquadrant.com> writes:
Is it possible to allocate the subquery in a child context of the main
query, so that it is technically a different context, yet can be freed
simultaneously?
That's exactly what the code *was* doing, but the problem is that we'd
free the child context during ExecEndSubqueryScan, at which point the
tupdesc is still needed.
regards, tom lane
On Wed, 2006-12-27 at 18:05 -0500, Tom Lane wrote:
"Simon Riggs" <simon@2ndquadrant.com> writes:
Is it possible to allocate the subquery in a child context of the main
query, so that it is technically a different context, yet can be freed
simultaneously?That's exactly what the code *was* doing, but the problem is that we'd
free the child context during ExecEndSubqueryScan, at which point the
tupdesc is still needed.
OK, I guess my only other question is to do with recursive/hierarchical
queries: How will we handle those? All in same context?
I guess many subqueries are transformable into main joins anyway, so the
distinction is probably moot anyhow.
--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com
"Simon Riggs" <simon@2ndquadrant.com> writes:
OK, I guess my only other question is to do with recursive/hierarchical
queries: How will we handle those? All in same context?
Offhand I don't think it matters. Recursive queries are recursive in
the data, not in the plan tree.
regards, tom lane