Assertion failure with a subtransaction and cursor
On all versions starting from 8.0 where subtransactions were introduced,
this causes an assertion failure:
postgres=# BEGIN;
BEGIN
postgres=# DECLARE foocur CURSOR FOR SELECT a FROM
generate_series(1,500000) a;
DECLARE CURSOR
postgres=# SAVEPOINT sp;
SAVEPOINT
postgres=# FETCH foocur;
a
───
1
(1 row)
postgres=# error; -- cause subxact abort
ERROR: syntax error at or near "error"
LINE 1: error;
^
postgres=# ROLLBACK TO SAVEPOINT sp;
ROLLBACK
postgres=# FETCH ALL FROM foocur;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
In server log:
TRAP: FailedAssertion("!(((file) > 0 && (file) < (int) SizeVfdCache &&
VfdCache[file].fileName != ((void *)0)))", File: "fd.c", Line: 1101)
When the first row is fetched from the cursor, the function scan is
materialized in a temporaray file. At the error, the temporary file is
deleted. But the FETCH ALL command expects the cursor to be still open,
and the file to exist, so it fails.
If assertions are disabled, you get an "ERROR: unexpected end of tape"
error instead.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
On all versions starting from 8.0 where subtransactions were introduced,
this causes an assertion failure:
Ugh :-(
This ties into the more general issue that it's not clear what effect a
subtransaction rollback should have on a cursor. You could argue that
ideally the cursor should revert to its pre-savepoint state. We didn't
implement that in 8.0 because it seemed too hard, but this bug shows
that not rolling back the cursor isn't exactly easy either.
Not sure what to do. The only fix that seems bulletproof at the moment
is to declare that any cursor that's been touched at all in a
subtransaction is marked "broken" if the subtransaction rolls back.
That might be too great a loss of functionality. It would block off
the controversial aspects of behavior though ...
regards, tom lane
Tom Lane wrote:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
On all versions starting from 8.0 where subtransactions were introduced,
this causes an assertion failure:Ugh :-(
This ties into the more general issue that it's not clear what effect a
subtransaction rollback should have on a cursor. You could argue that
ideally the cursor should revert to its pre-savepoint state. We didn't
implement that in 8.0 because it seemed too hard, but this bug shows
that not rolling back the cursor isn't exactly easy either.
Yeah, the current behavior is debatable. But it's quite sane, useful and
well-defined as it is, so I don't feel any urge to change it.
Not sure what to do. The only fix that seems bulletproof at the moment
is to declare that any cursor that's been touched at all in a
subtransaction is marked "broken" if the subtransaction rolls back.
That might be too great a loss of functionality. It would block off
the controversial aspects of behavior though ...
Hmm, I think we should track temporary files using resource owners.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
Tom Lane wrote:
Not sure what to do. The only fix that seems bulletproof at the moment
is to declare that any cursor that's been touched at all in a
subtransaction is marked "broken" if the subtransaction rolls back.
That might be too great a loss of functionality. It would block off
the controversial aspects of behavior though ...
Hmm, I think we should track temporary files using resource owners.
That would probably be a workable solution if temp files are the only
problem. What I'm afraid of is that this type of problem exists
*everywhere* that we track the need for cleanup operations using the
assumption that subtransactions are nested. If that's the case then we
are looking at a very major rewrite to make things bulletproof --- much
larger than I'd feel comfortable back-patching, especially so far back
as 8.0. I'm thinking we might have little choice but to disable the
functionality in back branches.
regards, tom lane
Tom Lane wrote:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
Tom Lane wrote:
Not sure what to do. The only fix that seems bulletproof at the moment
is to declare that any cursor that's been touched at all in a
subtransaction is marked "broken" if the subtransaction rolls back.
That might be too great a loss of functionality. It would block off
the controversial aspects of behavior though ...Hmm, I think we should track temporary files using resource owners.
That would probably be a workable solution if temp files are the only
problem. What I'm afraid of is that this type of problem exists
*everywhere* that we track the need for cleanup operations using the
assumption that subtransactions are nested. If that's the case then we
are looking at a very major rewrite to make things bulletproof --- much
larger than I'd feel comfortable back-patching, especially so far back
as 8.0. I'm thinking we might have little choice but to disable the
functionality in back branches.
Hmm, the reason we didn't disable it is that people requested it
explicitely. In fact IIRC we had this very same discussion years ago,
except without the crashing test case. I fear that if it gets disabled,
some people will be upset.
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
I wrote:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
Hmm, I think we should track temporary files using resource owners.
That would probably be a workable solution if temp files are the only
problem. What I'm afraid of is that this type of problem exists
*everywhere* that we track the need for cleanup operations using the
assumption that subtransactions are nested.
I just spent a tedious hour digging through every function called by
AbortSubTransaction(), and came to the conclusion that this fear may be
overblown, so long as we are willing to stipulate that user-visible
side effects caused by a cursor's query are to be rolled back if they
occur during a subtransaction that is rolled back. From the user's
perspective this may make things a bit unpredictable, since it is not
always clear exactly when a side effect will occur during the execution
of a query. However it doesn't seem like it can actually break the
system.
(At least not for code in core CVS. Outside modules might be doing
unsafe things in RegisterSubXactCallback callbacks. But that's not
our responsibility to fix.)
It might be a good idea to forbid writeable CTEs in cursor queries,
because that would expose the unpredictability a lot more, and maybe
open some internal issues too -- I noted snapshot management as a likely
problem if the executor is allowed to create its own snapshots. But as
far as what's in CVS is concerned that's not an issue yet anyhow. If
we did want to allow it, we could probably make it safe by forcing all
the writable CTEs to run at cursor creation time.
So as far as I can tell at the moment, temp files really are the only
problem, and making them be managed by resource owners instead of a
subxact-based release policy should fix that. I can go work on that,
unless you wanted to?
regards, tom lane
Tom Lane wrote:
So as far as I can tell at the moment, temp files really are the only
problem, and making them be managed by resource owners instead of a
subxact-based release policy should fix that.
Ok, good.
I can go work on that, unless you wanted to?
I started hacking on that when I posted, so I can finish it.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
Tom Lane wrote:
I can go work on that, unless you wanted to?
I started hacking on that when I posted, so I can finish it.
Sounds good. I added a bit to the ROLLBACK TO reference page to remind
us what we think the behavior is supposed to be for cursor rollback.
regards, tom lane
Heikki Linnakangas wrote:
Tom Lane wrote:
So as far as I can tell at the moment, temp files really are the only
problem, and making them be managed by resource owners instead of a
subxact-based release policy should fix that.Ok, good.
I can go work on that, unless you wanted to?
I started hacking on that when I posted, so I can finish it.
So, here's what I got this far, which is pretty straightforward.
Temporary files not opened in the interXact mode are registered with
CurrentResourceOwner, ensuring they're cleaned up at at the end of
(sub)transaction, or when the portal is closed if it's a temporary file
related to a cursor.
The logical next step would be to get rid of the interXact concept
altogether and always associate temporary files with the current
resource owner; the caller should switch to a sufficiently long-lived
one before calling. But I'm hesitant to change the signature of
OpenTemporaryFile, and tuplestore_begin_heap doesn't need the interXact
argument anymore either. Changing the signature of tuplestore_begin_heap
would certainly break user-defined set-returning C functions. So at
least in back-branches, perhaps we should leave those as dummy arguments
and elog(ERROR) if interXact is not passed in as false.
Google turned up one extra module that calls tuplestore_begin_heap with
interXact set to 'true', but frankly that one looks broken to me. It
should be using 'false':
http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/python/be/src/call/pl.c?rev=1.20&content-type=text/x-cvsweb-markup
I left the cleanup of files/dirs opened with AllocateFile/Dir alone,
although it looks a bit inconsistent now that they don't use resource
owners as well.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Attachments:
temp-file-resowner-2.patchtext/x-diff; name=temp-file-resowner-2.patchDownload+187-97
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
The logical next step would be to get rid of the interXact concept
altogether and always associate temporary files with the current
resource owner; the caller should switch to a sufficiently long-lived
one before calling.
That would mean having a session-lifespan resource owner, which doesn't
seem amazingly useful to me: we have no other resource types for which
that would be even a little bit sane. I'm inclined to leave the temp
file API as nearly as possible as-is. Having it attach only xact-local
files to the CurrentResourceOwner sounds fine to me.
One thought here is that my inclination would be to keep
CleanupTemporaryFiles in place and have it just look for temp files that
didn't get closed, as a debugging aid. This is comparable to the
behavior of other modules that are mostly driven by resource managers.
We didn't take out the end-of-xact checks when we added resource manager
support elsewhere, and I'm not inclined to do that here. I agree we
don't need an EOSubXact-time crosscheck, though.
Other than that quibble it looks sane from here.
regards, tom lane
Tom Lane wrote:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
The logical next step would be to get rid of the interXact concept
altogether and always associate temporary files with the current
resource owner; the caller should switch to a sufficiently long-lived
one before calling.That would mean having a session-lifespan resource owner, which doesn't
seem amazingly useful to me: we have no other resource types for which
that would be even a little bit sane.
The only user of "interXact == true" is when the result set of a
holdable cursor is materialized. I'm envisioning a new resource owner
alongside PortalData.holdContext which is the memory context used to
hold the tuplestore.
I'm inclined to leave the temp
file API as nearly as possible as-is. Having it attach only xact-local
files to the CurrentResourceOwner sounds fine to me.
Ok, fine with me.
One thought here is that my inclination would be to keep
CleanupTemporaryFiles in place and have it just look for temp files that
didn't get closed, as a debugging aid. This is comparable to the
behavior of other modules that are mostly driven by resource managers.
We didn't take out the end-of-xact checks when we added resource manager
support elsewhere, and I'm not inclined to do that here. I agree we
don't need an EOSubXact-time crosscheck, though.
Ok, committed that way. I made the cross-check a WARNING. That seems
like the right level of seriousness, although I see that many of the
other similar checks are Asserts.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
Ok, committed that way. I made the cross-check a WARNING. That seems
like the right level of seriousness, although I see that many of the
other similar checks are Asserts.
Looks good. I'm a bit tempted to rename the interXact argument to
something like noOwner --- no change in API, just name it a bit closer
to what it actually does now. Thoughts?
regards, tom lane
Tom Lane wrote:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
Ok, committed that way. I made the cross-check a WARNING. That seems
like the right level of seriousness, although I see that many of the
other similar checks are Asserts.Looks good. I'm a bit tempted to rename the interXact argument to
something like noOwner --- no change in API, just name it a bit closer
to what it actually does now. Thoughts?
Well, I did keep the cross-check at end-of-transaction, so it's not a
complete misnomer as it is; a file opened with interXact=false can't be
used across transactions. I won't object if you feel like renaming it,
though.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com