We've broken something in error recovery
In a somewhat misguided attempt to test something else, I did this in
CVS HEAD:
do $$begin
for i in 1 .. 10000 loop
execute 'create table t' || i::text || ' (f1 int primary key)';
end loop;
end$$;
This ran for awhile and then ran out of lock table space, which was
not surprising in hindsight:
ERROR: out of shared memory
HINT: You might need to increase max_locks_per_transaction.
But what was surprising was what happened next: the autovac launcher
immediately crashed.
TRAP: FailedAssertion("!(nestLevel > 0 && nestLevel <= GUCNestLevel)", File: "guc.c", Line: 3907)
LOG: autovacuum launcher process (PID 25220) was terminated by signal 6
Stack trace looks like
#4 0x4e85b4 in ExceptionalCondition (
conditionName=0x1ac4ac "!(nestLevel > 0 && nestLevel <= GUCNestLevel)",
errorType=0x1abf44 "FailedAssertion", fileName=0x1abee4 "guc.c",
lineNumber=3907) at assert.c:57
#5 0x501f48 in AtEOXact_GUC (isCommit=-86 '�', nestLevel=84) at guc.c:3907
#6 0x20618c in AbortTransaction () at xact.c:2194
#7 0x20688c in AbortCurrentTransaction () at xact.c:2568
#8 0x3b0f84 in AutoVacLauncherMain (argc=2063670312, argv=0x7b03b94c)
at autovacuum.c:491
#9 0x3b0bd8 in StartAutoVacLauncher () at autovacuum.c:371
Haven't dug any deeper yet --- who's touched this code lately?
regards, tom lane
Tom Lane wrote:
#4 0x4e85b4 in ExceptionalCondition (
conditionName=0x1ac4ac "!(nestLevel > 0 && nestLevel <= GUCNestLevel)",
errorType=0x1abf44 "FailedAssertion", fileName=0x1abee4 "guc.c",
lineNumber=3907) at assert.c:57
#5 0x501f48 in AtEOXact_GUC (isCommit=-86 '�', nestLevel=84) at guc.c:3907
#6 0x20618c in AbortTransaction () at xact.c:2194
This looks like maybe a corrupted stack - the args to AtEOXact_GUC at
that location in xact.c are hardwired.
cheers
andrew
Andrew Dunstan <andrew@dunslane.net> writes:
Tom Lane wrote:
#5 0x501f48 in AtEOXact_GUC (isCommit=-86 '�', nestLevel=84) at guc.c:3907
This looks like maybe a corrupted stack - the args to AtEOXact_GUC at
that location in xact.c are hardwired.
No, that's just a fairly typical behavior of debugging with -O greater
than zero --- the registers holding those parameter values got recycled
for something else. This is a rather old version of gdb and it doesn't
always print <<value optimized away>> when it should.
regards, tom lane
I wrote:
#4 0x4e85b4 in ExceptionalCondition (
conditionName=0x1ac4ac "!(nestLevel > 0 && nestLevel <= GUCNestLevel)",
errorType=0x1abf44 "FailedAssertion", fileName=0x1abee4 "guc.c",
lineNumber=3907) at assert.c:57
#5 0x501f48 in AtEOXact_GUC (isCommit=-86 '�', nestLevel=84) at guc.c:3907
#6 0x20618c in AbortTransaction () at xact.c:2194
#7 0x20688c in AbortCurrentTransaction () at xact.c:2568
#8 0x3b0f84 in AutoVacLauncherMain (argc=2063670312, argv=0x7b03b94c)
at autovacuum.c:491
On investigation I think that Assert may just be overenthusiastic.
The problem is that StartTransaction is failing at
VirtualXactLockTableInsert, for lack of any shared memory to acquire
the lock with; and then we try to do AbortTransaction and GUC is
unhappy because it's not been initialized yet. So this isn't a
new bug at all, it's been there awhile ...
regards, tom lane