"cache reference leak" and "problem in alloc set" warnings
Hi,
I've been trying to implement INOUT/OUT functionality in PL/scheme. When
I return a record type tuple, postmaster complains with below warnings:
WARNING: problem in alloc set ExprContext: detected write past chunk
end in block 0x8462f00, chunk 0x84634c8
WARNING: cache reference leak: cache pg_type (34), tuple 2/7 has
count 1
I found a related thread in the ml archives that Joe Conway fixed a
similar problem in one of his patches but I couldn't figure out how he
did it. Can somebody help me to figure out the reasons of above warnings
and how can I fix them?
Regards.
P.S. Also here's the backtrace of stack just before warnings are dumped.
Yeah, it's a little bit useless 'cause there's nearly one way to
reach these errors but... I thought it can give an oversight to
hackers who takes a quick look.
Breakpoint 2, AllocSetCheck (context=0x845ff58) at aset.c:1155
1155 elog(WARNING, "problem in alloc set %s: detected write past c
(gdb) where
#0 AllocSetCheck (context=0x845ff58) at aset.c:1155
#1 0x0829b728 in AllocSetReset (context=0x845ff58) at aset.c:407
#2 0x0829c958 in MemoryContextReset (context=0x845ff58) at mcxt.c:129
#3 0x0817dce5 in ExecResult (node=0x84a0754) at nodeResult.c:113
#4 0x0816b423 in ExecProcNode (node=0x84a0754) at execProcnode.c:334
#5 0x081698fb in ExecutePlan (estate=0x84a05bc, planstate=0x84a0754, operation=CMD_SELECT,
numberTuples=0, direction=138818820, dest=0x84102ec) at execMain.c:1145
#6 0x0816888b in ExecutorRun (queryDesc=0x842c680, direction=ForwardScanDirection, count=138818820)
at execMain.c:223
#7 0x08204a08 in PortalRunSelect (portal=0x842eae4, forward=1 '\001', count=0, dest=0x84102ec)
at pquery.c:803
#8 0x08204762 in PortalRun (portal=0x842eae4, count=2147483647, dest=0x84102ec, altdest=0x84102ec,
completionTag=0xbfc23cb0 "") at pquery.c:655
#9 0x082001e5 in exec_simple_query (query_string=0x840f91c "SELECT in_out_t_2(13, true);")
at postgres.c:1004
#10 0x08202de5 in PostgresMain (argc=4, argv=0x83bd7fc, username=0x83bd7d4 "vy") at postgres.c:3184
#11 0x081d6b54 in BackendRun (port=0x83d21a8) at postmaster.c:2853
#12 0x081d636f in BackendStartup (port=0x83d21a8) at postmaster.c:2490
#13 0x081d455e in ServerLoop () at postmaster.c:1203
#14 0x081d39ca in PostmasterMain (argc=3, argv=0x83bb888) at postmaster.c:955
#15 0x0818d404 in main (argc=3, argv=0x83bb888) at main.c:187
Breakpoint 1, PrintCatCacheLeakWarning (tuple=0xb5ef7dbc) at catcache.c:1808
1808 Assert(ct->ct_magic == CT_MAGIC);
(gdb) where
#0 PrintCatCacheLeakWarning (tuple=0xb5ef7dbc) at catcache.c:1808
#1 0x0829e927 in ResourceOwnerReleaseInternal (owner=0x83da800,
phase=RESOURCE_RELEASE_AFTER_LOCKS, isCommit=1 '\001', isTopLevel=0 '\0') at resowner.c:273
#2 0x0829e64c in ResourceOwnerRelease (owner=0x83da800, phase=RESOURCE_RELEASE_AFTER_LOCKS,
isCommit=1 '\001', isTopLevel=0 '\0') at resowner.c:165
#3 0x0829dd8e in PortalDrop (portal=0x842eae4, isTopCommit=0 '\0') at portalmem.c:358
#4 0x082001f9 in exec_simple_query (query_string=0x840f91c "SELECT in_out_t_2(13, true);")
at postgres.c:1012
#5 0x08202de5 in PostgresMain (argc=4, argv=0x83bd7fc, username=0x83bd7d4 "vy") at postgres.c:3184
#6 0x081d6b54 in BackendRun (port=0x83d21a8) at postmaster.c:2853
#7 0x081d636f in BackendStartup (port=0x83d21a8) at postmaster.c:2490
#8 0x081d455e in ServerLoop () at postmaster.c:1203
#9 0x081d39ca in PostmasterMain (argc=3, argv=0x83bb888) at postmaster.c:955
#10 0x0818d404 in main (argc=3, argv=0x83bb888) at main.c:187
On Aug 16 03:09, Volkan YAZICI wrote:
WARNING: problem in alloc set ExprContext: detected write past chunk
end in block 0x8462f00, chunk 0x84634c8
WARNING: cache reference leak: cache pg_type (34), tuple 2/7 has
count 1
Excuse me for bugging the list. I've solved the problem. I should look
for ReleaseSysCache() call just after every SearchSysCache().
Regards.
On Aug 16 04:20, Volkan YAZICI wrote:
On Aug 16 03:09, Volkan YAZICI wrote:
WARNING: problem in alloc set ExprContext: detected write past chunk
end in block 0x8462f00, chunk 0x84634c8
WARNING: cache reference leak: cache pg_type (34), tuple 2/7 has
count 1Excuse me for bugging the list. I've solved the problem. I should look
for ReleaseSysCache() call just after every SearchSysCache().
Looks like this only solves catalog search related allocation issues.
I've still biten by a single "write past chunk" error while returning a
record in PL/scheme:
WARNING: problem in alloc set ExprContext: detected write past chunk
end in block 0x84a0598, chunk 0x84a0c84
First, I thouht that it was because of clobbering a memory chunk that
doesn't belong to me. But when I place a
{ char *tmp = palloc(32); printf("-> %p\n", tmp); pfree(tmp) }
line at the entrance and end of PL handler, outputed bounds don't
include above 0x84a0598 chunk. Even the address of the heap tuples I
created are far distinct from the address in the error message.
I don't have any clue about the problematic section of the code,
although I know that it occurs when you return a record. I'd be very
very appreciated if somebody can help me to figure out how to debug (or
even solve) the problem.
Regards.
P.S. Here's the related source code: http://cvs.pgfoundry.org/cgi-bin/
cvsweb.cgi/~checkout~/plscheme/plscheme/plscheme-8.2.c?rev=1.3&content
-type=text/plain in case of if anyone would want to take a look at.
Volkan YAZICI <yazicivo@ttnet.net.tr> writes:
I've still biten by a single "write past chunk" error while returning a
record in PL/scheme:
WARNING: problem in alloc set ExprContext: detected write past chunk
end in block 0x84a0598, chunk 0x84a0c84
The actual bug, almost certainly, is that you're miscomputing the space
needed for a variable-size palloc request. But tracking that down will
be hard until you find out which chunk it is.
Do you have a sequence that will make the problem happen consistently at
the same address? If so, you can use a gdb watchpoint to find out where
the write-past-end is happening. Or use a conditional breakpoint in
AllocSetAlloc to try to identify where the chunk is handed out.
Another possibility is to set a breakpoint where the warning is emitted
and take a look at the contents of the chunk to see if you can identify
it; that wouldn't require knowing the target chunk address in advance.
BTW, if I recall that code correctly, the "chunk address" in the message
is probably the address of the start of the overhead data for the chunk,
not the usable-space start address that is passed back by palloc.
regards, tom lane
On Aug 17 10:38, Tom Lane wrote:
Volkan YAZICI <yazicivo@ttnet.net.tr> writes:
I've still biten by a single "write past chunk" error while returning a
record in PL/scheme:WARNING: problem in alloc set ExprContext: detected write past chunk
end in block 0x84a0598, chunk 0x84a0c84The actual bug, almost certainly, is that you're miscomputing the space
needed for a variable-size palloc request. But tracking that down will
be hard until you find out which chunk it is.
Looks like my palloc() math was correct. Just I had missed special
handling of attnulls array passed to heap_formtuple(). It had should be
attnulls[i] = (isnull) ? 'n' : ' ';
Do you have a sequence that will make the problem happen consistently at
the same address? If so, you can use a gdb watchpoint to find out where
the write-past-end is happening. Or use a conditional breakpoint in
AllocSetAlloc to try to identify where the chunk is handed out.
Yeah! That's exactly it. After setting a "watchpoint *0x84a0c84", in the
first "where" call, the erronous line is in front of me!
Another possibility is to set a breakpoint where the warning is emitted
and take a look at the contents of the chunk to see if you can identify
it; that wouldn't require knowing the target chunk address in advance.BTW, if I recall that code correctly, the "chunk address" in the message
is probably the address of the start of the overhead data for the chunk,
not the usable-space start address that is passed back by palloc.
Thanks so much for your kindly help. These all mentioned methods are
applicable in a whole software development area. Thanks again.
Regards.
Volkan YAZICI <yazicivo@ttnet.net.tr> writes:
Looks like my palloc() math was correct. Just I had missed special
handling of attnulls array passed to heap_formtuple(). It had should be
attnulls[i] = (isnull) ? 'n' : ' ';
These days I'd use heap_form_tuple in new code --- then you can work
with plain bool isnull flags instead of that weird 'n'/' ' convention.
regards, tom lane