Thread: "cache reference leak" and "problem in alloc set" warnings
Hi, I've been trying to implement INOUT/OUT functionality in PL/scheme. When I return a record type tuple, postmaster complains with below warnings: WARNING: problem in alloc set ExprContext: detected write past chunk end in block 0x8462f00, chunk 0x84634c8 WARNING: cache reference leak: cache pg_type (34), tuple 2/7 has count 1 I found a related thread in the ml archives that Joe Conway fixed a similar problem in one of his patches but I couldn't figure out how he did it. Can somebody help me to figure out the reasons of above warnings and how can I fix them? Regards. P.S. Also here's the backtrace of stack just before warnings are dumped. Yeah, it's a little bit useless 'cause there'snearly one way to reach these errors but... I thought it can give an oversight to hackers who takes a quicklook. Breakpoint 2, AllocSetCheck (context=0x845ff58) at aset.c:1155 1155 elog(WARNING, "problem in alloc set %s: detected write past c (gdb) where #0 AllocSetCheck (context=0x845ff58) at aset.c:1155 #1 0x0829b728 in AllocSetReset (context=0x845ff58) at aset.c:407 #2 0x0829c958 in MemoryContextReset (context=0x845ff58) at mcxt.c:129 #3 0x0817dce5 in ExecResult (node=0x84a0754) at nodeResult.c:113 #4 0x0816b423 in ExecProcNode (node=0x84a0754) at execProcnode.c:334 #5 0x081698fb in ExecutePlan (estate=0x84a05bc, planstate=0x84a0754, operation=CMD_SELECT, numberTuples=0, direction=138818820,dest=0x84102ec) at execMain.c:1145 #6 0x0816888b in ExecutorRun (queryDesc=0x842c680, direction=ForwardScanDirection, count=138818820) at execMain.c:223 #7 0x08204a08 in PortalRunSelect (portal=0x842eae4, forward=1 '\001', count=0, dest=0x84102ec) at pquery.c:803 #8 0x08204762 in PortalRun (portal=0x842eae4, count=2147483647, dest=0x84102ec, altdest=0x84102ec, completionTag=0xbfc23cb0"") at pquery.c:655 #9 0x082001e5 in exec_simple_query (query_string=0x840f91c "SELECT in_out_t_2(13, true);") at postgres.c:1004 #10 0x08202de5 in PostgresMain (argc=4, argv=0x83bd7fc, username=0x83bd7d4 "vy") at postgres.c:3184 #11 0x081d6b54 in BackendRun (port=0x83d21a8) at postmaster.c:2853 #12 0x081d636f in BackendStartup (port=0x83d21a8) at postmaster.c:2490 #13 0x081d455e in ServerLoop () at postmaster.c:1203 #14 0x081d39ca in PostmasterMain (argc=3, argv=0x83bb888) at postmaster.c:955 #15 0x0818d404 in main (argc=3, argv=0x83bb888) at main.c:187 Breakpoint 1, PrintCatCacheLeakWarning (tuple=0xb5ef7dbc) at catcache.c:1808 1808 Assert(ct->ct_magic == CT_MAGIC); (gdb) where #0 PrintCatCacheLeakWarning (tuple=0xb5ef7dbc) at catcache.c:1808 #1 0x0829e927 in ResourceOwnerReleaseInternal (owner=0x83da800, phase=RESOURCE_RELEASE_AFTER_LOCKS, isCommit=1 '\001',isTopLevel=0 '\0') at resowner.c:273 #2 0x0829e64c in ResourceOwnerRelease (owner=0x83da800, phase=RESOURCE_RELEASE_AFTER_LOCKS, isCommit=1 '\001', isTopLevel=0'\0') at resowner.c:165 #3 0x0829dd8e in PortalDrop (portal=0x842eae4, isTopCommit=0 '\0') at portalmem.c:358 #4 0x082001f9 in exec_simple_query (query_string=0x840f91c "SELECT in_out_t_2(13, true);") at postgres.c:1012 #5 0x08202de5 in PostgresMain (argc=4, argv=0x83bd7fc, username=0x83bd7d4 "vy") at postgres.c:3184 #6 0x081d6b54 in BackendRun (port=0x83d21a8) at postmaster.c:2853 #7 0x081d636f in BackendStartup (port=0x83d21a8) at postmaster.c:2490 #8 0x081d455e in ServerLoop () at postmaster.c:1203 #9 0x081d39ca in PostmasterMain (argc=3, argv=0x83bb888) at postmaster.c:955 #10 0x0818d404 in main (argc=3, argv=0x83bb888) at main.c:187
On Aug 16 03:09, Volkan YAZICI wrote: > WARNING: problem in alloc set ExprContext: detected write past chunk > end in block 0x8462f00, chunk 0x84634c8 > WARNING: cache reference leak: cache pg_type (34), tuple 2/7 has > count 1 Excuse me for bugging the list. I've solved the problem. I should look for ReleaseSysCache() call just after every SearchSysCache(). Regards.
On Aug 16 04:20, Volkan YAZICI wrote: > On Aug 16 03:09, Volkan YAZICI wrote: > > WARNING: problem in alloc set ExprContext: detected write past chunk > > end in block 0x8462f00, chunk 0x84634c8 > > WARNING: cache reference leak: cache pg_type (34), tuple 2/7 has > > count 1 > > Excuse me for bugging the list. I've solved the problem. I should look > for ReleaseSysCache() call just after every SearchSysCache(). Looks like this only solves catalog search related allocation issues. I've still biten by a single "write past chunk" error while returning a record in PL/scheme: WARNING: problem in alloc set ExprContext: detected write past chunk end in block 0x84a0598, chunk 0x84a0c84 First, I thouht that it was because of clobbering a memory chunk that doesn't belong to me. But when I place a { char *tmp = palloc(32); printf("-> %p\n", tmp); pfree(tmp) } line at the entrance and end of PL handler, outputed bounds don't include above 0x84a0598 chunk. Even the address of the heap tuples I created are far distinct from the address in the error message. I don't have any clue about the problematic section of the code, although I know that it occurs when you return a record. I'd be very very appreciated if somebody can help me to figure out how to debug (or even solve) the problem. Regards. P.S. Here's the related source code: http://cvs.pgfoundry.org/cgi-bin/ cvsweb.cgi/~checkout~/plscheme/plscheme/plscheme-8.2.c?rev=1.3&content -type=text/plain in case of if anyone would want to take a look at.
Volkan YAZICI <yazicivo@ttnet.net.tr> writes: > I've still biten by a single "write past chunk" error while returning a > record in PL/scheme: > WARNING: problem in alloc set ExprContext: detected write past chunk > end in block 0x84a0598, chunk 0x84a0c84 The actual bug, almost certainly, is that you're miscomputing the space needed for a variable-size palloc request. But tracking that down will be hard until you find out which chunk it is. Do you have a sequence that will make the problem happen consistently at the same address? If so, you can use a gdb watchpoint to find out where the write-past-end is happening. Or use a conditional breakpoint in AllocSetAlloc to try to identify where the chunk is handed out. Another possibility is to set a breakpoint where the warning is emitted and take a look at the contents of the chunk to see if you can identify it; that wouldn't require knowing the target chunk address in advance. BTW, if I recall that code correctly, the "chunk address" in the message is probably the address of the start of the overhead data for the chunk, not the usable-space start address that is passed back by palloc. regards, tom lane
On Aug 17 10:38, Tom Lane wrote: > Volkan YAZICI <yazicivo@ttnet.net.tr> writes: > > I've still biten by a single "write past chunk" error while returning a > > record in PL/scheme: > > > WARNING: problem in alloc set ExprContext: detected write past chunk > > end in block 0x84a0598, chunk 0x84a0c84 > > The actual bug, almost certainly, is that you're miscomputing the space > needed for a variable-size palloc request. But tracking that down will > be hard until you find out which chunk it is. Looks like my palloc() math was correct. Just I had missed special handling of attnulls array passed to heap_formtuple(). It had should be attnulls[i] = (isnull) ? 'n' : ' '; > Do you have a sequence that will make the problem happen consistently at > the same address? If so, you can use a gdb watchpoint to find out where > the write-past-end is happening. Or use a conditional breakpoint in > AllocSetAlloc to try to identify where the chunk is handed out. Yeah! That's exactly it. After setting a "watchpoint *0x84a0c84", in the first "where" call, the erronous line is in front of me! > Another possibility is to set a breakpoint where the warning is emitted > and take a look at the contents of the chunk to see if you can identify > it; that wouldn't require knowing the target chunk address in advance. > > BTW, if I recall that code correctly, the "chunk address" in the message > is probably the address of the start of the overhead data for the chunk, > not the usable-space start address that is passed back by palloc. Thanks so much for your kindly help. These all mentioned methods are applicable in a whole software development area. Thanks again. Regards.
Volkan YAZICI <yazicivo@ttnet.net.tr> writes: > Looks like my palloc() math was correct. Just I had missed special > handling of attnulls array passed to heap_formtuple(). It had should be > attnulls[i] = (isnull) ? 'n' : ' '; These days I'd use heap_form_tuple in new code --- then you can work with plain bool isnull flags instead of that weird 'n'/' ' convention. regards, tom lane