Thread: "cache reference leak" and "problem in alloc set" warnings

"cache reference leak" and "problem in alloc set" warnings

From
Volkan YAZICI
Date:
Hi,

I've been trying to implement INOUT/OUT functionality in PL/scheme. When
I return a record type tuple, postmaster complains with below warnings:

WARNING:  problem in alloc set ExprContext: detected write past chunk
end in block 0x8462f00, chunk 0x84634c8
WARNING:  cache reference leak: cache pg_type (34), tuple 2/7 has
count 1

I found a related thread in the ml archives that Joe Conway fixed a
similar problem in one of his patches but I couldn't figure out how he
did it. Can somebody help me to figure out the reasons of above warnings
and how can I fix them?


Regards.

P.S. Also here's the backtrace of stack just before warnings are dumped.    Yeah, it's a little bit useless 'cause
there'snearly one way to    reach these errors but... I thought it can give an oversight to    hackers who takes a
quicklook.
 

Breakpoint 2, AllocSetCheck (context=0x845ff58) at aset.c:1155
1155                                    elog(WARNING, "problem in alloc set %s: detected write past c
(gdb) where
#0  AllocSetCheck (context=0x845ff58) at aset.c:1155
#1  0x0829b728 in AllocSetReset (context=0x845ff58) at aset.c:407
#2  0x0829c958 in MemoryContextReset (context=0x845ff58) at mcxt.c:129
#3  0x0817dce5 in ExecResult (node=0x84a0754) at nodeResult.c:113
#4  0x0816b423 in ExecProcNode (node=0x84a0754) at execProcnode.c:334
#5  0x081698fb in ExecutePlan (estate=0x84a05bc, planstate=0x84a0754, operation=CMD_SELECT,   numberTuples=0,
direction=138818820,dest=0x84102ec) at execMain.c:1145
 
#6  0x0816888b in ExecutorRun (queryDesc=0x842c680, direction=ForwardScanDirection, count=138818820)   at
execMain.c:223
#7  0x08204a08 in PortalRunSelect (portal=0x842eae4, forward=1 '\001', count=0, dest=0x84102ec)   at pquery.c:803
#8  0x08204762 in PortalRun (portal=0x842eae4, count=2147483647, dest=0x84102ec, altdest=0x84102ec,
completionTag=0xbfc23cb0"") at pquery.c:655
 
#9  0x082001e5 in exec_simple_query (query_string=0x840f91c "SELECT in_out_t_2(13, true);")   at postgres.c:1004
#10 0x08202de5 in PostgresMain (argc=4, argv=0x83bd7fc, username=0x83bd7d4 "vy") at postgres.c:3184
#11 0x081d6b54 in BackendRun (port=0x83d21a8) at postmaster.c:2853
#12 0x081d636f in BackendStartup (port=0x83d21a8) at postmaster.c:2490
#13 0x081d455e in ServerLoop () at postmaster.c:1203
#14 0x081d39ca in PostmasterMain (argc=3, argv=0x83bb888) at postmaster.c:955
#15 0x0818d404 in main (argc=3, argv=0x83bb888) at main.c:187

Breakpoint 1, PrintCatCacheLeakWarning (tuple=0xb5ef7dbc) at catcache.c:1808
1808            Assert(ct->ct_magic == CT_MAGIC);
(gdb) where
#0  PrintCatCacheLeakWarning (tuple=0xb5ef7dbc) at catcache.c:1808
#1  0x0829e927 in ResourceOwnerReleaseInternal (owner=0x83da800,   phase=RESOURCE_RELEASE_AFTER_LOCKS, isCommit=1
'\001',isTopLevel=0 '\0') at resowner.c:273
 
#2  0x0829e64c in ResourceOwnerRelease (owner=0x83da800, phase=RESOURCE_RELEASE_AFTER_LOCKS,   isCommit=1 '\001',
isTopLevel=0'\0') at resowner.c:165
 
#3  0x0829dd8e in PortalDrop (portal=0x842eae4, isTopCommit=0 '\0') at portalmem.c:358
#4  0x082001f9 in exec_simple_query (query_string=0x840f91c "SELECT in_out_t_2(13, true);")   at postgres.c:1012
#5  0x08202de5 in PostgresMain (argc=4, argv=0x83bd7fc, username=0x83bd7d4 "vy") at postgres.c:3184
#6  0x081d6b54 in BackendRun (port=0x83d21a8) at postmaster.c:2853
#7  0x081d636f in BackendStartup (port=0x83d21a8) at postmaster.c:2490
#8  0x081d455e in ServerLoop () at postmaster.c:1203
#9  0x081d39ca in PostmasterMain (argc=3, argv=0x83bb888) at postmaster.c:955       
#10 0x0818d404 in main (argc=3, argv=0x83bb888) at main.c:187


Re: "cache reference leak" and "problem in alloc set" warnings

From
Volkan YAZICI
Date:
On Aug 16 03:09, Volkan YAZICI wrote:
> WARNING:  problem in alloc set ExprContext: detected write past chunk
> end in block 0x8462f00, chunk 0x84634c8
> WARNING:  cache reference leak: cache pg_type (34), tuple 2/7 has
> count 1

Excuse me for bugging the list. I've solved the problem. I should look
for ReleaseSysCache() call just after every SearchSysCache().


Regards.


Re: "cache reference leak" and "problem in alloc set" warnings

From
Volkan YAZICI
Date:
On Aug 16 04:20, Volkan YAZICI wrote:
> On Aug 16 03:09, Volkan YAZICI wrote:
> > WARNING:  problem in alloc set ExprContext: detected write past chunk
> > end in block 0x8462f00, chunk 0x84634c8
> > WARNING:  cache reference leak: cache pg_type (34), tuple 2/7 has
> > count 1
> 
> Excuse me for bugging the list. I've solved the problem. I should look
> for ReleaseSysCache() call just after every SearchSysCache().

Looks like this only solves catalog search related allocation issues.
I've still biten by a single "write past chunk" error while returning a
record in PL/scheme:
 WARNING:  problem in alloc set ExprContext: detected write past chunk end in block 0x84a0598, chunk 0x84a0c84

First, I thouht that it was because of clobbering a memory chunk that
doesn't belong to me. But when I place a 
 { char *tmp = palloc(32); printf("-> %p\n", tmp); pfree(tmp) }

line at the entrance and end of PL handler, outputed bounds don't
include above 0x84a0598 chunk. Even the address of the heap tuples I
created are far distinct from the address in the error message.

I don't have any clue about the problematic section of the code,
although I know that it occurs when you return a record. I'd be very
very appreciated if somebody can help me to figure out how to debug (or
even solve) the problem.


Regards.

P.S. Here's the related source code: http://cvs.pgfoundry.org/cgi-bin/
cvsweb.cgi/~checkout~/plscheme/plscheme/plscheme-8.2.c?rev=1.3&content
-type=text/plain in case of if anyone would want to take a look at.


Re: "cache reference leak" and "problem in alloc set" warnings

From
Tom Lane
Date:
Volkan YAZICI <yazicivo@ttnet.net.tr> writes:
> I've still biten by a single "write past chunk" error while returning a
> record in PL/scheme:

>   WARNING:  problem in alloc set ExprContext: detected write past chunk
>   end in block 0x84a0598, chunk 0x84a0c84

The actual bug, almost certainly, is that you're miscomputing the space
needed for a variable-size palloc request.  But tracking that down will
be hard until you find out which chunk it is. 

Do you have a sequence that will make the problem happen consistently at
the same address?  If so, you can use a gdb watchpoint to find out where
the write-past-end is happening.  Or use a conditional breakpoint in
AllocSetAlloc to try to identify where the chunk is handed out.

Another possibility is to set a breakpoint where the warning is emitted
and take a look at the contents of the chunk to see if you can identify
it; that wouldn't require knowing the target chunk address in advance.

BTW, if I recall that code correctly, the "chunk address" in the message
is probably the address of the start of the overhead data for the chunk,
not the usable-space start address that is passed back by palloc.
        regards, tom lane


Re: "cache reference leak" and "problem in alloc set" warnings

From
Volkan YAZICI
Date:
On Aug 17 10:38, Tom Lane wrote:
> Volkan YAZICI <yazicivo@ttnet.net.tr> writes:
> > I've still biten by a single "write past chunk" error while returning a
> > record in PL/scheme:
> 
> >   WARNING:  problem in alloc set ExprContext: detected write past chunk
> >   end in block 0x84a0598, chunk 0x84a0c84
> 
> The actual bug, almost certainly, is that you're miscomputing the space
> needed for a variable-size palloc request.  But tracking that down will
> be hard until you find out which chunk it is. 

Looks like my palloc() math was correct. Just I had missed special
handling of attnulls array passed to heap_formtuple(). It had should be
 attnulls[i] = (isnull) ? 'n' : ' ';

> Do you have a sequence that will make the problem happen consistently at
> the same address?  If so, you can use a gdb watchpoint to find out where
> the write-past-end is happening.  Or use a conditional breakpoint in
> AllocSetAlloc to try to identify where the chunk is handed out.

Yeah! That's exactly it. After setting a "watchpoint *0x84a0c84", in the
first "where" call, the erronous line is in front of me!

> Another possibility is to set a breakpoint where the warning is emitted
> and take a look at the contents of the chunk to see if you can identify
> it; that wouldn't require knowing the target chunk address in advance.
> 
> BTW, if I recall that code correctly, the "chunk address" in the message
> is probably the address of the start of the overhead data for the chunk,
> not the usable-space start address that is passed back by palloc.

Thanks so much for your kindly help. These all mentioned methods are
applicable in a whole software development area. Thanks again.


Regards.


Re: "cache reference leak" and "problem in alloc set" warnings

From
Tom Lane
Date:
Volkan YAZICI <yazicivo@ttnet.net.tr> writes:
> Looks like my palloc() math was correct. Just I had missed special
> handling of attnulls array passed to heap_formtuple(). It had should be

>   attnulls[i] = (isnull) ? 'n' : ' ';

These days I'd use heap_form_tuple in new code --- then you can work
with plain bool isnull flags instead of that weird 'n'/' ' convention.
        regards, tom lane