Thread: BUG #14174: Expanded-datum bug accessing freed memory
VGhlIGZvbGxvd2luZyBidWcgaGFzIGJlZW4gbG9nZ2VkIG9uIHRoZSB3ZWJz aXRlOgoKQnVnIHJlZmVyZW5jZTogICAgICAxNDE3NApMb2dnZWQgYnk6ICAg ICAgICAgIEFuZHJldyBHaWVydGgKRW1haWwgYWRkcmVzczogICAgICBhbmRy ZXdAdGFvMTEucmlkZGxlcy5vcmcudWsKUG9zdGdyZVNRTCB2ZXJzaW9uOiA5 LjUuMwpPcGVyYXRpbmcgc3lzdGVtOiAgIGFueQpEZXNjcmlwdGlvbjogICAg ICAgIAoKVGhpcyBpcyBhIHRlc3RjYXNlIGRpc3RpbGxlZCBmcm9tIGEgcHJv YmxlbSByZXBvcnRlZCBvbiBJUkM6DQoNCmNyZWF0ZSBvciByZXBsYWNlIGZ1 bmN0aW9uIGFmbjEoaW50ZWdlcikgcmV0dXJucyBpbnRbXSBsYW5ndWFnZSBw bHBnc3FsDQogYXMgJGYkIGRlY2xhcmUgciBpbnRbXTsgYmVnaW4gc2VsZWN0 IGludG8gciBhcnJheVskMSwkMV07IHJldHVybiByOyBlbmQ7CiRmJDsNCmNy ZWF0ZSBvciByZXBsYWNlIGZ1bmN0aW9uIGFmbjIoaW50W10pIHJldHVybnMg aW50ZWdlciBsYW5ndWFnZSBwbHBnc3FsDQogYXMgJGYkIGJlZ2luIHJldHVy biAkMVsxXTsgZW5kOyAkZiQ7DQpjcmVhdGUgb3IgcmVwbGFjZSB2aWV3IHZf dGVzdCBhcyBzZWxlY3QgaSwgYSBmcm9tIChzZWxlY3QgYWZuMSgxKSBhcyBh Cm9mZnNldCAwKSBzLCBsYXRlcmFsIGFmbjIoYSkgaTsNCmNyZWF0ZSBmdW5j dGlvbiB0c3RmbihpbnRlZ2VyKSByZXR1cm5zIHNldG9mIHZfdGVzdCBsYW5n dWFnZSBwbHBnc3FsDQogYXMgJGYkIGJlZ2luIHJldHVybiBxdWVyeSBzZWxl Y3QgKiBmcm9tIHZfdGVzdCB3aGVyZSBpPSQxOyBlbmQ7ICRmJDsNCnNlbGVj dCAqIGZyb20gdHN0Zm4oMSk7DQoNCkVSUk9SOiAgY2FjaGUgbG9va3VwIGZh aWxlZCBmb3IgdHlwZSAyMTM5MDYyMTQzDQoNCigyMTM5MDYyMTQzID0gMHg3 ZjdmN2Y3ZiwgYW5kIEknbSB0ZXN0aW5nIHRoaXMgb24gYSBjYXNzZXJ0IGJ1 aWxkLCBzbyB0aGlzCmlzIGFjY2Vzc2luZyBmcmVlZCBtZW1vcnkgc29tZXdo ZXJlIC0gdGhlIG9yaWdpbmFsIHJlcG9ydGVyIHdhcyBnZXR0aW5nCiJjYWNo ZSBsb29rdXAgZmFpbGVkIGZvciB0eXBlIDAiIG9yIGEgc2VnZmF1bHQpDQoN CklmIHRoZSAicmV0dXJuIHI7IiBpbiBhZm4xKCkgaXMgY2hhbmdlZCB0byAi cmV0dXJuIHIgfHwgJ3t9JzsiLCBvciBpZiB0aGUKbGF0ZXJhbCBjYWxsIHRv IGFmbjIgaXMgb21pdHRlZCwgdGhlIHByb2JsZW0gaXMgbm90IGFwcGFyZW50 LgoK
andrew@tao11.riddles.org.uk writes: > This is a testcase distilled from a problem reported on IRC: > create or replace function afn1(integer) returns int[] language plpgsql > as $f$ declare r int[]; begin select into r array[$1,$1]; return r; end; > $f$; > create or replace function afn2(int[]) returns integer language plpgsql > as $f$ begin return $1[1]; end; $f$; > create or replace view v_test as select i, a from (select afn1(1) as a > offset 0) s, lateral afn2(a) i; > create function tstfn(integer) returns setof v_test language plpgsql > as $f$ begin return query select * from v_test where i=$1; end; $f$; > select * from tstfn(1); > ERROR: cache lookup failed for type 2139062143 Hm, you don't need tstfn(), nor even the view: regression=# select i, a from (select afn1(1) as a offset 0) s, lateral afn2(a) i; ERROR: cache lookup failed for type 2139062143 The core of the problem here is that afn1() is returning a read-write pointer to the expanded object holding "r", and then when that's passed to afn2(), it supposes that it can take ownership of it as a read-write local variable; which means the value gets destroyed when afn2() exits. That'd be all right except there's another reference to "a" still to be evaluated :-( We should have forestalled this by applying MakeExpandedObjectReadOnly to the output of the first sub-select. But we didn't because that's currently only done by SubqueryNext, and the SubqueryScan node that would have fixed this has been optimized away: Nested Loop (cost=0.25..0.54 rows=1 width=36) Output: i.i, (afn1(1)) -> Result (cost=0.00..0.26 rows=1 width=32) Output: afn1(1) -> Function Scan on public.afn2 i (cost=0.25..0.26 rows=1 width=4) Output: i.i Function Call: afn2((afn1(1))) I am thinking maybe we need to have ExecProject do MakeExpandedObjectReadOnly on each result, rather than assuming that SubqueryScan is the place for that. This would slightly increase the general overhead attributable to the expanded-object feature, which is unfortunate, but right now it's not clear that anything less is safe. Making Result nodes do that would fix this particular instance but there are plenty of other node types that might appear at the top of a sub-select. A possible future improvement is to teach the planner to detect which variables are actually multiply referenced, and force read-only-ness for only those values. But that's clearly not something that would be reasonable to back-patch into 9.5, or even 9.6 at this point. regards, tom lane
[CC'ing in the original reporter, edward.greve at gmail.com] >>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> The core of the problem here is that afn1() is returning a Tom> read-write pointer to the expanded object holding "r", and then Tom> when that's passed to afn2(), it supposes that it can take Tom> ownership of it as a read-write local variable; which means the Tom> value gets destroyed when afn2() exits. That'd be all right Tom> except there's another reference to "a" still to be evaluated :-( Yeah, I figured it was something like that, but this is the first time I've had to look at the expanded object stuff. Tom> I am thinking maybe we need to have ExecProject do Tom> MakeExpandedObjectReadOnly on each result, rather than assuming Tom> that SubqueryScan is the place for that. This would slightly Tom> increase the general overhead attributable to the expanded-object Tom> feature, which is unfortunate, but right now it's not clear that Tom> anything less is safe. I concur. Tom> Making Result nodes do that would fix this particular instance but Tom> there are plenty of other node types that might appear at the top Tom> of a sub-select. The Result node here is certainly an artifact of the testcase construction; the original report (which featured about 300 lines of view and function definitions, some of them with additional subselects, nested in various ways) would probably have had an Agg node at the relevant spot, and could conceivably have had any projecting node type AFAIK. Tom> A possible future improvement is to teach the planner to detect Tom> which variables are actually multiply referenced, and force Tom> read-only-ness for only those values. But that's clearly not Tom> something that would be reasonable to back-patch into 9.5, or even Tom> 9.6 at this point. Clearly. -- Andrew (irc:RhodiumToad)