Thread: BUG #14174: Expanded-datum bug accessing freed memory

BUG #14174: Expanded-datum bug accessing freed memory

From
andrew@tao11.riddles.org.uk
Date:
VGhlIGZvbGxvd2luZyBidWcgaGFzIGJlZW4gbG9nZ2VkIG9uIHRoZSB3ZWJz
aXRlOgoKQnVnIHJlZmVyZW5jZTogICAgICAxNDE3NApMb2dnZWQgYnk6ICAg
ICAgICAgIEFuZHJldyBHaWVydGgKRW1haWwgYWRkcmVzczogICAgICBhbmRy
ZXdAdGFvMTEucmlkZGxlcy5vcmcudWsKUG9zdGdyZVNRTCB2ZXJzaW9uOiA5
LjUuMwpPcGVyYXRpbmcgc3lzdGVtOiAgIGFueQpEZXNjcmlwdGlvbjogICAg
ICAgIAoKVGhpcyBpcyBhIHRlc3RjYXNlIGRpc3RpbGxlZCBmcm9tIGEgcHJv
YmxlbSByZXBvcnRlZCBvbiBJUkM6DQoNCmNyZWF0ZSBvciByZXBsYWNlIGZ1
bmN0aW9uIGFmbjEoaW50ZWdlcikgcmV0dXJucyBpbnRbXSBsYW5ndWFnZSBw
bHBnc3FsDQogYXMgJGYkIGRlY2xhcmUgciBpbnRbXTsgYmVnaW4gc2VsZWN0
IGludG8gciBhcnJheVskMSwkMV07IHJldHVybiByOyBlbmQ7CiRmJDsNCmNy
ZWF0ZSBvciByZXBsYWNlIGZ1bmN0aW9uIGFmbjIoaW50W10pIHJldHVybnMg
aW50ZWdlciBsYW5ndWFnZSBwbHBnc3FsDQogYXMgJGYkIGJlZ2luIHJldHVy
biAkMVsxXTsgZW5kOyAkZiQ7DQpjcmVhdGUgb3IgcmVwbGFjZSB2aWV3IHZf
dGVzdCBhcyBzZWxlY3QgaSwgYSBmcm9tIChzZWxlY3QgYWZuMSgxKSBhcyBh
Cm9mZnNldCAwKSBzLCBsYXRlcmFsIGFmbjIoYSkgaTsNCmNyZWF0ZSBmdW5j
dGlvbiB0c3RmbihpbnRlZ2VyKSByZXR1cm5zIHNldG9mIHZfdGVzdCBsYW5n
dWFnZSBwbHBnc3FsDQogYXMgJGYkIGJlZ2luIHJldHVybiBxdWVyeSBzZWxl
Y3QgKiBmcm9tIHZfdGVzdCB3aGVyZSBpPSQxOyBlbmQ7ICRmJDsNCnNlbGVj
dCAqIGZyb20gdHN0Zm4oMSk7DQoNCkVSUk9SOiAgY2FjaGUgbG9va3VwIGZh
aWxlZCBmb3IgdHlwZSAyMTM5MDYyMTQzDQoNCigyMTM5MDYyMTQzID0gMHg3
ZjdmN2Y3ZiwgYW5kIEknbSB0ZXN0aW5nIHRoaXMgb24gYSBjYXNzZXJ0IGJ1
aWxkLCBzbyB0aGlzCmlzIGFjY2Vzc2luZyBmcmVlZCBtZW1vcnkgc29tZXdo
ZXJlIC0gdGhlIG9yaWdpbmFsIHJlcG9ydGVyIHdhcyBnZXR0aW5nCiJjYWNo
ZSBsb29rdXAgZmFpbGVkIGZvciB0eXBlIDAiIG9yIGEgc2VnZmF1bHQpDQoN
CklmIHRoZSAicmV0dXJuIHI7IiBpbiBhZm4xKCkgaXMgY2hhbmdlZCB0byAi
cmV0dXJuIHIgfHwgJ3t9JzsiLCBvciBpZiB0aGUKbGF0ZXJhbCBjYWxsIHRv
IGFmbjIgaXMgb21pdHRlZCwgdGhlIHByb2JsZW0gaXMgbm90IGFwcGFyZW50
LgoK

Re: BUG #14174: Expanded-datum bug accessing freed memory

From
Tom Lane
Date:
andrew@tao11.riddles.org.uk writes:
> This is a testcase distilled from a problem reported on IRC:

> create or replace function afn1(integer) returns int[] language plpgsql
>  as $f$ declare r int[]; begin select into r array[$1,$1]; return r; end;
> $f$;
> create or replace function afn2(int[]) returns integer language plpgsql
>  as $f$ begin return $1[1]; end; $f$;
> create or replace view v_test as select i, a from (select afn1(1) as a
> offset 0) s, lateral afn2(a) i;
> create function tstfn(integer) returns setof v_test language plpgsql
>  as $f$ begin return query select * from v_test where i=$1; end; $f$;
> select * from tstfn(1);

> ERROR:  cache lookup failed for type 2139062143

Hm, you don't need tstfn(), nor even the view:

regression=# select i, a from (select afn1(1) as a offset 0) s, lateral afn2(a) i;
ERROR:  cache lookup failed for type 2139062143

The core of the problem here is that afn1() is returning a read-write
pointer to the expanded object holding "r", and then when that's passed
to afn2(), it supposes that it can take ownership of it as a read-write
local variable; which means the value gets destroyed when afn2() exits.
That'd be all right except there's another reference to "a" still to be
evaluated :-(

We should have forestalled this by applying MakeExpandedObjectReadOnly
to the output of the first sub-select.  But we didn't because that's
currently only done by SubqueryNext, and the SubqueryScan node that
would have fixed this has been optimized away:

 Nested Loop  (cost=0.25..0.54 rows=1 width=36)
   Output: i.i, (afn1(1))
   ->  Result  (cost=0.00..0.26 rows=1 width=32)
         Output: afn1(1)
   ->  Function Scan on public.afn2 i  (cost=0.25..0.26 rows=1 width=4)
         Output: i.i
         Function Call: afn2((afn1(1)))

I am thinking maybe we need to have ExecProject do
MakeExpandedObjectReadOnly on each result, rather than assuming that
SubqueryScan is the place for that.  This would slightly increase the
general overhead attributable to the expanded-object feature, which is
unfortunate, but right now it's not clear that anything less is safe.
Making Result nodes do that would fix this particular instance but
there are plenty of other node types that might appear at the top of
a sub-select.

A possible future improvement is to teach the planner to detect which
variables are actually multiply referenced, and force read-only-ness
for only those values.  But that's clearly not something that would be
reasonable to back-patch into 9.5, or even 9.6 at this point.

            regards, tom lane

Re: BUG #14174: Expanded-datum bug accessing freed memory

From
Andrew Gierth
Date:
[CC'ing in the original reporter, edward.greve at gmail.com]

>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 Tom> The core of the problem here is that afn1() is returning a
 Tom> read-write pointer to the expanded object holding "r", and then
 Tom> when that's passed to afn2(), it supposes that it can take
 Tom> ownership of it as a read-write local variable; which means the
 Tom> value gets destroyed when afn2() exits.  That'd be all right
 Tom> except there's another reference to "a" still to be evaluated :-(

Yeah, I figured it was something like that, but this is the first time
I've had to look at the expanded object stuff.

 Tom> I am thinking maybe we need to have ExecProject do
 Tom> MakeExpandedObjectReadOnly on each result, rather than assuming
 Tom> that SubqueryScan is the place for that.  This would slightly
 Tom> increase the general overhead attributable to the expanded-object
 Tom> feature, which is unfortunate, but right now it's not clear that
 Tom> anything less is safe.

I concur.

 Tom> Making Result nodes do that would fix this particular instance but
 Tom> there are plenty of other node types that might appear at the top
 Tom> of a sub-select.

The Result node here is certainly an artifact of the testcase
construction; the original report (which featured about 300 lines of
view and function definitions, some of them with additional subselects,
nested in various ways) would probably have had an Agg node at the
relevant spot, and could conceivably have had any projecting node type
AFAIK.

 Tom> A possible future improvement is to teach the planner to detect
 Tom> which variables are actually multiply referenced, and force
 Tom> read-only-ness for only those values.  But that's clearly not
 Tom> something that would be reasonable to back-patch into 9.5, or even
 Tom> 9.6 at this point.

Clearly.

--
Andrew (irc:RhodiumToad)