Richard Guo <guofenglinux@gmail.com> writes: > Rebase the patch on HEAD as cfbot reminds.
This fix seems like a mess. The function that is in charge of filling RelOptInfo.lateral_vars is extract_lateral_references; or at least that was how it was done up to now. Can't we compute these additional references there? If not, maybe we ought to just merge extract_lateral_references into create_lateral_join_info, rather than having the responsibility split. I also wonder whether this change isn't creating hidden dependencies on RTE order (which would likely be bugs), since create_lateral_join_info itself examines the lateral_vars lists as it walks the rtable.
Yeah, you're right. Currently all RelOptInfo.lateral_vars are filled in extract_lateral_references. And then in create_lateral_join_info these lateral_vars, together with all PlaceHolderInfos, are examined to compute the per-base-relation lateral refs relids. However, in the process of extract_lateral_references, it's possible that we create new PlaceHolderInfos. So I think it may not be a good idea to extract the lateral references within PHVs there. But I agree with you that it's also not a good idea to compute these additional lateral Vars within PHVs in create_lateral_join_info as the patch does. Actually with the patch I find that with PHVs that are due to be evaluated at a join we may get a problematic plan. For instance
explain (costs off) select * from t t1 left join lateral (select t1.a as t1a, t2.a as t2a from t t2 join t t3 on true) s on true where s.t1a = s.t2a; QUERY PLAN ------------------------------------ Nested Loop -> Seq Scan on t t1 -> Nested Loop Join Filter: (t1.a = t2.a) -> Seq Scan on t t2 -> Memoize Cache Key: t1.a Cache Mode: binary -> Seq Scan on t t3 (9 rows)
There are no lateral refs in the subtree of the Memoize node, so it should be a Materialize node rather than a Memoize node. This is caused by that for a PHV that is due to be evaluated at a join, we fill its lateral refs in each baserel in the join, which is wrong.
So I'm wondering if it'd be better that we move all this logic of computing additional lateral references within PHVs to get_memoize_path, where we can examine only PHVs that are evaluated at innerrel. And considering that these lateral refs are only used by Memoize, it seems more sensible to compute them there. But I'm a little worried that doing this would make get_memoize_path too expensive.