Re: Bug in planner - Mailing list pgsql-hackers

From David Rowley
Subject Re: Bug in planner
Date
Msg-id CAApHDvpyO-wOp2RP+AghLDfwMo7XAsxgMm3GRTJ-tW3yG1gOvQ@mail.gmail.com
Whole thread Raw
In response to Bug in planner  (Teodor Sigaev <teodor@sigaev.ru>)
Responses Re: Bug in planner
List pgsql-hackers
On 24 April 2015 at 21:43, Teodor Sigaev <teodor@sigaev.ru> wrote:
Hi!

I faced with planner error:
ERROR:  could not find RelOptInfo for given relids


Good find!

I've simplified your query a bit, the following still shows the issue (using your schema):

SELECT *
FROM t1
WHERE NOT EXISTS (SELECT t2.c2 AS c1
 FROM t2 
 LEFT OUTER JOIN t3 ON t2.c3 = t3.c1
 LEFT OUTER JOIN (SELECT t5.c1 AS c1
  FROM t4
  LEFT OUTER JOIN t5 ON t4.c2 = t5.c1 --offset 0
 ) a1 ON a1.c1 = t3.c2
 WHERE t1.c1 = t2.c2
);

I've done a little debugging on this too and I get the idea that in eqjoinsel() that min_righthand incorrectly does not have a bit set for "t3"

When the failing call is made to find_join_rel() with the above query relids being searched for has a decimal value of 388 (binary 110000100 i.e t5, t4, t2)

find_join_rel makes a pass over join_rel_list to search for the 388 valued relids.

 join_rel_list contains the following:

1 -> 396      (110001100) t5, t4, t3, t2
2 -> 384      (110000000) t5, t4
3 -> 392      (110001000) t5, t4, t3
4 -> 396      (110001100) t5, t4, t3, t2

I looked up simple_rte_array to determine which bits are for which relation.

simple_rte_array:
1 -> t1
2 -> t2
3 -> t3
4 -> join
5 -> a1
6 -> join
7 -> t4
8 -> t5


I'd imagine that the find_join_input_rel() search should actually be for relids 396. I need to spend more time in this area to get a better grasp of what's actually meant to be happening, but I think the problem lies in make_outerjoininfo() when it determines what min_righthand should be set to with the following:

/*
* Similarly for required RHS.  But here, we must also include any lower
* inner joins, to ensure we don't try to commute with any of them.
*/
min_righthand = bms_int_members(bms_union(clause_relids, inner_join_rels),
right_rels);

I think the problem seems to be down to the fact that inner_join_rels and clause_relids are built from deconstruct_jointree() which I'd imagine does not get modified when the subquery for t4 and t5 is pulled up, therefore is out-of-date. </theory>

I've attached a patch which appears to fix the problem, but this is more for the purposes of a demonstration of where the problem lies. I don't have enough knowledge of what's meant to be happening here, I'll need to spend more time reading code and debugging.

On a side note, I just discovered another join removal opportunity:

explain select * from t1 where not exists(select 1 from t2 left join t3 on t2.c1 = t3.c1 where t1.c1=t2.c1);

The join to t3 here is useless, as since it's a left join, the join could only cause duplication of t2 rows, and this does not matter as we're performing an anti join anyway (same applies for semi join).

Regards

David Rowley
Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: INSERT ... ON CONFLICT syntax issues
Next
From: Simon Riggs
Date:
Subject: Re: Feedback on getting rid of VACUUM FULL