Thread: BUG #3012: Wrong JOIN order when a JOIN depends on result from a LEFT JOIN.
BUG #3012: Wrong JOIN order when a JOIN depends on result from a LEFT JOIN.
From
"Pelle Johansson"
Date:
The following bug has been logged online: Bug reference: 3012 Logged by: Pelle Johansson Email address: morth@morth.org PostgreSQL version: 8.2.3 Operating system: RedHat CentOS 4.4 Description: Wrong JOIN order when a JOIN depends on result from a LEFT JOIN. Details: We have a join where we select which row to join on a subquery with a coalesce on a column from a left join, which is not working as expected. Instead of getting one result row, we get one for each row matching the other conditions in the join (as if the subselect didn't exist), after that the left join is applied. Steps to reproduce: CREATE TABLE ta (ta_id integer PRIMARY KEY); CREATE TABLE tb (ta_id integer NOT NULL, tb_id integer NOT NULL, type varchar NOT NULL, ts timestamp NOT NULL); INSERT INTO ta VALUES (1); INSERT INTO tb VALUES (1, 1, 'appear', '2007-02-15 00:00:00'); INSERT INTO tb VALUES (1, 2, 'delete', '2007-02-15 00:00:01'); INSERT INTO tb VALUES (1, 3, 'appear', '2007-02-15 00:00:02'); SELECT * FROM ta LEFT JOIN ( SELECT ta_id, max(ts) AS ts, max(tb_id) AS tb_id FROM tb WHERE type = 'delete' GROUP BY ta_id ) AS last_delete USING (ta_id) JOIN tb AS appear ON appear.ta_id = ta.ta_id AND appear.tb_id = ( SELECT min(tb_id) FROM tb WHERE ta_id = ta.ta_id AND type != 'delete' AND coalesce(ts > last_delete.ts, true) ); ta_id | ts | tb_id | ta_id | tb_id | type | ts -------+---------------------+-------+-------+-------+--------+------------- -------- 1 | | | 1 | 1 | appear | 2007-02-15 00:00:00 1 | | | 1 | 2 | delete | 2007-02-15 00:00:01 1 | 2007-02-15 00:00:01 | 2 | 1 | 3 | appear | 2007-02-15 00:00:02 (3 rows) EXPLAIN SELECT .... QUERY PLAN ---------------------------------------------------------------------------- --------------------------------- Hash Left Join (cost=80.64..146.46 rows=990 width=64) Hash Cond: (ta.ta_id = last_delete.ta_id) Join Filter: (appear.tb_id = (subplan)) -> Hash Join (cost=58.15..92.90 rows=990 width=52) Hash Cond: (appear.ta_id = ta.ta_id) -> Seq Scan on tb appear (cost=0.00..19.90 rows=990 width=48) -> Hash (cost=31.40..31.40 rows=2140 width=4) -> Seq Scan on ta (cost=0.00..31.40 rows=2140 width=4) -> Hash (cost=22.46..22.46 rows=2 width=16) -> Subquery Scan last_delete (cost=22.41..22.46 rows=2 width=16) -> HashAggregate (cost=22.41..22.44 rows=2 width=16) -> Seq Scan on tb (cost=0.00..22.38 rows=5 width=16) Filter: (("type")::text = 'delete'::text) SubPlan -> Aggregate (cost=27.34..27.35 rows=1 width=4) -> Seq Scan on tb (cost=0.00..27.32 rows=5 width=4) Filter: ((ta_id = $0) AND (("type")::text <> 'delete'::text) AND COALESCE((ts > $1), true)) (17 rows) Workaround: The expected result of one result row can be obtained either by changing the JOIN to a LEFT JOIN or by setting join_collapse_limit = 1: SET join_collapse_limit = 1; SELECT .... ta_id | ts | tb_id | ta_id | tb_id | type | ts -------+---------------------+-------+-------+-------+--------+------------- -------- 1 | 2007-02-15 00:00:01 | 2 | 1 | 3 | appear | 2007-02-15 00:00:02 (1 row) QUERY PLAN ---------------------------------------------------------------------------- --------------------------------- Merge Join (cost=91.63..27264.54 rows=5 width=64) Merge Cond: (ta.ta_id = appear.ta_id) Join Filter: (appear.tb_id = (subplan)) -> Merge Left Join (cost=22.47..96.20 rows=2140 width=16) Merge Cond: (ta.ta_id = last_delete.ta_id) -> Index Scan using ta_pkey on ta (cost=0.00..68.35 rows=2140 width=4) -> Sort (cost=22.47..22.48 rows=2 width=16) Sort Key: last_delete.ta_id -> Subquery Scan last_delete (cost=22.41..22.46 rows=2 width=16) -> HashAggregate (cost=22.41..22.44 rows=2 width=16) -> Seq Scan on tb (cost=0.00..22.38 rows=5 width=16) Filter: (("type")::text = 'delete'::text) -> Sort (cost=69.16..71.63 rows=990 width=48) Sort Key: appear.ta_id -> Seq Scan on tb appear (cost=0.00..19.90 rows=990 width=48) SubPlan -> Aggregate (cost=27.34..27.35 rows=1 width=4) -> Seq Scan on tb (cost=0.00..27.32 rows=5 width=4) Filter: ((ta_id = $0) AND (("type")::text <> 'delete'::text) AND COALESCE((ts > $1), true)) (19 rows) Postgres 8.1.4 also provides the correct result, without any tweaking. TIA, Pelle Johansson
"Pelle Johansson" <morth@morth.org> writes: > We have a join where we select which row to join on a subquery with a > coalesce on a column from a left join, which is not working as expected. Hm, this is a fun one. The problem basically is that (1) The join qual "appear.tb_id = (SELECT ..." gets marked as is_pushed_down = false, because it uses all three relations of the outer query (ta, last_delete, tb) and so it's not possible to evaluate it at any lower syntactic level. (2) For whatever reason, the planner decides it can swap the order of the two joins and do the innerjoin first. (This wasn't possible before 8.2, hence no bug before.) (3) The join qual is correctly placed at the left join, since it's now the top join ... but because the qual's not marked is_pushed_down, the createplan.c code thinks it's a join qual of the outer join, and hence stores it as a "Join Filter" instead of just "Filter". This makes the wrong things happen --- the executor emits null-extended rows wherever the qual fails, instead of emitting no row as it should. I think we can band-aid this by forcing is_pushed_down = true for all innerjoin quals, but that suggests that the whole concept may need a bit of a rethink... regards, tom lane
"Pelle Johansson" <morth@morth.org> writes: > We have a join where we select which row to join on a subquery with a > coalesce on a column from a left join, which is not working as expected. The attached patch should fix this. regards, tom lane
Attachment
Re: BUG #3012: Wrong JOIN order when a JOIN depends on result from a LEFT JOIN.
From
Pelle Johansson
Date:
16 feb 2007 kl. 22.00 skrev Tom Lane: > "Pelle Johansson" <morth@morth.org> writes: >> We have a join where we select which row to join on a subquery with a >> coalesce on a column from a left join, which is not working as >> expected. > > The attached patch should fix this. Hi Tom, Thanks for the quick fix. We don't really have any good way to apply the patch, so we'll just use our workaround in the mean time, but it's good to know the issue is resolved. -- Pelle Johansson