Re: COALESCE() query yield different result with MJ vs. NLJ/HJ - Mailing list pgsql-hackers

From Tom Lane
Subject Re: COALESCE() query yield different result with MJ vs. NLJ/HJ
Date
Msg-id 19822.1428187389@sss.pgh.pa.us
Whole thread Raw
In response to COALESCE() query yield different result with MJ vs. NLJ/HJ  (Qingqing Zhou <zhouqq.postgres@gmail.com>)
List pgsql-hackers
Qingqing Zhou <zhouqq.postgres@gmail.com> writes:
> [ this fails: ]
> set enable_mergejoin=on; set enable_nestloop=off; set enable_hashjoin=off;
> explain analyze select a, b from t1 left join  t2 on coalesce(a, 1) =
> coalesce(b,1)  where (coalesce(b,1))>0

Ugh.  The core of the problem is a mistaken assumption that "b" below the
outer join means the same thing as "b" above it.  I've suspected for years
that the planner might someday have to explicitly distinguish the two
meanings, but at least up to now we've not really gotten burnt by failing
to make the distinction.

> A possible explanation is that in fix_join_expr_mutator(), we optimize
> with the case that if child node already compute an expression then
> upper node shall reuse it. In MJ, as coalesce() already computed in
> sort node, thus the NULL is directly used for ExecQual(>0) for join
> filter.
> If we take out this optimization the problem is solved but may looks
> like an overkill. What's a better fix?

Indeed, removing that optimization altogether seems likely to break
things, not to mention being pretty inefficient.  Maybe pending a
proper fix (which I'm afraid will entail major planner redesign)
we could refuse to match anything more complex than a Var or
PlaceHolderVar if it's bubbling up from the nullable side of an
outer join.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Jim Nasby
Date:
Subject: Re: Freeze avoidance of very large table.
Next
From: Tomas Vondra
Date:
Subject: PATCH: nbklno in _hash_splitbucket unused (after ed9cc2b)