In
http://www.postgresql.org/message-id/CAK_s-G3-fWVeeR1c0idVTZ0745-7rYiFi8WHBZCNmSn+HWCbYw@mail.gmail.com
it's pointed out that commit 2ffa740b was a few bricks shy of a load,
because it failed to cope with the possibility of a joinaliasvars item
containing an implicit coercion. That's not too hard to fix by adding a
strip_implicit_coercions() call, but while testing it I realized that
there's a second bug in the same place: the code also fails to cope with
a Const arising from a dropped input-relation column. (The comment
claiming that this can't happen is flat wrong, because we *have* passed
the view query through AcquireRewriteLocks().) I fixed that too, and
improved the comment in parsenodes.h that led me to overlook implicit
coercions in the first place, as per the attached WIP patch.
I then went looking for other places that might be assuming too much
about what is in joinaliasvars lists, and found several pre-existing
bugs :-(. The nastiest one is in flatten_join_alias_vars_mutator's code
to expand a whole-row reference, which supposes that if it finds a Const
item then that must represent a dropped column. That would be true in
the parser or rewriter, but at this point in planning we have already
done subquery pullup, which means we could find anything including a
Const there. The code would thus mistake a pulled-up constant subquery
output for a dropped column. An example in the regression database is
explain verbose select j from (int4_tbl join (select q1 as f1, q2 as z, 42 as c from int8_tbl) ss using(f1)) j;
QUERY PLAN
---------------------------------------------------------------------------
Hash Join (cost=1.11..2.23 rows=5 width=16)
Output: ROW(int8_tbl.q1, int8_tbl.q2)
Hash Cond: (int8_tbl.q1 = int4_tbl.f1)
-> Seq Scan on public.int8_tbl (cost=0.00..1.05 rows=5 width=16)
Output: int8_tbl.q1, int8_tbl.q2
-> Hash (cost=1.05..1.05 rows=5 width=4)
Output: int4_tbl.f1
-> Seq Scan on public.int4_tbl (cost=0.00..1.05 rows=5 width=4)
Output: int4_tbl.f1
(9 rows)
The 42 has disappeared entirely from the ROW() construct :-(. This can
be reproduced in all active branches.
After some reflection I think that the best fix is to redefine
AcquireRewriteLocks' processing of dropped columns so that it puts an
actual null pointer, not a consed-up Const, into the joinaliasvars list
item. This is reliably distinguishable from anything we might pull up
from a subquery output, and it doesn't really lose information since the
Const was completely phony anyway. (I think the original design
envisioned having the Const carrying column datatype info, but we
abandoned that idea upon realizing that the dropped column might be of a
dropped type. The phony Const is currently always of type INT4.)
This definitional change will not affect any rules-on-disk since the
dropped-column substitution is only done on rule trees when they are
loaded back into memory.
A risk we'd be taking with this change is that any extension code that
looks at post-rewriter joinaliasvars lists might be confused. However,
I'm not sure why extension code would be looking at those. In any case,
I can't see a different back-patchable change that would reduce such a
risk; we have to change the representation *somehow* if we're going to
distinguish these cases correctly.
Thoughts?
regards, tom lane