Thread: pg_dump and join aliases (was Re: [BUGS] ERROR: cannot handle unplanned sub-select)

Merlin sent me a test case off-list for the problem mentioned here:
http://archives.postgresql.org/pgsql-bugs/2010-07/msg00025.php

After some investigation I was able to simplify it to the following
example using the regression database:

select (select sq1) as qq1
from (select exists(select 1 from int4_tbl where f1 = q2) as sq1, 42 as dummy  from int8_tbl) sq0 join int4_tbl i4 on
dummy= i4.f1;
 

The problem is that flatten_join_alias_vars() can push SubLink
expressions down into sub-selects, as in this example when it replaces
the "sq1" reference with the EXISTS() subexpression that was previously
pulled up by flattening sq0.  But it fails to set the hasSubLinks flag
in the sub-Query, so subsequent processing doesn't think it needs to do
SS_process_sublinks within the sub-Query, and eventually we fail when we
come across the unprocessed SubLink.  This bug goes clear back to 7.4.
Fortunately it's simple to fix.

What seems more interesting is that I initially had a hard time
reproducing the bug under different conditions, and didn't figure out
what was going on until I realized that I had used pg_dump to
consolidate the multiple files Merlin sent ... and *reloading pg_dump's
version of the views didn't exhibit the bug*.  This is because pg_dump,
or more accurately ruleutils.c, has a habit of qualifying variable
references whether or not they were qualified in the original query.
If you turn the above example into a view and then dump it, you'll get
... (select sq0.sq1) as qq1 ...

and that doesn't tickle this bug.  (That's because "sq0.sq1" isn't a
join alias Var, whereas unqualified "sq1" is.)

So the question that seems worth discussing is whether this difference
ought to be considered a bug in ruleutils.  In theory it shouldn't
matter if pg_dump adds an "unnecessary" qualification, but here's an
example where it did matter.  Do we care?  People tend to assume that
dumping and reloading will make no change in the behavior of their
views, so this seems kind of scary to me.  On the other hand, the
"extra" qualifications make view definitions a bit more robust in the
face of column additions, renamings, etc.  So there's certainly a case
to be made that the dump behavior is preferable as-is.

Thoughts?
        regards, tom lane


On Wed, Jul 7, 2010 at 6:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> After some investigation I was able to simplify it to the following
> example using the regression database:
>
> select
>  (select sq1) as qq1
> from
>  (select exists(select 1 from int4_tbl where f1 = q2) as sq1, 42 as dummy
>   from int8_tbl) sq0
>  join
>  int4_tbl i4 on dummy = i4.f1;
>
> [discussion of bug]
>
> What seems more interesting is that I initially had a hard time
> reproducing the bug under different conditions, and didn't figure out
> what was going on until I realized that I had used pg_dump to
> consolidate the multiple files Merlin sent ... and *reloading pg_dump's
> version of the views didn't exhibit the bug*.  This is because pg_dump,
> or more accurately ruleutils.c, has a habit of qualifying variable
> references whether or not they were qualified in the original query.
> If you turn the above example into a view and then dump it, you'll get
>
>        ... (select sq0.sq1) as qq1 ...
>
> and that doesn't tickle this bug.  (That's because "sq0.sq1" isn't a
> join alias Var, whereas unqualified "sq1" is.)

I'm lost.  What's a join alias var?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Robert Haas <robertmhaas@gmail.com> writes:
> I'm lost.  What's a join alias var?

Suppose we have t1 with columns a,b,c and t2 with columns d,e,f, then
consider
select a from t1 join t2 on (whatever)select t1.a from t1 join t2 on (whatever)

In the first case the parser generates a Var that references a column of
the unnamed join's RTE; in the second case you get a Var that references
t1 directly.  These particular cases are semantically equivalent, but
there are lots of other cases where it's important to draw the
distinction.  One interesting example is
select x from (t1 join t2 on (whatever)) as j(x,y,z,xx,yy,zz)

where per SQL spec it'd actually be illegal to write a (or t1.a) because
the named join hides its components.  But I think what forced us to have
different representations is FULL JOIN USING.  If you have
select id from taba full join tabb using (id)

then taba.id and tabb.id and the join's output variable id are all
semantically different and *must* be given different representations at
the Var level.

Anyway, the way it works is that the parser generates "alias Vars" that
refer to the join RTE, mainly because this makes life simpler for
ruleutils.  But the planner prefers to work with the "real" underlying
columns whenever those are semantically equivalent, so it has a pass
that does the replacement, and that's what's broken ...
        regards, tom lane


On Wed, Jul 7, 2010 at 7:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Anyway, the way it works is that the parser generates "alias Vars" that
> refer to the join RTE, mainly because this makes life simpler for
> ruleutils.  But the planner prefers to work with the "real" underlying
> columns whenever those are semantically equivalent, so it has a pass
> that does the replacement, and that's what's broken ...

Well, +1 from me for leaving the ruleutils as-is.  I don't think we
should go out of our way to generate join alias vars just on the off
chance that there's a bug in the translation from join alias vars to
plain ol' vars, and I agree with your statement upthread that
qualification makes things more robust.

I like robust.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company