pg_dump and join aliases (was Re: [BUGS] ERROR: cannot handle unplanned sub-select) - Mailing list pgsql-hackers

From Tom Lane
Subject pg_dump and join aliases (was Re: [BUGS] ERROR: cannot handle unplanned sub-select)
Date
Msg-id 2804.1278543350@sss.pgh.pa.us
Whole thread Raw
Responses Re: pg_dump and join aliases (was Re: [BUGS] ERROR: cannot handle unplanned sub-select)
List pgsql-hackers
Merlin sent me a test case off-list for the problem mentioned here:
http://archives.postgresql.org/pgsql-bugs/2010-07/msg00025.php

After some investigation I was able to simplify it to the following
example using the regression database:

select (select sq1) as qq1
from (select exists(select 1 from int4_tbl where f1 = q2) as sq1, 42 as dummy  from int8_tbl) sq0 join int4_tbl i4 on
dummy= i4.f1;
 

The problem is that flatten_join_alias_vars() can push SubLink
expressions down into sub-selects, as in this example when it replaces
the "sq1" reference with the EXISTS() subexpression that was previously
pulled up by flattening sq0.  But it fails to set the hasSubLinks flag
in the sub-Query, so subsequent processing doesn't think it needs to do
SS_process_sublinks within the sub-Query, and eventually we fail when we
come across the unprocessed SubLink.  This bug goes clear back to 7.4.
Fortunately it's simple to fix.

What seems more interesting is that I initially had a hard time
reproducing the bug under different conditions, and didn't figure out
what was going on until I realized that I had used pg_dump to
consolidate the multiple files Merlin sent ... and *reloading pg_dump's
version of the views didn't exhibit the bug*.  This is because pg_dump,
or more accurately ruleutils.c, has a habit of qualifying variable
references whether or not they were qualified in the original query.
If you turn the above example into a view and then dump it, you'll get
... (select sq0.sq1) as qq1 ...

and that doesn't tickle this bug.  (That's because "sq0.sq1" isn't a
join alias Var, whereas unqualified "sq1" is.)

So the question that seems worth discussing is whether this difference
ought to be considered a bug in ruleutils.  In theory it shouldn't
matter if pg_dump adds an "unnecessary" qualification, but here's an
example where it did matter.  Do we care?  People tend to assume that
dumping and reloading will make no change in the behavior of their
views, so this seems kind of scary to me.  On the other hand, the
"extra" qualifications make view definitions a bit more robust in the
face of column additions, renamings, etc.  So there's certainly a case
to be made that the dump behavior is preferable as-is.

Thoughts?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Proposal for 9.1: WAL streaming from WAL buffers
Next
From: Robert Haas
Date:
Subject: Re: pg_dump and join aliases (was Re: [BUGS] ERROR: cannot handle unplanned sub-select)