Thread: Apparent anomaly with views and unions
I using 8.0.1. I create 3 tables with these definitions: create table t1 (serv_id varchar(50) not null); create table t2 (serv_id varchar(50) not null); create table t3 (serv_id varchar(50) not null); Now I create a view like this: create or replace view v1 as select * from t1; Next, I attempt to update this view like this: create or replace view v1 as select * from t1 union select * from t2; I receive: ERROR: cannot change data type of view column "serv_id" Next, I drop view v1 and recreate it using the union above. I am now able to do the following without error: create or replace view v1 as select * from t1 union select * from t2 union select * from t3; So it looks like changing a view from a non-union to a union is producing an error, but changing a view that already contains a 2-way union to one that contains a 3-way union is allowed (and I'm assuming any n to n+1 union transition is also allowed.) Is this intentional? It was certainly not predictable. -- Guy Rouillier
On Fri, 11 Feb 2005, Guy Rouillier wrote: > I using 8.0.1. I create 3 tables with these definitions: > > create table t1 (serv_id varchar(50) not null); > create table t2 (serv_id varchar(50) not null); > create table t3 (serv_id varchar(50) not null); > > Now I create a view like this: > > create or replace view v1 as select * from t1; > > Next, I attempt to update this view like this: > create or replace view v1 as select * from t1 union select * from t2; > > I receive: ERROR: cannot change data type of view column "serv_id" I'm pretty sure PostgreSQL treats the type of serv_id in the new view as varchar with no limit rather than varchar(50). I think that's also not entirely up to spec. In the two to three case it already was a varchar with no limit and so the type was the same.
Stephan Szabo <sszabo@megazone.bigpanda.com> writes: > On Fri, 11 Feb 2005, Guy Rouillier wrote: >> Now I create a view like this: >> create or replace view v1 as select * from t1; >> >> Next, I attempt to update this view like this: >> create or replace view v1 as select * from t1 union select * from t2; >> >> I receive: ERROR: cannot change data type of view column "serv_id" > I'm pretty sure PostgreSQL treats the type of serv_id in the new view as > varchar with no limit rather than varchar(50). This is the same issue reported here: http://archives.postgresql.org/pgsql-general/2004-12/msg00508.php http://archives.postgresql.org/pgsql-hackers/2004-12/msg00408.php and is pretty closely related to this: http://archives.postgresql.org/pgsql-bugs/2005-02/msg00008.php I was looking at fixing it yesterday. The obvious path to a fix is to do select_common_type across all the members of a set-operation tree at once, rather than pairwise as we do it now. That bothers me a bit because it could change the semantics. Consider for example SELECT float8col UNION ALL (SELECT numericcol UNION SELECT numericcol) The final result will be float8 in any case, but I think that a strict reading of the spec requires the righthand UNION to be done in numeric type, before we promote to float8. If we promote to float8 and then union, roundoff could make formerly distinct values not distinct, resulting in a different number of output rows. The above is a pretty contrived case though --- in particular, I had to write UNION ALL for the upper level, else it would do a duplicate-elimination pass that would mask the issue anyway. In practice I'm not sure anyone would ever notice the difference if we took the simple path of selecting one type across the whole tree. Maybe it would work to resolve types only across nested nodes of exactly the same set operator. I haven't worked through the details of this however. regards, tom lane
Tom and Stephan, thank you both for taking the time to reply. Further comments inline. Tom Lane wrote: > Stephan Szabo <sszabo@megazone.bigpanda.com> writes: >> On Fri, 11 Feb 2005, Guy Rouillier wrote: >>> Now I create a view like this: >>> create or replace view v1 as select * from t1; >>> >>> Next, I attempt to update this view like this: >>> create or replace view v1 as select * from t1 union select * from >>> t2; >>> >>> I receive: ERROR: cannot change data type of view column "serv_id" > >> I'm pretty sure PostgreSQL treats the type of serv_id in the new view >> as varchar with no limit rather than varchar(50). > > This is the same issue reported here: > http://archives.postgresql.org/pgsql-general/2004-12/msg00508.php > http://archives.postgresql.org/pgsql-hackers/2004-12/msg00408.php > and is pretty closely related to this: > http://archives.postgresql.org/pgsql-bugs/2005-02/msg00008.php We appear to have two issues here: (1) What is the meaning of "replace" as in replace view? (2) What are the semantics for multiple set operations? My original issue deals with (1). I'm unfortunately not well versed in the SQL spec, but from a layman's point of view, if I'm replacing a view definition, I don't expect that to be rejected because it is incompatible with the original view definition. My new definition may use entirely different set of tables than the original view. "create or replace view" should accomplish the same thing as "drop view; create view". > > I was looking at fixing it yesterday. The obvious path to a fix is > to do select_common_type across all the members of a set-operation > tree at once, rather than pairwise as we do it now. That bothers me > a bit because it could change the semantics. Consider for example > > SELECT float8col UNION ALL (SELECT numericcol UNION SELECT > numericcol) This seems to be addressing point (2). In this particular case, I would expect the query engine to follow the submitter's explicit directions, since he or she provided an explicit grouping. If no parentheses are included, then the optimizer or query rewriter is free to do as it wishes. -- Guy Rouillier
"Guy Rouillier" <guyr@masergy.com> writes: > We appear to have two issues here: > (1) What is the meaning of "replace" as in replace view? > (2) What are the semantics for multiple set operations? > My original issue deals with (1). I'm unfortunately not well versed in > the SQL spec, but from a layman's point of view, if I'm replacing a view > definition, I don't expect that to be rejected because it is > incompatible with the original view definition. Well, we don't have the ability to cope with redefining a view in a way that changes its result column list --- we don't know how to propagate that into dependent views in any clean way. (Now, ALTER TABLE deals with this by allowing the change only when there are no dependent views; maybe it'd be sensible to do the same for REPLACE VIEW.) The immediate issue here is that the redefinition shouldn't be seen as changing the result column type. The underlying cause of that hasn't got anything to do with views per se, it has to do with the way that we impute a result datatype to a UNION structure. > My new definition may > use entirely different set of tables than the original view. "create or > replace view" should accomplish the same thing as "drop view; create > view". In the cases I'm concerned about, DROP VIEW would fail unless you used CASCADE, and the subsequent CREATE VIEW would not magically bring back the dependent objects the CASCADE had killed. The entire *point* of REPLACE VIEW is that it does not have the side effect of destroying dependent objects; and therefore it has to act in a way that ensures the dependent objects still make sense. So, for the moment anyway, that means disallowing changes in the output column names and datatypes. >> Consider for example >> >> SELECT float8col UNION ALL (SELECT numericcol UNION SELECT >> numericcol) > This seems to be addressing point (2). In this particular case, I would > expect the query engine to follow the submitter's explicit directions, > since he or she provided an explicit grouping. If no parentheses are > included, then the optimizer or query rewriter is free to do as it > wishes. I don't think we get to do it that way. The way the SQL spec is written, there is not a difference between A UNION B UNION C and (A UNION B) UNION C. The result is *always* built up from pairwise UNIONs, and the semantic definition of UNION talks about combining only two tables at a time. This consideration doesn't stop us from fixing your problem (since in the case you care about, all the datatypes are the same anyway) ... but it puts rather a large crimp in Woehling's problem, which I was hoping to fix at the same time. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > In the cases I'm concerned about, DROP VIEW would fail unless you used > CASCADE, and the subsequent CREATE VIEW would not magically bring back > the dependent objects the CASCADE had killed. The entire *point* of > REPLACE VIEW is that it does not have the side effect of destroying > dependent objects; and therefore it has to act in a way that ensures > the dependent objects still make sense. So, for the moment anyway, > that means disallowing changes in the output column names and datatypes. There have been a few times when people were confused by the way views adapted to changes in their base tables. A typical example is when someone wants to rename a column and then add a new column with the old name. They usually expect the view to refer to the new column and not automagically change definition to refer to the renamed column. Based on the user reactions, it seems like the least-surprise behaviour would actually be the naive implementation of keeping the raw text defining the view and when anything changes in the base tables (or views) reinterpreting the view based on the new context. That would make both the ALTER TABLE and CREATE OR REPLACE VIEW behaviour simple to implement. If you rename a column then any query using the view gets an error until you put back a column with the name used. It seems like this would be a huge behaviour change though. I don't suppose the spec says anything about how views are supposed to respond to changes in base context does it? -- greg
Greg Stark <gsstark@mit.edu> writes: > It seems like this would be a huge behaviour change though. I don't suppose > the spec says anything about how views are supposed to respond to changes in > base context does it? SQL99 doesn't have RENAME COLUMN nor ALTER COLUMN TYPE, so the exact cases don't arise, but I think this note under ALTER TABLE ADD COLUMN is relevant: NOTE 189 - The addition of a column to a table has no effect on any existing <query expression> included in a view descriptor, <triggered action> included in a trigger descriptor, or <search condition> included in a constraint descriptor because any implicit column references in these descriptor elements are syntactically substituted by explicit column references under the Syntax Rules of Subclause 7.11, "<query specification>". Furthermore, by implication (from the lack of any General Rules to the contrary), the meaning of a column reference is never retroactively changed by the addition of a column subsequent to the invocation of the <SQL schema statement> containing that column reference. The last sentence in particular seems to justify the way we do things now. The spec is also quite clear that DROP COLUMN (with CASCADE) destroys any views that mention the column. I see nothing at all in the spec that justifies the idea of "keep around the source text and reinterpret it". They don't think that way; they think in terms of thoroughly-processed "descriptors" stored in the system catalogs. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > I see nothing at all in the spec that justifies the idea of "keep around > the source text and reinterpret it". They don't think that way; they > think in terms of thoroughly-processed "descriptors" stored in the > system catalogs. Huh. The thing I find most jarring about this way of thinking is that it means I can have objects in my database that don't correspond to any source code I have saved. Say I want to add an expression to a view, I can't safely take the source as I created it in the past, add the column, and recreate it. The actual view in the database may no longer correspond to the create statement I have saved. I suppose you could say that's no different than the create table statements which could be obsoleted by a series of alter tables. -- greg
Greg Stark <gsstark@mit.edu> writes: > Huh. The thing I find most jarring about this way of thinking is that it means > I can have objects in my database that don't correspond to any source code I > have saved. [ shrug... ] Almost any form of ALTER command gives rise to that situation; RENAME TABLE being just the most basic. I can't find anything compelling in that gripe. If you want your saved source code to always describe the database, you mustn't ALTER at all. There are certainly lots of people who do things just that way, but they aren't the ones pushing us to add more and more ALTER functionality ;-) regards, tom lane