Thread: Apparent anomaly with views and unions

Apparent anomaly with views and unions

From
"Guy Rouillier"
Date:
I using 8.0.1.  I create 3 tables with these definitions:

create table t1 (serv_id varchar(50) not null);
create table t2 (serv_id varchar(50) not null);
create table t3 (serv_id varchar(50) not null);

Now I create a view like this:

create or replace view v1 as select * from t1;

Next, I attempt to update this view like this:
create or replace view v1 as select * from t1 union select * from t2;

I receive: ERROR:  cannot change data type of view column "serv_id"

Next, I drop view v1 and recreate it using the union above.  I am now
able to do the following without error:

create or replace view v1 as select * from t1 union select * from t2
union select * from t3;

So it looks like changing a view from a non-union to a union is
producing an error, but changing a view that  already contains a 2-way
union to one that contains a 3-way union is allowed (and I'm assuming
any n to n+1 union transition is also allowed.)

Is this intentional?  It was certainly not predictable.

--
Guy Rouillier


Re: Apparent anomaly with views and unions

From
Stephan Szabo
Date:
On Fri, 11 Feb 2005, Guy Rouillier wrote:

> I using 8.0.1.  I create 3 tables with these definitions:
>
> create table t1 (serv_id varchar(50) not null);
> create table t2 (serv_id varchar(50) not null);
> create table t3 (serv_id varchar(50) not null);
>
> Now I create a view like this:
>
> create or replace view v1 as select * from t1;
>
> Next, I attempt to update this view like this:
> create or replace view v1 as select * from t1 union select * from t2;
>
> I receive: ERROR:  cannot change data type of view column "serv_id"

I'm pretty sure PostgreSQL treats the type of serv_id in the new view as
varchar with no limit rather than varchar(50). I think that's also not
entirely up to spec.

In the two to three case it already was a varchar with no limit and so the
type was the same.

Re: Apparent anomaly with views and unions

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> On Fri, 11 Feb 2005, Guy Rouillier wrote:
>> Now I create a view like this:
>> create or replace view v1 as select * from t1;
>>
>> Next, I attempt to update this view like this:
>> create or replace view v1 as select * from t1 union select * from t2;
>>
>> I receive: ERROR:  cannot change data type of view column "serv_id"

> I'm pretty sure PostgreSQL treats the type of serv_id in the new view as
> varchar with no limit rather than varchar(50).

This is the same issue reported here:
http://archives.postgresql.org/pgsql-general/2004-12/msg00508.php
http://archives.postgresql.org/pgsql-hackers/2004-12/msg00408.php
and is pretty closely related to this:
http://archives.postgresql.org/pgsql-bugs/2005-02/msg00008.php

I was looking at fixing it yesterday.  The obvious path to a fix is to
do select_common_type across all the members of a set-operation tree
at once, rather than pairwise as we do it now.  That bothers me a bit
because it could change the semantics.  Consider for example

  SELECT float8col UNION ALL (SELECT numericcol UNION SELECT numericcol)

The final result will be float8 in any case, but I think that a strict
reading of the spec requires the righthand UNION to be done in numeric
type, before we promote to float8.  If we promote to float8 and then
union, roundoff could make formerly distinct values not distinct,
resulting in a different number of output rows.

The above is a pretty contrived case though --- in particular,
I had to write UNION ALL for the upper level, else it would do a
duplicate-elimination pass that would mask the issue anyway.  In
practice I'm not sure anyone would ever notice the difference if we
took the simple path of selecting one type across the whole tree.

Maybe it would work to resolve types only across nested nodes of exactly
the same set operator.  I haven't worked through the details of this
however.

            regards, tom lane

Re: Apparent anomaly with views and unions

From
"Guy Rouillier"
Date:
Tom and Stephan, thank you both for taking the time to reply.  Further
comments inline.

Tom Lane wrote:
> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
>> On Fri, 11 Feb 2005, Guy Rouillier wrote:
>>> Now I create a view like this:
>>> create or replace view v1 as select * from t1;
>>>
>>> Next, I attempt to update this view like this:
>>> create or replace view v1 as select * from t1 union select * from
>>> t2;
>>>
>>> I receive: ERROR:  cannot change data type of view column "serv_id"
>
>> I'm pretty sure PostgreSQL treats the type of serv_id in the new view
>> as varchar with no limit rather than varchar(50).
>
> This is the same issue reported here:
> http://archives.postgresql.org/pgsql-general/2004-12/msg00508.php
> http://archives.postgresql.org/pgsql-hackers/2004-12/msg00408.php
> and is pretty closely related to this:
> http://archives.postgresql.org/pgsql-bugs/2005-02/msg00008.php

We appear to have two issues here:

(1) What is the meaning of "replace" as in replace view?
(2) What are the semantics for multiple set operations?

My original issue deals with (1).  I'm unfortunately not well versed in
the SQL spec, but from a layman's point of view, if I'm replacing a view
definition, I don't expect that to be rejected because it is
incompatible with the original view definition.  My new definition may
use entirely different set of tables than the original view.  "create or
replace view" should accomplish the same thing as "drop view; create
view".

>
> I was looking at fixing it yesterday.  The obvious path to a fix is
> to do select_common_type across all the members of a set-operation
> tree at once, rather than pairwise as we do it now.  That bothers me
> a bit because it could change the semantics.  Consider for example
>
>   SELECT float8col UNION ALL (SELECT numericcol UNION SELECT
> numericcol)

This seems to be addressing point (2).  In this particular case, I would
expect the query engine to follow the submitter's explicit directions,
since he or she provided an explicit grouping.  If no parentheses are
included, then the optimizer or query rewriter is free to do as it
wishes.

--
Guy Rouillier


Re: Apparent anomaly with views and unions

From
Tom Lane
Date:
"Guy Rouillier" <guyr@masergy.com> writes:
> We appear to have two issues here:

> (1) What is the meaning of "replace" as in replace view?
> (2) What are the semantics for multiple set operations?

> My original issue deals with (1).  I'm unfortunately not well versed in
> the SQL spec, but from a layman's point of view, if I'm replacing a view
> definition, I don't expect that to be rejected because it is
> incompatible with the original view definition.

Well, we don't have the ability to cope with redefining a view in a way
that changes its result column list --- we don't know how to propagate
that into dependent views in any clean way.  (Now, ALTER TABLE deals
with this by allowing the change only when there are no dependent views;
maybe it'd be sensible to do the same for REPLACE VIEW.)  The immediate
issue here is that the redefinition shouldn't be seen as changing the
result column type.  The underlying cause of that hasn't got anything
to do with views per se, it has to do with the way that we impute a
result datatype to a UNION structure.

> My new definition may
> use entirely different set of tables than the original view.  "create or
> replace view" should accomplish the same thing as "drop view; create
> view".

In the cases I'm concerned about, DROP VIEW would fail unless you used
CASCADE, and the subsequent CREATE VIEW would not magically bring back
the dependent objects the CASCADE had killed.  The entire *point* of
REPLACE VIEW is that it does not have the side effect of destroying
dependent objects; and therefore it has to act in a way that ensures
the dependent objects still make sense.  So, for the moment anyway,
that means disallowing changes in the output column names and datatypes.

>> Consider for example
>>
>> SELECT float8col UNION ALL (SELECT numericcol UNION SELECT
>> numericcol)

> This seems to be addressing point (2).  In this particular case, I would
> expect the query engine to follow the submitter's explicit directions,
> since he or she provided an explicit grouping.  If no parentheses are
> included, then the optimizer or query rewriter is free to do as it
> wishes.

I don't think we get to do it that way.  The way the SQL spec is
written, there is not a difference between A UNION B UNION C and
(A UNION B) UNION C.  The result is *always* built up from pairwise
UNIONs, and the semantic definition of UNION talks about combining
only two tables at a time.

This consideration doesn't stop us from fixing your problem (since in
the case you care about, all the datatypes are the same anyway) ...
but it puts rather a large crimp in Woehling's problem, which I was
hoping to fix at the same time.

            regards, tom lane

Re: Apparent anomaly with views and unions

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> In the cases I'm concerned about, DROP VIEW would fail unless you used
> CASCADE, and the subsequent CREATE VIEW would not magically bring back
> the dependent objects the CASCADE had killed.  The entire *point* of
> REPLACE VIEW is that it does not have the side effect of destroying
> dependent objects; and therefore it has to act in a way that ensures
> the dependent objects still make sense.  So, for the moment anyway,
> that means disallowing changes in the output column names and datatypes.

There have been a few times when people were confused by the way views adapted
to changes in their base tables. A typical example is when someone wants to
rename a column and then add a new column with the old name. They usually
expect the view to refer to the new column and not automagically change
definition to refer to the renamed column.

Based on the user reactions, it seems like the least-surprise behaviour would
actually be the naive implementation of keeping the raw text defining the view
and when anything changes in the base tables (or views) reinterpreting the
view based on the new context.

That would make both the ALTER TABLE and CREATE OR REPLACE VIEW behaviour
simple to implement. If you rename a column then any query using the view gets
an error until you put back a column with the name used.

It seems like this would be a huge behaviour change though. I don't suppose
the spec says anything about how views are supposed to respond to changes in
base context does it?

--
greg

Re: Apparent anomaly with views and unions

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> It seems like this would be a huge behaviour change though. I don't suppose
> the spec says anything about how views are supposed to respond to changes in
> base context does it?

SQL99 doesn't have RENAME COLUMN nor ALTER COLUMN TYPE, so the exact
cases don't arise, but I think this note under ALTER TABLE ADD COLUMN
is relevant:

            NOTE 189 - The addition of a column to a table has no effect on
            any existing <query expression> included in a view descriptor,
            <triggered action> included in a trigger descriptor, or <search
            condition> included in a constraint descriptor because any
            implicit column references in these descriptor elements are
            syntactically substituted by explicit column references under
            the Syntax Rules of Subclause 7.11, "<query specification>".
            Furthermore, by implication (from the lack of any General Rules
            to the contrary), the meaning of a column reference is never
            retroactively changed by the addition of a column subsequent
            to the invocation of the <SQL schema statement> containing that
            column reference.

The last sentence in particular seems to justify the way we do things
now.

The spec is also quite clear that DROP COLUMN (with CASCADE) destroys
any views that mention the column.

I see nothing at all in the spec that justifies the idea of "keep around
the source text and reinterpret it".  They don't think that way; they
think in terms of thoroughly-processed "descriptors" stored in the
system catalogs.

            regards, tom lane

Re: Apparent anomaly with views and unions

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> I see nothing at all in the spec that justifies the idea of "keep around
> the source text and reinterpret it".  They don't think that way; they
> think in terms of thoroughly-processed "descriptors" stored in the
> system catalogs.

Huh. The thing I find most jarring about this way of thinking is that it means
I can have objects in my database that don't correspond to any source code I
have saved.

Say I want to add an expression to a view, I can't safely take the source as I
created it in the past, add the column, and recreate it. The actual view in
the database may no longer correspond to the create statement I have saved.

I suppose you could say that's no different than the create table statements
which could be obsoleted by a series of alter tables.

--
greg

Re: Apparent anomaly with views and unions

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> Huh. The thing I find most jarring about this way of thinking is that it means
> I can have objects in my database that don't correspond to any source code I
> have saved.

[ shrug... ]  Almost any form of ALTER command gives rise to that
situation; RENAME TABLE being just the most basic.  I can't find
anything compelling in that gripe.  If you want your saved source
code to always describe the database, you mustn't ALTER at all.

There are certainly lots of people who do things just that way,
but they aren't the ones pushing us to add more and more ALTER
functionality ;-)

            regards, tom lane