Thread: patch: Add columns via CREATE OR REPLACE VIEW

patch: Add columns via CREATE OR REPLACE VIEW

From
"Robert Haas"
Date:
Here's a patch that allows CREATE OR REPLACE VIEW to add new columns
to an existing view.

Any feedback would be appreciated, especially if it meant that I could
fix any problems before the next commitfest.

Thanks,

...Robert

Attachment

Re: patch: Add columns via CREATE OR REPLACE VIEW

From
Alvaro Herrera
Date:
Robert Haas escribió:
> Here's a patch that allows CREATE OR REPLACE VIEW to add new columns
> to an existing view.
> 
> Any feedback would be appreciated, especially if it meant that I could
> fix any problems before the next commitfest.

What happens with the columns previously defined?  What happens if I
specify a different column definition for them; does it raise an error?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
"Robert Haas"
Date:
On Thu, Aug 7, 2008 at 11:17 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Robert Haas escribió:
>> Here's a patch that allows CREATE OR REPLACE VIEW to add new columns
>> to an existing view.
>>
>> Any feedback would be appreciated, especially if it meant that I could
>> fix any problems before the next commitfest.
>
> What happens with the columns previously defined?  What happens if I
> specify a different column definition for them; does it raise an error?

Yes.  CheckViewTupleDesc() errors out, same as before.

Basically, the old algorithm was:

1. Check [new # of columns] = [old # of columns]
2. For each column: check that [old definition] = [new definition]

The new algorithm is:

1. Check [new # of columns] >= [old # of columns]
2. For each column <= [old # of columns]: check that [old definition]
= [new definition]
3. For each column > [old # of columns]: add the new column to the
relation (as ALTER TABLE ADD COLUMN, except bypassing the usual
prohibition on adding columns to views)

...Robert


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Robert Haas escribi�:
>> Here's a patch that allows CREATE OR REPLACE VIEW to add new columns
>> to an existing view.

> What happens with the columns previously defined?  What happens if I
> specify a different column definition for them; does it raise an error?

The original idea here was to give REPLACE VIEW as much flexibility
as we've recently added for tables via ALTER TABLE, which would ideally
include

1. adding columns
2. renaming columns
3. dropping columns that are not referenced elsewhere
4. changing type of columns that are not referenced elsewhere

But it seems hard to tell the difference between a "rename" and a
"drop".  I think that we aren't going to get far on this until we
decide what we will consider to be the identity of a view column.
With regular tables the attnum is a persistent identifier, but that
doesn't seem to play nicely for REPLACE VIEW, at least not if you're
wanting to allow people to remove columns from their view definitions.

Maybe the right way is to *not* use CREATE OR REPLACE VIEW, but
rather ALTER VIEW ADD COLUMN and so on.  Then column identity seems
a lot easier to keep track of.

Thoughts?
        regards, tom lane


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
Alvaro Herrera
Date:
Tom Lane escribió:

> But it seems hard to tell the difference between a "rename" and a
> "drop".  I think that we aren't going to get far on this until we
> decide what we will consider to be the identity of a view column.
> With regular tables the attnum is a persistent identifier, but that
> doesn't seem to play nicely for REPLACE VIEW, at least not if you're
> wanting to allow people to remove columns from their view definitions.

Hmm, maybe we need to pull off the project to separate logical attribute
number from physical and position.  It sounds like it could make it
easier for view modification.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
"David E. Wheeler"
Date:
On Aug 7, 2008, at 08:43, Tom Lane wrote:

> Maybe the right way is to *not* use CREATE OR REPLACE VIEW, but
> rather ALTER VIEW ADD COLUMN and so on.  Then column identity seems
> a lot easier to keep track of.

+1, although what does the standard say?

Best,

David


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
"Robert Haas"
Date:
> But it seems hard to tell the difference between a "rename" and a
> "drop".  I think that we aren't going to get far on this until we
> decide what we will consider to be the identity of a view column.
> With regular tables the attnum is a persistent identifier, but that
> doesn't seem to play nicely for REPLACE VIEW, at least not if you're
> wanting to allow people to remove columns from their view definitions.
>
> Maybe the right way is to *not* use CREATE OR REPLACE VIEW, but
> rather ALTER VIEW ADD COLUMN and so on.  Then column identity seems
> a lot easier to keep track of.
>
> Thoughts?

ISTM that column identity should logically follow column name.  If a
user types "CREATE OR REPLACE VIEW sandwhich (bacon, lettuce, bread)
..." and sometime later types "CREATE OR REPLACE VIEW sandwich (bacon,
lettuce, tomato, bread) ..." it seems overwhelmingly likely that their
intention was to insert tomato between lettuce and bread rather than
to rename bread to tomato and add a new column that happens to also be
called bread.  If they want the other interpretation, they should use
do "ALTER VIEW sandwhich RENAME COLUMN tomato TO bread" before
executing "CREATE OR REPLACE VIEW".

The problem with "ALTER VIEW ADD COLUMN" is that the new column won't
be of any use until someone does "CREATE OR REPLACE VIEW" to update
the underlying query.  And if they're already doing "CREATE OR REPLACE
VIEW", then we might as well let "CREATE OR REPLACE VIEW" generate the
column definitions automatically rather than forcing them to do the
same thing by hand (just as we allowed the user to create the view in
the first place without insisting on explicit column definitions).
The problem is even worse for "ALTER VIEW ALTER COLUMN TYPE".  What
exactly will the semantics of the view be after this operation but
before a subsequent update of the query via "CREATE OR REPLACE VIEW"?
There are various options but none of them make much sense.

If you accept the idea that column identity should be based on column
name, then the only two operations that are really necessary are
"CREATE OR REPLACE VIEW" and "ALTER VIEW RENAME COLUMN", and it is
100% clear what the semantics of those operations should be.  We may
not choose to fully support all possible alterations that can be made
through this framework for some time to come, but it's extremely easy
to understand where we're trying to get to.  If you want to
additionally have "ALTER VIEW ADD/DROP COLUMN" available for those
that may wish to use them, presumably returning NULL for any column
that isn't generated by the query, that's also easy to understand and
well-defined.

...Robert


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
Andrew Dunstan
Date:

Tom Lane wrote:
> Maybe the right way is to *not* use CREATE OR REPLACE VIEW, but
> rather ALTER VIEW ADD COLUMN and so on.  Then column identity seems
> a lot easier to keep track of.
>
>
>   

How would that look? Where would we put the new query?

cheers

andrew


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
"Robert Haas"
Date:
> Hmm, maybe we need to pull off the project to separate logical attribute
> number from physical and position.  It sounds like it could make it
> easier for view modification.

Much easier!  It would be a nice feature to have for table as well.
Right now, if you have a table with columns (foo1, foo2, foo3, bar1,
bar2, bar3) and you decide to add column foo4, there's no way to put
it where you intuitively want to put it.  Not a big deal, but I'd
definitely use it if we had it.

However, it's not necessary to implement this in order to make
meaningful improvements to CREATE OR REPLACE VIEW.  I think the only
thing we need to agree on is that no future implementation of CREATE
OR REPLACE VIEW will ever implicitly rename a column.  If we agree on
column name as a measure of column identity, then the change I'm
proposing is forward-compatible with any other enhancements we may
want to make later.

...Robert


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
Alvaro Herrera
Date:
Andrew Dunstan escribió:
>
>
> Tom Lane wrote:
>> Maybe the right way is to *not* use CREATE OR REPLACE VIEW, but
>> rather ALTER VIEW ADD COLUMN and so on.  Then column identity seems
>> a lot easier to keep track of.
>
> How would that look? Where would we put the new query?

I was thinking that the ADD COLUMN should specify the new result list
entry.  Of course, this doesn't allow changing the query in more complex
ways (e.g. you can't add a GROUP BY clause that wasn't already there).

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Andrew Dunstan escribi�:
>> Tom Lane wrote:
>>> Maybe the right way is to *not* use CREATE OR REPLACE VIEW, but
>>> rather ALTER VIEW ADD COLUMN and so on.  Then column identity seems
>>> a lot easier to keep track of.
>> 
>> How would that look? Where would we put the new query?

> I was thinking that the ADD COLUMN should specify the new result list
> entry.

Yeah, that's what I was thinking too.  If you needed to change more
than just the topmost SELECT list, you'd need two steps: an ADD COLUMN
and then CREATE OR REPLACE VIEW to change the query in some way that
doesn't result in changing the output column set.  Maybe that's going
to be too awkward to use though.
        regards, tom lane


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
"Robert Haas"
Date:
>> I was thinking that the ADD COLUMN should specify the new result list
>> entry.
>
> Yeah, that's what I was thinking too.  If you needed to change more
> than just the topmost SELECT list, you'd need two steps: an ADD COLUMN
> and then CREATE OR REPLACE VIEW to change the query in some way that
> doesn't result in changing the output column set.  Maybe that's going
> to be too awkward to use though.

To me, that sounds less useful than simply being able to make changes
via CREATE OR REPLACE VIEW, but it's not an either/or proposition, and
I suspect that it's significantly more complicated to implement than
the patch I submitted.

...Robert


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
Gregory Stark
Date:
"Robert Haas" <robertmhaas@gmail.com> writes:

> I think the only thing we need to agree on is that no future implementation
> of CREATE OR REPLACE VIEW will ever implicitly rename a column. If we agree
> on column name as a measure of column identity, then the change I'm
> proposing is forward-compatible with any other enhancements we may want to
> make later.

hm... so what would this output?

CREATE VIEW a AS (select 1 as a, 2 as b);
CREATE VIEW b AS (select x,y FROM a AS a(x,y))
CREATE OR REPLACE VIEW a AS (select 1 as b, 'two' as c, 3 as a);
SELECT * FROM b;

What about this?

CREATE OR REPLACE VIEW a AS (select 1 as b, 2 as c, 'three as a);
SELECT * FROM b;

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about
EnterpriseDB'sPostgreSQL training!
 


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
"Pavel Stehule"
Date:
> Maybe the right way is to *not* use CREATE OR REPLACE VIEW, but
> rather ALTER VIEW ADD COLUMN and so on.  Then column identity seems
> a lot easier to keep track of.
>

I prefere ALTER VIEW too

regards
Pavel Stehule


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
Tom Lane
Date:
"David E. Wheeler" <david@kineticode.com> writes:
> On Aug 7, 2008, at 08:43, Tom Lane wrote:
>> Maybe the right way is to *not* use CREATE OR REPLACE VIEW, but
>> rather ALTER VIEW ADD COLUMN and so on.  Then column identity seems
>> a lot easier to keep track of.

> +1, although what does the standard say?

AFAICT the standard doesn't have any way to alter the definition of an
existing view at all.  It might be worth asking what other systems do,
though --- can you alter a view in Oracle or DB2 or mysql?
        regards, tom lane


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
"David E. Wheeler"
Date:
On Aug 7, 2008, at 13:01, Tom Lane wrote:

>> +1, although what does the standard say?
>
> AFAICT the standard doesn't have any way to alter the definition of an
> existing view at all.  It might be worth asking what other systems do,
> though --- can you alter a view in Oracle or DB2 or mysql?

Looks like MySQL 6.0 just does a CREATE OR REPLACE when you do ALTER
VIEW:
  http://dev.mysql.com/doc/refman/6.0/en/create-view.html  http://dev.mysql.com/doc/refman/6.0/en/alter-view.html

Oracle doesn't seem to do much with it, either, just recompiles a view:
  http://download.oracle.com/docs/cd/B10500_01/server.920/a96540/statements_45a.htm

Note that it says, "This statement does not change the definition of
an existing view. To redefine a view, you must use CREATE VIEW with
the OR REPLACE keywords."

DB2's ALTER VIEW is a bit more promising, though there doesn't seem to
be a way to add columns, just to redefine them:
  http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/admin/r0000894.htm

Best,

David

Re: patch: Add columns via CREATE OR REPLACE VIEW

From
"Robert Haas"
Date:
>> I think the only thing we need to agree on is that no future implementation
>> of CREATE OR REPLACE VIEW will ever implicitly rename a column. If we agree
>> on column name as a measure of column identity, then the change I'm
>> proposing is forward-compatible with any other enhancements we may want to
>> make later.
>
> hm... so what would this output?
>
> CREATE VIEW a AS (select 1 as a, 2 as b);
> CREATE VIEW b AS (select x,y FROM a AS a(x,y))
> CREATE OR REPLACE VIEW a AS (select 1 as b, 'two' as c, 3 as a);
> SELECT * FROM b;

I'm not too familiar with the syntax "a AS a(x, y)" but I think it's
asking that the first two columns of a (whatever they are at the
moment) be aliased to x and y.  If we allow reordering columns, it's
going to change the meaning of a number of different expressions.  In
addition to the above, you have to worry about at least:

(1) INSERT INTO foo VALUES (a, b, c)
(2) COPY foo FROM wherever
(3) SELECT * FROM foo

Were I implementing the ability to reorder columns (which I have no
current plans to do), I think the logical thing to do would be to
decree that all of this is the user's problem.  If you sensibly refer
to columns by name, then you are guaranteed to always be referencing
the same set of columns.  If you assume that you know the position of
the columns, you assume the risk of getting burned if that ordering
changes.  This is already true: even without the ability to reorder
columns, a table or view can still be dropped and recreated with a
different column ordering, and any client code that stupidly relies on
the columns being in the same order that they were before will blow
up.  If people are writing code this way, they should either (1) fix
it to include explicit target lists or (2) not ever change the order
of their table columns.  This is true whether or not reordering
columns requires dropping the object in question and all of its
dependencies, or whether it can be done in place, and is not an
argument for refusing to make it easier to do in-place.

With respect to your particular example, I think CREATE OR REPLACE
VIEW should fail with an error, just as this case does:

CREATE VIEW a AS (select 1 as a, 2 as b);
CREATE VIEW b AS (select x,y FROM a AS a(x,y));
ALTER TABLE a RENAME TO a_old;  -- move a out of the way so we can
change the definition
CREATE OR REPLACE VIEW a AS (select 1 as b, 'two' as c, 3 as a);
CREATE OR REPLACE VIEW b AS (select x,y FROM a AS a(x,y)); -- recreate
b based on new definition of a
ERROR:  cannot change data type of view column "y"

As Tom said upthread, columns should only be allowed to be dropped or
retyped if they have no dependencies.  This case is a little weird
because normally the dependency would follow the name, but here
because of the a(x, y) syntax it has to follow the column position,
but it seems clear to me that the type change can't silently propagate
down into view b, so making it error out is the only logical
alternative.

> What about this?
>
> CREATE OR REPLACE VIEW a AS (select 1 as b, 2 as c, 'three as a);
> SELECT * FROM b;

This seems well defined.  a now has columns b-c-a with types
int-int-varchar.  b gets a.b as x and a.c as y, so I expect to get (1,
2).  If he had written CREATE VIEW b AS (select a as x,b as y FROM a)
then the statement above would fail due to a dependency on the type of
a.

Of course, accepting the patch I submitted doesn't require us to ever
support any of this, since it makes no attempt to reorder or retype
any existing columns.  It only allows adding new columns at the end.
If we want to stick with the approach of "don't ever reorder columns",
we could decree that the goal is for CREATE OR REPLACE VIEW to allow
adding new columns and retyping of columns without dependencies, but
that renaming, dropping, and potentially reordering would have to be
done using an ALTER VIEW command.

I'm afraid we're getting off into a discussion of how to support
reordering of columns which I find fascinating but not necessarily
relevant to the patch at hand.  I can't think of any imaginable
scenario in which the ability to add new columns at the end of an
existing view via CREATE OR REPLACE VIEW causes any problem for any
feature we might want to implement in the future, and it's clearly
useful.  It takes me about half an hour to add a column to a
particular view on one of the systems I work with because of the need
to update all the dependent objects, and this would reduce it to about
10 seconds.

...Robert


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
"Robert Haas"
Date:
> I'm not too familiar with the syntax "a AS a(x, y)" but I think it's
> asking that the first two columns of a (whatever they are at the
> moment) be aliased to x and y.

Another possible option would be to introduce a syntax along the lines of

table AS table_alias (column AS column_alias, column AS column_alias)

and decree that a(x, y) is in essence expanded to a(a as x, b as y) at
the time you originally execute the statement, much the same way we
handle *.

This has a major implementation advantage in that it frees us from
worrying about whether columns were originally referenced by position
or by name.  It does change the answers I gave in my previous email,
but I guess that doesn't matter very much since (1) we're not
proposing to implement this now and (2) either decision is
upward-compatible with the proposed patch.

Although several people have said that they prefer the idea of using
ALTER VIEW to make changes to views, no one has really expanded on the
reasons for their preference.  Also, no one has offered any argument
at all as to why any future ALTER VIEW functionality that might be
added would conflict with the semantics of the patch I proposed.

...Robert


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
Tom Lane
Date:
"Robert Haas" <robertmhaas@gmail.com> writes:
> Although several people have said that they prefer the idea of using
> ALTER VIEW to make changes to views, no one has really expanded on the
> reasons for their preference.

Because it sidesteps the problem of tracking which column is supposed to
be which.  If you try to do it through CREATE OR REPLACE VIEW, you have
to either be extremely restrictive (like probably not allow renaming
of columns at all), or write some AI-complete algorithm to guess at what
the user intended.
        regards, tom lane


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
"Robert Haas"
Date:
Forgot to copy my response to this to the list.

On Thu, Aug 7, 2008 at 5:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Robert Haas" <robertmhaas@gmail.com> writes:
>> Although several people have said that they prefer the idea of using
>> ALTER VIEW to make changes to views, no one has really expanded on the
>> reasons for their preference.
>
> Because it sidesteps the problem of tracking which column is supposed to
> be which.  If you try to do it through CREATE OR REPLACE VIEW, you have
> to either be extremely restrictive (like probably not allow renaming
> of columns at all), or write some AI-complete algorithm to guess at what
> the user intended.

The current code takes the approach of being extremely restrictive -
it doesn't let you change anything at all.  The code I'm proposing
manages to relax that restriction without creating any ambiguity that
anyone has been able to point out.  All of the ambiguities that have
been mentioned are problems that might be created by some other,
entirely hypothetical patch.

...Robert


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
Tom Lane
Date:
"Robert Haas" <robertmhaas@gmail.com> writes:
> On Thu, Aug 7, 2008 at 5:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Because it sidesteps the problem of tracking which column is supposed to
>> be which.  If you try to do it through CREATE OR REPLACE VIEW, you have
>> to either be extremely restrictive (like probably not allow renaming
>> of columns at all), or write some AI-complete algorithm to guess at what
>> the user intended.

> The current code takes the approach of being extremely restrictive -
> it doesn't let you change anything at all.  The code I'm proposing
> manages to relax that restriction without creating any ambiguity that
> anyone has been able to point out.  All of the ambiguities that have
> been mentioned are problems that might be created by some other,
> entirely hypothetical patch.

Well, my feeling is that if we are inventing a new feature we ought not
paint ourselves into a corner by failing to consider what will happen
when obvious extensions to the feature are attempted.  Whether the
present patch is self-consistent is not the question --- the question
is do we have a self-consistent vision of how we will later do the
other stuff like renaming, changing column type, etc.
        regards, tom lane


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
"Robert Haas"
Date:
> Well, my feeling is that if we are inventing a new feature we ought not
> paint ourselves into a corner by failing to consider what will happen
> when obvious extensions to the feature are attempted.  Whether the
> present patch is self-consistent is not the question --- the question
> is do we have a self-consistent vision of how we will later do the
> other stuff like renaming, changing column type, etc.

If we can work out that design, I think that's great.  However, it
doesn't actually 100% matter whether we know the one true way that we
will definitely implement those features - it only matters that none
of the things we might choose are inconsistent with what we're doing
now.

In order to avoid being AI-complete, REPLACE VIEW needs some kind of
straightforward algorithm for matching up the old and new target
lists.  AFAICS, the only thing to decide here is what you want to use
as the key.  There are three possibilities that I can think of: [1]
name, [2] position, [3] both name and position.

It's axiomatic that REPLACE VIEW can't be given the capability to make
any modification that involves changing the key field, so in [1] you
can't rename columns, in [2] you can't reorder columns, and in [3] you
can't do either.  Furthermore, in [2], you also can't support dropping
columns, because a drop is indistinguishable from renaming and
retyping every column from the point of the drop onwards.  Therefore,
the maximum set of operations REPLACE VIEW can potentially support in
each scenario are:

[1] add column, change type, drop column, reorder columns
[2] add column, change type, rename
[3] add column, change type, drop column

The actual set of operations supported may be less either because of
implementation limitations or because you don't want to provide users
with a foot-gun.  ISTM that allowing REPLACE VIEW to do renames in
scenario [2] can be pretty much rejected outright as a violation of
the principle of least surprise - there is an enormous danger of
someone simultaneously renaming and retyping a whole series of columns
when they instead intended to drop a column.  Similarly, in scenario
[1] or [3], ISTM that allowing someone to drop columns using REPLACE
VIEW is something of a foot-gun unless we are in scenario [1] and
reordering columns is also implemented, because users who don't RTFM
will try to reorder columns and it will succeed and fail erratically
according to whether there are dependencies that prevent dropping and
re-adding whatever subset of columns need to be shuffled to create the
same effect as would be produced by reordering.  However, in any
scenario, I can't see how adding columns or changing column types is
likely to produce any confusion or user-unexpected behavior.  Perhaps
I'm missing something?

Personally, I favor scenario [1].  I hardly ever rename database
columns, and I don't mind needing to ALTER VIEW RENAME COLUMN on those
rare occasions when I do, but I add new columns to my tables (which
then also need to be added to my views) on a regular basis.  If I
could keep groups of related columns together in the table and view
definitions without having to drop and recreate the objects, that
would be awesome.  But I'm not sure it's worth the amount of
implementation that would be required to get there, especially if all
of that implementation would need to be done by me (and
double-especially if none of it would likely be included in -core).

Of course, as I said before, nothing we do in REPLACE VIEW precludes
having a powerful implementation of ALTER VIEW.  But I think the
coding to make ALTER VIEW do these operations is a lot trickier,
because you have to deal with modifying the query that's already in
place piecemeal as you make your changes to the view.  It's not that
it can't be done, but I doubt it can be done in an 8K patch, and as
mentioned upthread, it certainly can't be done in a fully general
way... you will still frequently need to CREATE OR REPLACE VIEW
afterwards.  To put that another way, ALTER TABLE is a complicated
beast because you have to worry about how you're going to handle the
existing data, and ALTER VIEW will be a complicated beast for the
analogous reason that you need to worry about handing the existing
rewrite rule.  But at the moment when a REPLACE VIEW command is
executed, that problem goes away, because now you have the query in
your hand and just need to make the relation match it without breaking
any of the dependencies.

...Robert


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
"Asko Oja"
Date:
ALTER VIEW does not sound useful for me.
CREATE OR REPLACE VIEW should create or replace view and only thing that should be same is the name of the view. It's up to Postgres to invalidate all plans and up to developer to make sure that all places where his view is used will stay still working. All this discussion about matching up columns and AI seems totally useless to me :)

On Fri, Aug 8, 2008 at 4:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> Well, my feeling is that if we are inventing a new feature we ought not
> paint ourselves into a corner by failing to consider what will happen
> when obvious extensions to the feature are attempted.  Whether the
> present patch is self-consistent is not the question --- the question
> is do we have a self-consistent vision of how we will later do the
> other stuff like renaming, changing column type, etc.

If we can work out that design, I think that's great.  However, it
doesn't actually 100% matter whether we know the one true way that we
will definitely implement those features - it only matters that none
of the things we might choose are inconsistent with what we're doing
now.

In order to avoid being AI-complete, REPLACE VIEW needs some kind of
straightforward algorithm for matching up the old and new target
lists.  AFAICS, the only thing to decide here is what you want to use
as the key.  There are three possibilities that I can think of: [1]
name, [2] position, [3] both name and position.

It's axiomatic that REPLACE VIEW can't be given the capability to make
any modification that involves changing the key field, so in [1] you
can't rename columns, in [2] you can't reorder columns, and in [3] you
can't do either.  Furthermore, in [2], you also can't support dropping
columns, because a drop is indistinguishable from renaming and
retyping every column from the point of the drop onwards.  Therefore,
the maximum set of operations REPLACE VIEW can potentially support in
each scenario are:

[1] add column, change type, drop column, reorder columns
[2] add column, change type, rename
[3] add column, change type, drop column

The actual set of operations supported may be less either because of
implementation limitations or because you don't want to provide users
with a foot-gun.  ISTM that allowing REPLACE VIEW to do renames in
scenario [2] can be pretty much rejected outright as a violation of
the principle of least surprise - there is an enormous danger of
someone simultaneously renaming and retyping a whole series of columns
when they instead intended to drop a column.  Similarly, in scenario
[1] or [3], ISTM that allowing someone to drop columns using REPLACE
VIEW is something of a foot-gun unless we are in scenario [1] and
reordering columns is also implemented, because users who don't RTFM
will try to reorder columns and it will succeed and fail erratically
according to whether there are dependencies that prevent dropping and
re-adding whatever subset of columns need to be shuffled to create the
same effect as would be produced by reordering.  However, in any
scenario, I can't see how adding columns or changing column types is
likely to produce any confusion or user-unexpected behavior.  Perhaps
I'm missing something?

Personally, I favor scenario [1].  I hardly ever rename database
columns, and I don't mind needing to ALTER VIEW RENAME COLUMN on those
rare occasions when I do, but I add new columns to my tables (which
then also need to be added to my views) on a regular basis.  If I
could keep groups of related columns together in the table and view
definitions without having to drop and recreate the objects, that
would be awesome.  But I'm not sure it's worth the amount of
implementation that would be required to get there, especially if all
of that implementation would need to be done by me (and
double-especially if none of it would likely be included in -core).

Of course, as I said before, nothing we do in REPLACE VIEW precludes
having a powerful implementation of ALTER VIEW.  But I think the
coding to make ALTER VIEW do these operations is a lot trickier,
because you have to deal with modifying the query that's already in
place piecemeal as you make your changes to the view.  It's not that
it can't be done, but I doubt it can be done in an 8K patch, and as
mentioned upthread, it certainly can't be done in a fully general
way... you will still frequently need to CREATE OR REPLACE VIEW
afterwards.  To put that another way, ALTER TABLE is a complicated
beast because you have to worry about how you're going to handle the
existing data, and ALTER VIEW will be a complicated beast for the
analogous reason that you need to worry about handing the existing
rewrite rule.  But at the moment when a REPLACE VIEW command is
executed, that problem goes away, because now you have the query in
your hand and just need to make the relation match it without breaking
any of the dependencies.

...Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: patch: Add columns via CREATE OR REPLACE VIEW

From
Zeugswetter Andreas OSB sIT
Date:
> If you accept the idea that column identity should be based on column
> name, then the only two operations that are really necessary are
> "CREATE OR REPLACE VIEW" and "ALTER VIEW RENAME COLUMN", and it is
> 100% clear what the semantics of those operations should be.

+1

I think this would be an easily useable and understandable concept.
I also fully support Robert's reasoning in his next reply to Tom,
detailing why his patch's provided functionality is acceptable.

Andreas

PS: "alter view" in O does not change the base definition,
it only allows modifying view constraints.


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
Tom Lane
Date:
Zeugswetter Andreas OSB sIT <Andreas.Zeugswetter@s-itsolutions.at> writes:
>> If you accept the idea that column identity should be based on column
>> name, then the only two operations that are really necessary are
>> "CREATE OR REPLACE VIEW" and "ALTER VIEW RENAME COLUMN", and it is
>> 100% clear what the semantics of those operations should be.

> +1

It's nice, it's simple, and it's unimplementable.  At least not without
huge changes in the representation of stored views, which would likely
lead to failure to follow spec-required behavior in other ways.  Other
views are going to refer to the columns of this one by *number*, not
name, and it's not clear to me how you're going to preserve column
number identity with this approach.
        regards, tom lane


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
"Robert Haas"
Date:
I'm just getting back around to this now.  I guess I'm wondering if
someone could advise me on the best way of getting closer to a
committable patch.

The original patch just allows additional columns to be appended to
the previous column list (while disallowing all other sorts of
changes, including reordering/renaming/dropping columns).  Although I
understand that reordering, renaming, and dropping columns are
potential sources of ambiguity, I don't think there is any possible
ambiguity in adding columns.  If so, perhaps this patch or a similar
one might committable.

But if not, then I'd like some suggestions on what could be done to
pave the way for a future patch that would allow a more powerful
version of CREATE OR REPLACE VIEW. Do I need to implement the
three-tiered structure that Tom first proposed here?

http://archives.postgresql.org/pgsql-hackers/2006-12/msg00977.php

The major problem I have with this idea is that it seems to require us
to do everything to do anything.  Separating storage location from
permanent ID is a performance enhancement which will probably require
quite a bit of testing and discussion to figure out how it ought to
work and how much control users ought to have over the behavior,
questions which were extensively discussed here:

http://archives.postgresql.org/pgsql-hackers/2007-02/msg01164.php

On the other hand, separating permanent ID from view position requires
working out the semantics of operations that rely on the column
ordering, such as SELECT *, COPY, INSERT without column list, CREATE
OR REPLACE VIEW view_name (column_alias, ...), and SELECT ... FROM
table (column_alias, ...) as well as figuring out what sorts of DDL
commands should be added to manipulate the view position.  Trying to
implement both of these things as a single patch sounds like it might
be trying to do too much, but I'm open to suggestions.

I think there would be a lot of benefit in being able to make changes
to views without needing to drop and recreate all the dependent
objects - it would certainly simplify things for me, and I imagine for
others as well.  I need some guidance though on where to go with it.

I'd also be interested in knowing if anyone else is working on
anything along these lines.

Thanks,

...Robert


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
Bernd Helmle
Date:
--On Donnerstag, August 07, 2008 08:03:52 -0400 Robert Haas 
<robertmhaas@gmail.com> wrote:

> Here's a patch that allows CREATE OR REPLACE VIEW to add new columns
> to an existing view.
>
> Any feedback would be appreciated, especially if it meant that I could
> fix any problems before the next commitfest.

I had a deeper look at this now. The patch looks clean and applies without 
any problems, regression tests passes. However, ATRewriteTables() has a 
problem when adding columns with domains and constraints. Consider this 
small test case:

CREATE TABLE bar (id INTEGER);
CREATE OR REPLACE VIEW vbar AS SELECT * FROM bar;
CREATE DOMAIN person AS TEXT CHECK(value IN ('haas', 'helmle'));
ALTER TABLE bar ADD COLUMN name person;
CREATE OR REPLACE VIEW vbar AS SELECT * FROM bar;

The last command confuses ATRewriteTable(), which wants to scan the 
relation leading to this error:
ERROR:  could not open relation base/16384/16476:

I see that ATRewriteTable() errors out on heap_beginscan(), since needscan 
is set to TRUE. One solution would be to teach ATRewriteTable(s) to handle 
view alteration differently in this case.

Opinions?

--  Thanks
                   Bernd


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
"Robert Haas"
Date:
> I had a deeper look at this now. The patch looks clean and applies without
> any problems, regression tests passes. However, ATRewriteTables() has a
> problem when adding columns with domains and constraints. Consider this
> small test case:
>
> CREATE TABLE bar (id INTEGER);
> CREATE OR REPLACE VIEW vbar AS SELECT * FROM bar;
> CREATE DOMAIN person AS TEXT CHECK(value IN ('haas', 'helmle'));
> ALTER TABLE bar ADD COLUMN name person;
> CREATE OR REPLACE VIEW vbar AS SELECT * FROM bar;
>
> The last command confuses ATRewriteTable(), which wants to scan the relation
> leading to this error:
> ERROR:  could not open relation base/16384/16476:
>
> I see that ATRewriteTable() errors out on heap_beginscan(), since needscan
> is set to TRUE. One solution would be to teach ATRewriteTable(s) to handle
> view alteration differently in this case.

Thanks for the review - I'll take a look.

...Robert


Re: patch: Add columns via CREATE OR REPLACE VIEW

From
"Robert Haas"
Date:
> I had a deeper look at this now. The patch looks clean and applies without
> any problems, regression tests passes. However, ATRewriteTables() has a
> problem when adding columns with domains and constraints. Consider this
> small test case:
>
> CREATE TABLE bar (id INTEGER);
> CREATE OR REPLACE VIEW vbar AS SELECT * FROM bar;
> CREATE DOMAIN person AS TEXT CHECK(value IN ('haas', 'helmle'));
> ALTER TABLE bar ADD COLUMN name person;
> CREATE OR REPLACE VIEW vbar AS SELECT * FROM bar;
>
> The last command confuses ATRewriteTable(), which wants to scan the relation
> leading to this error:
> ERROR:  could not open relation base/16384/16476:
>
> I see that ATRewriteTable() errors out on heap_beginscan(), since needscan
> is set to TRUE. One solution would be to teach ATRewriteTable(s) to handle
> view alteration differently in this case.

After looking at this, I think the root cause of this problem is that
ATPrepAddColumn isn't smart enough to know that when the underlying
relation is a view, there's no point in asking for a table rewrite.
Please find an updated patch that addresses this problem.

Thanks again for the review - let me know what you think of this version!

...Robert

Attachment

Re: patch: Add columns via CREATE OR REPLACE VIEW

From
Bruce Momjian
Date:
Patch applied, thanks.

---------------------------------------------------------------------------

Robert Haas wrote:
> > I had a deeper look at this now. The patch looks clean and applies without
> > any problems, regression tests passes. However, ATRewriteTables() has a
> > problem when adding columns with domains and constraints. Consider this
> > small test case:
> >
> > CREATE TABLE bar (id INTEGER);
> > CREATE OR REPLACE VIEW vbar AS SELECT * FROM bar;
> > CREATE DOMAIN person AS TEXT CHECK(value IN ('haas', 'helmle'));
> > ALTER TABLE bar ADD COLUMN name person;
> > CREATE OR REPLACE VIEW vbar AS SELECT * FROM bar;
> >
> > The last command confuses ATRewriteTable(), which wants to scan the relation
> > leading to this error:
> > ERROR:  could not open relation base/16384/16476:
> >
> > I see that ATRewriteTable() errors out on heap_beginscan(), since needscan
> > is set to TRUE. One solution would be to teach ATRewriteTable(s) to handle
> > view alteration differently in this case.
> 
> After looking at this, I think the root cause of this problem is that
> ATPrepAddColumn isn't smart enough to know that when the underlying
> relation is a view, there's no point in asking for a table rewrite.
> Please find an updated patch that addresses this problem.
> 
> Thanks again for the review - let me know what you think of this version!
> 
> ...Robert

[ Attachment, skipping... ]

> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +