Thread: BUG #6050: Dump and restore of view after a schema change: can't restore the view

BUG #6050: Dump and restore of view after a schema change: can't restore the view

From
"Daniel Cristian Cruz"
Date:
The following bug has been logged online:

Bug reference:      6050
Logged by:          Daniel Cristian Cruz
Email address:      danielcristian@gmail.com
PostgreSQL version: 9.0.4
Operating system:   Linux
Description:        Dump and restore of view after a schema change: can't
restore the view
Details:

Example:

DROP VIEW IF EXISTS cba;
DROP TABLE IF EXISTS c;
DROP TABLE IF EXISTS b;
DROP TABLE IF EXISTS a;
CREATE TABLE a (
 id_a serial primary key,
 v text
);
CREATE TABLE b (
 id_b serial primary key,
 id_a integer REFERENCES a (id_a),
 v text
);
CREATE TABLE c (
 id_c serial primary key,
 id_b integer references b (id_b),
 v text
);
INSERT INTO a (id_a, v) VALUES (DEFAULT, 'A');
INSERT INTO b (id_a, v) VALUES (CURRVAL('a_id_a_seq'), 'B');
INSERT INTO c (id_b, v) VALUES (CURRVAL('b_id_b_seq'), 'C');
CREATE VIEW cba AS
 SELECT c.v AS vc, b.v AS vb, a.v AS va
 FROM c
 JOIN b USING (id_b)
 JOIN a USING (id_a);
SELECT * FROM cba;
-- RELATION a -> b-> c became a -> b and a -> c because b is optional
-- SET the value of a -> c where a -> b is defined:
ALTER TABLE c ADD id_a integer;
UPDATE c
 SET id_a = b.id_a
 FROM b
 WHERE b.id_b = c.id_b;
-- VIEW still works!?!?!?
-- Obvious that was a modeling mistake
SELECT * FROM cba;
-- But a pg_dump and a pg_restore of this database generates an error when
restoring the view
"Daniel Cristian Cruz" <danielcristian@gmail.com> writes:
> CREATE TABLE a (
>  id_a serial primary key,
>  v text
> );
> CREATE TABLE b (
>  id_b serial primary key,
>  id_a integer REFERENCES a (id_a),
>  v text
> );
> CREATE TABLE c (
>  id_c serial primary key,
>  id_b integer references b (id_b),
>  v text
> );

> CREATE VIEW cba AS
>  SELECT c.v AS vc, b.v AS vb, a.v AS va
>  FROM c
>  JOIN b USING (id_b)
>  JOIN a USING (id_a);

> ALTER TABLE c ADD id_a integer;

> [ view definition now fails due to multiple "id_a" columns ]

I'm inclined to write this off as "so don't do that".  There's nothing
that pg_dump can do to make this work: it has to use the USING syntax
for the join, and that doesn't offer any way to qualify the column name
on just one side.  The only possible fix would be to try to make ALTER
TABLE reject the addition of the conflicting column name to "c" in the
first place.  That doesn't seem very practical; it would require ALTER
TABLE to do a tremendous amount of analysis, and exclusively lock all
the dependent views, and then lock all the other tables used in the
views, and so on.

Personally my advice is to avoid USING: it wasn't one of the SQL
committee's better ideas.

            regards, tom lane
On Fri, Jun 3, 2011 at 10:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Daniel Cristian Cruz" <danielcristian@gmail.com> writes:
>> CREATE TABLE a (
>> =A0id_a serial primary key,
>> =A0v text
>> );
>> CREATE TABLE b (
>> =A0id_b serial primary key,
>> =A0id_a integer REFERENCES a (id_a),
>> =A0v text
>> );
>> CREATE TABLE c (
>> =A0id_c serial primary key,
>> =A0id_b integer references b (id_b),
>> =A0v text
>> );
>
>> CREATE VIEW cba AS
>> =A0SELECT c.v AS vc, b.v AS vb, a.v AS va
>> =A0FROM c
>> =A0JOIN b USING (id_b)
>> =A0JOIN a USING (id_a);
>
>> ALTER TABLE c ADD id_a integer;
>
>> [ view definition now fails due to multiple "id_a" columns ]
>
> I'm inclined to write this off as "so don't do that". =A0There's nothing
> that pg_dump can do to make this work: it has to use the USING syntax
> for the join, and that doesn't offer any way to qualify the column name
> on just one side. =A0The only possible fix would be to try to make ALTER
> TABLE reject the addition of the conflicting column name to "c" in the
> first place. =A0That doesn't seem very practical; it would require ALTER
> TABLE to do a tremendous amount of analysis, and exclusively lock all
> the dependent views, and then lock all the other tables used in the
> views, and so on.
>
> Personally my advice is to avoid USING: it wasn't one of the SQL
> committee's better ideas.

I don't understand why we can't just translate the USING into some
equivalent construct that doesn't involve USING.  I proposed that a
while ago and you shot it down, but I didn't find the reasoning very
compelling.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Jun 3, 2011 at 10:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Personally my advice is to avoid USING: it wasn't one of the SQL
>> committee's better ideas.

> I don't understand why we can't just translate the USING into some
> equivalent construct that doesn't involve USING.

There is no directly equivalent construct, because USING changes the
shape of the output table (the join has one fewer output column than
would exist without USING).

Now, if the query doesn't involve any explicit reference to "joinalias.*",
we could probably fake it with some ugly thing involving
COALESCE(leftcol, rightcol) ... but I don't think people will want to
read that, and anyway the idea falls apart as soon as you do have a
whole-row reference.

            regards, tom lane
Robert Haas <robertmhaas@gmail.com> wrote:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:

>> Personally my advice is to avoid USING: it wasn't one of the SQL
>> committee's better ideas.
>
> I don't understand why we can't just translate the USING into some
> equivalent construct that doesn't involve USING.  I proposed that
> a while ago and you shot it down, but I didn't find the reasoning
> very compelling.

There's no query you can write with USING that you can't write in a
longer form with ON; but a query of moderate complexity can become
significantly shorter with USING, and thus easier to understand and
less prone to bugs than the ON form.  I think it's a mistake to
discourage USING or under-support it compared to the more verbose
constructs.

-Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Personally my advice is to avoid USING: it wasn't one of the SQL
>>> committee's better ideas.

> There's no query you can write with USING that you can't write in a
> longer form with ON; but a query of moderate complexity can become
> significantly shorter with USING, and thus easier to understand and
> less prone to bugs than the ON form.

If USING (x) actually were exactly equivalent to ON (a.x = b.x),
I'd be less annoyed with it.  But it isn't; the committee just
couldn't resist the urge to decorate it with some dubious additional
behaviors.

            regards, tom lane

Re: BUG #6050: Dump and restore of view after a schema change: can't restore the view

From
Daniel Cristian Cruz
Date:
I understood that USING is not a cool thing to use inside views, will avoid
it.

Still gonna use it in reports and applications, since there is no risk to
affect backups and raises an error.

It would be nice to had an NOTICE or WARNING about the USING clause inside
VIEWs, saying that it's not recommended since a schema change would crash
backups.

2011/6/3 Tom Lane <tgl@sss.pgh.pa.us>

> "Daniel Cristian Cruz" <danielcristian@gmail.com> writes:
> > CREATE TABLE a (
> >  id_a serial primary key,
> >  v text
> > );
> > CREATE TABLE b (
> >  id_b serial primary key,
> >  id_a integer REFERENCES a (id_a),
> >  v text
> > );
> > CREATE TABLE c (
> >  id_c serial primary key,
> >  id_b integer references b (id_b),
> >  v text
> > );
>
> > CREATE VIEW cba AS
> >  SELECT c.v AS vc, b.v AS vb, a.v AS va
> >  FROM c
> >  JOIN b USING (id_b)
> >  JOIN a USING (id_a);
>
> > ALTER TABLE c ADD id_a integer;
>
> > [ view definition now fails due to multiple "id_a" columns ]
>
> I'm inclined to write this off as "so don't do that".  There's nothing
> that pg_dump can do to make this work: it has to use the USING syntax
> for the join, and that doesn't offer any way to qualify the column name
> on just one side.  The only possible fix would be to try to make ALTER
> TABLE reject the addition of the conflicting column name to "c" in the
> first place.  That doesn't seem very practical; it would require ALTER
> TABLE to do a tremendous amount of analysis, and exclusively lock all
> the dependent views, and then lock all the other tables used in the
> views, and so on.
>
> Personally my advice is to avoid USING: it wasn't one of the SQL
> committee's better ideas.
>
>                        regards, tom lane
>



--=20
Daniel Cristian Cruz
=E3=82=AF=E3=83=AB=E3=82=BA =E3=82=AF=E3=83=AA=E3=82=B9=E3=83=81=E3=82=A2=
=E3=83=B3 =E3=83=80=E3=83=8B=E3=82=A8=E3=83=AB
On Fri, Jun 3, 2011 at 1:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Fri, Jun 3, 2011 at 10:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Personally my advice is to avoid USING: it wasn't one of the SQL
>>> committee's better ideas.
>
>> I don't understand why we can't just translate the USING into some
>> equivalent construct that doesn't involve USING.
>
> There is no directly equivalent construct, because USING changes the
> shape of the output table (the join has one fewer output column than
> would exist without USING).
>
> Now, if the query doesn't involve any explicit reference to "joinalias.*",
> we could probably fake it with some ugly thing involving
> COALESCE(leftcol, rightcol) ... but I don't think people will want to
> read that, and anyway the idea falls apart as soon as you do have a
> whole-row reference.

Well, it gets internally translated to COALESCE(leftcol, rightcol)
anyway, so I hardly think it's too ugly to print it out that way.
It's what we're doing; more, it'll work instead of erroring out.  This
is not the first complaint we've gotten about this problem.

I'm not seeing the problem with whole-row references; can you elaborate on that?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Jun 3, 2011 at 1:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Now, if the query doesn't involve any explicit reference to "joinalias.*",
>> we could probably fake it with some ugly thing involving
>> COALESCE(leftcol, rightcol) ... but I don't think people will want to
>> read that, and anyway the idea falls apart as soon as you do have a
>> whole-row reference.

> Well, it gets internally translated to COALESCE(leftcol, rightcol)

We do that during planning; it's not the form that gets stored in views
or dumped by pg_dump.  I don't really want pg_dump dumping this kind of
thing, because that locks us down to supporting it that way forever.

> I'm not seeing the problem with whole-row references; can you elaborate on that?

SELECT somefunc(j.*) FROM (tab1 JOIN tab2 USING (id)) j;

The shape of the record passed to somefunc() depends on removal of the
second id column.

Now you might claim that we could expand the j.* to a ROW() construct
with an explicit list of columns, which indeed is what happens
internally.  But again, that happens at plan time, it's not what gets
stored in rules.  And that matters, because locking down the column
expansion too early would break the response to ADD/DROP COLUMN on
one of the input tables.

            regards, tom lane
Excerpts from Tom Lane's message of vie jun 03 10:59:00 -0400 2011:

> I'm inclined to write this off as "so don't do that".  There's nothing
> that pg_dump can do to make this work: it has to use the USING syntax
> for the join, and that doesn't offer any way to qualify the column name
> on just one side.  The only possible fix would be to try to make ALTER
> TABLE reject the addition of the conflicting column name to "c" in the
> first place.

Would it be possible to dump USING as an equivalent ON condition on
output (nodeToString), and then examine simple ON conditions to turn
them into USING conditions on input (stringToNode)?

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> wrote:

> Would it be possible to dump USING as an equivalent ON condition
> on output (nodeToString), and then examine simple ON conditions to
> turn them into USING conditions on input (stringToNode)?

You know, I'm not necessarily against some warning that a USING
clause in a view definition makes that view more fragile in the face
of schema changes.  If we could invalidate the view at the time of
the schema change, that would not offend me at all.  The two things
which would offend me here are generating a dump which can't be
restored in a single transaction because of errors, or essentially
deprecating the USING clause in general.

-Kevin
On Jun 3, 2011 4:20 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
>
> > [ view definition now fails due to multiple "id_a" columns ]
>
> I'm inclined to write this off as "so don't do that".  There's nothing
> that pg_dump can do to make this work: it has to use the USING syntax
> for the join, and that doesn't offer any way to qualify the column name
> on just one side.

There's nothing stopping us from adding a nonstandard syntax to cover
precisely the information needed to resolve this case when dumping.

For example we could support USING (a.a=b.a) or ON (a.a=b.a as a)

We could use it only in this case where there's ambiguity too so it wouldn't
clutter people's dumps.

That said it isn't very appetizing a change to backport so it doesn't do
much for the OP ...
Greg Stark <gsstark@gmail.com> writes:
> On Jun 3, 2011 4:20 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
>> I'm inclined to write this off as "so don't do that".  There's nothing
>> that pg_dump can do to make this work: it has to use the USING syntax
>> for the join, and that doesn't offer any way to qualify the column name
>> on just one side.

> There's nothing stopping us from adding a nonstandard syntax to cover
> precisely the information needed to resolve this case when dumping.

> For example we could support USING (a.a=b.a) or ON (a.a=b.a as a)

1. "Nonstandard syntax" is widely seen as "vendor lock-in".  I don't
think that people would appreciate such a fix, especially for an issue
so obscure that we've never seen it before.

2. I don't believe your proposal covers all cases.  For instance, there
are cases where there is no valid qualified name for a column, ie, it's
a merged column from an alias-less JOIN.  (The existence of such cases
is another reason why USING sucks, but I digress.)

> We could use it only in this case where there's ambiguity too so it wouldn't
> clutter people's dumps.

No, because the problem case is where ambiguity gets added after the
fact.

            regards, tom lane
On Fri, Jun 3, 2011 at 2:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Fri, Jun 3, 2011 at 1:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Now, if the query doesn't involve any explicit reference to "joinalias.=
*",
>>> we could probably fake it with some ugly thing involving
>>> COALESCE(leftcol, rightcol) ... but I don't think people will want to
>>> read that, and anyway the idea falls apart as soon as you do have a
>>> whole-row reference.
>
>> Well, it gets internally translated to COALESCE(leftcol, rightcol)
>
> We do that during planning; it's not the form that gets stored in views
> or dumped by pg_dump. =A0I don't really want pg_dump dumping this kind of
> thing, because that locks us down to supporting it that way forever.

Hmm.

>> I'm not seeing the problem with whole-row references; can you elaborate =
on that?
>
> SELECT somefunc(j.*) FROM (tab1 JOIN tab2 USING (id)) j;
>
> The shape of the record passed to somefunc() depends on removal of the
> second id column.

Ah, yes.

> Now you might claim that we could expand the j.* to a ROW() construct
> with an explicit list of columns, which indeed is what happens
> internally. =A0But again, that happens at plan time, it's not what gets
> stored in rules. =A0And that matters, because locking down the column
> expansion too early would break the response to ADD/DROP COLUMN on
> one of the input tables.

Fair enough, but the current implementation with respect to ADD
COLUMN.  And RENAME COLUMN.

If your point here is that you don't want to spend time hacking on
this because it's a fairly marginal feature and therefore not terribly
high on your priority list, I can understand that.  But if you're
actually defending the current implementation, I'm going to have to
respectfully disagree.  It's broken, and it sucks, and this is not the
first complaint we've had about it.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes:
> If your point here is that you don't want to spend time hacking on
> this because it's a fairly marginal feature and therefore not terribly
> high on your priority list, I can understand that.  But if you're
> actually defending the current implementation, I'm going to have to
> respectfully disagree.  It's broken, and it sucks, and this is not the
> first complaint we've had about it.

The spec's definition of USING is broken and sucky, and we're
implementing it as best we can.  I don't feel a need to invent
strange nonstandard behavior to work around the fact that USING
is fragile *by definition*.  Complain to the standards committee
about that.

(Question: would you also have us try to work around the fact that
USING stops working if you rename one of the join columns?)

            regards, tom lane

Re: BUG #6050: Dump and restore of view after a schema change: can't restore the view

From
Daniel Cristian Cruz
Date:
But as a PostgreSQL user, I would like to had a warning when creating a view
with USING. It solves my problem. Maybe many others too.

2011/6/7 Robert Haas <robertmhaas@gmail.com>

> On Fri, Jun 3, 2011 at 2:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Robert Haas <robertmhaas@gmail.com> writes:
> >> On Fri, Jun 3, 2011 at 1:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>> Now, if the query doesn't involve any explicit reference to
> "joinalias.*",
> >>> we could probably fake it with some ugly thing involving
> >>> COALESCE(leftcol, rightcol) ... but I don't think people will want to
> >>> read that, and anyway the idea falls apart as soon as you do have a
> >>> whole-row reference.
> >
> >> Well, it gets internally translated to COALESCE(leftcol, rightcol)
> >
> > We do that during planning; it's not the form that gets stored in views
> > or dumped by pg_dump.  I don't really want pg_dump dumping this kind of
> > thing, because that locks us down to supporting it that way forever.
>
> Hmm.
>
> >> I'm not seeing the problem with whole-row references; can you elaborate
> on that?
> >
> > SELECT somefunc(j.*) FROM (tab1 JOIN tab2 USING (id)) j;
> >
> > The shape of the record passed to somefunc() depends on removal of the
> > second id column.
>
> Ah, yes.
>
> > Now you might claim that we could expand the j.* to a ROW() construct
> > with an explicit list of columns, which indeed is what happens
> > internally.  But again, that happens at plan time, it's not what gets
> > stored in rules.  And that matters, because locking down the column
> > expansion too early would break the response to ADD/DROP COLUMN on
> > one of the input tables.
>
> Fair enough, but the current implementation with respect to ADD
> COLUMN.  And RENAME COLUMN.
>
> If your point here is that you don't want to spend time hacking on
> this because it's a fairly marginal feature and therefore not terribly
> high on your priority list, I can understand that.  But if you're
> actually defending the current implementation, I'm going to have to
> respectfully disagree.  It's broken, and it sucks, and this is not the
> first complaint we've had about it.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>



--=20
Daniel Cristian Cruz
=E3=82=AF=E3=83=AB=E3=82=BA =E3=82=AF=E3=83=AA=E3=82=B9=E3=83=81=E3=82=A2=
=E3=83=B3 =E3=83=80=E3=83=8B=E3=82=A8=E3=83=AB
On Tue, Jun 7, 2011 at 11:41 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> If your point here is that you don't want to spend time hacking on
>> this because it's a fairly marginal feature and therefore not terribly
>> high on your priority list, I can understand that. =A0But if you're
>> actually defending the current implementation, I'm going to have to
>> respectfully disagree. =A0It's broken, and it sucks, and this is not the
>> first complaint we've had about it.
>
> The spec's definition of USING is broken and sucky, and we're
> implementing it as best we can. =A0I don't feel a need to invent
> strange nonstandard behavior to work around the fact that USING
> is fragile *by definition*. =A0Complain to the standards committee
> about that.

It's not the standard's committee's fault that our dump won't restore.
 They may not have made life any easier for us, but if we're going to
have the feature, then pg_dump ought to work.  Otherwise, we're
telling people "you can use this feature, but don't expect to be able
to restore from backup".  Not a way to win friends.

> (Question: would you also have us try to work around the fact that
> USING stops working if you rename one of the join columns?)

Yeah.  In fact, I proposed a patch to do just that in response to bug
#5234, which you shot down.  I still don't agree with that.  We can
either disallow the original DDL (adding or renaming a column in a way
that causes the dumped representation to become invalid) or we can
change what we dump so that it can be reloaded.  Letting the user
change the view definition and then producing an unrestorable dump
seems truly awful to me, regardless of how little help we're getting
from the SQL standards committee.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Greg Stark <gsstark@gmail.com> writes:
> A lot of work has gone into making pg_dump/pg_restore guarantee that
> they'll always produce a copy of the database, even if you've done odd
> things like change the lower bounds of your arrays. A lot of this was
> from before the days of PITR when pg_dump/pg_restore was the *only*
> backup option and it was considered absolutely essential that they
> always work. But even today I think it's still a goal that pg_dump
> always dump a loadable database.

Well, pg_upgrade still depends on pg_dump being a 100% solution for DDL,
so I don't think the requirements have gone down any ...

> I had in mind for pg_dump to decide to use the non-standard syntax iff
> it was necessary at dump time.

Maybe.  I'm concerned about the cost of determining whether it's
necessary ("cost" meaning both "runtime" and "code complexity").

            regards, tom lane
On Tue, Jun 7, 2011 at 3:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Greg Stark <gsstark@gmail.com> writes:
>> On Jun 3, 2011 4:20 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
>>> I'm inclined to write this off as "so don't do that". =A0There's nothing
>>> that pg_dump can do to make this work: it has to use the USING syntax
>>> for the join, and that doesn't offer any way to qualify the column name
>>> on just one side.
>
>> There's nothing stopping us from adding a nonstandard syntax to cover
>> precisely the information needed to resolve this case when dumping.
>
>> For example we could support USING (a.a=3Db.a) or ON (a.a=3Db.a as a)
>
> 1. "Nonstandard syntax" is widely seen as "vendor lock-in". =A0I don't
> think that people would appreciate such a fix, especially for an issue
> so obscure that we've never seen it before.

Well our dumps are already not going to be loadable on other SQL
implementations. But yes, it's non-ideal which is why I was tempted to
do it only when needed.

> 2. I don't believe your proposal covers all cases. =A0For instance, there
> are cases where there is no valid qualified name for a column, ie, it's
> a merged column from an alias-less JOIN. =A0(The existence of such cases
> is another reason why USING sucks, but I digress.)

Hm, will have to think about that.


>> We could use it only in this case where there's ambiguity too so it woul=
dn't
>> clutter people's dumps.
>
> No, because the problem case is where ambiguity gets added after the
> fact.

I think this is an interesting point. It seems there are two cases,
one of which I think is much worse than the other.

It sounds like you're concerned about someone dumping the view
definition, then doing an alter table on one of the underlying tables,
then trying to reload their old view definition. As you pointed out
later there are lots of ways the alter table could cause the view to
no longer work. Many of them involve USING which justifies your
complaint that it's fragile but of course it could be as simple as the
alter table dropping a needed column...

The other case seems worse to me: someone creates the view, does the
alter table, then dumps the database. They don't make any further
database modifications, the dump is unloadable as it was dumped. That
is, pg_dump produced an unloadable dump right off the bat.

A lot of work has gone into making pg_dump/pg_restore guarantee that
they'll always produce a copy of the database, even if you've done odd
things like change the lower bounds of your arrays. A lot of this was
from before the days of PITR when pg_dump/pg_restore was the *only*
backup option and it was considered absolutely essential that they
always work. But even today I think it's still a goal that pg_dump
always dump a loadable database. Of course it won't always load in a
different context but if you restore it in the right context or
restore the whole database it ought to work.

I had in mind for pg_dump to decide to use the non-standard syntax iff
it was necessary at dump time. That doesn't protect against someone
changing the table referenced after the dump but that's fine by me. At
least when it was dumped the sql would have loaded to produce the same
view as was dumped.

--=20
greg
Robert Haas wrote:
> On Tue, Jun 7, 2011 at 11:41 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Robert Haas <robertmhaas@gmail.com> writes:
> >> If your point here is that you don't want to spend time hacking on
> >> this because it's a fairly marginal feature and therefore not terribly
> >> high on your priority list, I can understand that. ?But if you're
> >> actually defending the current implementation, I'm going to have to
> >> respectfully disagree. ?It's broken, and it sucks, and this is not the
> >> first complaint we've had about it.
> >
> > The spec's definition of USING is broken and sucky, and we're
> > implementing it as best we can. ?I don't feel a need to invent
> > strange nonstandard behavior to work around the fact that USING
> > is fragile *by definition*. ?Complain to the standards committee
> > about that.
>
> It's not the standard's committee's fault that our dump won't restore.
>  They may not have made life any easier for us, but if we're going to
> have the feature, then pg_dump ought to work.  Otherwise, we're
> telling people "you can use this feature, but don't expect to be able
> to restore from backup".  Not a way to win friends.
>
> > (Question: would you also have us try to work around the fact that
> > USING stops working if you rename one of the join columns?)
>
> Yeah.  In fact, I proposed a patch to do just that in response to bug
> #5234, which you shot down.  I still don't agree with that.  We can
> either disallow the original DDL (adding or renaming a column in a way
> that causes the dumped representation to become invalid) or we can
> change what we dump so that it can be reloaded.  Letting the user
> change the view definition and then producing an unrestorable dump
> seems truly awful to me, regardless of how little help we're getting
> from the SQL standards committee.

Reminder, pg_upgrade is also going to be unusuable if pg_dump generates
an error, even on a view.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +
I assume this should _not_ be added as a TODO.

---------------------------------------------------------------------------

Greg Stark wrote:
> On Tue, Jun 7, 2011 at 3:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Greg Stark <gsstark@gmail.com> writes:
> >> On Jun 3, 2011 4:20 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
> >>> I'm inclined to write this off as "so don't do that". ?There's nothing
> >>> that pg_dump can do to make this work: it has to use the USING syntax
> >>> for the join, and that doesn't offer any way to qualify the column name
> >>> on just one side.
> >
> >> There's nothing stopping us from adding a nonstandard syntax to cover
> >> precisely the information needed to resolve this case when dumping.
> >
> >> For example we could support USING (a.a=b.a) or ON (a.a=b.a as a)
> >
> > 1. "Nonstandard syntax" is widely seen as "vendor lock-in". ?I don't
> > think that people would appreciate such a fix, especially for an issue
> > so obscure that we've never seen it before.
>
> Well our dumps are already not going to be loadable on other SQL
> implementations. But yes, it's non-ideal which is why I was tempted to
> do it only when needed.
>
> > 2. I don't believe your proposal covers all cases. ?For instance, there
> > are cases where there is no valid qualified name for a column, ie, it's
> > a merged column from an alias-less JOIN. ?(The existence of such cases
> > is another reason why USING sucks, but I digress.)
>
> Hm, will have to think about that.
>
>
> >> We could use it only in this case where there's ambiguity too so it wouldn't
> >> clutter people's dumps.
> >
> > No, because the problem case is where ambiguity gets added after the
> > fact.
>
> I think this is an interesting point. It seems there are two cases,
> one of which I think is much worse than the other.
>
> It sounds like you're concerned about someone dumping the view
> definition, then doing an alter table on one of the underlying tables,
> then trying to reload their old view definition. As you pointed out
> later there are lots of ways the alter table could cause the view to
> no longer work. Many of them involve USING which justifies your
> complaint that it's fragile but of course it could be as simple as the
> alter table dropping a needed column...
>
> The other case seems worse to me: someone creates the view, does the
> alter table, then dumps the database. They don't make any further
> database modifications, the dump is unloadable as it was dumped. That
> is, pg_dump produced an unloadable dump right off the bat.
>
> A lot of work has gone into making pg_dump/pg_restore guarantee that
> they'll always produce a copy of the database, even if you've done odd
> things like change the lower bounds of your arrays. A lot of this was
> from before the days of PITR when pg_dump/pg_restore was the *only*
> backup option and it was considered absolutely essential that they
> always work. But even today I think it's still a goal that pg_dump
> always dump a loadable database. Of course it won't always load in a
> different context but if you restore it in the right context or
> restore the whole database it ought to work.
>
> I had in mind for pg_dump to decide to use the non-standard syntax iff
> it was necessary at dump time. That doesn't protect against someone
> changing the table referenced after the dump but that's fine by me. At
> least when it was dumped the sql would have loaded to produce the same
> view as was dumped.
>
> --
> greg
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

Re: Re: BUG #6050: Dump and restore of view after a schema change: can't restore the view

From
Daniel Cristian Cruz
Date:
SG93IGFib3V0IGEgd2FybmluZyBpbiBkb2NzIGZvciB1bmNhdXRpb3VzIERC
QSdzPwoKMjAxMS8xMC8xMyBCcnVjZSBNb21qaWFuIDxicnVjZUBtb21qaWFu
LnVzPgoKPgo+IEkgYXNzdW1lIHRoaXMgc2hvdWxkIF9ub3RfIGJlIGFkZGVk
IGFzIGEgVE9ETy4KCgotLSAKRGFuaWVsIENyaXN0aWFuIENydXoK44Kv44Or
44K6IOOCr+ODquOCueODgeOCouODsyDjg4Djg4vjgqjjg6sK