Thread: Add column if not exists (CINE)

Add column if not exists (CINE)

From
Kjell Rune Skaaraas
Date:
Hello,

I've been reading the earlier threads at:
http://archives.postgresql.org/pgsql-hackers/2009-05/thrd7.php#00252
http://archives.postgresql.org/pgsql-hackers/2005-10/thrd4.php#00632
and I'm not sure I have anything that substantially new to add but:

1. I can't see there's an unambiguity about what the syntax would do. It is IF NOT EXISTS, not IF NOT LIKE. Anyone who
shootsthemselves in the foot by calling a CINE and thinking that a preexisting differently defined column is magically
converteddeserves it. Either it should act exactly like the non-CINE command, or do nothing at all as if the statement
wasn'tthere. 

2. The use case is pretty clear to me - flexible scripts that'll bring all earlier database versions to the latest
schema.I've been experimenting in 9.0 alpha with calling DROP CONSTRAINT IF EXISTS then ADD CONSTRAINT with named
constantsfor a CINE effect. which as a side effect will correct any updated constraints too - and it works great.
UnfortunatelyDROP COLUMN IF EXISTS then ADD COLUMN has the side effect of deleting all the data, so that's hardly
usable.

I saw some indications that this might be a minority opinion, well I would like to cast a vote FOR this functionality.
Theworkarounds are ugly, the solution simple and while I agree it's possible to misuse it, my opinion is that you
shouldn'tbecome a surgeon if you can't handle a scalpel. In this case I get the feeling I'm reading instructions on how
todo surgery with a butter knife because we don't dare hand out anything sharper. 

Regards,
Kjell Rune Skaaraas




Re: Add column if not exists (CINE)

From
Robert Haas
Date:
On Tue, Apr 27, 2010 at 6:45 PM, Kjell Rune Skaaraas <kjella79@yahoo.no> wrote:
> Hello,
>
> I've been reading the earlier threads at:
> http://archives.postgresql.org/pgsql-hackers/2009-05/thrd7.php#00252
> http://archives.postgresql.org/pgsql-hackers/2005-10/thrd4.php#00632
> and I'm not sure I have anything that substantially new to add but:
>
> 1. I can't see there's an unambiguity about what the syntax would do. It is IF NOT EXISTS, not IF NOT LIKE. Anyone
whoshoots themselves in the foot by calling a CINE and thinking that a preexisting differently defined column is
magicallyconverted deserves it. Either it should act exactly like the non-CINE command, or do nothing at all as if the
statementwasn't there. 
>
> 2. The use case is pretty clear to me - flexible scripts that'll bring all earlier database versions to the latest
schema.I've been experimenting in 9.0 alpha with calling DROP CONSTRAINT IF EXISTS then ADD CONSTRAINT with named
constantsfor a CINE effect. which as a side effect will correct any updated constraints too - and it works great.
UnfortunatelyDROP COLUMN IF EXISTS then ADD COLUMN has the side effect of deleting all the data, so that's hardly
usable.
>
> I saw some indications that this might be a minority opinion, well I would like to cast a vote FOR this
functionality.The workarounds are ugly, the solution simple and while I agree it's possible to misuse it, my opinion is
thatyou shouldn't become a surgeon if you can't handle a scalpel. In this case I get the feeling I'm reading
instructionson how to do surgery with a butter knife because we don't dare hand out anything sharper. 

I've already said my piece on this, but I couldn't agree more.  Well
said, and your use case is exactly the one I want it for.

...Robert


Re: Add column if not exists (CINE)

From
Takahiro Itagaki
Date:
Kjell Rune Skaaraas <kjella79@yahoo.no> wrote:

> I've been reading the earlier threads at:
> http://archives.postgresql.org/pgsql-hackers/2009-05/thrd7.php#00252
> http://archives.postgresql.org/pgsql-hackers/2005-10/thrd4.php#00632
> and I'm not sure I have anything that substantially new to add but:
> 
> I saw some indications that this might be a minority opinion,
> well I would like to cast a vote FOR this functionality.

+1 for CINE, just because MySQL supports it.

But before developing, we need to decide how to handle an added object
that has the same name but has different definitions. 

Also, developers should consider not only ADD COLUMN but also other
CREATE or ADD commands. The patch will be large, including documentation
adjustments in many places -- it would be hard work.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center




Re: Add column if not exists (CINE)

From
Andrew Dunstan
Date:

Takahiro Itagaki wrote:
> Kjell Rune Skaaraas <kjella79@yahoo.no> wrote:
>
>   
>> I've been reading the earlier threads at:
>> http://archives.postgresql.org/pgsql-hackers/2009-05/thrd7.php#00252
>> http://archives.postgresql.org/pgsql-hackers/2005-10/thrd4.php#00632
>> and I'm not sure I have anything that substantially new to add but:
>>
>> I saw some indications that this might be a minority opinion,
>> well I would like to cast a vote FOR this functionality.
>>     
>
> +1 for CINE, just because MySQL supports it.
>   

MySQL compatibility has never been our aim. We should adopt ideas from 
other projects because they are good, not just because they are there.

That doesn't mean I don't think this is a good idea.

> But before developing, we need to decide how to handle an added object
> that has the same name but has different definitions. 
>   

The OP explicitly stated that in his opinion nothing should be done in 
such cases. That's a defensible position, in the case of objects such as 
tables that must be unique by name (e.g. tables).  But what would we do 
about objects where the name could be overloaded? Since we would 
presumably want to do this for all (or almost all) of our CREATE/ADD 
commands, we'd need a policy on those.
> Also, developers should consider not only ADD COLUMN but also other
> CREATE or ADD commands. The patch will be large, including documentation
> adjustments in many places -- it would be hard work.
>
>
>   

I can speak with some experience on this at least. :-) I don't see that 
it would be a heck of a lot bigger than the DROP IF EXISTS cases, which 
after the first few had been done were not hard, merely tedious to do :-)

cheers

andrew



Re: Add column if not exists (CINE)

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Takahiro Itagaki wrote:
>> But before developing, we need to decide how to handle an added object
>> that has the same name but has different definitions. 

> The OP explicitly stated that in his opinion nothing should be done in 
> such cases. That's a defensible position, in the case of objects such as 
> tables that must be unique by name (e.g. tables).  But what would we do 
> about objects where the name could be overloaded?

Even if it's defensible, the consensus position so far has been that
it's a bad design.  Every time we've looked at this, we have concluded
that CREATE OR REPLACE semantics are considerably safer to use, because
there is no question what the state of the object is afterwards.  That
argument is just as valid for a column as for anything larger.

AFAICS, the only excuse CINE has for living is that (people think)
it would take less work to implement.
        regards, tom lane


Re: Add column if not exists (CINE)

From
"Ross J. Reedstrom"
Date:
On Tue, Apr 27, 2010 at 08:18:13PM -0400, Robert Haas wrote:
> On Tue, Apr 27, 2010 at 6:45 PM, Kjell Rune Skaaraas <kjella79@yahoo.no> wrote:
[snip]
> > I saw some indications that this might be a minority opinion, well I would like to cast a vote FOR this
functionality.The workarounds are ugly, the solution simple and while I agree it's possible to misuse it, my opinion is
thatyou shouldn't become a surgeon if you can't handle a scalpel. In this case I get the feeling I'm reading
instructionson how to do surgery with a butter knife because we don't dare hand out anything sharper.
 
> 
> I've already said my piece on this, but I couldn't agree more.  Well
> said, and your use case is exactly the one I want it for.
> 

+1 (Scribbles down the phrase "instructions on how to do surgery with a
butter knife because we don't dare hand out anything sharper" for future
repurposing)

Ross
-- 
Ross Reedstrom, Ph.D.                                 reedstrm@rice.edu
Systems Engineer & Admin, Research Scientist        phone: 713-348-6166
The Connexions Project      http://cnx.org            fax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E  F888 D3AE 810E 88F0 BEDE


Re: Add column if not exists (CINE)

From
Robert Haas
Date:
On Wed, Apr 28, 2010 at 9:58 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Takahiro Itagaki wrote:
>>> But before developing, we need to decide how to handle an added object
>>> that has the same name but has different definitions.
>
>> The OP explicitly stated that in his opinion nothing should be done in
>> such cases. That's a defensible position, in the case of objects such as
>> tables that must be unique by name (e.g. tables).  But what would we do
>> about objects where the name could be overloaded?
>
> Even if it's defensible, the consensus position so far has been that
> it's a bad design.  Every time we've looked at this, we have concluded
> that CREATE OR REPLACE semantics are considerably safer to use, because
> there is no question what the state of the object is afterwards.  That
> argument is just as valid for a column as for anything larger.
>
> AFAICS, the only excuse CINE has for living is that (people think)
> it would take less work to implement.

I don't believe you are fairly stating the consensus from previous
discussion and I believe that you are actually in the minority on this
one.  I agree that we probably don't need to support this for object
types for which CREATE OR REPLACE is available or can be made
available, but that isn't feasible for all object types - tables and
columns being the obvious examples.

...Robert


Re: Add column if not exists (CINE)

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> I don't believe you are fairly stating the consensus from previous
> discussion and I believe that you are actually in the minority on this
> one.  I agree that we probably don't need to support this for object
> types for which CREATE OR REPLACE is available or can be made
> available, but that isn't feasible for all object types - tables and
> columns being the obvious examples.

What's obvious about it?  In particular, I should think that ADD OR
REPLACE COLUMN would usefully be defined as "ADD if no such column,
else ALTER COLUMN as necessary to match this spec".  Dropping the
ALTER part of that has no benefit except to lazy implementors; it
certainly is not more useful to users if they can't be sure of the
column properties after issuing the command.
        regards, tom lane


Re: Add column if not exists (CINE)

From
Heikki Linnakangas
Date:
Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I don't believe you are fairly stating the consensus from previous
>> discussion and I believe that you are actually in the minority on this
>> one.  I agree that we probably don't need to support this for object
>> types for which CREATE OR REPLACE is available or can be made
>> available, but that isn't feasible for all object types - tables and
>> columns being the obvious examples.
> 
> What's obvious about it?  In particular, I should think that ADD OR
> REPLACE COLUMN would usefully be defined as "ADD if no such column,
> else ALTER COLUMN as necessary to match this spec".  Dropping the
> ALTER part of that has no benefit except to lazy implementors; it
> certainly is not more useful to users if they can't be sure of the
> column properties after issuing the command.

Right. A trivial implementation of CREATE OR REPLACE is to create the
object if it doesn't exist, do nothing if it exists already and is
identical to the new definition, and throw an error if it's not
identical. That covers the same use case as CREATE IF NOT EXISTS, but
you know what the state is after a successful execution, is easy to
implement, and is in line with the existing CREATE OR REPLACE commands.
And can be extended in the future to alter the existing object instead
of throwing an error.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Add column if not exists (CINE)

From
Robert Haas
Date:
On Wed, Apr 28, 2010 at 11:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I don't believe you are fairly stating the consensus from previous
>> discussion and I believe that you are actually in the minority on this
>> one.  I agree that we probably don't need to support this for object
>> types for which CREATE OR REPLACE is available or can be made
>> available, but that isn't feasible for all object types - tables and
>> columns being the obvious examples.
>
> What's obvious about it?  In particular, I should think that ADD OR
> REPLACE COLUMN would usefully be defined as "ADD if no such column,
> else ALTER COLUMN as necessary to match this spec".  Dropping the
> ALTER part of that has no benefit except to lazy implementors; it
> certainly is not more useful to users if they can't be sure of the
> column properties after issuing the command.

Actually, that's a good idea.  But how will you handle tables?

...Robert


Re: Add column if not exists (CINE)

From
Heikki Linnakangas
Date:
Robert Haas wrote:
> On Wed, Apr 28, 2010 at 11:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> I don't believe you are fairly stating the consensus from previous
>>> discussion and I believe that you are actually in the minority on this
>>> one.  I agree that we probably don't need to support this for object
>>> types for which CREATE OR REPLACE is available or can be made
>>> available, but that isn't feasible for all object types - tables and
>>> columns being the obvious examples.
>> What's obvious about it?  In particular, I should think that ADD OR
>> REPLACE COLUMN would usefully be defined as "ADD if no such column,
>> else ALTER COLUMN as necessary to match this spec".  Dropping the
>> ALTER part of that has no benefit except to lazy implementors; it
>> certainly is not more useful to users if they can't be sure of the
>> column properties after issuing the command.
> 
> Actually, that's a good idea.  But how will you handle tables?

What do you mean?

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Add column if not exists (CINE)

From
Andrew Dunstan
Date:

Robert Haas wrote:
> On Wed, Apr 28, 2010 at 11:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>   
>> Robert Haas <robertmhaas@gmail.com> writes:
>>     
>>> I don't believe you are fairly stating the consensus from previous
>>> discussion and I believe that you are actually in the minority on this
>>> one.  I agree that we probably don't need to support this for object
>>> types for which CREATE OR REPLACE is available or can be made
>>> available, but that isn't feasible for all object types - tables and
>>> columns being the obvious examples.
>>>       
>> What's obvious about it?  In particular, I should think that ADD OR
>> REPLACE COLUMN would usefully be defined as "ADD if no such column,
>> else ALTER COLUMN as necessary to match this spec".  Dropping the
>> ALTER part of that has no benefit except to lazy implementors; it
>> certainly is not more useful to users if they can't be sure of the
>> column properties after issuing the command.
>>     
>
> Actually, that's a good idea.  But how will you handle tables?
>
>
>   

I think I Iike Heikki's suggestion better, to error out if the object 
exists but the properties differ. At least I'd like an option for that.

cheers

andrew


Re: Add column if not exists (CINE)

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Actually, that's a good idea.  But how will you handle tables?

Well, tables are a special case, mainly because it's not clear how to
avoid accidentally throwing away data.  (In particular if some column in
the existing table isn't there in the new definition.  It's a bit scary
to just drop the column, IMO.)  I don't see that that argument applies
to doing an automatic ALTER COLUMN, though, especially since the only
column type alterations that will go through without a USING clause are
reasonably straightforward.
        regards, tom lane


Re: Add column if not exists (CINE)

From
Robert Haas
Date:
On Wed, Apr 28, 2010 at 12:07 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> Robert Haas wrote:
>> On Wed, Apr 28, 2010 at 11:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Robert Haas <robertmhaas@gmail.com> writes:
>>>> I don't believe you are fairly stating the consensus from previous
>>>> discussion and I believe that you are actually in the minority on this
>>>> one.  I agree that we probably don't need to support this for object
>>>> types for which CREATE OR REPLACE is available or can be made
>>>> available, but that isn't feasible for all object types - tables and
>>>> columns being the obvious examples.
>>> What's obvious about it?  In particular, I should think that ADD OR
>>> REPLACE COLUMN would usefully be defined as "ADD if no such column,
>>> else ALTER COLUMN as necessary to match this spec".  Dropping the
>>> ALTER part of that has no benefit except to lazy implementors; it
>>> certainly is not more useful to users if they can't be sure of the
>>> column properties after issuing the command.
>>
>> Actually, that's a good idea.  But how will you handle tables?
>
> What do you mean?

Well, how would you define CREATE OR REPLACE TABLE?  I think that
doesn't make much sense, which is why I think CREATE IF NOT EXISTS is
a reasonable approach.

...Robert


Re: Add column if not exists (CINE)

From
Robert Haas
Date:
On Wed, Apr 28, 2010 at 1:40 PM, Dimitri Fontaine
<dfontaine@hi-media.com> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Well, how would you define CREATE OR REPLACE TABLE?  I think that
>> doesn't make much sense, which is why I think CREATE IF NOT EXISTS is
>> a reasonable approach.
>
> <hand waving time>
>
> The behavior I'd like to have would be to allow me to give a SELECT
> query to run for replacing what is there if there's something. If the
> query can not be run on the existing data set, error out of course.
>
> So you know the state for sure after the command, but it depends on your
> query being correct. And you can (de)normalize existing data using joins.
>
> The REPLACE keyword would here mean that there's a CTAS going under the
> hood, then we add the constraints and indexes and triggers etc. That
> would mean being able to express those entities changes too, but it
> seems important.
>
> Well, that may be not precise enough as a spec, but at least that's food
> for though I hope.

This type of hand-waving convinces me more than ever that we should
just implement CINE, and it should just C if it doesn't already E.
This is what has been requested multiple times, by multiple people,
including various people who don't normally poke their head into
-hackers.  I think the resistance to a straightforward implementation
with easy-to-understand behavior is completely unjustifiable.  It's
completely unobvious to me that all of the above will work at all and,
if it did, whether it would actually solve the problems that I care
about, like being able to write schema-upgrade scripts that would work
in a simple and predictable fashion.

...Robert


Re: Add column if not exists (CINE)

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Well, how would you define CREATE OR REPLACE TABLE?  I think that
> doesn't make much sense, which is why I think CREATE IF NOT EXISTS is
> a reasonable approach.

<hand waving time>

The behavior I'd like to have would be to allow me to give a SELECT
query to run for replacing what is there if there's something. If the
query can not be run on the existing data set, error out of course.

So you know the state for sure after the command, but it depends on your
query being correct. And you can (de)normalize existing data using joins.

The REPLACE keyword would here mean that there's a CTAS going under the
hood, then we add the constraints and indexes and triggers etc. That
would mean being able to express those entities changes too, but it
seems important.

Well, that may be not precise enough as a spec, but at least that's food
for though I hope.

Regards,
-- 
dim


Re: Add column if not exists (CINE)

From
Heikki Linnakangas
Date:
Robert Haas wrote:
> On Wed, Apr 28, 2010 at 12:07 PM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>> Robert Haas wrote:
>>> On Wed, Apr 28, 2010 at 11:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> Robert Haas <robertmhaas@gmail.com> writes:
>>>>> I don't believe you are fairly stating the consensus from previous
>>>>> discussion and I believe that you are actually in the minority on this
>>>>> one.  I agree that we probably don't need to support this for object
>>>>> types for which CREATE OR REPLACE is available or can be made
>>>>> available, but that isn't feasible for all object types - tables and
>>>>> columns being the obvious examples.
>>>> What's obvious about it?  In particular, I should think that ADD OR
>>>> REPLACE COLUMN would usefully be defined as "ADD if no such column,
>>>> else ALTER COLUMN as necessary to match this spec".  Dropping the
>>>> ALTER part of that has no benefit except to lazy implementors; it
>>>> certainly is not more useful to users if they can't be sure of the
>>>> column properties after issuing the command.
>>> Actually, that's a good idea.  But how will you handle tables?
>> What do you mean?
> 
> Well, how would you define CREATE OR REPLACE TABLE?

It the table doesn't exist, create it. If it exists with the same name
and same columns and constraints and all, do nothing. Otherwise throw an
error.

Maybe it should also check that the existing table is empty.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Add column if not exists (CINE)

From
Kjell Rune Skaaraas
Date:

--- Den ons 2010-04-28 skrev Tom Lane <tgl@sss.pgh.pa.us>:

> Fra: Tom Lane <tgl@sss.pgh.pa.us>
> Emne: Re: [HACKERS] Add column if not exists (CINE)
> Til: "Robert Haas" <robertmhaas@gmail.com>
> Kopi: "Andrew Dunstan" <andrew@dunslane.net>, "Takahiro Itagaki" <itagaki.takahiro@oss.ntt.co.jp>, "Kjell Rune
Skaaraas"<kjella79@yahoo.no>, pgsql-hackers@postgresql.org 
> Dato: Onsdag 28. april 2010 17.20
> Robert Haas <robertmhaas@gmail.com>
> writes:
> > I don't believe you are fairly stating the consensus
> from previous
> > discussion and I believe that you are actually in the
> minority on this
> > one.  I agree that we probably don't need to
> support this for object
> > types for which CREATE OR REPLACE is available or can
> be made
> > available, but that isn't feasible for all object
> types - tables and
> > columns being the obvious examples.
>
> What's obvious about it?  In particular, I should
> think that ADD OR
> REPLACE COLUMN would usefully be defined as "ADD if no such
> column,
> else ALTER COLUMN as necessary to match this spec". 
> Dropping the
> ALTER part of that has no benefit except to lazy
> implementors; it
> certainly is not more useful to users if they can't be sure
> of the
> column properties after issuing the command.

To me this construct seems horribly ambigious. Imagine I did a ALTER TABLE foo ADD COLUMN bar INTEGER NOT NULL UNIQUE
DEFAULT10, then an ALTER TABLE foo ADD OR REPLACE COLUMN bar BIGINT. Would I get a BIGINT NOT NULL UNIQUE DEFAULT 10 or
aplain BIGINT? Either way I think one group will be disappointed because it either trashes all your other setup *or*
forcesyou to call DROP NOT NULL, DROP DEFAULT etc. when you don't want it. 

There's a reason why there's no ALTER TABLE foo SET COLUMN bar [definition]" and instead many statements. Remember it
hasto deal with all these possible column constraints in ADD COLUMN: 

where column_constraint is:

[ CONSTRAINT constraint_name ]
{ NOT NULL |  NULL |  UNIQUE index_parameters | PRIMARY KEY index_parameters | CHECK ( expression ) | REFERENCES
reftable[ ( refcolumn ) ] [ MATCH FULL | MATCH PARTIAL | MATCH SIMPLE ]   [ ON DELETE action ] [ ON UPDATE action ] } 
[ DEFERRABLE | NOT DEFERRABLE ] [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]

What about something like CHECK? Would you assume it's a complete set of CHECKs and drop the rest? Or just keep
creatingnew CHECKs every time it is run? Dupe checking? 

CINE has none of these problems, either the column didn't exist before so there's nothing to worry about or the command
doesnothing. True, you may have a borked column but not if you follow a simple design pattern of never recasting a
columntype but rather add a new, migrate your data and update your queries. And for the exceptions to that rule, you
canadd a ALTER COLUMN SET DATA TYPE (or any of the other ALTERs) after the CINE in your scipt. If the CINE triggered
allis the latest version, if not the detailed ALTERs will change any column that needs changing. Clean and simple. 

Regards,
Kjell Rune




Re: Add column if not exists (CINE)

From
Tom Lane
Date:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Robert Haas wrote:
>> Well, how would you define CREATE OR REPLACE TABLE?

> It the table doesn't exist, create it. If it exists with the same name
> and same columns and constraints and all, do nothing. Otherwise throw an
> error.

> Maybe it should also check that the existing table is empty.

The last bit doesn't seem to make sense.  If you want an empty table,
you can do DROP IF EXISTS and then CREATE.  ISTM that the use-cases
where you don't want to do that are cases where you don't want to lose
existing data.

For either CINE or COR, there are a number of issues that are being
hand-waved away here: is it OK to change ownership and/or permissions?
What about foreign key constraints relating this table to others?
For that matter it's not real clear that indexes, check constraints,
etc should be allowed to survive.  If they are allowed to survive then
CINE TABLE is just the tip of the iceberg: to do anything useful you'd
also need CINE for ADD CONSTRAINT, CREATE INDEX, ADD FOREIGN KEY, etc.
And the more of those you add, the more problematic it gets to allow
existing objects that don't quite match what the command says.

Any of these commands are headache-y for something as complicated
as a table.  I'm not at all impressed by the argument that mysql
does it, because they are *notorious* for being willing to ship
half-baked solutions.
        regards, tom lane


Re: Add column if not exists (CINE)

From
Robert Haas
Date:
On Wed, Apr 28, 2010 at 2:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> Robert Haas wrote:
>>> Well, how would you define CREATE OR REPLACE TABLE?
>
>> It the table doesn't exist, create it. If it exists with the same name
>> and same columns and constraints and all, do nothing. Otherwise throw an
>> error.
>
>> Maybe it should also check that the existing table is empty.
>
> The last bit doesn't seem to make sense.  If you want an empty table,
> you can do DROP IF EXISTS and then CREATE.  ISTM that the use-cases
> where you don't want to do that are cases where you don't want to lose
> existing data.

Right.

> For either CINE or COR, there are a number of issues that are being
> hand-waved away here: is it OK to change ownership and/or permissions?
> What about foreign key constraints relating this table to others?
> For that matter it's not real clear that indexes, check constraints,
> etc should be allowed to survive.  If they are allowed to survive then
> CINE TABLE is just the tip of the iceberg: to do anything useful you'd
> also need CINE for ADD CONSTRAINT, CREATE INDEX, ADD FOREIGN KEY, etc.
> And the more of those you add, the more problematic it gets to allow
> existing objects that don't quite match what the command says.
>
> Any of these commands are headache-y for something as complicated
> as a table.  I'm not at all impressed by the argument that mysql
> does it, because they are *notorious* for being willing to ship
> half-baked solutions.

We can artificially make this problem as complicated as we wish, but
the people who are asking for this feature (including me) will, I
believe, be quite happy with a solution that throws, say, a NOTICE
instead of an ERROR when the object already exists, and then returns
without doing anything further.  There are very few, if any,
definitional issues here, except by people who are brainstorming crazy
alternative behaviors whose actual usefulness I very much doubt.

CREATE OR REPLACE is indeed much more complicated.  In fact, for
tables, I maintain that you'll need to link with -ldwim to make it
work properly.

...Robert


Re: Add column if not exists (CINE)

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> We can artificially make this problem as complicated as we wish, but
> the people who are asking for this feature (including me) will, I
> believe, be quite happy with a solution that throws, say, a NOTICE
> instead of an ERROR when the object already exists, and then returns
> without doing anything further.  There are very few, if any,
> definitional issues here, except by people who are brainstorming crazy
> alternative behaviors whose actual usefulness I very much doubt.

> CREATE OR REPLACE is indeed much more complicated.  In fact, for
> tables, I maintain that you'll need to link with -ldwim to make it
> work properly.

This may in fact be an appropriate way to handle the case for tables,
given the complexity of their definitions.  However, the original
point of the thread was about what to do for columns.  I still say
that COR rather than CINE semantics would be appropriate for columns.
        regards, tom lane


Re: Add column if not exists (CINE)

From
Chris Browne
Date:
robertmhaas@gmail.com (Robert Haas) writes:
> On Wed, Apr 28, 2010 at 1:40 PM, Dimitri Fontaine
> <dfontaine@hi-media.com> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> Well, how would you define CREATE OR REPLACE TABLE?  I think that
>>> doesn't make much sense, which is why I think CREATE IF NOT EXISTS is
>>> a reasonable approach.
>>
>> <hand waving time>
>>
>> The behavior I'd like to have would be to allow me to give a SELECT
>> query to run for replacing what is there if there's something. If the
>> query can not be run on the existing data set, error out of course.
>>
>> So you know the state for sure after the command, but it depends on your
>> query being correct. And you can (de)normalize existing data using joins.
>>
>> The REPLACE keyword would here mean that there's a CTAS going under the
>> hood, then we add the constraints and indexes and triggers etc. That
>> would mean being able to express those entities changes too, but it
>> seems important.
>>
>> Well, that may be not precise enough as a spec, but at least that's food
>> for though I hope.
>
> This type of hand-waving convinces me more than ever that we should
> just implement CINE, and it should just C if it doesn't already E.
> This is what has been requested multiple times, by multiple people,
> including various people who don't normally poke their head into
> -hackers.  I think the resistance to a straightforward implementation
> with easy-to-understand behavior is completely unjustifiable.  It's
> completely unobvious to me that all of the above will work at all and,
> if it did, whether it would actually solve the problems that I care
> about, like being able to write schema-upgrade scripts that would work
> in a simple and predictable fashion.

I tend to agree with you here.

While yes, "CINE is a simplification of COR (CREATE OR REPLACE)", I'm
not at all sure that it's reasonable to hope for the latter, in that it
elides potentially grave problems that aren't reasonable to expect
solved.

Notably, the "and what if a substantial data transformation is needed to
accomplish this?"

CINE doesn't propose to try to do that transformation, which seems like
the right choice to me.

When I put my "we've got things replicating using Slony-I" hat on, CINE
looks pretty preferable to me.  It's unambitious - but it is certainly
NOT doing a bunch of magic behind your back so as to make it tougher to
predict what might happen in a trigger-replicated environment.

In any case, CINE seems pretty useful to me.  I'm prepared to listen to
persuasion, but thus far, it looks like a "+1 from me."

An alternative that seems likable is "COR, raising an exception if
there's a type mismatch." Where there's certainly room to debate how
much of a difference represents a "mismatch."
-- 
"cbbrowne","@","gmail.com"
http://linuxfinances.info/info/wp.html
Predestination was doomed from the start. 


Re: Add column if not exists (CINE)

From
Bruce Momjian
Date:
Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > We can artificially make this problem as complicated as we wish, but
> > the people who are asking for this feature (including me) will, I
> > believe, be quite happy with a solution that throws, say, a NOTICE
> > instead of an ERROR when the object already exists, and then returns
> > without doing anything further.  There are very few, if any,
> > definitional issues here, except by people who are brainstorming crazy
> > alternative behaviors whose actual usefulness I very much doubt.
> 
> > CREATE OR REPLACE is indeed much more complicated.  In fact, for
> > tables, I maintain that you'll need to link with -ldwim to make it
> > work properly.
> 
> This may in fact be an appropriate way to handle the case for tables,
> given the complexity of their definitions.  However, the original
> point of the thread was about what to do for columns.  I still say
> that COR rather than CINE semantics would be appropriate for columns.

I have added this TODO item:
Allow CREATE TABLE to optionally create a table if it does not alreadyexist, without throwing an error    The fact that
tablescontain data makes this more complex than otherCREATE OR REPLACE operations.        *
http://archives.postgresql.org/pgsql-hackers/2010-04/msg01300.php
 

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com


Re: Add column if not exists (CINE)

From
Robert Haas
Date:
On Sat, May 1, 2010 at 2:26 PM, Kjell Rune Skaaraas <kjella79@yahoo.no> wrote:
> In other words, pretty much all the hard bits I seem to hear people agree
> on exist still apply to the single column. COR for columns was suggested
> already back in the same thread in 2005:
>
> http://archives.postgresql.org/pgsql-hackers/2005-10/msg00644.php
>
> We're already talking 2010, what I fear is that COR is that shiny goal
> far, far out there but if I come back in 2015 there'll still be neither
> COR or CINE. I really can't understand how CINE can be equally hard to
> implement as COR, since CINE is simply to NOT throw an error.

It's not hard to implement at all; this whole discussion - and all the
ones that have preceded it - are based on the theory that people who
are asking for CINE are too stupid to see that CINE isn't really going
to fix their problem.  That comes in several flavors, including:

- 90% of the things you would want to do in real life are way more
complicated anyway, so CINE by itself is no use,
- COR is really what you want but since you're too lazy to implement
that you want CINE instead,
- MySQL has it so it must be a bad idea (even though that was part of
our justification for adopting DIE), and/or
- most people agree that we shouldn't implement CINE.

I am usually very supportive of our community's decision-making
process, but in this case I think we're just being extraordinarily
stubborn, and frankly insulting the intelligence of our end-users,
among whom I number myself.

Q. I ran CREATE IF NOT EXISTS and it didn't magically make my table
match the column list I gave!
A1. Duh.
A2. Did you notice the part where it said:
NOTICE: table "dwiw" already exists, skipping
...and if yes, what did you think the word "skipping" (which also
appears in the similar notice we give for DROP IF EXISTS) meant?

...Robert


Re: Add column if not exists (CINE)

From
Robert Haas
Date:
On Wed, Apr 28, 2010 at 9:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> CREATE OR REPLACE is indeed much more complicated.  In fact, for
>> tables, I maintain that you'll need to link with -ldwim to make it
>> work properly.
>
> This may in fact be an appropriate way to handle the case for tables,
> given the complexity of their definitions.

Patch attached.

...Robert

Attachment

Re: Add column if not exists (CINE)

From
Kjell Rune Skaaraas
Date:
--- Den fre 2010-04-30 skrev Bruce Momjian <bruce@momjian.us>:
> Tom Lane wrote:
> > Robert Haas <robertmhaas@gmail.com>
> writes:
> > > We can artificially make this problem as
> complicated as we wish, but
> > > the people who are asking for this feature
> (including me) will, I
> > > believe, be quite happy with a solution that
> throws, say, a NOTICE
> > > instead of an ERROR when the object already
> exists, and then returns
> > > without doing anything further.  There are
> very few, if any,
> > > definitional issues here, except by people who
> are brainstorming crazy
> > > alternative behaviors whose actual usefulness I
> very much doubt.
> >
> > > CREATE OR REPLACE is indeed much more
> complicated.  In fact, for
> > > tables, I maintain that you'll need to link with
> -ldwim to make it
> > > work properly.
> >
> > This may in fact be an appropriate way to handle the
> case for tables,
> > given the complexity of their definitions. 
> However, the original
> > point of the thread was about what to do for
> columns.  I still say
> > that COR rather than CINE semantics would be
> appropriate for columns.
>
> I have added this TODO item:
>
>     Allow CREATE TABLE to optionally create
> a table if it does not already
>     exist, without throwing an error
>    
>         The fact that tables
> contain data makes this more complex than other
>     CREATE OR REPLACE operations.

If you could write a COR for columns, then a COR for tables would be
90% done by calling COR on each column and drop/keep the rest (COR
WITH/WITHOUT DROP?). You'd have to deal with table constraints but you
also have to handle column constraints to do COR on columns.

In other words, pretty much all the hard bits I seem to hear people agree
on exist still apply to the single column. COR for columns was suggested
already back in the same thread in 2005:

http://archives.postgresql.org/pgsql-hackers/2005-10/msg00644.php

We're already talking 2010, what I fear is that COR is that shiny goal
far, far out there but if I come back in 2015 there'll still be neither
COR or CINE. I really can't understand how CINE can be equally hard to
implement as COR, since CINE is simply to NOT throw an error.

Regards,
Kjell Rune




Re: Add column if not exists (CINE)

From
Bernd Helmle
Date:

--On 1. Mai 2010 23:09:23 -0400 Robert Haas <robertmhaas@gmail.com> wrote:

> On Wed, Apr 28, 2010 at 9:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> CREATE OR REPLACE is indeed much more complicated.  In fact, for
>>> tables, I maintain that you'll need to link with -ldwim to make it
>>> work properly.
>>
>> This may in fact be an appropriate way to handle the case for tables,
>> given the complexity of their definitions.
>
> Patch attached.
>


I had an initial look at Robert's patch. Patch applies cleanly,
documentation and regression tests included, everything works as expected.
When looking at the functionality there's one thing that strikes me a
little:

bernd@localhost:bernd #*= CREATE TABLE IF NOT EXISTS foo(id int);
ERROR:  duplicate key value violates unique constraint
"pg_type_typname_nsp_index"
DETAIL:  Key (typname, typnamespace)=(foo, 2200) already exists.

This is what you get from concurrent CINE commands. The typname thingie
might be confusing by unexperienced users, but i think its hard to do
anything about it ?




--
Thanks
Bernd


Re: Add column if not exists (CINE)

From
Robert Haas
Date:
On Wed, Jul 21, 2010 at 2:53 PM, Bernd Helmle <mailings@oopsware.de> wrote:
>
>
> --On 1. Mai 2010 23:09:23 -0400 Robert Haas <robertmhaas@gmail.com> wrote:
>
>> On Wed, Apr 28, 2010 at 9:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>>
>>>> CREATE OR REPLACE is indeed much more complicated.  In fact, for
>>>> tables, I maintain that you'll need to link with -ldwim to make it
>>>> work properly.
>>>
>>> This may in fact be an appropriate way to handle the case for tables,
>>> given the complexity of their definitions.
>>
>> Patch attached.
>
> I had an initial look at Robert's patch. Patch applies cleanly,
> documentation and regression tests included, everything works as expected.
> When looking at the functionality there's one thing that strikes me a
> little:
>
> bernd@localhost:bernd #*= CREATE TABLE IF NOT EXISTS foo(id int);
> ERROR:  duplicate key value violates unique constraint
> "pg_type_typname_nsp_index"
> DETAIL:  Key (typname, typnamespace)=(foo, 2200) already exists.
>
> This is what you get from concurrent CINE commands. The typname thingie
> might be confusing by unexperienced users, but i think its hard to do
> anything about it ?

I get the same error message from concurrent CREATE TABLE commands
even without CINE...

S1:
rhaas=# begin;
BEGIN
rhaas=# create table foo (id int);
CREATE TABLE

S2:
rhaas=# begin;
BEGIN
rhaas=# create table foo (id int);
<blocks>

S1:
rhaas=# commit;
COMMIT

S2:
ERROR:  duplicate key value violates unique constraint
"pg_type_typname_nsp_index"
DETAIL:  Key (typname, typnamespace)=(foo, 2200) already exists.

I agree it would be nice to fix this.  I'm not sure how hard it is.  I
don't think it's the job of this patch.  :-)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Add column if not exists (CINE)

From
Simon Riggs
Date:
On Wed, 2010-04-28 at 21:15 -0400, Tom Lane wrote:

> I still say
> that COR rather than CINE semantics would be appropriate for columns.

Viewed from a locking perspective, I would disagree.

COR semantics force a table rewrite, in certain cases. That makes it
hard to predict externally how long the command will run for.

As a DBA, I would want a command that executes without rewrite (if
appropriate) or does nothing.

Predictable behaviour is the most important concern.

That isn't necessarily an argument in favour of CINE, which seems
slightly less clear about what we might expect from that, in my reading
at least.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services



Re: Add column if not exists (CINE)

From
Kjell Rune Skaaraas
Date:
Hello,

At least from a performance point of view CINE should never cause a table rewrite, it should either execute as a plain
CREATEor as "nothing". I don't mind if the CINE fails if the column already exists but with a different definition, so
maybeit could be worded differently to make it clearer what you get? How about something like: 

"ALTER TABLE foo ADD OR MATCH COLUMN bar INTEGER"
a) doesn't exist => create
b) exists and matches => nothing
c) exists and doesn't match => error

if COR semantics should ever be implmented they could be

"ALTER TABLE foo ADD OR REPLACE COLUMN bar INTEGER"
a) doesn't exist => create
b) exists and matches => nothing
c) exists and doesn't match => replace

However, I don't want it to fail unless there's an explicit conflict, because I tend to modify the columns later:
"ALTER TABLE foo ADD COLUMN bar INTEGER"
"ALTER TABLE foo ALTER COLUMN bar SET DEFAULT 0"
"ALTER TABLE foo ALTER COLUMN bar SET NOT NULL"
"ALTER TABLE foo ADD OR MATCH COLUMN bar INTEGER" <-- succeed or fail?

Personally, I'm only interested to match on TYPE so possibly:
"ALTER TABLE foo ADD OR MATCH TYPE COLUMN bar INTEGER" <-- succeed
"ALTER TABLE foo ADD OR MATCH [ALL] COLUMN bar INTEGER" <-- fail

To be honest, I think this becomes more complicated than a CINE, but as I felt that got a rather lukewarm reception
maybethis sounds better. The syntax leaves it open for COR later, and the matching code should be useful to determine
ifthe COR actually needs to do a REPLACE. Opinions? 

Regards,
Kjell Rune

--- Den tor 2010-07-22 skrev Simon Riggs <simon@2ndQuadrant.com>:

> Fra: Simon Riggs <simon@2ndQuadrant.com>
> Emne: Re: [HACKERS] Add column if not exists (CINE)
> Til: "Tom Lane" <tgl@sss.pgh.pa.us>
> Kopi: "Robert Haas" <robertmhaas@gmail.com>, "Heikki Linnakangas" <heikki.linnakangas@enterprisedb.com>, "Andrew
Dunstan"<andrew@dunslane.net>, "Takahiro Itagaki" <itagaki.takahiro@oss.ntt.co.jp>, "Kjell Rune Skaaraas"
<kjella79@yahoo.no>,pgsql-hackers@postgresql.org 
> Dato: Torsdag 22. juli 2010 02.43
> On Wed, 2010-04-28 at 21:15 -0400,
> Tom Lane wrote:
>
> > I still say
> > that COR rather than CINE semantics would be
> appropriate for columns.
>
> Viewed from a locking perspective, I would disagree.
>
> COR semantics force a table rewrite, in certain cases. That
> makes it
> hard to predict externally how long the command will run
> for.
>
> As a DBA, I would want a command that executes without
> rewrite (if
> appropriate) or does nothing.
>
> Predictable behaviour is the most important concern.
>
> That isn't necessarily an argument in favour of CINE, which
> seems
> slightly less clear about what we might expect from that,
> in my reading
> at least.
>
> --
>  Simon Riggs       
>    www.2ndQuadrant.com
>  PostgreSQL Development, 24x7 Support, Training and
> Services
>
>




Re: Add column if not exists (CINE)

From
Bernd Helmle
Date:

--On 21. Juli 2010 17:16:13 -0400 Robert Haas <robertmhaas@gmail.com> wrote:

> I get the same error message from concurrent CREATE TABLE commands
> even without CINE...
>
> S1:
> rhaas=# begin;
> BEGIN
> rhaas=# create table foo (id int);
> CREATE TABLE
>
> S2:
> rhaas=# begin;
> BEGIN
> rhaas=# create table foo (id int);
> <blocks>
>
> S1:
> rhaas=# commit;
> COMMIT
>
> S2:
> ERROR:  duplicate key value violates unique constraint
> "pg_type_typname_nsp_index"
> DETAIL:  Key (typname, typnamespace)=(foo, 2200) already exists.
>

Funny, never realized that before, but you're right.

> I agree it would be nice to fix this.  I'm not sure how hard it is.  I
> don't think it's the job of this patch.  :-)

Yes, i agree. I would like to mark this patch "Ready for Committer", if 
that's okay for you (since you are a committer you might want to commit it 
yourself). Given that there's still some discussion in progress, i'm not 
sure about it, however. The patch itself looks fine to me and I'm traveling 
this weekend, so i don't want to hold it off as long as necessary.

-- 
Thanks
Bernd


Re: Add column if not exists (CINE)

From
Robert Haas
Date:
On Fri, Jul 23, 2010 at 2:46 AM, Bernd Helmle <mailings@oopsware.de> wrote:
>
>
> --On 21. Juli 2010 17:16:13 -0400 Robert Haas <robertmhaas@gmail.com> wrote:
>
>> I get the same error message from concurrent CREATE TABLE commands
>> even without CINE...
>>
>> S1:
>> rhaas=# begin;
>> BEGIN
>> rhaas=# create table foo (id int);
>> CREATE TABLE
>>
>> S2:
>> rhaas=# begin;
>> BEGIN
>> rhaas=# create table foo (id int);
>> <blocks>
>>
>> S1:
>> rhaas=# commit;
>> COMMIT
>>
>> S2:
>> ERROR:  duplicate key value violates unique constraint
>> "pg_type_typname_nsp_index"
>> DETAIL:  Key (typname, typnamespace)=(foo, 2200) already exists.
>>
>
> Funny, never realized that before, but you're right.
>
>> I agree it would be nice to fix this.  I'm not sure how hard it is.  I
>> don't think it's the job of this patch.  :-)
>
> Yes, i agree. I would like to mark this patch "Ready for Committer", if
> that's okay for you (since you are a committer you might want to commit it
> yourself). Given that there's still some discussion in progress, i'm not
> sure about it, however. The patch itself looks fine to me and I'm traveling
> this weekend, so i don't want to hold it off as long as necessary.

As far as I can see, the other emails were regarding adding columns,
whereas this patch is about creating tables.  So I think it's OK...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Add column if not exists (CINE)

From
Robert Haas
Date:
On Fri, Jul 23, 2010 at 2:46 AM, Bernd Helmle <mailings@oopsware.de> wrote:
> Yes, i agree. I would like to mark this patch "Ready for Committer", if
> that's okay for you (since you are a committer you might want to commit it
> yourself).

I see that it is so marked, so, committed, with a minor correction to
my original docs.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Add column if not exists (CINE)

From
Robert Haas
Date:
On Thu, Jul 22, 2010 at 4:34 PM, Kjell Rune Skaaraas <kjella79@yahoo.no> wrote:
> At least from a performance point of view CINE should never cause a table rewrite, it should either execute as a
plainCREATE or as "nothing". I don't mind if the CINE fails if the column already exists but with a different
definition,so maybe it could be worded differently to make it clearer what you get? 

That's what I want, too.  The people saying we should implement COR
for columns seem to be, by and large, people who have never wished for
this feature and have no particular use case for either one.  I have
stated my use case in the past, but it has been dismissed as stupid or
contrived.  I can live with the possibility that I'm dumb, but, for
the record, I'm not making this up.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company