Thread: Add column if not exists (CINE)
Hello, I've been reading the earlier threads at: http://archives.postgresql.org/pgsql-hackers/2009-05/thrd7.php#00252 http://archives.postgresql.org/pgsql-hackers/2005-10/thrd4.php#00632 and I'm not sure I have anything that substantially new to add but: 1. I can't see there's an unambiguity about what the syntax would do. It is IF NOT EXISTS, not IF NOT LIKE. Anyone who shootsthemselves in the foot by calling a CINE and thinking that a preexisting differently defined column is magically converteddeserves it. Either it should act exactly like the non-CINE command, or do nothing at all as if the statement wasn'tthere. 2. The use case is pretty clear to me - flexible scripts that'll bring all earlier database versions to the latest schema.I've been experimenting in 9.0 alpha with calling DROP CONSTRAINT IF EXISTS then ADD CONSTRAINT with named constantsfor a CINE effect. which as a side effect will correct any updated constraints too - and it works great. UnfortunatelyDROP COLUMN IF EXISTS then ADD COLUMN has the side effect of deleting all the data, so that's hardly usable. I saw some indications that this might be a minority opinion, well I would like to cast a vote FOR this functionality. Theworkarounds are ugly, the solution simple and while I agree it's possible to misuse it, my opinion is that you shouldn'tbecome a surgeon if you can't handle a scalpel. In this case I get the feeling I'm reading instructions on how todo surgery with a butter knife because we don't dare hand out anything sharper. Regards, Kjell Rune Skaaraas
On Tue, Apr 27, 2010 at 6:45 PM, Kjell Rune Skaaraas <kjella79@yahoo.no> wrote: > Hello, > > I've been reading the earlier threads at: > http://archives.postgresql.org/pgsql-hackers/2009-05/thrd7.php#00252 > http://archives.postgresql.org/pgsql-hackers/2005-10/thrd4.php#00632 > and I'm not sure I have anything that substantially new to add but: > > 1. I can't see there's an unambiguity about what the syntax would do. It is IF NOT EXISTS, not IF NOT LIKE. Anyone whoshoots themselves in the foot by calling a CINE and thinking that a preexisting differently defined column is magicallyconverted deserves it. Either it should act exactly like the non-CINE command, or do nothing at all as if the statementwasn't there. > > 2. The use case is pretty clear to me - flexible scripts that'll bring all earlier database versions to the latest schema.I've been experimenting in 9.0 alpha with calling DROP CONSTRAINT IF EXISTS then ADD CONSTRAINT with named constantsfor a CINE effect. which as a side effect will correct any updated constraints too - and it works great. UnfortunatelyDROP COLUMN IF EXISTS then ADD COLUMN has the side effect of deleting all the data, so that's hardly usable. > > I saw some indications that this might be a minority opinion, well I would like to cast a vote FOR this functionality.The workarounds are ugly, the solution simple and while I agree it's possible to misuse it, my opinion is thatyou shouldn't become a surgeon if you can't handle a scalpel. In this case I get the feeling I'm reading instructionson how to do surgery with a butter knife because we don't dare hand out anything sharper. I've already said my piece on this, but I couldn't agree more. Well said, and your use case is exactly the one I want it for. ...Robert
Kjell Rune Skaaraas <kjella79@yahoo.no> wrote: > I've been reading the earlier threads at: > http://archives.postgresql.org/pgsql-hackers/2009-05/thrd7.php#00252 > http://archives.postgresql.org/pgsql-hackers/2005-10/thrd4.php#00632 > and I'm not sure I have anything that substantially new to add but: > > I saw some indications that this might be a minority opinion, > well I would like to cast a vote FOR this functionality. +1 for CINE, just because MySQL supports it. But before developing, we need to decide how to handle an added object that has the same name but has different definitions. Also, developers should consider not only ADD COLUMN but also other CREATE or ADD commands. The patch will be large, including documentation adjustments in many places -- it would be hard work. Regards, --- Takahiro Itagaki NTT Open Source Software Center
Takahiro Itagaki wrote: > Kjell Rune Skaaraas <kjella79@yahoo.no> wrote: > > >> I've been reading the earlier threads at: >> http://archives.postgresql.org/pgsql-hackers/2009-05/thrd7.php#00252 >> http://archives.postgresql.org/pgsql-hackers/2005-10/thrd4.php#00632 >> and I'm not sure I have anything that substantially new to add but: >> >> I saw some indications that this might be a minority opinion, >> well I would like to cast a vote FOR this functionality. >> > > +1 for CINE, just because MySQL supports it. > MySQL compatibility has never been our aim. We should adopt ideas from other projects because they are good, not just because they are there. That doesn't mean I don't think this is a good idea. > But before developing, we need to decide how to handle an added object > that has the same name but has different definitions. > The OP explicitly stated that in his opinion nothing should be done in such cases. That's a defensible position, in the case of objects such as tables that must be unique by name (e.g. tables). But what would we do about objects where the name could be overloaded? Since we would presumably want to do this for all (or almost all) of our CREATE/ADD commands, we'd need a policy on those. > Also, developers should consider not only ADD COLUMN but also other > CREATE or ADD commands. The patch will be large, including documentation > adjustments in many places -- it would be hard work. > > > I can speak with some experience on this at least. :-) I don't see that it would be a heck of a lot bigger than the DROP IF EXISTS cases, which after the first few had been done were not hard, merely tedious to do :-) cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Takahiro Itagaki wrote: >> But before developing, we need to decide how to handle an added object >> that has the same name but has different definitions. > The OP explicitly stated that in his opinion nothing should be done in > such cases. That's a defensible position, in the case of objects such as > tables that must be unique by name (e.g. tables). But what would we do > about objects where the name could be overloaded? Even if it's defensible, the consensus position so far has been that it's a bad design. Every time we've looked at this, we have concluded that CREATE OR REPLACE semantics are considerably safer to use, because there is no question what the state of the object is afterwards. That argument is just as valid for a column as for anything larger. AFAICS, the only excuse CINE has for living is that (people think) it would take less work to implement. regards, tom lane
On Tue, Apr 27, 2010 at 08:18:13PM -0400, Robert Haas wrote: > On Tue, Apr 27, 2010 at 6:45 PM, Kjell Rune Skaaraas <kjella79@yahoo.no> wrote: [snip] > > I saw some indications that this might be a minority opinion, well I would like to cast a vote FOR this functionality.The workarounds are ugly, the solution simple and while I agree it's possible to misuse it, my opinion is thatyou shouldn't become a surgeon if you can't handle a scalpel. In this case I get the feeling I'm reading instructionson how to do surgery with a butter knife because we don't dare hand out anything sharper. > > I've already said my piece on this, but I couldn't agree more. Well > said, and your use case is exactly the one I want it for. > +1 (Scribbles down the phrase "instructions on how to do surgery with a butter knife because we don't dare hand out anything sharper" for future repurposing) Ross -- Ross Reedstrom, Ph.D. reedstrm@rice.edu Systems Engineer & Admin, Research Scientist phone: 713-348-6166 The Connexions Project http://cnx.org fax: 713-348-3665 Rice University MS-375, Houston, TX 77005 GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE
On Wed, Apr 28, 2010 at 9:58 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> Takahiro Itagaki wrote: >>> But before developing, we need to decide how to handle an added object >>> that has the same name but has different definitions. > >> The OP explicitly stated that in his opinion nothing should be done in >> such cases. That's a defensible position, in the case of objects such as >> tables that must be unique by name (e.g. tables). But what would we do >> about objects where the name could be overloaded? > > Even if it's defensible, the consensus position so far has been that > it's a bad design. Every time we've looked at this, we have concluded > that CREATE OR REPLACE semantics are considerably safer to use, because > there is no question what the state of the object is afterwards. That > argument is just as valid for a column as for anything larger. > > AFAICS, the only excuse CINE has for living is that (people think) > it would take less work to implement. I don't believe you are fairly stating the consensus from previous discussion and I believe that you are actually in the minority on this one. I agree that we probably don't need to support this for object types for which CREATE OR REPLACE is available or can be made available, but that isn't feasible for all object types - tables and columns being the obvious examples. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > I don't believe you are fairly stating the consensus from previous > discussion and I believe that you are actually in the minority on this > one. I agree that we probably don't need to support this for object > types for which CREATE OR REPLACE is available or can be made > available, but that isn't feasible for all object types - tables and > columns being the obvious examples. What's obvious about it? In particular, I should think that ADD OR REPLACE COLUMN would usefully be defined as "ADD if no such column, else ALTER COLUMN as necessary to match this spec". Dropping the ALTER part of that has no benefit except to lazy implementors; it certainly is not more useful to users if they can't be sure of the column properties after issuing the command. regards, tom lane
Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> I don't believe you are fairly stating the consensus from previous >> discussion and I believe that you are actually in the minority on this >> one. I agree that we probably don't need to support this for object >> types for which CREATE OR REPLACE is available or can be made >> available, but that isn't feasible for all object types - tables and >> columns being the obvious examples. > > What's obvious about it? In particular, I should think that ADD OR > REPLACE COLUMN would usefully be defined as "ADD if no such column, > else ALTER COLUMN as necessary to match this spec". Dropping the > ALTER part of that has no benefit except to lazy implementors; it > certainly is not more useful to users if they can't be sure of the > column properties after issuing the command. Right. A trivial implementation of CREATE OR REPLACE is to create the object if it doesn't exist, do nothing if it exists already and is identical to the new definition, and throw an error if it's not identical. That covers the same use case as CREATE IF NOT EXISTS, but you know what the state is after a successful execution, is easy to implement, and is in line with the existing CREATE OR REPLACE commands. And can be extended in the future to alter the existing object instead of throwing an error. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Wed, Apr 28, 2010 at 11:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> I don't believe you are fairly stating the consensus from previous >> discussion and I believe that you are actually in the minority on this >> one. I agree that we probably don't need to support this for object >> types for which CREATE OR REPLACE is available or can be made >> available, but that isn't feasible for all object types - tables and >> columns being the obvious examples. > > What's obvious about it? In particular, I should think that ADD OR > REPLACE COLUMN would usefully be defined as "ADD if no such column, > else ALTER COLUMN as necessary to match this spec". Dropping the > ALTER part of that has no benefit except to lazy implementors; it > certainly is not more useful to users if they can't be sure of the > column properties after issuing the command. Actually, that's a good idea. But how will you handle tables? ...Robert
Robert Haas wrote: > On Wed, Apr 28, 2010 at 11:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> I don't believe you are fairly stating the consensus from previous >>> discussion and I believe that you are actually in the minority on this >>> one. I agree that we probably don't need to support this for object >>> types for which CREATE OR REPLACE is available or can be made >>> available, but that isn't feasible for all object types - tables and >>> columns being the obvious examples. >> What's obvious about it? In particular, I should think that ADD OR >> REPLACE COLUMN would usefully be defined as "ADD if no such column, >> else ALTER COLUMN as necessary to match this spec". Dropping the >> ALTER part of that has no benefit except to lazy implementors; it >> certainly is not more useful to users if they can't be sure of the >> column properties after issuing the command. > > Actually, that's a good idea. But how will you handle tables? What do you mean? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Robert Haas wrote: > On Wed, Apr 28, 2010 at 11:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Robert Haas <robertmhaas@gmail.com> writes: >> >>> I don't believe you are fairly stating the consensus from previous >>> discussion and I believe that you are actually in the minority on this >>> one. I agree that we probably don't need to support this for object >>> types for which CREATE OR REPLACE is available or can be made >>> available, but that isn't feasible for all object types - tables and >>> columns being the obvious examples. >>> >> What's obvious about it? In particular, I should think that ADD OR >> REPLACE COLUMN would usefully be defined as "ADD if no such column, >> else ALTER COLUMN as necessary to match this spec". Dropping the >> ALTER part of that has no benefit except to lazy implementors; it >> certainly is not more useful to users if they can't be sure of the >> column properties after issuing the command. >> > > Actually, that's a good idea. But how will you handle tables? > > > I think I Iike Heikki's suggestion better, to error out if the object exists but the properties differ. At least I'd like an option for that. cheers andrew
Robert Haas <robertmhaas@gmail.com> writes: > Actually, that's a good idea. But how will you handle tables? Well, tables are a special case, mainly because it's not clear how to avoid accidentally throwing away data. (In particular if some column in the existing table isn't there in the new definition. It's a bit scary to just drop the column, IMO.) I don't see that that argument applies to doing an automatic ALTER COLUMN, though, especially since the only column type alterations that will go through without a USING clause are reasonably straightforward. regards, tom lane
On Wed, Apr 28, 2010 at 12:07 PM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > Robert Haas wrote: >> On Wed, Apr 28, 2010 at 11:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Robert Haas <robertmhaas@gmail.com> writes: >>>> I don't believe you are fairly stating the consensus from previous >>>> discussion and I believe that you are actually in the minority on this >>>> one. I agree that we probably don't need to support this for object >>>> types for which CREATE OR REPLACE is available or can be made >>>> available, but that isn't feasible for all object types - tables and >>>> columns being the obvious examples. >>> What's obvious about it? In particular, I should think that ADD OR >>> REPLACE COLUMN would usefully be defined as "ADD if no such column, >>> else ALTER COLUMN as necessary to match this spec". Dropping the >>> ALTER part of that has no benefit except to lazy implementors; it >>> certainly is not more useful to users if they can't be sure of the >>> column properties after issuing the command. >> >> Actually, that's a good idea. But how will you handle tables? > > What do you mean? Well, how would you define CREATE OR REPLACE TABLE? I think that doesn't make much sense, which is why I think CREATE IF NOT EXISTS is a reasonable approach. ...Robert
On Wed, Apr 28, 2010 at 1:40 PM, Dimitri Fontaine <dfontaine@hi-media.com> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Well, how would you define CREATE OR REPLACE TABLE? I think that >> doesn't make much sense, which is why I think CREATE IF NOT EXISTS is >> a reasonable approach. > > <hand waving time> > > The behavior I'd like to have would be to allow me to give a SELECT > query to run for replacing what is there if there's something. If the > query can not be run on the existing data set, error out of course. > > So you know the state for sure after the command, but it depends on your > query being correct. And you can (de)normalize existing data using joins. > > The REPLACE keyword would here mean that there's a CTAS going under the > hood, then we add the constraints and indexes and triggers etc. That > would mean being able to express those entities changes too, but it > seems important. > > Well, that may be not precise enough as a spec, but at least that's food > for though I hope. This type of hand-waving convinces me more than ever that we should just implement CINE, and it should just C if it doesn't already E. This is what has been requested multiple times, by multiple people, including various people who don't normally poke their head into -hackers. I think the resistance to a straightforward implementation with easy-to-understand behavior is completely unjustifiable. It's completely unobvious to me that all of the above will work at all and, if it did, whether it would actually solve the problems that I care about, like being able to write schema-upgrade scripts that would work in a simple and predictable fashion. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > Well, how would you define CREATE OR REPLACE TABLE? I think that > doesn't make much sense, which is why I think CREATE IF NOT EXISTS is > a reasonable approach. <hand waving time> The behavior I'd like to have would be to allow me to give a SELECT query to run for replacing what is there if there's something. If the query can not be run on the existing data set, error out of course. So you know the state for sure after the command, but it depends on your query being correct. And you can (de)normalize existing data using joins. The REPLACE keyword would here mean that there's a CTAS going under the hood, then we add the constraints and indexes and triggers etc. That would mean being able to express those entities changes too, but it seems important. Well, that may be not precise enough as a spec, but at least that's food for though I hope. Regards, -- dim
Robert Haas wrote: > On Wed, Apr 28, 2010 at 12:07 PM, Heikki Linnakangas > <heikki.linnakangas@enterprisedb.com> wrote: >> Robert Haas wrote: >>> On Wed, Apr 28, 2010 at 11:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> Robert Haas <robertmhaas@gmail.com> writes: >>>>> I don't believe you are fairly stating the consensus from previous >>>>> discussion and I believe that you are actually in the minority on this >>>>> one. I agree that we probably don't need to support this for object >>>>> types for which CREATE OR REPLACE is available or can be made >>>>> available, but that isn't feasible for all object types - tables and >>>>> columns being the obvious examples. >>>> What's obvious about it? In particular, I should think that ADD OR >>>> REPLACE COLUMN would usefully be defined as "ADD if no such column, >>>> else ALTER COLUMN as necessary to match this spec". Dropping the >>>> ALTER part of that has no benefit except to lazy implementors; it >>>> certainly is not more useful to users if they can't be sure of the >>>> column properties after issuing the command. >>> Actually, that's a good idea. But how will you handle tables? >> What do you mean? > > Well, how would you define CREATE OR REPLACE TABLE? It the table doesn't exist, create it. If it exists with the same name and same columns and constraints and all, do nothing. Otherwise throw an error. Maybe it should also check that the existing table is empty. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
--- Den ons 2010-04-28 skrev Tom Lane <tgl@sss.pgh.pa.us>: > Fra: Tom Lane <tgl@sss.pgh.pa.us> > Emne: Re: [HACKERS] Add column if not exists (CINE) > Til: "Robert Haas" <robertmhaas@gmail.com> > Kopi: "Andrew Dunstan" <andrew@dunslane.net>, "Takahiro Itagaki" <itagaki.takahiro@oss.ntt.co.jp>, "Kjell Rune Skaaraas"<kjella79@yahoo.no>, pgsql-hackers@postgresql.org > Dato: Onsdag 28. april 2010 17.20 > Robert Haas <robertmhaas@gmail.com> > writes: > > I don't believe you are fairly stating the consensus > from previous > > discussion and I believe that you are actually in the > minority on this > > one. I agree that we probably don't need to > support this for object > > types for which CREATE OR REPLACE is available or can > be made > > available, but that isn't feasible for all object > types - tables and > > columns being the obvious examples. > > What's obvious about it? In particular, I should > think that ADD OR > REPLACE COLUMN would usefully be defined as "ADD if no such > column, > else ALTER COLUMN as necessary to match this spec". > Dropping the > ALTER part of that has no benefit except to lazy > implementors; it > certainly is not more useful to users if they can't be sure > of the > column properties after issuing the command. To me this construct seems horribly ambigious. Imagine I did a ALTER TABLE foo ADD COLUMN bar INTEGER NOT NULL UNIQUE DEFAULT10, then an ALTER TABLE foo ADD OR REPLACE COLUMN bar BIGINT. Would I get a BIGINT NOT NULL UNIQUE DEFAULT 10 or aplain BIGINT? Either way I think one group will be disappointed because it either trashes all your other setup *or* forcesyou to call DROP NOT NULL, DROP DEFAULT etc. when you don't want it. There's a reason why there's no ALTER TABLE foo SET COLUMN bar [definition]" and instead many statements. Remember it hasto deal with all these possible column constraints in ADD COLUMN: where column_constraint is: [ CONSTRAINT constraint_name ] { NOT NULL | NULL | UNIQUE index_parameters | PRIMARY KEY index_parameters | CHECK ( expression ) | REFERENCES reftable[ ( refcolumn ) ] [ MATCH FULL | MATCH PARTIAL | MATCH SIMPLE ] [ ON DELETE action ] [ ON UPDATE action ] } [ DEFERRABLE | NOT DEFERRABLE ] [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ] What about something like CHECK? Would you assume it's a complete set of CHECKs and drop the rest? Or just keep creatingnew CHECKs every time it is run? Dupe checking? CINE has none of these problems, either the column didn't exist before so there's nothing to worry about or the command doesnothing. True, you may have a borked column but not if you follow a simple design pattern of never recasting a columntype but rather add a new, migrate your data and update your queries. And for the exceptions to that rule, you canadd a ALTER COLUMN SET DATA TYPE (or any of the other ALTERs) after the CINE in your scipt. If the CINE triggered allis the latest version, if not the detailed ALTERs will change any column that needs changing. Clean and simple. Regards, Kjell Rune
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > Robert Haas wrote: >> Well, how would you define CREATE OR REPLACE TABLE? > It the table doesn't exist, create it. If it exists with the same name > and same columns and constraints and all, do nothing. Otherwise throw an > error. > Maybe it should also check that the existing table is empty. The last bit doesn't seem to make sense. If you want an empty table, you can do DROP IF EXISTS and then CREATE. ISTM that the use-cases where you don't want to do that are cases where you don't want to lose existing data. For either CINE or COR, there are a number of issues that are being hand-waved away here: is it OK to change ownership and/or permissions? What about foreign key constraints relating this table to others? For that matter it's not real clear that indexes, check constraints, etc should be allowed to survive. If they are allowed to survive then CINE TABLE is just the tip of the iceberg: to do anything useful you'd also need CINE for ADD CONSTRAINT, CREATE INDEX, ADD FOREIGN KEY, etc. And the more of those you add, the more problematic it gets to allow existing objects that don't quite match what the command says. Any of these commands are headache-y for something as complicated as a table. I'm not at all impressed by the argument that mysql does it, because they are *notorious* for being willing to ship half-baked solutions. regards, tom lane
On Wed, Apr 28, 2010 at 2:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: >> Robert Haas wrote: >>> Well, how would you define CREATE OR REPLACE TABLE? > >> It the table doesn't exist, create it. If it exists with the same name >> and same columns and constraints and all, do nothing. Otherwise throw an >> error. > >> Maybe it should also check that the existing table is empty. > > The last bit doesn't seem to make sense. If you want an empty table, > you can do DROP IF EXISTS and then CREATE. ISTM that the use-cases > where you don't want to do that are cases where you don't want to lose > existing data. Right. > For either CINE or COR, there are a number of issues that are being > hand-waved away here: is it OK to change ownership and/or permissions? > What about foreign key constraints relating this table to others? > For that matter it's not real clear that indexes, check constraints, > etc should be allowed to survive. If they are allowed to survive then > CINE TABLE is just the tip of the iceberg: to do anything useful you'd > also need CINE for ADD CONSTRAINT, CREATE INDEX, ADD FOREIGN KEY, etc. > And the more of those you add, the more problematic it gets to allow > existing objects that don't quite match what the command says. > > Any of these commands are headache-y for something as complicated > as a table. I'm not at all impressed by the argument that mysql > does it, because they are *notorious* for being willing to ship > half-baked solutions. We can artificially make this problem as complicated as we wish, but the people who are asking for this feature (including me) will, I believe, be quite happy with a solution that throws, say, a NOTICE instead of an ERROR when the object already exists, and then returns without doing anything further. There are very few, if any, definitional issues here, except by people who are brainstorming crazy alternative behaviors whose actual usefulness I very much doubt. CREATE OR REPLACE is indeed much more complicated. In fact, for tables, I maintain that you'll need to link with -ldwim to make it work properly. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > We can artificially make this problem as complicated as we wish, but > the people who are asking for this feature (including me) will, I > believe, be quite happy with a solution that throws, say, a NOTICE > instead of an ERROR when the object already exists, and then returns > without doing anything further. There are very few, if any, > definitional issues here, except by people who are brainstorming crazy > alternative behaviors whose actual usefulness I very much doubt. > CREATE OR REPLACE is indeed much more complicated. In fact, for > tables, I maintain that you'll need to link with -ldwim to make it > work properly. This may in fact be an appropriate way to handle the case for tables, given the complexity of their definitions. However, the original point of the thread was about what to do for columns. I still say that COR rather than CINE semantics would be appropriate for columns. regards, tom lane
robertmhaas@gmail.com (Robert Haas) writes: > On Wed, Apr 28, 2010 at 1:40 PM, Dimitri Fontaine > <dfontaine@hi-media.com> wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> Well, how would you define CREATE OR REPLACE TABLE? I think that >>> doesn't make much sense, which is why I think CREATE IF NOT EXISTS is >>> a reasonable approach. >> >> <hand waving time> >> >> The behavior I'd like to have would be to allow me to give a SELECT >> query to run for replacing what is there if there's something. If the >> query can not be run on the existing data set, error out of course. >> >> So you know the state for sure after the command, but it depends on your >> query being correct. And you can (de)normalize existing data using joins. >> >> The REPLACE keyword would here mean that there's a CTAS going under the >> hood, then we add the constraints and indexes and triggers etc. That >> would mean being able to express those entities changes too, but it >> seems important. >> >> Well, that may be not precise enough as a spec, but at least that's food >> for though I hope. > > This type of hand-waving convinces me more than ever that we should > just implement CINE, and it should just C if it doesn't already E. > This is what has been requested multiple times, by multiple people, > including various people who don't normally poke their head into > -hackers. I think the resistance to a straightforward implementation > with easy-to-understand behavior is completely unjustifiable. It's > completely unobvious to me that all of the above will work at all and, > if it did, whether it would actually solve the problems that I care > about, like being able to write schema-upgrade scripts that would work > in a simple and predictable fashion. I tend to agree with you here. While yes, "CINE is a simplification of COR (CREATE OR REPLACE)", I'm not at all sure that it's reasonable to hope for the latter, in that it elides potentially grave problems that aren't reasonable to expect solved. Notably, the "and what if a substantial data transformation is needed to accomplish this?" CINE doesn't propose to try to do that transformation, which seems like the right choice to me. When I put my "we've got things replicating using Slony-I" hat on, CINE looks pretty preferable to me. It's unambitious - but it is certainly NOT doing a bunch of magic behind your back so as to make it tougher to predict what might happen in a trigger-replicated environment. In any case, CINE seems pretty useful to me. I'm prepared to listen to persuasion, but thus far, it looks like a "+1 from me." An alternative that seems likable is "COR, raising an exception if there's a type mismatch." Where there's certainly room to debate how much of a difference represents a "mismatch." -- "cbbrowne","@","gmail.com" http://linuxfinances.info/info/wp.html Predestination was doomed from the start.
Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > We can artificially make this problem as complicated as we wish, but > > the people who are asking for this feature (including me) will, I > > believe, be quite happy with a solution that throws, say, a NOTICE > > instead of an ERROR when the object already exists, and then returns > > without doing anything further. There are very few, if any, > > definitional issues here, except by people who are brainstorming crazy > > alternative behaviors whose actual usefulness I very much doubt. > > > CREATE OR REPLACE is indeed much more complicated. In fact, for > > tables, I maintain that you'll need to link with -ldwim to make it > > work properly. > > This may in fact be an appropriate way to handle the case for tables, > given the complexity of their definitions. However, the original > point of the thread was about what to do for columns. I still say > that COR rather than CINE semantics would be appropriate for columns. I have added this TODO item: Allow CREATE TABLE to optionally create a table if it does not alreadyexist, without throwing an error The fact that tablescontain data makes this more complex than otherCREATE OR REPLACE operations. * http://archives.postgresql.org/pgsql-hackers/2010-04/msg01300.php -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com
On Sat, May 1, 2010 at 2:26 PM, Kjell Rune Skaaraas <kjella79@yahoo.no> wrote: > In other words, pretty much all the hard bits I seem to hear people agree > on exist still apply to the single column. COR for columns was suggested > already back in the same thread in 2005: > > http://archives.postgresql.org/pgsql-hackers/2005-10/msg00644.php > > We're already talking 2010, what I fear is that COR is that shiny goal > far, far out there but if I come back in 2015 there'll still be neither > COR or CINE. I really can't understand how CINE can be equally hard to > implement as COR, since CINE is simply to NOT throw an error. It's not hard to implement at all; this whole discussion - and all the ones that have preceded it - are based on the theory that people who are asking for CINE are too stupid to see that CINE isn't really going to fix their problem. That comes in several flavors, including: - 90% of the things you would want to do in real life are way more complicated anyway, so CINE by itself is no use, - COR is really what you want but since you're too lazy to implement that you want CINE instead, - MySQL has it so it must be a bad idea (even though that was part of our justification for adopting DIE), and/or - most people agree that we shouldn't implement CINE. I am usually very supportive of our community's decision-making process, but in this case I think we're just being extraordinarily stubborn, and frankly insulting the intelligence of our end-users, among whom I number myself. Q. I ran CREATE IF NOT EXISTS and it didn't magically make my table match the column list I gave! A1. Duh. A2. Did you notice the part where it said: NOTICE: table "dwiw" already exists, skipping ...and if yes, what did you think the word "skipping" (which also appears in the similar notice we give for DROP IF EXISTS) meant? ...Robert
On Wed, Apr 28, 2010 at 9:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> CREATE OR REPLACE is indeed much more complicated. In fact, for >> tables, I maintain that you'll need to link with -ldwim to make it >> work properly. > > This may in fact be an appropriate way to handle the case for tables, > given the complexity of their definitions. Patch attached. ...Robert
Attachment
--- Den fre 2010-04-30 skrev Bruce Momjian <bruce@momjian.us>: > Tom Lane wrote: > > Robert Haas <robertmhaas@gmail.com> > writes: > > > We can artificially make this problem as > complicated as we wish, but > > > the people who are asking for this feature > (including me) will, I > > > believe, be quite happy with a solution that > throws, say, a NOTICE > > > instead of an ERROR when the object already > exists, and then returns > > > without doing anything further. There are > very few, if any, > > > definitional issues here, except by people who > are brainstorming crazy > > > alternative behaviors whose actual usefulness I > very much doubt. > > > > > CREATE OR REPLACE is indeed much more > complicated. In fact, for > > > tables, I maintain that you'll need to link with > -ldwim to make it > > > work properly. > > > > This may in fact be an appropriate way to handle the > case for tables, > > given the complexity of their definitions. > However, the original > > point of the thread was about what to do for > columns. I still say > > that COR rather than CINE semantics would be > appropriate for columns. > > I have added this TODO item: > > Allow CREATE TABLE to optionally create > a table if it does not already > exist, without throwing an error > > The fact that tables > contain data makes this more complex than other > CREATE OR REPLACE operations. If you could write a COR for columns, then a COR for tables would be 90% done by calling COR on each column and drop/keep the rest (COR WITH/WITHOUT DROP?). You'd have to deal with table constraints but you also have to handle column constraints to do COR on columns. In other words, pretty much all the hard bits I seem to hear people agree on exist still apply to the single column. COR for columns was suggested already back in the same thread in 2005: http://archives.postgresql.org/pgsql-hackers/2005-10/msg00644.php We're already talking 2010, what I fear is that COR is that shiny goal far, far out there but if I come back in 2015 there'll still be neither COR or CINE. I really can't understand how CINE can be equally hard to implement as COR, since CINE is simply to NOT throw an error. Regards, Kjell Rune
--On 1. Mai 2010 23:09:23 -0400 Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Apr 28, 2010 at 9:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> CREATE OR REPLACE is indeed much more complicated. In fact, for >>> tables, I maintain that you'll need to link with -ldwim to make it >>> work properly. >> >> This may in fact be an appropriate way to handle the case for tables, >> given the complexity of their definitions. > > Patch attached. > I had an initial look at Robert's patch. Patch applies cleanly, documentation and regression tests included, everything works as expected. When looking at the functionality there's one thing that strikes me a little: bernd@localhost:bernd #*= CREATE TABLE IF NOT EXISTS foo(id int); ERROR: duplicate key value violates unique constraint "pg_type_typname_nsp_index" DETAIL: Key (typname, typnamespace)=(foo, 2200) already exists. This is what you get from concurrent CINE commands. The typname thingie might be confusing by unexperienced users, but i think its hard to do anything about it ? -- Thanks Bernd
On Wed, Jul 21, 2010 at 2:53 PM, Bernd Helmle <mailings@oopsware.de> wrote: > > > --On 1. Mai 2010 23:09:23 -0400 Robert Haas <robertmhaas@gmail.com> wrote: > >> On Wed, Apr 28, 2010 at 9:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> >>>> CREATE OR REPLACE is indeed much more complicated. In fact, for >>>> tables, I maintain that you'll need to link with -ldwim to make it >>>> work properly. >>> >>> This may in fact be an appropriate way to handle the case for tables, >>> given the complexity of their definitions. >> >> Patch attached. > > I had an initial look at Robert's patch. Patch applies cleanly, > documentation and regression tests included, everything works as expected. > When looking at the functionality there's one thing that strikes me a > little: > > bernd@localhost:bernd #*= CREATE TABLE IF NOT EXISTS foo(id int); > ERROR: duplicate key value violates unique constraint > "pg_type_typname_nsp_index" > DETAIL: Key (typname, typnamespace)=(foo, 2200) already exists. > > This is what you get from concurrent CINE commands. The typname thingie > might be confusing by unexperienced users, but i think its hard to do > anything about it ? I get the same error message from concurrent CREATE TABLE commands even without CINE... S1: rhaas=# begin; BEGIN rhaas=# create table foo (id int); CREATE TABLE S2: rhaas=# begin; BEGIN rhaas=# create table foo (id int); <blocks> S1: rhaas=# commit; COMMIT S2: ERROR: duplicate key value violates unique constraint "pg_type_typname_nsp_index" DETAIL: Key (typname, typnamespace)=(foo, 2200) already exists. I agree it would be nice to fix this. I'm not sure how hard it is. I don't think it's the job of this patch. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On Wed, 2010-04-28 at 21:15 -0400, Tom Lane wrote: > I still say > that COR rather than CINE semantics would be appropriate for columns. Viewed from a locking perspective, I would disagree. COR semantics force a table rewrite, in certain cases. That makes it hard to predict externally how long the command will run for. As a DBA, I would want a command that executes without rewrite (if appropriate) or does nothing. Predictable behaviour is the most important concern. That isn't necessarily an argument in favour of CINE, which seems slightly less clear about what we might expect from that, in my reading at least. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services
Hello, At least from a performance point of view CINE should never cause a table rewrite, it should either execute as a plain CREATEor as "nothing". I don't mind if the CINE fails if the column already exists but with a different definition, so maybeit could be worded differently to make it clearer what you get? How about something like: "ALTER TABLE foo ADD OR MATCH COLUMN bar INTEGER" a) doesn't exist => create b) exists and matches => nothing c) exists and doesn't match => error if COR semantics should ever be implmented they could be "ALTER TABLE foo ADD OR REPLACE COLUMN bar INTEGER" a) doesn't exist => create b) exists and matches => nothing c) exists and doesn't match => replace However, I don't want it to fail unless there's an explicit conflict, because I tend to modify the columns later: "ALTER TABLE foo ADD COLUMN bar INTEGER" "ALTER TABLE foo ALTER COLUMN bar SET DEFAULT 0" "ALTER TABLE foo ALTER COLUMN bar SET NOT NULL" "ALTER TABLE foo ADD OR MATCH COLUMN bar INTEGER" <-- succeed or fail? Personally, I'm only interested to match on TYPE so possibly: "ALTER TABLE foo ADD OR MATCH TYPE COLUMN bar INTEGER" <-- succeed "ALTER TABLE foo ADD OR MATCH [ALL] COLUMN bar INTEGER" <-- fail To be honest, I think this becomes more complicated than a CINE, but as I felt that got a rather lukewarm reception maybethis sounds better. The syntax leaves it open for COR later, and the matching code should be useful to determine ifthe COR actually needs to do a REPLACE. Opinions? Regards, Kjell Rune --- Den tor 2010-07-22 skrev Simon Riggs <simon@2ndQuadrant.com>: > Fra: Simon Riggs <simon@2ndQuadrant.com> > Emne: Re: [HACKERS] Add column if not exists (CINE) > Til: "Tom Lane" <tgl@sss.pgh.pa.us> > Kopi: "Robert Haas" <robertmhaas@gmail.com>, "Heikki Linnakangas" <heikki.linnakangas@enterprisedb.com>, "Andrew Dunstan"<andrew@dunslane.net>, "Takahiro Itagaki" <itagaki.takahiro@oss.ntt.co.jp>, "Kjell Rune Skaaraas" <kjella79@yahoo.no>,pgsql-hackers@postgresql.org > Dato: Torsdag 22. juli 2010 02.43 > On Wed, 2010-04-28 at 21:15 -0400, > Tom Lane wrote: > > > I still say > > that COR rather than CINE semantics would be > appropriate for columns. > > Viewed from a locking perspective, I would disagree. > > COR semantics force a table rewrite, in certain cases. That > makes it > hard to predict externally how long the command will run > for. > > As a DBA, I would want a command that executes without > rewrite (if > appropriate) or does nothing. > > Predictable behaviour is the most important concern. > > That isn't necessarily an argument in favour of CINE, which > seems > slightly less clear about what we might expect from that, > in my reading > at least. > > -- > Simon Riggs > www.2ndQuadrant.com > PostgreSQL Development, 24x7 Support, Training and > Services > >
--On 21. Juli 2010 17:16:13 -0400 Robert Haas <robertmhaas@gmail.com> wrote: > I get the same error message from concurrent CREATE TABLE commands > even without CINE... > > S1: > rhaas=# begin; > BEGIN > rhaas=# create table foo (id int); > CREATE TABLE > > S2: > rhaas=# begin; > BEGIN > rhaas=# create table foo (id int); > <blocks> > > S1: > rhaas=# commit; > COMMIT > > S2: > ERROR: duplicate key value violates unique constraint > "pg_type_typname_nsp_index" > DETAIL: Key (typname, typnamespace)=(foo, 2200) already exists. > Funny, never realized that before, but you're right. > I agree it would be nice to fix this. I'm not sure how hard it is. I > don't think it's the job of this patch. :-) Yes, i agree. I would like to mark this patch "Ready for Committer", if that's okay for you (since you are a committer you might want to commit it yourself). Given that there's still some discussion in progress, i'm not sure about it, however. The patch itself looks fine to me and I'm traveling this weekend, so i don't want to hold it off as long as necessary. -- Thanks Bernd
On Fri, Jul 23, 2010 at 2:46 AM, Bernd Helmle <mailings@oopsware.de> wrote: > > > --On 21. Juli 2010 17:16:13 -0400 Robert Haas <robertmhaas@gmail.com> wrote: > >> I get the same error message from concurrent CREATE TABLE commands >> even without CINE... >> >> S1: >> rhaas=# begin; >> BEGIN >> rhaas=# create table foo (id int); >> CREATE TABLE >> >> S2: >> rhaas=# begin; >> BEGIN >> rhaas=# create table foo (id int); >> <blocks> >> >> S1: >> rhaas=# commit; >> COMMIT >> >> S2: >> ERROR: duplicate key value violates unique constraint >> "pg_type_typname_nsp_index" >> DETAIL: Key (typname, typnamespace)=(foo, 2200) already exists. >> > > Funny, never realized that before, but you're right. > >> I agree it would be nice to fix this. I'm not sure how hard it is. I >> don't think it's the job of this patch. :-) > > Yes, i agree. I would like to mark this patch "Ready for Committer", if > that's okay for you (since you are a committer you might want to commit it > yourself). Given that there's still some discussion in progress, i'm not > sure about it, however. The patch itself looks fine to me and I'm traveling > this weekend, so i don't want to hold it off as long as necessary. As far as I can see, the other emails were regarding adding columns, whereas this patch is about creating tables. So I think it's OK... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On Fri, Jul 23, 2010 at 2:46 AM, Bernd Helmle <mailings@oopsware.de> wrote: > Yes, i agree. I would like to mark this patch "Ready for Committer", if > that's okay for you (since you are a committer you might want to commit it > yourself). I see that it is so marked, so, committed, with a minor correction to my original docs. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On Thu, Jul 22, 2010 at 4:34 PM, Kjell Rune Skaaraas <kjella79@yahoo.no> wrote: > At least from a performance point of view CINE should never cause a table rewrite, it should either execute as a plainCREATE or as "nothing". I don't mind if the CINE fails if the column already exists but with a different definition,so maybe it could be worded differently to make it clearer what you get? That's what I want, too. The people saying we should implement COR for columns seem to be, by and large, people who have never wished for this feature and have no particular use case for either one. I have stated my use case in the past, but it has been dismissed as stupid or contrived. I can live with the possibility that I'm dumb, but, for the record, I'm not making this up. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company