Thread: using composite types in insert/update
IMO, composite types on insert/update should work as they do on select: SELECT foo FROM foo; -- works, returning type foo INSERT INTO foo VALUES '(something)'::foo -- fails, but we have a workaround: INSERT INTO foo SELECT ('(something)'::foo).* -- expands foo into foo columns however no such workaround exists for update. ideally, UPDATE foo SET foo = foo; would be valid. Aside from fixing a surprising behavior, it would greatly aid in writing triggers that do things like ship updates over dblink _much_ easier (in fact...the dblink_build_xxx family would become obsolete). e.g. perform dblink.dblink('UPDATE foo SET foo = \'' || new || '\'::foo); I call the existing behavior of insert/update of composite types broken to the point of almost being a bug. Fixing the above to work would close the loop on a broad new set of things you can do with composite types. merlin
Merlin Moncure <mmoncure@gmail.com> wrote: > however no such workaround exists for update. ideally, > UPDATE foo SET foo = foo; +1. "UPDATE foo SET (*) = (foo.*)" would be another candidate of syntax. I want to use this kind of queries to apply a changeset log to another table. It is just like log application in Slony-I. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
On Wed, Jan 28, 2009 at 12:03:56PM -0500, Merlin Moncure wrote: > IMO, composite types on insert/update should work as they do on select: > INSERT INTO foo VALUES '(something)'::foo -- fails, The VALUES command is just a convenient way of getting lots of tuples into PG isn't it? If the above was valid, PG would have to support similar syntax elsewhere, which seems independent of the feature you're really asking for. > but we have a workaround: > INSERT INTO foo SELECT ('(something)'::foo).* -- expands foo into foo columns Or if you wanted to insert multiple rows: INSERT INTO foo SELECT (txt::foo).* FROM (VALUES ('(something)'), ('(something else)')) x(txt); > however no such workaround exists for update. ideally, > UPDATE foo SET foo = foo; > > would be valid. Sounds useful, but seems to break existing syntax (imagine if the table "foo" had a column called "foo"). Takahiro suggests using a * to indicate what you're asking for and this seems to have nicer semantics to me. There seem to be two different improvements needed; the first would be in allowing composite values on the RHS, the second in allowing the column list on the LHS to be replaced with a *. E.g. we start with the following code: CREATE TEMP TABLE foo ( a INT, b TEXT ); INSERT INTO foo ( 1, 'a' ); the following is currently valid: UPDATE foo SET (a,b) = (x.a,x.b) FROM (SELECT ('(2,c)'::foo).*) x; The first step would allow you to do: UPDATE foo SET (a,b) = x FROM (SELECT ('(2,c)'::foo).*) x; and the second step allow you to do: UPDATE foo SET (*) = x FROM (SELECT ('(2,c)'::foo).*) x; > Aside from fixing a surprising behavior Or have I missed the point and you mean the "surprising behavior" is that you expect PG to generate WHERE clauses for you automatically. This seems impossible in the general case. > , it would > greatly aid in writing triggers that do things like ship updates over > dblink _much_ easier (in fact...the dblink_build_xxx family would > become obsolete). > > e.g. > perform dblink.dblink('UPDATE foo SET foo = \'' || new || '\'::foo); > > I call the existing behavior of insert/update of composite types > broken to the point of almost being a bug. Fixing the above to work > would close the loop on a broad new set of things you can do with > composite types. How well would something like this work in practice? If for some reason "foo" had been created with the columns in a different order in the two databases then you'd end up with things breaking pretty quickly. One naive way out seems to be to include the column names in serialized tuples. This has advantages (i.e. we're always told not to rely on column order and this would be one less place we implicitly had to) as well as disadvantages (i.e. the size of the resulting serialized value would go up and well as the complexity of the serialization routine). -- Sam http://samason.me.uk/
On 1/30/09, Sam Mason <sam@samason.me.uk> wrote: > On Wed, Jan 28, 2009 at 12:03:56PM -0500, Merlin Moncure wrote: > > IMO, composite types on insert/update should work as they do on select: > > The VALUES command is just a convenient way of getting lots of tuples > into PG isn't it? If the above was valid, PG would have to support > similar syntax elsewhere, which seems independent of the feature you're > really asking for. You are missing the point, using the composite type allows you to build the insert without knowing the specific layout of the table...only the table itself and the fields that comprise the key for update statements. > > but we have a workaround: > > INSERT INTO foo SELECT ('(something)'::foo).* -- expands foo into foo columns > > > Or if you wanted to insert multiple rows: > > INSERT INTO foo SELECT (txt::foo).* FROM > (VALUES ('(something)'), ('(something else)')) x(txt); > > > > however no such workaround exists for update. ideally, > > UPDATE foo SET foo = foo; > > > > would be valid. > > Sounds useful, but seems to break existing syntax (imagine if the table > "foo" had a column called "foo"). Takahiro suggests using a * to > indicate what you're asking for and this seems to have nicer semantics > to me. I don't think it would...right now select statements work the way I want. If there is table and column with the same name, the column name is assumed. It's an issue of symmetry...why can't you insert the same way you select? By the way, record types are virtually first class objects starting with 8.4: create index foo_idx on foo(foo); select (1,2)::foo = (3,4)::foo; select foo from foo order by foo; select foo::text::foo; -- got this in 8.3 are all valid. > There seem to be two different improvements needed; the first would be > in allowing composite values on the RHS, the second in allowing the > column list on the LHS to be replaced with a *. E.g. we start with the > following code: > > CREATE TEMP TABLE foo ( a INT, b TEXT ); > INSERT INTO foo ( 1, 'a' ); > > the following is currently valid: > > UPDATE foo SET (a,b) = (x.a,x.b) FROM (SELECT ('(2,c)'::foo).*) x; > > The first step would allow you to do: > > UPDATE foo SET (a,b) = x FROM (SELECT ('(2,c)'::foo).*) x; > > and the second step allow you to do: > > UPDATE foo SET (*) = x FROM (SELECT ('(2,c)'::foo).*) x; Hm. IMO, set (*) is a completely new invention of what '*' means. I guess it's ok though, but I think the composite type is more natural. I think if you went this route you should think about other places that this syntax might be valid. I'm not arguing against what you're saying, but the composite type should work too. > > Aside from fixing a surprising behavior > > Or have I missed the point and you mean the "surprising behavior" is > that you expect PG to generate WHERE clauses for you automatically. > This seems impossible in the general case. The surprising behavior is that 'select foo from foo' works, but 'update foo set foo = x::foo' does not. > > , it would > > greatly aid in writing triggers that do things like ship updates over > > dblink _much_ easier (in fact...the dblink_build_xxx family would > > become obsolete). > > > > e.g. > > perform dblink.dblink('UPDATE foo SET foo = \'' || new || '\'::foo); > > > > I call the existing behavior of insert/update of composite types > > broken to the point of almost being a bug. Fixing the above to work > > would close the loop on a broad new set of things you can do with > > composite types. > > How well would something like this work in practice? If for some reason > "foo" had been created with the columns in a different order in the two > databases then you'd end up with things breaking pretty quickly. One That's a separate application specific issue that applies only to dblink style replication (and I don't think keeping to schemas similar is really all that difficult). Composite type insertion has usefulness far beyond dblink triggers. Imagine a global trigger that captures record to text and logs to text table. Now it's trivial if you want to render the text back into the table, since you don't have to look up the field list to generate the statement. > naive way out seems to be to include the column names in serialized > tuples. This has advantages (i.e. we're always told not to rely on > column order and this would be one less place we implicitly had to) as > well as disadvantages (i.e. the size of the resulting serialized value > would go up and well as the complexity of the serialization routine). > That's nice as well, but should be considered separately from powering composite types. merlin
Merlin Moncure <mmoncure@gmail.com> writes: > You are missing the point, using the composite type allows you to > build the insert without knowing the specific layout of the > table... Surely at *some* level you have to know that. regards, tom lane
On Fri, Jan 30, 2009 at 11:01:53AM -0500, Merlin Moncure wrote: > On 1/30/09, Sam Mason <sam@samason.me.uk> wrote: > > The VALUES command is just a convenient way of getting lots of tuples > > into PG isn't it? If the above was valid, PG would have to support > > similar syntax elsewhere, which seems independent of the feature you're > > really asking for. > > You are missing the point, using the composite type allows you to > build the insert without knowing the specific layout of the > table...only the table itself and the fields that comprise the key for > update statements. But this has nothing to do with the VALUES command! Going back to what I interpret as your original point, I find myself wanting to write: SELECT * FROM foo f, (VALUES 1, 5, 7, 23, 47) v WHERE f.id = v; but end up having to write the following: SELECT * FROM foo f, (VALUES (1), (5), (7), (23), (47)) x(v) WHERE f.id = x.v; quite often (i.e. a VALUES command with many singletons). This seems a bit annoying and appears to be what you were suggesting you wanted before (although you killed the relevant bit of context, making me think we may be talking about different things). > > > ideally, > > > UPDATE foo SET foo = foo; > > > would be valid. > > > > Sounds useful, but seems to break existing syntax (imagine if the table > > "foo" had a column called "foo"). Takahiro suggests using a * to > > indicate what you're asking for and this seems to have nicer semantics > > to me. > > I don't think it would...right now select statements work the way I > want. If there is table and column with the same name, the column > name is assumed. It's an issue of symmetry...why can't you insert the > same way you select? For several reasons; mainly because SQL is an abortion of a language, it's got no regularity and attempts to justify requirements because of "symmetry" will end up causing more headaches. Another way of saying what you seem to be saying above is: I want things to work correctly, unless I happen to have a column name that happens to be the same as the table at which point I want everything to break. > By the way, record types are virtually first class objects starting with 8.4: > create index foo_idx on foo(foo); > select (1,2)::foo = (3,4)::foo; > select foo from foo order by foo; > select foo::text::foo; -- got this in 8.3 > > are all valid. Record *types* are most definitely not first class objects; record/composite *values* on the other hand have been gaining support for a while. There are a few weirdo's left, like VALUES commands only working with records, but the dichotomy between record and "non-record" types is slowly vanishing. > > UPDATE foo SET (*) = x FROM (SELECT ('(2,c)'::foo).*) x; > > Hm. IMO, set (*) is a completely new invention of what '*' means. In my head, * has always meant all the columns associated with some record. This is just putting it in a new place in the grammar. It's nice because it doesn't introduce any ambiguities, whereas using the table name does. I'm not sure if the brackets are needed, but I thought it safer to leave them in. > > > Aside from fixing a surprising behavior > > > > Or have I missed the point and you mean the "surprising behavior" is > > that you expect PG to generate WHERE clauses for you automatically. > > This seems impossible in the general case. > > The surprising behavior is that 'select foo from foo' works, but > 'update foo set foo = x::foo' does not. Then blame the original designers of SQL; they optimized the syntax for a different set of use cases! Is the symmetry more obvious when you compare: SELECT * FROM foo; with UPDATE foo SET * = x; ? -- Sam http://samason.me.uk/
On 1/30/09, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Merlin Moncure <mmoncure@gmail.com> writes: > > You are missing the point, using the composite type allows you to > > build the insert without knowing the specific layout of the > > table... > > Surely at *some* level you have to know that. You don't (if I understand your meaning) ...you just have to make sure the destination of the insert is the same as the source. With 'tables as composite types', this is trivially easy as long as you make sure the destination schema matches (basically, the whole point of ad-hoc dblink based replication). Fix up the composite types, and you can now make context free triggers that ship records around without exposing any detail of the record except a candidate key, which can be solved by convention (foo->foo_id). merlin
On 1/30/09, Sam Mason <sam@samason.me.uk> wrote: > quite often (i.e. a VALUES command with many singletons). This seems > a bit annoying and appears to be what you were suggesting you wanted > before (although you killed the relevant bit of context, making me think > we may be talking about different things). we are. See the title of the thread: 'using composite types in insert/update'. that's what I'm talking about. I especially am not talking about the 'values' statement. > For several reasons; mainly because SQL is an abortion of a language, > it's got no regularity and attempts to justify requirements because of > "symmetry" will end up causing more headaches. > > Another way of saying what you seem to be saying above is: I want things > to work correctly, unless I happen to have a column name that happens to > be the same as the table at which point I want everything to break. Upthread, I noted the usefulness in writing triggers. There are many other uses. btw, symmetry (making insert work more similarly to select) is tangential but surely a good thing. > Record *types* are most definitely not first class objects; > record/composite *values* on the other hand have been gaining support well, I used the terms record types and composite types interchangeably in this discussion. Sorry for the confusion. I don't know if you are arguing for or against the idea of 'update foo set foo = foo' working. (if against, why?) merlin
On Fri, Jan 30, 2009 at 02:47:49PM -0500, Merlin Moncure wrote: > On 1/30/09, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Merlin Moncure <mmoncure@gmail.com> writes: > > > You are missing the point, using the composite type allows you to > > > build the insert without knowing the specific layout of the > > > table... > > > > Surely at *some* level you have to know that. > > You don't (if I understand your meaning) ...you just have to make sure > the destination of the insert is the same as the source. Sounds as though there are at least two levels that know the specific layout of the tables involved then. 1) PG has to know the structure of the tables, and 2) you application relies on the fact that tables of the same name have the same structure. Sounds like a very simple ah-hoc nominal type system to me. -- Sam http://samason.me.uk/
Sam Mason wrote: > On Fri, Jan 30, 2009 at 02:47:49PM -0500, Merlin Moncure wrote: >> On 1/30/09, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Merlin Moncure <mmoncure@gmail.com> writes: >>> > You are missing the point, using the composite type allows you to >>> > build the insert without knowing the specific layout of the >>> > table... >>> >>> Surely at *some* level you have to know that. >> You don't (if I understand your meaning) ...you just have to make sure >> the destination of the insert is the same as the source. > > Sounds as though there are at least two levels that know the specific > layout of the tables involved then. 1) PG has to know the structure of > the tables, and 2) you application relies on the fact that tables of the What merlin is trying to solve is home-grown replication. By definition, the master and slave must have the same table(s). So I think he is looking for a more elegant method of performing slave updates; rather than mirror.field_a=master.field_a, mirror.field_b=master.field_b, etc... until you are blue in the face. What makes single field updating even worse is the maintained overhead involved if the table structure changes; can't just alter the two tables, you also have to modify the UPDATE statement. > same name have the same structure. Sounds like a very simple ah-hoc> nominal type system to me. No. Its an ad-hoc replication system. A change to UPDATE is needed for it to work, not a type system. -- Andrew Chernow eSilo, LLC every bit counts http://www.esilo.com/
On Fri, Jan 30, 2009 at 03:12:27PM -0500, Merlin Moncure wrote: > On 1/30/09, Sam Mason <sam@samason.me.uk> wrote: > > quite often (i.e. a VALUES command with many singletons). This seems > > a bit annoying and appears to be what you were suggesting you wanted > > before (although you killed the relevant bit of context, making me think > > we may be talking about different things). > > we are. See the title of the thread: 'using composite types in > insert/update'. that's what I'm talking about. I especially am not > talking about the 'values' statement. Humm, I was talking about your example code: INSERT INTO foo VALUES '(something)'::foo; This isn't currently valid, but it sounds as though it needs to be. > > For several reasons; mainly because SQL is an abortion of a language, > > it's got no regularity and attempts to justify requirements because of > > "symmetry" will end up causing more headaches. > > > > Another way of saying what you seem to be saying above is: I want things > > to work correctly, unless I happen to have a column name that happens to > > be the same as the table at which point I want everything to break. > > Upthread, I noted the usefulness in writing triggers. There are many > other uses. btw, symmetry (making insert work more similarly to > select) is tangential but surely a good thing. > I don't know if you are arguing for or against the idea of 'update foo > set foo = foo' working. (if against, why?) I agree that the mechanism is good, it's just that the syntax you proposed comes with it's own built in footgun. Symmetry is also muddied by the fact that SELECT and INSERT/UPDATE are built on fundamentally different premises. It's only ever possible to modify a set of rows from one table at a time, whereas a SELECT is designed to work with multiple tables. -- Sam http://samason.me.uk/
On 1/30/09, Sam Mason <sam@samason.me.uk> wrote: > On Fri, Jan 30, 2009 at 03:12:27PM -0500, Merlin Moncure wrote: > > On 1/30/09, Sam Mason <sam@samason.me.uk> wrote: > > > quite often (i.e. a VALUES command with many singletons). This seems > > > a bit annoying and appears to be what you were suggesting you wanted > > > before (although you killed the relevant bit of context, making me think > > > we may be talking about different things). > > > > we are. See the title of the thread: 'using composite types in > > insert/update'. that's what I'm talking about. I especially am not > > talking about the 'values' statement. > > > Humm, I was talking about your example code: > > INSERT INTO foo VALUES '(something)'::foo; > > This isn't currently valid, but it sounds as though it needs to be. hm. i don't think so...at least not quite (my thinko in orig example). I think per spec that would attempt to insert the constructed record into the first column. instead, we would want: INSERT INTO foo(foo) VALUES ... or INSERT INTO foo(foo) SELECT ... Assuming we didn't have a foo column, that would tell pg we are pushing in composite type: 'UPDATE' works simillar: SET foo = is the key that we are pushing composite type, not specific fields. > I agree that the mechanism is good, it's just that the syntax you > proposed comes with it's own built in footgun. Symmetry is also muddied > by the fact that SELECT and INSERT/UPDATE are built on fundamentally > different premises. It's only ever possible to modify a set of rows > from one table at a time, whereas a SELECT is designed to work with > multiple tables. double-check that statement vs. example above. I just don't see the problem. Only small gripe I can think of is that since you can't alias the table in the insert statement, if you have a column named 'foo', you're stuck...oh well. I don't think INSERT INTO foo f(f) VALUES ... or INSERT INTO foo(f) f VALUES ... are worth exploring. merlin
On Fri, Jan 30, 2009 at 03:29:29PM -0500, Andrew Chernow wrote: > Sam Mason wrote: > >On Fri, Jan 30, 2009 at 02:47:49PM -0500, Merlin Moncure wrote: > >>On 1/30/09, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >>>Merlin Moncure <mmoncure@gmail.com> writes: > >>> > You are missing the point, using the composite type allows you to > >>> > build the insert without knowing the specific layout of the > >>> > table... > >>> > >>>Surely at *some* level you have to know that. > >>You don't (if I understand your meaning) ...you just have to make sure > >>the destination of the insert is the same as the source. > > > >Sounds as though there are at least two levels that know the specific > >layout of the tables involved then. 1) PG has to know the structure of > >the tables, and 2) you application relies on the fact that tables of the > > What merlin is trying to solve is home-grown replication. By > definition, the master and slave must have the same table(s). Yes, we know that, but the code doesn't. I was just being pedantic and pointing out where the assumptions of this replication rest. > > same name have the same structure. Sounds like a very simple ah-hoc > > nominal type system to me. > > No. Its an ad-hoc replication system. A change to UPDATE is needed for > it to work, not a type system. It seems convenient to think about the resulting assumptions as a type system. It did to me anyway, but apparently this is causing much confusion and it was a bad analogy to have drawn. -- Sam http://samason.me.uk/
On Fri, Jan 30, 2009 at 03:45:54PM -0500, Merlin Moncure wrote: > On 1/30/09, Sam Mason <sam@samason.me.uk> wrote: > > I was talking about your example code: > > > > INSERT INTO foo VALUES '(something)'::foo; > > > > This isn't currently valid, but it sounds as though it needs to be. > > hm. i don't think so...at least not quite (my thinko in orig example). > I think per spec that would attempt to insert the constructed record > into the first column. bah, it would wouldn't it! why the hell was it designed like that, it's just inviting bugs! I'll use the following table definition below: create temp table foo ( a int, b text ); I was expecting: insert into foo values (1); to fail in the same way as: insert into foo (a,b) values (1); I've never realized before that if you leave off the column list it guesses what you want to do. > instead, we would want: > > INSERT INTO foo(foo) VALUES ... I'm still not sure why anyone would want to do this! > or > INSERT INTO foo(foo) SELECT ... But why not just leave INSERT as it is, it works and is unambiguous! > Assuming we didn't have a foo column, that would tell pg we are > pushing in composite type: I still don't see why you want to encourage people to think "have I got a similarly named column" the whole time. It's fine when you're after some specific column because then you know what it's called and you're asking for it, so when it's the same as the table it's obvious. When it's the table you're asking for you don't want to worry about it breaking when a column gets added. > 'UPDATE' works simillar: SET foo = > is the key that we are pushing composite type, not specific fields. But why is this better than using a *? > > I agree that the mechanism is good, it's just that the syntax you > > proposed comes with it's own built in footgun. Symmetry is also muddied > > by the fact that SELECT and INSERT/UPDATE are built on fundamentally > > different premises. It's only ever possible to modify a set of rows > > from one table at a time, whereas a SELECT is designed to work with > > multiple tables. > > double-check that statement vs. example above. I'm not sure what I'm supposed to be checking, more verbosity please! > I just don't see the > problem. Only small gripe I can think of is that since you can't > alias the table in the insert statement, if you have a column named > 'foo', you're stuck...oh well. I'm not sure what you are saying; is this a nail in the coffin for using the table name as "the key" or have you just deliberately introduced the (fuzzy, un-checked) rule that you're not allowed to have a column the same name as the table. -- Sam http://samason.me.uk/
On 1/30/09, Sam Mason <sam@samason.me.uk> wrote: > But why not just leave INSERT as it is, it works and is unambiguous! Because *there is no way to insert a composite type!!!* (you can expand the type via INSERT ... SELECT, but not for UPDATE). SELECT foo FROM foo; pulls the foo composite from the table, not the fields. I still can't understand why you want to not be able to do this via insert. You are looking for more flexible way to imput fields, I am looking for a way to input type directly. > But why is this better than using a *? because we are not updating specific fields...'*' denotes 'all columns'. we are setting the type to something else. I want to update the type directly, not it's fields, because I don't want to construct the update statement. (*) is better than *, because at least we are suggesting a composite. However, let's try and keep the syntax a little regular? select foo from foo; -- this is how it works now update foo set foo=somefoo; --why would you want update to work any way but this way? likewise, with aliases select foo f from foo; -- this is how it works now update foo f set f=somefoo; -- again, this is how it should work my only point was that there is no aliases in inserts, so there is a minute probability of case where you can't insert the composite type directly. your idea (i think): update foo f set (*) = somefoo; is a huge departure in syntax and semantics from the way things work in other places. merlin
On 1/30/09, Merlin Moncure <mmoncure@gmail.com> wrote: > likewise, with aliases > select foo f from foo; -- this is how it works now > update foo f set f=somefoo; -- again, this is how it should work thinko: select f from foo f; -- this is how it really works now merlin