Thread: COPY (query) TO file
Tom had posted a question about file compression with copy. I thought about it, and I want to through this out and see if anyone things it is a good idea. Currently, the COPY command only copies a table, what if it could operate with a query, as: COPY (select * from mytable where foo='bar') as BAR TO stdout I have no idea if it is doable, but I can see uses for replication psql -h source mydb -c "COPY (select * from mytable where ID > xxxxx) as mytable TO STDOUT" | psql -h target mydb -c "COPY mytable FROM stdin"
Mark Woodward wrote: > Tom had posted a question about file compression with copy. I thought > about it, and I want to through this out and see if anyone things it is a > good idea. > > Currently, the COPY command only copies a table, what if it could operate > with a query, as: > > COPY (select * from mytable where foo='bar') as BAR TO stdout > > I have no idea if it is doable, but I can see uses for replication I doubt it be really usefull (apart from maybe saving some work coding a client app) but did you actually test it with create table as select ...; followed by a copy of that table if it really is faster then just the usual select & fetch? Regards Tino
> Mark Woodward wrote: >> Tom had posted a question about file compression with copy. I thought >> about it, and I want to through this out and see if anyone things it is >> a >> good idea. >> >> Currently, the COPY command only copies a table, what if it could >> operate >> with a query, as: >> >> COPY (select * from mytable where foo='bar') as BAR TO stdout >> >> I have no idea if it is doable, but I can see uses for replication > > I doubt it be really usefull (apart from maybe saving some work > coding a client app) but did you actually test it with > > create table as select ...; followed by a copy of that table > if it really is faster then just the usual select & fetch? Why "create table?" The idea is that you would have one or more redundent databases and use the COPY TO/FROM to keep them up to date.
Mark Woodward wrote: > Tom had posted a question about file compression with copy. I thought > about it, and I want to through this out and see if anyone things it is a > good idea. > > Currently, the COPY command only copies a table, what if it could operate > with a query, as: > > COPY (select * from mytable where foo='bar') as BAR TO stdout > > Isn't this already being worked on? The TODO list says: Allow COPY to output from views Another idea would be to allow actual SELECT statements in a COPY. Personally I strongly favor the second option as being more flexible than the first. cheers andrew
Mark Woodward wrote: >> Mark Woodward wrote: >>> Tom had posted a question about file compression with copy. I thought >>> about it, and I want to through this out and see if anyone things it is >>> a >>> good idea. >>> >>> Currently, the COPY command only copies a table, what if it could >>> operate >>> with a query, as: >>> >>> COPY (select * from mytable where foo='bar') as BAR TO stdout >>> >>> I have no idea if it is doable, but I can see uses for replication >> I doubt it be really usefull (apart from maybe saving some work >> coding a client app) but did you actually test it with >> >> create table as select ...; followed by a copy of that table >> if it really is faster then just the usual select & fetch? > > Why "create table?" Just to simulate and time the proposal. SELECT ... already works over the network and if COPY from a select (which would basically work like yet another wire protocol) isnt significantly faster, why bother? > The idea is that you would have one or more redundent databases and use > the COPY TO/FROM to keep them up to date. Well, if you have databases you would have regular tables - and can use copy as it is now :-) Regards Tino
> Mark Woodward wrote: >>> Mark Woodward wrote: >>>> Tom had posted a question about file compression with copy. I thought >>>> about it, and I want to through this out and see if anyone things it >>>> is >>>> a >>>> good idea. >>>> >>>> Currently, the COPY command only copies a table, what if it could >>>> operate >>>> with a query, as: >>>> >>>> COPY (select * from mytable where foo='bar') as BAR TO stdout >>>> >>>> I have no idea if it is doable, but I can see uses for replication >>> I doubt it be really usefull (apart from maybe saving some work >>> coding a client app) but did you actually test it with >>> >>> create table as select ...; followed by a copy of that table >>> if it really is faster then just the usual select & fetch? >> >> Why "create table?" > > Just to simulate and time the proposal. > SELECT ... already works over the network and if COPY from a > select (which would basically work like yet another wire > protocol) isnt significantly faster, why bother? Because the format of COPY is a common transmiter/receiver for PostgreSQL, like this: pg_dump -t mytable | psql -h target -c "COPY mytable FROM STDIN" With a more selective copy, you can use pretty much this mechanism to limit a copy to a sumset of the records in a table. > >> The idea is that you would have one or more redundent databases and use >> the COPY TO/FROM to keep them up to date. > > Well, if you have databases you would have regular tables - and > can use copy as it is now :-) But COPY copies all the records, not some of the records. > > Regards > Tino >
Andrew Dunstan wrote: > Mark Woodward wrote: > >> Tom had posted a question about file compression with copy. I thought >> about it, and I want to through this out and see if anyone things it is a >> good idea. >> >> Currently, the COPY command only copies a table, what if it could operate >> with a query, as: >> >> COPY (select * from mytable where foo='bar') as BAR TO stdout >> >> > > > Isn't this already being worked on? The TODO list says: > > Allow COPY to output from views > Another idea would be to allow actual SELECT statements in a COPY. > > Personally I strongly favor the second option as being more flexible > than the first. I second that - allowing arbitrary SELECT statements as a COPY source seems much more powerful and flexible than just supporting COPY FROM VIEW. Stefan
On Fri, 2006-06-02 at 09:56 -0400, Andrew Dunstan wrote: > Allow COPY to output from views FYI, there is a patch for this floating around -- I believe it was posted to -patches a few months back. > Another idea would be to allow actual SELECT statements in a COPY. > > Personally I strongly favor the second option as being more flexible > than the first. +1. -Neil
Neil Conway wrote: > On Fri, 2006-06-02 at 09:56 -0400, Andrew Dunstan wrote: > > Allow COPY to output from views > > FYI, there is a patch for this floating around -- I believe it was > posted to -patches a few months back. I have it. The pieces of it than I can use to implement the idea below, I'll try to rescue. I don't think it's that much though. > > Another idea would be to allow actual SELECT statements in a COPY. > > > > Personally I strongly favor the second option as being more flexible > > than the first. > > +1. Sounds like a plan. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Mark Woodward wrote: ... >>>> create table as select ...; followed by a copy of that table >>>> if it really is faster then just the usual select & fetch? >>> Why "create table?" >> Just to simulate and time the proposal. >> SELECT ... already works over the network and if COPY from a >> select (which would basically work like yet another wire >> protocol) isnt significantly faster, why bother? > > Because the format of COPY is a common transmiter/receiver for PostgreSQL, > like this: > > pg_dump -t mytable | psql -h target -c "COPY mytable FROM STDIN" > > With a more selective copy, you can use pretty much this mechanism to > limit a copy to a sumset of the records in a table. Ok, but why not just implement this into pg_dump or psql? Why bother the backend with that functionality? For example if you copy numbers, int4 (4 bytes) gets expanded to up to 10 bytes. Of course can get the same wire load if you use to_char() with regular select. >>> The idea is that you would have one or more redundent databases and use >>> the COPY TO/FROM to keep them up to date. >> Well, if you have databases you would have regular tables - and >> can use copy as it is now :-) > > But COPY copies all the records, not some of the records. yes, that would be the point in having them "up to date" and not "partially maybe something up to date" ;) COPY is fine for import of data, but for export I think it should be implemented in the frontend. Regards Tino
Tino Wildenhain <tino@wildenhain.de> writes: > Ok, but why not just implement this into pg_dump or psql? > Why bother the backend with that functionality? You're not seriously suggesting we reimplement evaluation of WHERE clauses on the client side, are you? regards, tom lane
> Mark Woodward wrote: > ... >>>>> create table as select ...; followed by a copy of that table >>>>> if it really is faster then just the usual select & fetch? >>>> Why "create table?" >>> Just to simulate and time the proposal. >>> SELECT ... already works over the network and if COPY from a >>> select (which would basically work like yet another wire >>> protocol) isnt significantly faster, why bother? >> >> Because the format of COPY is a common transmiter/receiver for >> PostgreSQL, >> like this: >> >> pg_dump -t mytable | psql -h target -c "COPY mytable FROM STDIN" >> >> With a more selective copy, you can use pretty much this mechanism to >> limit a copy to a sumset of the records in a table. > > Ok, but why not just implement this into pg_dump or psql? > Why bother the backend with that functionality? Because "COPY" runs on the back-end, not the front end, and the front end may not even be in the same city as the backend. When you issue a "COPY" the file it reads or writes local to the backend. True, the examples I gave may not show how that is important, but consider this: psql -h remote masterdb -c "COPY (select * from mytable where ID < xxlastxx) as mytable TO '/replicate_backup/mytable-060602.pgc'" This runs completely in the background and can serve as a running backup.
Tom Lane wrote: > Tino Wildenhain <tino@wildenhain.de> writes: >> Ok, but why not just implement this into pg_dump or psql? >> Why bother the backend with that functionality? > > You're not seriously suggesting we reimplement evaluation of WHERE clauses > on the client side, are you? no, did I? But what is wrong with something like: \COPY 'SELECT foo,bar,baz FROM footable WHERE baz=5 ORDER BY foo' TO file|stdout which would just run the query (in the backend of course) and format the output just like copy would... I mean, ok, when its in the backend, its fine too (beside the data expansion if you dont implement compression...) but I thougt implementing in the frontend would be easier... Regards Tino
Mark Woodward wrote: ... >>> pg_dump -t mytable | psql -h target -c "COPY mytable FROM STDIN" >>> >>> With a more selective copy, you can use pretty much this mechanism to >>> limit a copy to a sumset of the records in a table. >> Ok, but why not just implement this into pg_dump or psql? >> Why bother the backend with that functionality? > > Because "COPY" runs on the back-end, not the front end, and the front end > may not even be in the same city as the backend. When you issue a "COPY" > the file it reads or writes local to the backend. True, the examples I > gave may not show how that is important, but consider this: We were talking about COPY to stdout :-) Copy to file is another issue :-) Copy to (server fs) file has so many limitations I dont see wide use for it. (Of course there are usecases) > psql -h remote masterdb -c "COPY (select * from mytable where ID < > xxlastxx) as mytable TO '/replicate_backup/mytable-060602.pgc'" > > This runs completely in the background and can serve as a running backup. And you are sure it would be much faster then a server local running psql just dumping the result of a query? (And you could more easy avoid raceconditions in contrast to several remote clients trying to trigger your above backup ) But what do I know... I was just asking :-) Regards Tino
On Fri, Jun 02, 2006 at 09:56:07AM -0400, Andrew Dunstan wrote: > Mark Woodward wrote: > >Tom had posted a question about file compression with copy. I thought > >about it, and I want to through this out and see if anyone things it is a > >good idea. > > > >Currently, the COPY command only copies a table, what if it could operate > >with a query, as: > > > >COPY (select * from mytable where foo='bar') as BAR TO stdout > > > > > > Isn't this already being worked on? The TODO list says: > > Allow COPY to output from views > Another idea would be to allow actual SELECT statements in a COPY. > > Personally I strongly favor the second option as being more flexible > than the first. +1 :) Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote!
Tino Wildenhain <tino@wildenhain.de> writes: > Tom Lane wrote: > > Tino Wildenhain <tino@wildenhain.de> writes: > >> Ok, but why not just implement this into pg_dump or psql? > >> Why bother the backend with that functionality? > > > > You're not seriously suggesting we reimplement evaluation of WHERE clauses > > on the client side, are you? No, he's suggesting the client implement COPY formatting after fetching a regular result set. Of course this runs into the same problem other clients have dealing with large result sets. libpq doesn't want to let the client deal with partial results so you have to buffer up the entire result set in memory. I was also vaguely pondering whether all the DDL commands could be generalized to receive or send COPY formatted data for repeated execution. It would be neat to be able to prepare an UPDATE with placeholders and stream data in COPY format as parameters to the UPDATE to execute it thousands or millions of times without any protocol overhead or network pipeline stalls. -- greg
> Mark Woodward wrote: > ... > >>>> pg_dump -t mytable | psql -h target -c "COPY mytable FROM STDIN" >>>> >>>> With a more selective copy, you can use pretty much this mechanism to >>>> limit a copy to a sumset of the records in a table. >>> Ok, but why not just implement this into pg_dump or psql? >>> Why bother the backend with that functionality? >> >> Because "COPY" runs on the back-end, not the front end, and the front >> end >> may not even be in the same city as the backend. When you issue a "COPY" >> the file it reads or writes local to the backend. True, the examples I >> gave may not show how that is important, but consider this: > > > We were talking about COPY to stdout :-) Copy to file is another > issue :-) Copy to (server fs) file has so many limitations I dont see > wide use for it. (Of course there are usecases) "wide use for" is not always the same as "useful." Sometimes "useful" is something not easily doable in other ways or completes a feature set. > >> psql -h remote masterdb -c "COPY (select * from mytable where ID < >> xxlastxx) as mytable TO '/replicate_backup/mytable-060602.pgc'" >> >> This runs completely in the background and can serve as a running >> backup. > > And you are sure it would be much faster then a server local running > psql just dumping the result of a query? No I can't be sure of that at all, but .... The COPY command has a specific use that is understood and an operation that is separate from the normal query mechanism. > (And you could more easy avoid raceconditions in contrast to several > remote clients trying to trigger your above backup ) Again, the examples may not have been precise in presenting "why," the focus was mostly "what" so it could be discussed. As a generic feature it has many potential uses. Trying to debate and defend a specific use limits the potential scope of the feature. Why have COPY anyway? Why not just use "SELECT * FROM TABLE?"
>> Allow COPY to output from views >> Another idea would be to allow actual SELECT statements in a COPY. >> >> Personally I strongly favor the second option as being more flexible >> than the first. > > > I second that - allowing arbitrary SELECT statements as a COPY source > seems much more powerful and flexible than just supporting COPY FROM VIEW. Not to be a sour apple or anything but I don't see how any of this is needed in the backend since we can easily use Psql to do it, or pretty much any other tool. Joshua D. Drake > > > Stefan > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutionssince 1997 http://www.commandprompt.com/
> >>> Allow COPY to output from views >>> Another idea would be to allow actual SELECT statements in a COPY. >>> >>> Personally I strongly favor the second option as being more flexible >>> than the first. >> >> >> I second that - allowing arbitrary SELECT statements as a COPY source >> seems much more powerful and flexible than just supporting COPY FROM >> VIEW. > > Not to be a sour apple or anything but I don't see how any of this is > needed in the backend since we can easily use Psql to do it, or pretty > much any other tool. There is an important difference between a capability in the backend vs one synthesized in the frontend.
>> Not to be a sour apple or anything but I don't see how any of this is >> needed in the backend since we can easily use Psql to do it, or pretty >> much any other tool. > > There is an important difference between a capability in the backend vs > one synthesized in the frontend. And that would be? The suspense is killing me :) Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutionssince 1997 http://www.commandprompt.com/
Mark Woodward wrote: >>>> Allow COPY to output from views >>>> Another idea would be to allow actual SELECT statements in a COPY. >>>> >>>> Personally I strongly favor the second option as being more flexible >>>> than the first. >>> >>> I second that - allowing arbitrary SELECT statements as a COPY source >>> seems much more powerful and flexible than just supporting COPY FROM >>> VIEW. >> Not to be a sour apple or anything but I don't see how any of this is >> needed in the backend since we can easily use Psql to do it, or pretty >> much any other tool. > > There is an important difference between a capability in the backend vs > one synthesized in the frontend. > To be clear, I am not suggesting that we implement COPY TO *in* the client. I am suggesting I don't see the purpose of COPY TO when we can just use the psql query capability to achieve the same thing. In talking with Andrew he is saying there is overhead in pulling these types of result sets through the client but people that are using this feature... won't they be connecting remotely anyway ? I mean... why are we using COPY TO on the same server? what purpose does it serve? What I see COPY TO for is an fast way to pull out bulk reports or something like that... all of which I am NOT going to do on the server. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutionssince 1997 http://www.commandprompt.com/
>>> VIEW. >> Not to be a sour apple or anything but I don't see how any of this is >> needed in the backend since we can easily use Psql to do it, or pretty >> much any other tool. > > There is an important difference between a capability in the backend vs > one synthesized in the frontend. > After much patience from Andrew, I withdrawn my objection :) please continue as if I never popped my head out of my hole. Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutionssince 1997 http://www.commandprompt.com/
Tino Wildenhain <tino@wildenhain.de> writes: > Tom Lane wrote: >> You're not seriously suggesting we reimplement evaluation of WHERE clauses >> on the client side, are you? > no, did I? But what is wrong with something like: > \COPY 'SELECT foo,bar,baz FROM footable WHERE baz=5 ORDER BY foo' TO > file|stdout > which would just run the query (in the backend of course) and > format the output just like copy would... Oh, I see what you have in mind. But that's hardly free either. It requires psql to be able to translate between the frontend SELECT data format and COPY format; which is far from a trivial thing, especially when you consider all those CSV options ;-). And then we get to maintain that code in parallel with the backend code anytime someone wants another COPY feature. And then if you want the feature in a different client, you get to do it all over again. On balance I think doing it in the backend is more flexible and requires less duplication of code. regards, tom lane
> I was also vaguely pondering whether all the DDL commands could be > generalized > to receive or send COPY formatted data for repeated execution. It would > be > neat to be able to prepare an UPDATE with placeholders and stream data > in COPY > format as parameters to the UPDATE to execute it thousands or millions of > times without any protocol overhead or network pipeline stalls. MySQL already does this for INSERT :INSERT INTO x (a,b) VALUES (1,2), (3,4), (5,6)...; > allowing arbitrary SELECT statements as a COPY source > seems much more powerful and flexible than just supporting COPY FROM > VIEW. MySQL already does this :SELECT INTO OUTFILE blah FROM table... Now in both cases the MySQL syntax sucks but it's still quite practical, and the INSERT saves some overhead (parsing, acquiring locks...) and is quite a bit faster than regular INSERT.
PFC <lists@peufeu.com> writes: > > I was also vaguely pondering whether all the DDL commands could be > > generalized to receive or send COPY formatted data for repeated execution. > > It would be neat to be able to prepare an UPDATE with placeholders and > > stream data in COPY format as parameters to the UPDATE to execute it > > thousands or millions of times without any protocol overhead or network > > pipeline stalls. > > MySQL already does this for INSERT : > INSERT INTO x (a,b) VALUES (1,2), (3,4), (5,6)...; Does MySQL really let you stream that? Trying to do syntax like that in Postgres wouldn't work because the parser would try to build up a parse tree for the whole statement before running the command. -- greg
Greg Stark <gsstark@mit.edu> writes: > PFC <lists@peufeu.com> writes: >> MySQL already does this for INSERT : >> INSERT INTO x (a,b) VALUES (1,2), (3,4), (5,6)...; > Does MySQL really let you stream that? Trying to do syntax like that in > Postgres wouldn't work because the parser would try to build up a parse tree > for the whole statement before running the command. A quick look at MySQL's grammar doesn't show any indication that they don't build a full parse tree too. The above syntax is SQL-standard, so we ought to support it sometime, performance benefits or no. I agree it might be tricky to avoid eating an unreasonable amount of memory for a very long list in Postgres :-( Supporting VALUES only in INSERT would be relatively trivial BTW, but the spec actually thinks it should be allowed as a <table spec> in FROM ... regards, tom lane
>> MySQL already does this for INSERT : >> INSERT INTO x (a,b) VALUES (1,2), (3,4), (5,6)...; > > Does MySQL really let you stream that? Trying to do syntax like that in > Postgres wouldn't work because the parser would try to build up a parse > tree > for the whole statement before running the command. Hehe, I don't know, but I suppose it's parsed in one-shot then executed, and not streamed, because :- you can append modifiers at the end of the statement (IGNORE...),- mysql barfs if the completeSQL including data is larger than the query buffer specified in the config file. The second point leads to an interesting fact, ie. dumps generated by phpmyadmin and mysqldump need a parameter specifying how long, in bytes, the insert commands can be ; so that hopefully they can be reloaded later.If one of the inserted values violates a "constraint",it is substituted by "some other default value". Still, it's useful ; and one interesting part is that everything happens in the same SQL command (wrt concurrency).
Mark Woodward wrote: >> Mark Woodward wrote: ... >>> This runs completely in the background and can serve as a running >>> backup. >> And you are sure it would be much faster then a server local running >> psql just dumping the result of a query? > > No I can't be sure of that at all, but .... The COPY command has a > specific use that is understood and an operation that is separate from the > normal query mechanism. Unless you change it to actually execute a query ;) >> (And you could more easy avoid raceconditions in contrast to several >> remote clients trying to trigger your above backup ) > > Again, the examples may not have been precise in presenting "why," the > focus was mostly "what" so it could be discussed. As a generic feature it > has many potential uses. Trying to debate and defend a specific use limits > the potential scope of the feature. Thats why I'm asking. I'm still wondering which use-case actually defends the integration of the resultset-formatter into the backend vs. just doing it in the frontend (in both places there are already some routines which could be used to implent). > Why have COPY anyway? Why not just use "SELECT * FROM TABLE?" Because the special SELECT * FROM TABLE can be optimized aparently. Ah yes, and if usual result fetch requires storing result set in server ram, it would be nicer to change that if possible. I think we run SELECT ... much more often then COPY ;-) (And I hope nobody comes up with the idea if copy would be implemented to execute queries, to generally use COPY instead of select for large result sets in applications. Goodbye portability...)
Tom Lane <tgl@sss.pgh.pa.us> writes: > Supporting VALUES only in INSERT would be relatively trivial BTW, > but the spec actually thinks it should be allowed as a <table spec> > in FROM ... How does that syntax work? INSERT INTO x (a,b) from select x,y,z from t from select x2,y2,z2 from t ? doesn't seem to be very sensible? -- greg
Greg Stark <gsstark@mit.edu> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> Supporting VALUES only in INSERT would be relatively trivial BTW, >> but the spec actually thinks it should be allowed as a <table spec> >> in FROM ... > How does that syntax work? If you look at SQL92, INSERT ... VALUES is actually not a direct production in their BNF. They define <insert statement> as <insert statement> ::= INSERT INTO <table name> <insert columns and source> <insert columns and source> ::= [ <left paren> <insert column list> <right paren> ] <queryexpression> | DEFAULT VALUES <insert column list> ::= <column name list> and then one of the alternatives for <query expression> is <table value constructor>, which is <table value constructor> ::= VALUES <table value constructor list> <table value constructor list> ::= <row value constructor> [ { <comma> <row value constructor> }... ] (Another alternative for <query expression> is <query specification>, which accounts for the INSERT ... SELECT syntax.) The interesting point here is that a <subquery> is defined as a parenthesized <query expression>, which means that you ought to be able to use a parenthesized VALUES list anyplace you could use a parenthesized SELECT. So FROM lists, IN clauses, = ANY and friends, etc all really ought to be able to support this. (A quick look at mysql's grammar suggests that they don't handle those cases.) The trouble with supporting it for any case other than INSERT is that you have to work out what the column datatypes of the construct ought to be. This is the same as the equivalent problem for UNION constructs, but the UNION type resolution algorithm looks kinda ugly for thousands of inputs :-( regards, tom lane
Tom Lane wrote: >Greg Stark <gsstark@mit.edu> writes: > > >>PFC <lists@peufeu.com> writes: >> >> >>>MySQL already does this for INSERT : >>>INSERT INTO x (a,b) VALUES (1,2), (3,4), (5,6)...; >>> >>> > > >The above syntax is SQL-standard, so we ought to support it sometime, >performance benefits or no. I agree it might be tricky to avoid eating >an unreasonable amount of memory for a very long list in Postgres :-( > >Supporting VALUES only in INSERT would be relatively trivial BTW, >but the spec actually thinks it should be allowed as a <table spec> >in FROM ... > > > > Can we just start with the simple case? cheers andrew
Tom Lane <tgl@sss.pgh.pa.us> writes: > The interesting point here is that a <subquery> is defined as a > parenthesized <query expression>, which means that you ought to be able to > use a parenthesized VALUES list anyplace you could use a parenthesized > SELECT. So FROM lists, IN clauses, = ANY and friends, etc all really ought > to be able to support this. That's actually pretty neat. I've occasionally had to write queries with the idiom SELECT ... FROM (SELECT a,b,c UNION ALL SELECT d,e,f UNION ALL SELECT g,h,i )WHERE ... That's pretty awful. It would have been awfully nice to do be able to do SELECT ... FROM (VALUES (a,b,c),(d,e,f),(g,h,i)) > The trouble with supporting it for any case other than INSERT is that > you have to work out what the column datatypes of the construct ought > to be. This is the same as the equivalent problem for UNION constructs, > but the UNION type resolution algorithm looks kinda ugly for thousands > of inputs :-( I always thought UNION just decided on the type based on the first branch and then coerced all the others to that type. I always cast all the columns on the first union branch just in case. -- greg
Greg Stark wrote: >It would have been awfully nice to do be able to do > >SELECT ... FROM (VALUES (a,b,c),(d,e,f),(g,h,i)) > > > > >>The trouble with supporting it for any case other than INSERT is that >>you have to work out what the column datatypes of the construct ought >>to be. This is the same as the equivalent problem for UNION constructs, >>but the UNION type resolution algorithm looks kinda ugly for thousands >>of inputs :-( >> >> > >I always thought UNION just decided on the type based on the first branch and >then coerced all the others to that type. I always cast all the columns on the >first union branch just in case. > > > Could we get away with requiring an explicit type expression where there's some abiguity or uncertainty, like this SELECT ... FROM (VALUES (a,b,c),(d,e,f),(g,h,i)) as (a int, b text, c float) That's what you have to do with an SRF that returns a SETOF RECORD in the same situation, after all. cheers andrew
In article <44804377.8070706@dunslane.net>, Andrew Dunstan <andrew@dunslane.net> writes: > Mark Woodward wrote: >> Tom had posted a question about file compression with copy. I thought >> about it, and I want to through this out and see if anyone things it is a >> good idea. >> >> Currently, the COPY command only copies a table, what if it could operate >> with a query, as: >> >> COPY (select * from mytable where foo='bar') as BAR TO stdout >> >> > Isn't this already being worked on? The TODO list says: > Allow COPY to output from views IIRC Karel Zak posted a patch for that. > Another idea would be to allow actual SELECT statements in a COPY. > Personally I strongly favor the second option as being more flexible > than the first. How so? I see that psql -h somehost somedb -c "copy 'somequery' to stdout" >localfile would be more terse than psql -h somehost somedb -c "create temp view tmp as somequery; copy tmp to stdout" >localfile but what's more flexible there?
On Tue, Jun 06, 2006 at 04:47:40PM +0200, Harald Fuchs wrote: > In article <44804377.8070706@dunslane.net>, > Andrew Dunstan <andrew@dunslane.net> writes: > > > Mark Woodward wrote: > >> Tom had posted a question about file compression with copy. I thought > >> about it, and I want to through this out and see if anyone things it is a > >> good idea. > >> > >> Currently, the COPY command only copies a table, what if it could operate > >> with a query, as: > >> > >> COPY (select * from mytable where foo='bar') as BAR TO stdout > >> > >> > > > Isn't this already being worked on? The TODO list says: > > > Allow COPY to output from views > > IIRC Karel Zak posted a patch for that. > > > Another idea would be to allow actual SELECT statements in a COPY. > > > Personally I strongly favor the second option as being more flexible > > than the first. > > How so? I see that > > psql -h somehost somedb -c "copy 'somequery' to stdout" >localfile > > would be more terse than > > psql -h somehost somedb -c "create temp view tmp as somequery; copy tmp to stdout" >localfile > > but what's more flexible there? Flexibility aside, doing this via a temporary view is a very non-intuitive way to go about it. AFAIK CREATE TEMP VIEW is also regular DDL, which means more overhead in the system catalogs, along with more need to vacuum. I really fail to see why we shouldn't support copying from a query unless there's some serious technical challenge. If there was a serious technical challange that using a temporary view solved, we should do the work of creating the temporary view for the user. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461