Thread: Can COPY skip columns?
Hi, Is it possible for the COPY command to read data from a file, but skip specific columns? Thanks Adam -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Adam Witney wrote: > Hi, > > Is it possible for the COPY command to read data from a file, but skip > specific columns? \h copy COPY tablename [ ( column [, ...] ) ] TO { 'filename' | STDOUT } [ [ WITH ] [ BINARY ] [ OIDS ] [ DELIMITER [ AS ] 'delimiter' ] [ NULL [ AS ] 'null string' ] [ CSV [ QUOTE [ AS ] 'quote' ] [ ESCAPE [ AS ] 'escape' ] [ FORCE QUOTE column [, ...] ] Yes.. you just have to specify the columns... Sincerely, Joshua D. Drake > > Thanks > > Adam > > -- Command Prompt, Inc., home of PostgreSQL Replication, and plPHP. Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
Attachment
Hi Joshua, Sorry, I meant skip a column in the file, not the database table, or is that what you meant? Thanks adam > Adam Witney wrote: >> Hi, >> >> Is it possible for the COPY command to read data from a file, but skip >> specific columns? > > \h copy > > COPY tablename [ ( column [, ...] ) ] > TO { 'filename' | STDOUT } > [ [ WITH ] > [ BINARY ] > [ OIDS ] > [ DELIMITER [ AS ] 'delimiter' ] > [ NULL [ AS ] 'null string' ] > [ CSV [ QUOTE [ AS ] 'quote' ] > [ ESCAPE [ AS ] 'escape' ] > [ FORCE QUOTE column [, ...] ] > > Yes.. you just have to specify the columns... > > Sincerely, > > Joshua D. Drake > > >> >> Thanks >> >> Adam >> >> > -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Adam Witney <awitney@sghms.ac.uk> writes: > Is it possible for the COPY command to read data from a file, but skip > specific columns? Nope. When you get into significant massaging of the input data, usually the best bet is to COPY into a temp table that exactly matches the format of the data file, and then do your rearrangements using an INSERT/SELECT into the final target table. regards, tom lane
Joshua D. Drake wrote: > Adam Witney wrote: > >> Hi, >> >> Is it possible for the COPY command to read data from a file, but skip >> specific columns? Crap, read your message backwards. Sorry. No you can't do this. Sincerely, Joshua D. Drake > > > \h copy > > COPY tablename [ ( column [, ...] ) ] > TO { 'filename' | STDOUT } > [ [ WITH ] > [ BINARY ] > [ OIDS ] > [ DELIMITER [ AS ] 'delimiter' ] > [ NULL [ AS ] 'null string' ] > [ CSV [ QUOTE [ AS ] 'quote' ] > [ ESCAPE [ AS ] 'escape' ] > [ FORCE QUOTE column [, ...] ] > > Yes.. you just have to specify the columns... > > Sincerely, > > Joshua D. Drake > > >> >> Thanks >> >> Adam >> >> > > > > ------------------------------------------------------------------------ > > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend -- Command Prompt, Inc., home of PostgreSQL Replication, and plPHP. Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
Attachment
Adam Witney wrote: > Hi Joshua, > > Sorry, I meant skip a column in the file, not the database table, or is that > what you meant? No, I read your question backwards. I am sorry. As Tom Lane said, copy the file into a temporary table and then you can deal with it from there. Sincerely, Joshua D. Drake > > Thanks > > adam > > > > >>Adam Witney wrote: >> >>>Hi, >>> >>>Is it possible for the COPY command to read data from a file, but skip >>>specific columns? >> >>\h copy >> >>COPY tablename [ ( column [, ...] ) ] >> TO { 'filename' | STDOUT } >> [ [ WITH ] >> [ BINARY ] >> [ OIDS ] >> [ DELIMITER [ AS ] 'delimiter' ] >> [ NULL [ AS ] 'null string' ] >> [ CSV [ QUOTE [ AS ] 'quote' ] >> [ ESCAPE [ AS ] 'escape' ] >> [ FORCE QUOTE column [, ...] ] >> >>Yes.. you just have to specify the columns... >> >>Sincerely, >> >>Joshua D. Drake >> >> >> >>>Thanks >>> >>>Adam >>> >>> >> > > -- Command Prompt, Inc., home of PostgreSQL Replication, and plPHP. Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
Attachment
On 18/11/04 5:15 pm, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > Adam Witney <awitney@sghms.ac.uk> writes: >> Is it possible for the COPY command to read data from a file, but skip >> specific columns? > > Nope. When you get into significant massaging of the input data, > usually the best bet is to COPY into a temp table that exactly matches > the format of the data file, and then do your rearrangements using an > INSERT/SELECT into the final target table. Ah ok. Can COPY be made to skip the first row then? To avoid the column headers? Thanks adam -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
On Nov 18, 2004, at 11:53 AM, Adam Witney wrote: > > Hi, > > Is it possible for the COPY command to read data from a file, but skip > specific columns? > You can use awk to skip fields and create an intermediate file or better yet, just pipe the output to copy. Here is a trivial example: awk '{ FS = "\t" ; OFS = "\t" ; print $1,$3 }' inputdatafile This sets the input and output field separators to tab and outputs the first and third fields from inputdatafile. If you want to skip the first record just add the following if statement: awk '{ FS = "\t" ; OFS = "\t" ; if ( NR > 1 ) print $1,$3 }' inputdatafile Patrick B. Kelly ------------------------------------------------------ http://patrickbkelly.org
Another thing you can do if you know how to use Perl is to write a load procedure for the ascii file. We are dumping data from a Progress database so there was quite a lot of massaging to do, especially with some tables. We read the Progress dump in and write it back out in the format we need for PostgreSQL. Then we use COPY. Perl is excellent for this typ of work. On Thursday 18 November 2004 12:15 pm, Tom Lane saith: > Adam Witney <awitney@sghms.ac.uk> writes: > > Is it possible for the COPY command to read data from a file, but skip > > specific columns? > > Nope. When you get into significant massaging of the input data, > usually the best bet is to COPY into a temp table that exactly matches > the format of the data file, and then do your rearrangements using an > INSERT/SELECT into the final target table. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match -- Work: 1-336-372-6812 Cell: 1-336-363-4719 email: terry@esc1.com
Following on from my question yesterday... Can COPY then be made to skip a header line (the first line of the file say)? The problem is this... I need to allow a user to upload a data file through a web browser (PHP driven). This is then processed and the selected file columns mapped to fields in a database, and then the data file uploaded. I can do this fine with small files.... But if I get above a 1000 rows it takes so long it time out. I can upload a 10,000 row equivalent file using COPY from psql in 2 seconds, so the time is down to the PHP processing (really all it does is send itto pg_put_Line) I liked Toms idea to create a temporary table, but I need to be able to get rid of the header row then... Is there anyway of avoiding getting PHP involved in the file processing? Any ideas? Thanks Adam -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Wouldn't it be easier to have PHP remove the first row? And PHP has to be there because Apache is sending all the data to it. Hope this helps, On Fri, Nov 19, 2004 at 03:43:18PM +0000, Adam Witney wrote: > > Following on from my question yesterday... Can COPY then be made to skip a > header line (the first line of the file say)? > > The problem is this... I need to allow a user to upload a data file through > a web browser (PHP driven). This is then processed and the selected file > columns mapped to fields in a database, and then the data file uploaded. I > can do this fine with small files.... But if I get above a 1000 rows it > takes so long it time out. > > I can upload a 10,000 row equivalent file using COPY from psql in 2 seconds, > so the time is down to the PHP processing (really all it does is send itto > pg_put_Line) > > I liked Toms idea to create a temporary table, but I need to be able to get > rid of the header row then... Is there anyway of avoiding getting PHP > involved in the file processing? > > Any ideas? > > Thanks > > Adam > > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
Hi Martijn, Can I get PHP to remove the first row without reading the whole file in? If there was a way where PHP would just chop the first row off that would be ideal... Thanks adam > Wouldn't it be easier to have PHP remove the first row? > > And PHP has to be there because Apache is sending all the data to it. > > Hope this helps, > > On Fri, Nov 19, 2004 at 03:43:18PM +0000, Adam Witney wrote: >> >> Following on from my question yesterday... Can COPY then be made to skip a >> header line (the first line of the file say)? >> >> The problem is this... I need to allow a user to upload a data file through >> a web browser (PHP driven). This is then processed and the selected file >> columns mapped to fields in a database, and then the data file uploaded. I >> can do this fine with small files.... But if I get above a 1000 rows it >> takes so long it time out. >> >> I can upload a 10,000 row equivalent file using COPY from psql in 2 seconds, >> so the time is down to the PHP processing (really all it does is send itto >> pg_put_Line) >> >> I liked Toms idea to create a temporary table, but I need to be able to get >> rid of the header row then... Is there anyway of avoiding getting PHP >> involved in the file processing? >> >> Any ideas? >> >> Thanks >> >> Adam >> >> >> -- >> This message has been scanned for viruses and >> dangerous content by MailScanner, and is >> believed to be clean. >> >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
> pg_put_Line) > > I liked Toms idea to create a temporary table, but I need to be able to get > rid of the header row then... Is there anyway of avoiding getting PHP > involved in the file processing? You could write a server side function to do the processing. You could even use plPHP if that is your preferred language. Sincerely, Joshua D. Drake > > Any ideas? > > Thanks > > Adam > > -- Command Prompt, Inc., home of PostgreSQL Replication, and plPHP. Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
Attachment
Hi, > Can I get PHP to remove the first row without reading the whole file in? If > there was a way where PHP would just chop the first row off that would be > ideal... If you are using "COPY table FROM stdin" and pg_put_line() it's not much difficult to skip the first line ;) Best regards -- Matteo Beccati http://phpadsnew.com http://phppgads.com
> can do this fine with small files.... But if I get above a 1000 rows it > takes so long it time out. PHP is slow, but not *that* slow, you have a problem somewhere ! > I can upload a 10,000 row equivalent file using COPY from psql in 2 > seconds, > so the time is down to the PHP processing (really all it does is send > itto > pg_put_Line) If you read the whole file in memory, the server will kick your script (I think the default limit is 8 megabytes or something)... So, I'd advise reading the file line by line using fgets() (dunno how it is spelled in php), and just skip the first line, and pg_put_line() the rest. This way you just use memory for one line at a time. ALso you can echo (and flush) messages like 'XX lines inserted...' to the user while it crunches. If you're really stuck, and have command execution privileges, why not system() a command line like "awk -blah your file | psql copy to your table", or even launch it as a background process ?
Since we can specify the order of columns in copy, how hard would it be (he asked, naively) to specify a column name that points to /dev/null (or the postgresql internals equivalent)? Sybase's copy utility is very similar to our copy, and has a function that you can specify in your column list instead of a real column name that ignores the data. Something like copy mytable (col1, col2, ignore(), col3) from '/tmp/dump'; Not a big deal, but kind of handy. Ian Harding Programmer/Analyst II Tacoma-Pierce County Health Department iharding@tpchd.org Phone: (253) 798-3549 Pager: (253) 754-0002 >>> Tom Lane <tgl@sss.pgh.pa.us> 11/18/04 9:15 AM >>> Adam Witney <awitney@sghms.ac.uk> writes: > Is it possible for the COPY command to read data from a file, but skip > specific columns? Nope. When you get into significant massaging of the input data, usually the best bet is to COPY into a temp table that exactly matches the format of the data file, and then do your rearrangements using an INSERT/SELECT into the final target table. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
MS SQL Server has a BULK INSERT command that can take a parameter "Start Row" so that you can skip as many Rows as you want. this is also available using the command line bcp utility. This would be a nice feature for Postgresql to have , and I can't see it on any of the to do lists, unless it is already implemented well enough in third party utility or contrib. I shelled out a fair bit to get hold of EMS Hi Tech Postgres Manager bundle, and whilst there is a lot to like about it generally, at least part of the reason was because of the improved bulk insert tools. Adam Witney wrote: >Hi Martijn, > >Can I get PHP to remove the first row without reading the whole file in? If >there was a way where PHP would just chop the first row off that would be >ideal... > >Thanks > >adam > > > >>Wouldn't it be easier to have PHP remove the first row? >> >>And PHP has to be there because Apache is sending all the data to it. >> >>Hope this helps, >> >>On Fri, Nov 19, 2004 at 03:43:18PM +0000, Adam Witney wrote: >> >> >>>Following on from my question yesterday... Can COPY then be made to skip a >>>header line (the first line of the file say)? >>> >>>The problem is this... I need to allow a user to upload a data file through >>>a web browser (PHP driven). This is then processed and the selected file >>>columns mapped to fields in a database, and then the data file uploaded. I >>>can do this fine with small files.... But if I get above a 1000 rows it >>>takes so long it time out. >>> >>>I can upload a 10,000 row equivalent file using COPY from psql in 2 seconds, >>>so the time is down to the PHP processing (really all it does is send itto >>>pg_put_Line) >>> >>>I liked Toms idea to create a temporary table, but I need to be able to get >>>rid of the header row then... Is there anyway of avoiding getting PHP >>>involved in the file processing? >>> >>>Any ideas? >>> >>>Thanks >>> >>>Adam >>> >>> >>>-- >>>This message has been scanned for viruses and >>>dangerous content by MailScanner, and is >>>believed to be clean. >>> >>> >>>---------------------------(end of broadcast)--------------------------- >>>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org >>> >>> > > > >
On 19/11/04 6:46 pm, "Pierre-Frédéric Caillaud" <lists@boutiquenumerique.com> wrote: > >> can do this fine with small files.... But if I get above a 1000 rows it >> takes so long it time out. > > PHP is slow, but not *that* slow, you have a problem somewhere ! Aha yes, I was reading the file doing this fgets($fh, 1048576))) ... But changing it to fgets($fh, 1024))) Which is a large enough amount of data for my needs improved the speed dramatically! Thanks to all those that replied adam -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.