Re: Different length lines in COPY CSV - Mailing list pgsql-hackers
From | Mike G. |
---|---|
Subject | Re: Different length lines in COPY CSV |
Date | |
Msg-id | 20051212191428.GA13700@localhost.localdomain Whole thread Raw |
In response to | Re: Different length lines in COPY CSV (Martijn van Oosterhout <kleptog@svana.org>) |
List | pgsql-hackers |
I too have to deal with this issue daily since I deal with Windows daily. It sounds like the source of the csv files wereoriginally Excel files. I have never been able to figure out how Excel determines to quit putting null values in emptycolumns after X rows and resume again later on. If the file has less than 65000 rows I would suggest using OpenOffice 2.0 instead of Excel. OpenOffice does not stop fillingthe empty columns and with 2.0 it now supports the same maximum number of rows that Excel does. I use Perl constantly to "reformat" files and import them as a csv using the COPY command. I think the original poster wouldprefer a php solution though... While it is not a problem for me I do have other less technical users who don't know perl and this makes postgres much moredifficult for them to use. Most of them come from a M$ Access background which can handle importing of Excel files directlythus don't have to deal with this issue. A file conversion utility would be very helpful for supporting Postgres with Windows especially if it could handle Excelfiles in their native format. Mike On Mon, Dec 12, 2005 at 07:58:52PM +0100, Martijn van Oosterhout wrote: > On Mon, Dec 12, 2005 at 10:15:03AM -0500, Pollard, Mike wrote: > > Tom Lane wrote: > > > What's been suggested in the past is some sort of standalone > > > file-format-conversion utility, which could deal with this sort of > > > stuff without having to also deal with all the backend-internal > > > considerations that COPY must handle. So (at least in theory) it'd > > > be simpler and more maintainable. That still seems like a good > > > idea to me --- in fact, given my druthers I would rather have seen > > > CSV support done in such an external program. > > > > Why not add hooks into COPY to call the user's massage functions? That > > way you don't have to read and write the data, then read it again to > > load it into the database. > > Well, it does make you wonder about supporting something like (perl > style): > > \copy foo FROM 'myfilter dodgy-data.csv |' > > or maybe > > \copy foo from pipe 'myfilter dodgy-data.csv' > > or possibly > > \pipe foo from dodgy-data.csv using autodetecting-format-filter.pl > > Which would cause psql to fork/exec the filter and pass the output data > to the server. We could then provide all sorts of parsers for > format-of-the-week. This would probably include the converse: > > \pipe foo to table.xls using make-excel-spreadsheet.pl > > Have a nice day, > -- > Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > > tool for doing 5% of the work and then sitting around waiting for someone > > else to do the other 95% so you can sue them.
pgsql-hackers by date: