Thread: Note that spaces between QUOTE and DELIMITER are included in the field during CVS COPY.
Note that spaces between QUOTE and DELIMITER are included in the field during CVS COPY.
From
Darcy Buskermolen
Date:
-- Darcy Buskermolen Wavefire Technologies Corp. http://www.wavefire.com ph: 250.717.0200 fx: 250.763.1759
Attachment
Darcy Buskermolen wrote: >+ CSV mode will include all characters between <literal>QUOTE</> and >+ <literal>DELIMITER</> in the value for the field, this is of special >+ attention to those who use CSV mode to import data from other RDBMS >+ systems that create fixed width CSV files. > > First, this need some grammar cleanup. But more importantly, it's not quite a correct formulation. CSV mode splits a line on (unquoted) delimiters. Within each chunk dequoting is done, and withing quoted sections de-escaping is done. But nothing is discarded. i.e. with the quote char as '"', 'foo"bar"baz' becomes 'foobarbaz' and ' "x" ' becomes ' x '. I understand Dary's problem has been that Oracle pads CSV lines with spaces. Perhaps we need to warn specifically about that - I suspect most people for whom it might be important will miss the significance otherwise. I'll work on some better wording. cheers andrew
> + <note> > + <para> > + CSV mode will include all characters between <literal>QUOTE</> and > + <literal>DELIMITER</> in the value for the field, this is of special > + attention to those who use CSV mode to import data from other RDBMS > + systems that create fixed width CSV files. > + </para> > + </note> Sorry, I do not understand the issue you are describing above. Can you supply an example? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
I wrote: > Darcy Buskermolen wrote: > >> + CSV mode will include all characters between >> <literal>QUOTE</> and > >> + <literal>DELIMITER</> in the value for the field, this is of >> special >> + attention to those who use CSV mode to import data from other >> RDBMS >> + systems that create fixed width CSV files. >> > > > First, this need some grammar cleanup. But more importantly, it's not > quite a correct formulation. CSV mode splits a line on (unquoted) > delimiters. Within each chunk dequoting is done, and withing quoted > sections de-escaping is done. But nothing is discarded. > > i.e. with the quote char as '"', 'foo"bar"baz' becomes 'foobarbaz' and > ' "x" ' becomes ' x '. > > I understand Dary's problem has been that Oracle pads CSV lines with > spaces. Perhaps we need to warn specifically about that - I suspect > most people for whom it might be important will miss the significance > otherwise. > > I'll work on some better wording. > > How about this? In CSV mode all characters are significant. A quoted value surrounded by white space, or any characters other than <literal>DELIMITER</>, will include those characters. This can cause errors if you import data from a system that pads CSV lines with white space out to some fixed width. If such a situation arises you might need to preprocess the CSV file to remove the trailing white space, before importing the data into Postgres. cheers andrew
I wrote: > > > In CSV mode all characters are significant. A quoted value surrounded > by white space, or any characters other than <literal>DELIMITER</>, > will include those characters. This can cause errors if you import > data from a system that pads CSV lines with white space out to some > fixed width. If such a situation arises you might need to preprocess > the CSV file to remove the trailing white space, before importing the > data into Postgres. > > Patch applied with pretty much this wording. cheers andrew