Thread: Bug in COPY from CSV?
I think I’ve found a bug in PG 8.0 that occurs while copying from CSV files.
I checked the bugs list but didn’t see anything similar.
This occurs when reading a CSV file where one of the ‘text’ fields has
a blank line. I included an example that shows the problem below.
I found that if I change the blank line to have one space, pg reads
it just fine.
If this is indeed a bug, if someone could let me know the best
way to address it, I would appreciate it.
--EXAMPLE
create table t (
id integer,
description text
);
copy t from stdin with null as '' csv quote as '"';
1,"Now is the time"
2,"for all good men
to come
to the
aid of their party"
3,"The quick brown fox"
\.
--END EXAMPLE
The above example produces this output:
CREATE TABLE
psql:test2.sql:8: ERROR: unterminated CSV quoted field
CONTEXT: COPY t, line 5: ""
I did some research on this and it turns out it is one of our TODO items. It is: o Allow COPY FROM ... CSV to interpret newlines and carriage returns in data This would require major refactoring of the copy source code. We are actually careful to warn people who dump out data with newlines in CSV format: test=> insert into t values (4, 'lkjasdf test'> lkjasdf test'> test'> test'> test'> aaaa'); INSERT 542038 1 test=> copy t to '/bjm/3' with null as '' csv quote as '"'; WARNING: CSV fields with embedded linefeed or carriage return characters might not be able to be reimported COPY Someone has come up with a patch which might allow this so it might work in 8.1. I am attaching a reproducable case of your report. --------------------------------------------------------------------------- Rick Schumeyer wrote: > I think I've found a bug in PG 8.0 that occurs while copying from CSV files. > > I checked the bugs list but didn't see anything similar. > > > > This occurs when reading a CSV file where one of the 'text' fields has > > a blank line. I included an example that shows the problem below. > > I found that if I change the blank line to have one space, pg reads > > it just fine. > > > > If this is indeed a bug, if someone could let me know the best > > way to address it, I would appreciate it. > > > > --EXAMPLE > > create table t ( > > id integer, > > description text > > ); > > > > copy t from stdin with null as '' csv quote as '"'; > > 1,"Now is the time" > > 2,"for all good men > > to come > > to the > > > > aid of their party" > > 3,"The quick brown fox" > > \. > > --END EXAMPLE > > > > The above example produces this output: > > > > CREATE TABLE > > psql:test2.sql:8: ERROR: unterminated CSV quoted field > > CONTEXT: COPY t, line 5: "" > > > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073 CREATE TABLE t ( id INTEGER, description TEXT ); COPY T FROM stdin WITH NULL AS '' CSV QUOTE AS '"'; 1,"Now is the time" 2,"for all good men to come to the aid of their party" 3,"The quick brown fox" \.
If you have command line support for PHP it is fairly easy to write a program that would use "pg_escape_string" to condition the data. You could even use "pg_copy_to" to send the data to the db as well. On Mon, 2005-14-02 at 22:30 -0500, Bruce Momjian wrote: > I did some research on this and it turns out it is one of our TODO > items. It is: > > o Allow COPY FROM ... CSV to interpret newlines and carriage > returns in data > > This would require major refactoring of the copy source code. > > We are actually careful to warn people who dump out data with newlines > in CSV format: > > test=> insert into t values (4, 'lkjasdf > test'> lkjasdf > test'> > test'> > test'> > test'> aaaa'); > INSERT 542038 1 > test=> copy t to '/bjm/3' with null as '' csv quote as '"'; > WARNING: CSV fields with embedded linefeed or carriage return characters might not be able to be reimported > COPY > > Someone has come up with a patch which might allow this so it might work > in 8.1. > > I am attaching a reproducable case of your report. > > --------------------------------------------------------------------------- > > Rick Schumeyer wrote: > > I think I've found a bug in PG 8.0 that occurs while copying from CSV files. > > > > I checked the bugs list but didn't see anything similar. > > > > > > > > This occurs when reading a CSV file where one of the 'text' fields has > > > > a blank line. I included an example that shows the problem below. > > > > I found that if I change the blank line to have one space, pg reads > > > > it just fine. > > > > > > > > If this is indeed a bug, if someone could let me know the best > > > > way to address it, I would appreciate it. > > > > > > > > --EXAMPLE > > > > create table t ( > > > > id integer, > > > > description text > > > > ); > > > > > > > > copy t from stdin with null as '' csv quote as '"'; > > > > 1,"Now is the time" > > > > 2,"for all good men > > > > to come > > > > to the > > > > > > > > aid of their party" > > > > 3,"The quick brown fox" > > > > \. > > > > --END EXAMPLE > > > > > > > > The above example produces this output: > > > > > > > > CREATE TABLE > > > > psql:test2.sql:8: ERROR: unterminated CSV quoted field > > > > CONTEXT: COPY t, line 5: "" > > > > > > > > plain text document attachment (/bjm/0) > CREATE TABLE t ( > id INTEGER, > description TEXT > ); > COPY T FROM stdin WITH NULL AS '' CSV QUOTE AS '"'; > 1,"Now is the time" > 2,"for all good men > to come > to the > > aid of their party" > 3,"The quick brown fox" > \. > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- Guy Fraser Network Administrator The Internet Centre 1-888-450-6787 (780)450-6787