Thread: Bug in COPY from CSV?

Bug in COPY from CSV?

From
"Rick Schumeyer"
Date:

I think I’ve found a bug in PG 8.0 that occurs while copying from CSV files.

I checked the bugs list but didn’t see anything similar.

 

This occurs when reading a CSV file where one of the ‘text’ fields has

a blank line.  I included an example that shows the problem below.

I found that if I change the blank line to have one space, pg reads

it just fine.

 

If this is indeed a bug, if someone could let me know the best

way to address it, I would appreciate it.

 

--EXAMPLE

 create table t (

      id integer,

      description text

);

 

copy t from stdin with null as '' csv quote as '"';

1,"Now is the time"

2,"for all good men

to come

to the

 

aid of their party"

3,"The quick brown fox"

\.

--END EXAMPLE

 

The above example produces this output:

 

CREATE TABLE

psql:test2.sql:8: ERROR:  unterminated CSV quoted field

CONTEXT:  COPY t, line 5: ""

 

Re: Bug in COPY from CSV?

From
Bruce Momjian
Date:
I did some research on this and it turns out it is one of our TODO
items.  It is:

        o Allow COPY FROM ... CSV to interpret newlines and carriage
          returns in data

          This would require major refactoring of the copy source code.

We are actually careful to warn people who dump out data with newlines
in CSV format:

    test=>  insert into t values (4, 'lkjasdf
    test'> lkjasdf
    test'>
    test'>
    test'>
    test'> aaaa');
    INSERT 542038 1
    test=> copy t to '/bjm/3' with null as '' csv quote as '"';
    WARNING:  CSV fields with embedded linefeed or carriage return characters might not be able to be reimported
    COPY

Someone has come up with a patch which might allow this so it might work
in 8.1.

I am attaching a reproducable case of your report.

---------------------------------------------------------------------------

Rick Schumeyer wrote:
> I think I've found a bug in PG 8.0 that occurs while copying from CSV files.
>
> I checked the bugs list but didn't see anything similar.
>
>
>
> This occurs when reading a CSV file where one of the 'text' fields has
>
> a blank line.  I included an example that shows the problem below.
>
> I found that if I change the blank line to have one space, pg reads
>
> it just fine.
>
>
>
> If this is indeed a bug, if someone could let me know the best
>
> way to address it, I would appreciate it.
>
>
>
> --EXAMPLE
>
>  create table t (
>
>       id integer,
>
>       description text
>
> );
>
>
>
> copy t from stdin with null as '' csv quote as '"';
>
> 1,"Now is the time"
>
> 2,"for all good men
>
> to come
>
> to the
>
>
>
> aid of their party"
>
> 3,"The quick brown fox"
>
> \.
>
> --END EXAMPLE
>
>
>
> The above example produces this output:
>
>
>
> CREATE TABLE
>
> psql:test2.sql:8: ERROR:  unterminated CSV quoted field
>
> CONTEXT:  COPY t, line 5: ""
>
>
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
CREATE TABLE t (
      id INTEGER,
      description TEXT
);
COPY T FROM stdin WITH NULL AS '' CSV QUOTE AS '"';
1,"Now is the time"
2,"for all good men
to come
to the

aid of their party"
3,"The quick brown fox"
\.

Re: Bug in COPY from CSV?

From
Guy Fraser
Date:
If you have command line support for PHP it is fairly easy to
write a program that would use "pg_escape_string" to condition
the data. You could even use "pg_copy_to" to send the data
to the db as well.

On Mon, 2005-14-02 at 22:30 -0500, Bruce Momjian wrote:
> I did some research on this and it turns out it is one of our TODO
> items.  It is:
>
>         o Allow COPY FROM ... CSV to interpret newlines and carriage
>           returns in data
>
>           This would require major refactoring of the copy source code.
>
> We are actually careful to warn people who dump out data with newlines
> in CSV format:
>
>     test=>  insert into t values (4, 'lkjasdf
>     test'> lkjasdf
>     test'>
>     test'>
>     test'>
>     test'> aaaa');
>     INSERT 542038 1
>     test=> copy t to '/bjm/3' with null as '' csv quote as '"';
>     WARNING:  CSV fields with embedded linefeed or carriage return characters might not be able to be reimported
>     COPY
>
> Someone has come up with a patch which might allow this so it might work
> in 8.1.
>
> I am attaching a reproducable case of your report.
>
> ---------------------------------------------------------------------------
>
> Rick Schumeyer wrote:
> > I think I've found a bug in PG 8.0 that occurs while copying from CSV files.
> >
> > I checked the bugs list but didn't see anything similar.
> >
> >
> >
> > This occurs when reading a CSV file where one of the 'text' fields has
> >
> > a blank line.  I included an example that shows the problem below.
> >
> > I found that if I change the blank line to have one space, pg reads
> >
> > it just fine.
> >
> >
> >
> > If this is indeed a bug, if someone could let me know the best
> >
> > way to address it, I would appreciate it.
> >
> >
> >
> > --EXAMPLE
> >
> >  create table t (
> >
> >       id integer,
> >
> >       description text
> >
> > );
> >
> >
> >
> > copy t from stdin with null as '' csv quote as '"';
> >
> > 1,"Now is the time"
> >
> > 2,"for all good men
> >
> > to come
> >
> > to the
> >
> >
> >
> > aid of their party"
> >
> > 3,"The quick brown fox"
> >
> > \.
> >
> > --END EXAMPLE
> >
> >
> >
> > The above example produces this output:
> >
> >
> >
> > CREATE TABLE
> >
> > psql:test2.sql:8: ERROR:  unterminated CSV quoted field
> >
> > CONTEXT:  COPY t, line 5: ""
> >
> >
> >
>
> plain text document attachment (/bjm/0)
> CREATE TABLE t (
>       id INTEGER,
>       description TEXT
> );
> COPY T FROM stdin WITH NULL AS '' CSV QUOTE AS '"';
> 1,"Now is the time"
> 2,"for all good men
> to come
> to the
>
> aid of their party"
> 3,"The quick brown fox"
> \.
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
--
Guy Fraser
Network Administrator
The Internet Centre
1-888-450-6787
(780)450-6787