Thread: Help with copy (loading TSV file into table as text)

Help with copy (loading TSV file into table as text)

From
Allan Kamau
Date:
Hi,
I have a tab delimited file with over a thousand fields (columns)
which I would like to import into postgreSQL.

I have opted to import the entire record (line) of this file into a
single field in a table, one table record per file line. Later split
the contents of the field accordingly into an array for further
processing and decomposition.

I am using the COPY command as follows
#COPY a.t(raw_data)FROM '/data/tmp/t.txt';

and it reports "ERROR:  extra data after last expected column".


#COPY a.t(raw_data)FROM '/data/tmp/t.txt' WITH FORMAT text;

yields "ERROR:  syntax error at or near "FORMAT"


I am running PostgreSQL 9.1.1 compiled from source and deployed on
Fedora Core 14 64bit. Below is the output of version();

m=> SELECT VERSION();
                                                   version
--------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.1.1 on x86_64-unknown-linux-gnu, compiled by gcc (GCC)
4.4.5 20101112 (Red Hat 4.4.5-2), 64-bit
(1 row)


Allan.

Re: Help with copy (loading TSV file into table as text)

From
Raymond O'Donnell
Date:
On 24/10/2011 20:23, Allan Kamau wrote:
> Hi,
> I have a tab delimited file with over a thousand fields (columns)
> which I would like to import into postgreSQL.
>
> I have opted to import the entire record (line) of this file into a
> single field in a table, one table record per file line. Later split
> the contents of the field accordingly into an array for further
> processing and decomposition.
>
> I am using the COPY command as follows
> #COPY a.t(raw_data)FROM '/data/tmp/t.txt';
>
> and it reports "ERROR:  extra data after last expected column".

I'd guess that there's a character in the input which PG is interpreting
as a column delimiter. Use the WITH DELIMITER option to set the expected
delimiter to something that won't appear in the input.

Ray.


--
Raymond O'Donnell :: Galway :: Ireland
rod@iol.ie

Re: Help with copy (loading TSV file into table as text)

From
Allan Kamau
Date:
On Mon, Oct 24, 2011 at 11:29 PM, Raymond O'Donnell <rod@iol.ie> wrote:
> On 24/10/2011 20:23, Allan Kamau wrote:
>> Hi,
>> I have a tab delimited file with over a thousand fields (columns)
>> which I would like to import into postgreSQL.
>>
>> I have opted to import the entire record (line) of this file into a
>> single field in a table, one table record per file line. Later split
>> the contents of the field accordingly into an array for further
>> processing and decomposition.
>>
>> I am using the COPY command as follows
>> #COPY a.t(raw_data)FROM '/data/tmp/t.txt';
>>
>> and it reports "ERROR:  extra data after last expected column".
>
> I'd guess that there's a character in the input which PG is interpreting
> as a column delimiter. Use the WITH DELIMITER option to set the expected
> delimiter to something that won't appear in the input.
>
> Ray.
>
>
> --
> Raymond O'Donnell :: Galway :: Ireland
> rod@iol.ie
>

Yes PG senses the tab and assumes it to be a field delimiter. I have
now used the '=' character as my "csv" delimiter after grepping the
file on '=' and seeing no lines were returned.

Out of curiosity, what would be the correct use of "FORMAT text"
option in COPY command. The command below does give me an error as
indicated in my previous email.

COPY a.t(raw_data)FROM '/data/tmp/t.txt' FORMAT text;

Allan.

Re: Help with copy (loading TSV file into table as text)

From
Raymond O'Donnell
Date:
On 24/10/2011 22:39, Allan Kamau wrote:
> On Mon, Oct 24, 2011 at 11:29 PM, Raymond O'Donnell <rod@iol.ie> wrote:
>> On 24/10/2011 20:23, Allan Kamau wrote:
>>> Hi,
>>> I have a tab delimited file with over a thousand fields (columns)
>>> which I would like to import into postgreSQL.
>>>
>>> I have opted to import the entire record (line) of this file into a
>>> single field in a table, one table record per file line. Later split
>>> the contents of the field accordingly into an array for further
>>> processing and decomposition.
>>>
>>> I am using the COPY command as follows
>>> #COPY a.t(raw_data)FROM '/data/tmp/t.txt';
>>>
>>> and it reports "ERROR:  extra data after last expected column".
>>
>> I'd guess that there's a character in the input which PG is interpreting
>> as a column delimiter. Use the WITH DELIMITER option to set the expected
>> delimiter to something that won't appear in the input.
>>
>> Ray.
>>
>>
>> --
>> Raymond O'Donnell :: Galway :: Ireland
>> rod@iol.ie
>>
>
> Yes PG senses the tab and assumes it to be a field delimiter. I have
> now used the '=' character as my "csv" delimiter after grepping the
> file on '=' and seeing no lines were returned.
>
> Out of curiosity, what would be the correct use of "FORMAT text"
> option in COPY command. The command below does give me an error as
> indicated in my previous email.
>
> COPY a.t(raw_data)FROM '/data/tmp/t.txt' FORMAT text;

I honestly don't know - it looks fine to me. The docs suggest that it'd
give an error on pre-9.0 Postgres; but you said you're on 9.1...

<shrug>.


Ray.


--
Raymond O'Donnell :: Galway :: Ireland
rod@iol.ie

Re: Help with copy (loading TSV file into table as text)

From
"Albe Laurenz"
Date:
Allan Kamau wrote:
> #COPY a.t(raw_data)FROM '/data/tmp/t.txt' WITH FORMAT text;
>
> yields "ERROR:  syntax error at or near "FORMAT"

You'll have to use the syntax as documented:

COPY ... FROM ... WITH (FORMAT 'text');

Yours,
Laurenz Albe