COPY table_name (single_column) FROM 'iso-8859-1.txt' DELIMITER E'\n' - Mailing list pgsql-hackers

From Joel Jacobson
Subject COPY table_name (single_column) FROM 'iso-8859-1.txt' DELIMITER E'\n'
Date
Msg-id f901c82d-5144-4d3e-92a0-54207cf1d9a7@www.fastmail.com
Whole thread Raw
Responses Re: COPY table_name (single_column) FROM 'iso-8859-1.txt' DELIMITER E'\n'
Re: COPY table_name (single_column) FROM 'iso-8859-1.txt' DELIMITER E'\n'
List pgsql-hackers
Hi,

From time to time, I need to deal with bizarrely delimited text files,
having to use tools such as awk/sed/perl to reformat the files
so that they can be copied into PostgreSQL.

If such files could be imported to a table with a single text column,
we could then use PostgreSQL's now really fast regex-engine
to do data cleaning/reformatting, instead of having to rely on external awk-ward tools.

Furthermore, sometimes you don't want to clean/reformat the data at all,
but simply import the text lines "as is" without modifications,
such as when wanting to import unformatted log files,
where the log lines can contain any characters.

Could it be an idea to exploit the fact that DELIMITER E'\n' is currently an error?

    ERROR:  COPY delimiter cannot be newline or carriage return

That is, to change E'\n' to be a valid delimiter, which would simply read each line
delimited by newlines, as a single column.

The hack I'm currently abusing is to find some one-byte character that is not present anywhere in the text file,
and then to use that character as a delimiter. This doesn't work when needing to deal with a text file
which content is unknown at the time when writing the code though, so it's mostly useful for throwaway one-off queries.

Thoughts?

/Joel

MySQL seems to already support using \n as a delimiter (I haven't verified it myself though) [1]



pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: Is txid_status() actually safe? / What is 011_crash_recovery.pl testing?
Next
From: Laurenz Albe
Date:
Subject: Re: pg_receivewal makes a bad daemon