Re: importing db as text files - Mailing list pgsql-general

From Jason Godden
Subject Re: importing db as text files
Date
Msg-id 200308140734.56108.jasongodden@optushome.com.au
Whole thread Raw
In response to importing db as text files  (expect <expect@ihubbell.com>)
Responses Re: importing db as text files  (expect <expect@ihubbell.com>)
List pgsql-general
Hi expect,

Best way in my opinion is to use the copy table command.  This way Pg will
actually 'massage' the data (string escapes and all) for you.

If you have complex files best to use a combo of sed/awk to pre-massage the
field delimiters and import the data to a temporary table.  In most instances
it is best to pipe the data to the psql command using copy table from stdin
as from file requires that you are the process owner.

Always use a temporary import table and perform validation/further conversion
in that table (IMO).

eg (from one of my earlier posts)

cat unprocesseddata.txt | sed "s/\",\"/$TAB/g" | sed "s/\"//g" | sed "s/.$//g"
| psql -dmydatabase -c "copy importtable from stdin delimiter '\t';"

The first sed will replace all instances of "," with a tab character
(I achieve tab in my shell scripts with a line like TAB=$(printf "\t"))

The second sed will remove the start and end " because the first only checks
for ",".

The third sed will remove the Windows carriage returns (if any - depends on
the platform the file was generated from.

Note here that my file format never contains any " in a field so I can safely
run the second sed.  If your import file contains these then you will need to
change the regex or use awk.

Regards,

Jason

On Thu, 14 Aug 2003 07:14 am, expect wrote:
> What's the big deal with importing text files?  I have a 70 MB file to
> import and it's been one problem after another.  I used the copy command
> and it appears that it's just not possible. I finally massaged the file
> into a .sql file and ran that using \i db.sql but that failed too because I
> overlooked ' in names like D'Adario.  The other problem I encountered was
> that a numeric field had to have data in it, pg would not default to the
> default value.  So instead of massaging all the data again I decided to
> change the data type for that column. This is my first experience with
> postgresql and I'm wondering if I should expect to encounter similar pain
> as I go further into this?  So far it's been very painful trying to do what
> I thought would be easy and what I think should be easy.
>
> PostgreSQL 7.3.4 on linux redhat 9
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings


pgsql-general by date:

Previous
From: "Dann Corbit"
Date:
Subject: Re: importing db as text files
Next
From: Dennis Gearon
Date:
Subject: Re: Sorting Problem