Date format for bulk copy - Mailing list pgsql-general

From David Rysdam
Subject Date format for bulk copy
Date
Msg-id 416D3682.9070708@ll.mit.edu
Whole thread Raw
Responses Re: Date format for bulk copy  (Greg Stark <gsstark@mit.edu>)
Re: Date format for bulk copy  (Michael Fuhr <mike@fuhr.org>)
List pgsql-general
I have a large amount of data that I copy in and out of Sybase very
often.  Now I also want to copy this data in and out of postgres.  I
have an existing script that creates the entire database(s) from scratch
in Sybase and then uses the Sybase bulk copy tool "bcp" to copy the data
in.

I already did a brute force port of this script to postgres once, but
I'm trying to do it more elegantly now that I know what issues I'm going
to run into.  One of them is date formats in the bcp files.  Sybase bulk
copies the date fields out in this format:

Mar  4 1973 10:28:00:000AM

Postgresql's COPY (or psql \copy) doesn't like that format.  In
particular, it doesn't like the millisecond field at the end.  If I
understand the docs correctly, postgres wants the millisecond field to
be proceeded by a decimal point instead of a colon.  In my brute force
port, I just bulk copied the date fields into temporary tables and then
did a to_timestamp(field, 'Mon DD YYYY HH:MI:SS:MSAM').

That worked, but required a lot of additional logic in my script to
handle the temp tables and conversions.  I'd hate to have to keep all
that overhead in there to basically handle a conversion of a colon to a
decimal point.

So my questions are these:

0) I thought of creating a user-defined data type for this, but it seems
like overkill, especially if I'd have to provide all kinds of helper
functions for things like date incrementation or comparison or
whatever.  Am I off track?
1) Are there any tools out there that allow for specifying the field
format of a COPY?
2) If not, is it reasonable or unreasonable to modify the postgresql
source (I'm running Beta 3) to handle a colon as a millisecond
delimiter?  (If so, where do I look?)
3) If I did create such a patch, would the postgresql accept it into the
tree?

I have a similarish problem with another field type.  In Sybase it's a
binary format.  In postgres it is a binary format (bytea).  But Sybase
bcps the data out in ASCII.  Sybase recognizes that when it is a binary
field and auto-converts the ASCII back to binary.  Postgres doesn't.
Again, I created a temporary table and did a decode(field, 'hex') to the
real table.  It seems reasonable to expect to be able to bulk copy
ASCII-encoded binary values into binary fields.  Probably this field is
best described by a user-defined type....?

pgsql-general by date:

Previous
From: "ruben20@superguai.com"
Date:
Subject: Recovering data from corrupted table. Urgent Help!!
Next
From: Gaetano Mendola
Date:
Subject: Re: could not access status of transaction 4244329