Re: Export tab delimited from mysql to postgres. - Mailing list pgsql-sql

From Pierre-Frédéric Caillaud
Subject Re: Export tab delimited from mysql to postgres.
Date
Msg-id opsfqxb7stcq72hf@musicbox
Whole thread Raw
In response to Re: Export tab delimited from mysql to postgres.  (Theo Galanakis <Theo.Galanakis@lonelyplanet.com.au>)
List pgsql-sql
A tested example in Python :
Basically it counts the \t and accumulates the lines until it has enough  
and then prints the line.
Note : as an exercise you could add a test so that there are exactly  
(columns-1) delimiters and not >=(columns-1).

def grouplines( in_stream, columns, delimiter ):num_delimiters = columns - 1accum = ''for line in in_stream:    accum
+=line    if accum.count( delimiter ) >= num_delimiters:        print accum.replace( "\n", "\\n" )        accum = ''
   if accum:    print "Last line unterminated."
 

grouplines( open( 'data.in' ), 3, "\t" ):


Input data (I added a column over your example):
1       What a day!     A
2       What a week it has
been!   B
3       What the!       C

Output :
1       What a day!     A\n
2       What a week it has\nbeen!       B\n
3       What the!       C

Have fun with your copy !

On Tue, 12 Oct 2004 15:33:46 +1000, Theo Galanakis  
<Theo.Galanakis@lonelyplanet.com.au> wrote:

>
> Thanks for all your comments,
>
>     I have beent trying the insert within a transaction block, however
> it does not seem to reduce the time it takes to process each records.  
> Mind
> you there are 80 column and the insert statement explicitly defines the
> column to insert into.
>
>     I need any tip I can get help me transform the text file into a
> format postgres copy will successfully read.
>
>
> Here is sample of the current format of a mysql tab delimited dump..
>
> columnA    columnB
>
> 1        What a day!
> 2        What a week it has
> been!
> 3        What the!
>
> As you can see row 2 has a value that holds a CR value which ends up
> wrapping around onto the third line. Postgres copy command does not like
> this and mysql is unable to replace the value with another type of
> delimiter, like a \r.
>
> So I gather I have to some how manually replace the carriage return with
> something postgres understand \r...
>
>
> columnA    columnB
>
> 1        What a day!
> 2        What a week it has \r been!
> 3        What the!
>
> How do I do this without getting a text file that looks like this
>
> 1    What a day! \r\n2        What a week it has \r been!\r\n3
> What the!\r\n
>
> Any help would be appreciated.
>
> Theo
>
> -----Original Message-----
> From: Christopher Browne [mailto:cbbrowne@acm.org]
>
> Sent: Tuesday, 12 October 2004 10:46 AM
> To: pgsql-sql@postgresql.org
> Subject: Re: [SQL] Export tab delimited from mysql to postgres.
>
>
> Quoth Theo.Galanakis@lonelyplanet.com.au (Theo Galanakis):
>>         Could you provide a example of how to do this?
>>
>>         I actually ended up exporting the data as Insert statements,
>> which strips out cf/lf within varchars. However it takes an eternity
>
>> to import 200,000 records... 24 hours infact???? Is this normal?
>
> I expect that this results from each INSERT being a separate transaction.
>
> If you put a BEGIN at the start and a COMMIT at the end, you'd doubtless  
> see
> an ENORMOUS improvement.
>
> That's not even the _big_ improvement, either.  The _big_ improvement  
> would
> involve reformatting the data so that you could use the COPY statement,
> which is _way_ faster than a bunch of INSERTs.  Take a look at the
> documentation to see the formatting that is needed:
>
> http://techdocs.postgresql.org/techdocs/usingcopy.php
> http://www.faqs.org/docs/ppbook/x5504.htm
> http://www.postgresql.org/docs/7.4/static/sql-copy.html
> --
>
> output = ("cbbrowne" "@" "ntlug.org")
> http://www3.sympatico.ca/cbbrowne/lsf.html
> Question: How many surrealists does it take to change a light bulb?
>
> Answer: Two, one to hold the giraffe, and the other to fill the bathtub
>         with brightly colored machine tools.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
>
> ______________________________________________________________________
> This email, including attachments, is intended only for the addressee
> and may be confidential, privileged and subject to copyright.  If you
> have received this email in error, please advise the sender and delete
> it.  If you are not the intended recipient of this email, you must not
> use, copy or disclose its content to anyone.  You must not copy or
>
> communicate to others content that is confidential or subject to
>
> copyright, unless you have the consent of the content owner.




pgsql-sql by date:

Previous
From: Theo Galanakis
Date:
Subject: Re: Export tab delimited from mysql to postgres.
Next
From: Gerald.Laurent@bull.net
Date:
Subject: Information about the command SQL " create synonym".