Thread: Export tab delimited from mysql to postgres.

Export tab delimited from mysql to postgres.

From

Theo Galanakis

Date:

11 October 2004, 00:03:33

<p><font face="Times New Roman">Has anyone been able to export and import a tab delimited file from mysql to postgres
successfully?</font><br/><font face="Times New Roman">This is my dilemma which I have posted on mysql sites with no
response,however postgres users may have come across the same issue:</font><p><font face="Times New Roman">Trying to
exporta tab delimited file with the following command, however it does not appear to strip out the CR/LF within a
varcharcolumn.<br /><br /> select * into outfile '/tmp/users.txt' FIELDS ESCAPED BY '\\' LINES TERMINATED BY '\r\n'
fromusers;<br /><br /> However when I use the below function to export as an insert statement, it escapes all the CR/LF
withineach column.<br /><br /> mysqldump --database mydatabase --password --user myusername --verbose --complete-insert
--tablesusers > myfile.sql<br /><br /> Does anyone have any insigt into why the above select does not escape CR/LF
withinvarchar columns?<br /><br /> I really want to get the export as tab delimited working to import into
postgres,whichis much faster than insert statements.<br /><br /> Theo</font><table><tr><td bgcolor="#ffffff"><font
color="#000000">______________________________________________________________________<br/>This email, including
attachments,is intended only for the addressee<br />and may be confidential, privileged and subject to copyright. If
you<br/>have received this email in error, please advise the sender and delete<br />it. If you are not the intended
recipientof this email, you must not<br />use, copy or disclose its content to anyone. You must not copy or <br
/>communicateto others content that is confidential or subject to <br />copyright, unless you have the consent of the
contentowner.<br /></font></td></tr></table>

Re: Export tab delimited from mysql to postgres.

From

Harald Fuchs

Date:

11 October 2004, 10:40:47

In article <82E30406384FFB44AFD1012BAB230B5505F40D3A@shiva.au.lpint.net>,
Theo Galanakis <Theo.Galanakis@lonelyplanet.com.au> writes:

> Has anyone been able to export and import a tab delimited file from mysql to
> postgres successfully?
> This is my dilemma which I have posted on mysql sites with no response,
> however postgres users may have come across the same issue:

> Trying to export a tab delimited file with the following command, however it
> does not appear to strip out the CR/LF within a varchar column.
> select * into outfile '/tmp/users.txt' FIELDS ESCAPED BY '\\' LINES TERMINATED
> BY '\r\n' from users;

It's a MySQL bug/misfeature that it doesn't translate CRs into '\r'.
Nevertheless, you can import a MySQL tab delimited file into
PostgreSQL - just pipe it through a small sed script.

Re: Export tab delimited from mysql to postgres.

From

Theo Galanakis

Date:

12 October 2004, 00:27:51

<p>        <font size="2">Could you provide a example of how to do this?</font><p>        <font size="2">I actually
endedup exporting the data as Insert statements, which strips out cf/lf within varchars. However it takes an eternity
toimport 200,000 records... 24 hours infact???? Is this normal?</font><p><font size="2">Theo</font><p><font
size="2">-----OriginalMessage-----</font><br /><font size="2">From: Harald Fuchs [<a
href="mailto:hf0722x@protecting.net">mailto:hf0722x@protecting.net</a>]</font><br /><font size="2">Sent: Monday, 11
October2004 7:34 PM</font><br /><font size="2">To: pgsql-sql@postgresql.org</font><br /><font size="2">Subject: Re:
[SQL]Export tab delimited from mysql to postgres.</font><br /><p><font size="2">In article
<82E30406384FFB44AFD1012BAB230B5505F40D3A@shiva.au.lpint.net>,</font><br/><font size="2">Theo Galanakis
<Theo.Galanakis@lonelyplanet.com.au>writes:</font><p><font size="2">> Has anyone been able to export and
importa tab delimited file from </font><br /><font size="2">> mysql to postgres successfully? This is my dilemma
whichI have posted </font><br /><font size="2">> on mysql sites with no response, however postgres users may have
come</font><br /><font size="2">> across the same issue:</font><p><font size="2">> Trying to export a tab
delimitedfile with the following command, </font><br /><font size="2">> however it does not appear to strip out the
CR/LFwithin a varchar </font><br /><font size="2">> column. select * into outfile '/tmp/users.txt' FIELDS ESCAPED BY
'\\'</font><br /><font size="2">> LINES TERMINATED BY '\r\n' from users;</font><p><font size="2">It's a MySQL
bug/misfeaturethat it doesn't translate CRs into '\r'. Nevertheless, you can import a MySQL tab delimited file into
PostgreSQL- just pipe it through a small sed script.</font><br /><p><font size="2">---------------------------(end of
broadcast)---------------------------</font><br/><font size="2">TIP 4: Don't 'kill -9' the
postmaster</font><table><tr><tdbgcolor="#ffffff"><font
color="#000000">______________________________________________________________________<br/>This email, including
attachments,is intended only for the addressee<br />and may be confidential, privileged and subject to copyright. If
you<br/>have received this email in error, please advise the sender and delete<br />it. If you are not the intended
recipientof this email, you must not<br />use, copy or disclose its content to anyone. You must not copy or <br
/>communicateto others content that is confidential or subject to <br />copyright, unless you have the consent of the
contentowner.<br /></font></td></tr></table>

Re: Export tab delimited from mysql to postgres.

From

Bruno Wolff III

Date:

12 October 2004, 00:31:44

On Tue, Oct 12, 2004 at 09:23:41 +1000, Theo Galanakis <Theo.Galanakis@lonelyplanet.com.au> wrote:
> 
>     Could you provide a example of how to do this?
> 
>     I actually ended up exporting the data as Insert statements, which
> strips out cf/lf within varchars. However it takes an eternity to import
> 200,000 records... 24 hours infact???? Is this normal?

Are you doing the load in one transaction?

Re: Export tab delimited from mysql to postgres.

From

Christopher Browne

Date:

12 October 2004, 01:48:54

Quoth Theo.Galanakis@lonelyplanet.com.au (Theo Galanakis):
>         Could you provide a example of how to do this?
>
>         I actually ended up exporting the data as Insert statements,
> which strips out cf/lf within varchars. However it takes an eternity
> to import 200,000 records... 24 hours infact???? Is this normal?

I expect that this results from each INSERT being a separate
transaction.

If you put a BEGIN at the start and a COMMIT at the end, you'd
doubtless see an ENORMOUS improvement.

That's not even the _big_ improvement, either.  The _big_ improvement
would involve reformatting the data so that you could use the COPY
statement, which is _way_ faster than a bunch of INSERTs.  Take a look
at the documentation to see the formatting that is needed:

http://techdocs.postgresql.org/techdocs/usingcopy.php
http://www.faqs.org/docs/ppbook/x5504.htm
http://www.postgresql.org/docs/7.4/static/sql-copy.html
-- 
output = ("cbbrowne" "@" "ntlug.org")
http://www3.sympatico.ca/cbbrowne/lsf.html
Question: How many surrealists does it take to change a light bulb?

Answer: Two, one to hold the giraffe, and the other to fill the bathtub       with brightly colored machine tools.

Re: Export tab delimited from mysql to postgres.

From

Theo Galanakis

Date:

12 October 2004, 06:38:12

<p><font size="2">Thanks for all your comments,</font><p>        <font size="2">I have beent trying the insert within a
transactionblock, however it does not seem to reduce the time it takes to process each records. Mind you there are 80
columnand the insert statement explicitly defines the column to insert into.</font><p>        <font size="2">I need any
tipI can get help me transform the text file into a format postgres copy will successfully read. </font><p><font
size="2">Hereis sample of the current format of a mysql tab delimited dump..</font><p><font size="2">columnA columnB
</font><br/><font size="2">1               What a day!</font><br /><font size="2">2               What a week it
has</font><br/><font size="2">been!</font><br /><font size="2">3               What the!</font><p><font size="2">As you
cansee row 2 has a value that holds a CR value which ends up wrapping around onto the third line. Postgres copy command
doesnot like this and mysql is unable to replace the value with another type of delimiter, like a \r.</font><p><font
size="2">SoI gather I have to some how manually replace the carriage return with something postgres understand \r...
</font><p><fontsize="2">columnA columnB </font><br /><font size="2">1               What a day!</font><br /><font
size="2">2              What a week it has \r been!</font><br /><font size="2">3               What the!</font><p><font
size="2">Howdo I do this without getting a text file that looks like this </font><br /><font size="2">1       What a
day!\r\n2               What a week it has \r been!\r\n3                What the!\r\n</font><p><font size="2">Any help
wouldbe appreciated.</font><p><font size="2">Theo</font><p><font size="2">-----Original Message-----</font><br /><font
size="2">From:Christopher Browne [<a href="mailto:cbbrowne@acm.org">mailto:cbbrowne@acm.org</a>] </font><br /><font
size="2">Sent:Tuesday, 12 October 2004 10:46 AM</font><br /><font size="2">To: pgsql-sql@postgresql.org</font><br
/><fontsize="2">Subject: Re: [SQL] Export tab delimited from mysql to postgres.</font><br /><p><font size="2">Quoth
Theo.Galanakis@lonelyplanet.com.au(Theo Galanakis):</font><br /><font size="2">>         Could you provide a example
ofhow to do this?</font><br /><font size="2">></font><br /><font size="2">>         I actually ended up exporting
thedata as Insert statements,</font><br /><font size="2">> which strips out cf/lf within varchars. However it takes
aneternity </font><br /><font size="2">> to import 200,000 records... 24 hours infact???? Is this
normal?</font><p><fontsize="2">I expect that this results from each INSERT being a separate transaction.</font><p><font
size="2">Ifyou put a BEGIN at the start and a COMMIT at the end, you'd doubtless see an ENORMOUS
improvement.</font><p><fontsize="2">That's not even the _big_ improvement, either.  The _big_ improvement would involve
reformattingthe data so that you could use the COPY statement, which is _way_ faster than a bunch of INSERTs.  Take a
lookat the documentation to see the formatting that is needed:</font><p><font size="2"><a
href="http://techdocs.postgresql.org/techdocs/usingcopy.php"
target="_blank">http://techdocs.postgresql.org/techdocs/usingcopy.php</a></font><br/><font size="2"><a
href="http://www.faqs.org/docs/ppbook/x5504.htm"target="_blank">http://www.faqs.org/docs/ppbook/x5504.htm</a></font><br
/><fontsize="2"><a href="http://www.postgresql.org/docs/7.4/static/sql-copy.html"
target="_blank">http://www.postgresql.org/docs/7.4/static/sql-copy.html</a></font><br/><font size="2">-- </font><br
/><fontsize="2">output = ("cbbrowne" "@" "ntlug.org") <a href="http://www3.sympatico.ca/cbbrowne/lsf.html"
target="_blank">http://www3.sympatico.ca/cbbrowne/lsf.html</a></font><br/><font size="2">Question: How many surrealists
doesit take to change a light bulb?</font><p><font size="2">Answer: Two, one to hold the giraffe, and the other to fill
thebathtub</font><br /><font size="2">        with brightly colored machine tools.</font><p><font
size="2">---------------------------(endof broadcast)---------------------------</font><br /><font size="2">TIP 2: you
canget off all lists at once with the unregister command</font><br /><font size="2">    (send "unregister
YourEmailAddressHere"to majordomo@postgresql.org)</font><table><tr><td bgcolor="#ffffff"><font
color="#000000">______________________________________________________________________<br/>This email, including
attachments,is intended only for the addressee<br />and may be confidential, privileged and subject to copyright. If
you<br/>have received this email in error, please advise the sender and delete<br />it. If you are not the intended
recipientof this email, you must not<br />use, copy or disclose its content to anyone. You must not copy or <br
/>communicateto others content that is confidential or subject to <br />copyright, unless you have the consent of the
contentowner.<br /></font></td></tr></table>

Re: Export tab delimited from mysql to postgres.

From

Pierre-Frédéric Caillaud

Date:

12 October 2004, 08:16:53

A tested example in Python :
Basically it counts the \t and accumulates the lines until it has enough  
and then prints the line.
Note : as an exercise you could add a test so that there are exactly  
(columns-1) delimiters and not >=(columns-1).

def grouplines( in_stream, columns, delimiter ):num_delimiters = columns - 1accum = ''for line in in_stream:    accum
+=line    if accum.count( delimiter ) >= num_delimiters:        print accum.replace( "\n", "\\n" )        accum = ''
   if accum:    print "Last line unterminated."
 

grouplines( open( 'data.in' ), 3, "\t" ):


Input data (I added a column over your example):
1       What a day!     A
2       What a week it has
been!   B
3       What the!       C

Output :
1       What a day!     A\n
2       What a week it has\nbeen!       B\n
3       What the!       C

Have fun with your copy !

On Tue, 12 Oct 2004 15:33:46 +1000, Theo Galanakis  
<Theo.Galanakis@lonelyplanet.com.au> wrote:

>
> Thanks for all your comments,
>
>     I have beent trying the insert within a transaction block, however
> it does not seem to reduce the time it takes to process each records.  
> Mind
> you there are 80 column and the insert statement explicitly defines the
> column to insert into.
>
>     I need any tip I can get help me transform the text file into a
> format postgres copy will successfully read.
>
>
> Here is sample of the current format of a mysql tab delimited dump..
>
> columnA    columnB
>
> 1        What a day!
> 2        What a week it has
> been!
> 3        What the!
>
> As you can see row 2 has a value that holds a CR value which ends up
> wrapping around onto the third line. Postgres copy command does not like
> this and mysql is unable to replace the value with another type of
> delimiter, like a \r.
>
> So I gather I have to some how manually replace the carriage return with
> something postgres understand \r...
>
>
> columnA    columnB
>
> 1        What a day!
> 2        What a week it has \r been!
> 3        What the!
>
> How do I do this without getting a text file that looks like this
>
> 1    What a day! \r\n2        What a week it has \r been!\r\n3
> What the!\r\n
>
> Any help would be appreciated.
>
> Theo
>
> -----Original Message-----
> From: Christopher Browne [mailto:cbbrowne@acm.org]
>
> Sent: Tuesday, 12 October 2004 10:46 AM
> To: pgsql-sql@postgresql.org
> Subject: Re: [SQL] Export tab delimited from mysql to postgres.
>
>
> Quoth Theo.Galanakis@lonelyplanet.com.au (Theo Galanakis):
>>         Could you provide a example of how to do this?
>>
>>         I actually ended up exporting the data as Insert statements,
>> which strips out cf/lf within varchars. However it takes an eternity
>
>> to import 200,000 records... 24 hours infact???? Is this normal?
>
> I expect that this results from each INSERT being a separate transaction.
>
> If you put a BEGIN at the start and a COMMIT at the end, you'd doubtless  
> see
> an ENORMOUS improvement.
>
> That's not even the _big_ improvement, either.  The _big_ improvement  
> would
> involve reformatting the data so that you could use the COPY statement,
> which is _way_ faster than a bunch of INSERTs.  Take a look at the
> documentation to see the formatting that is needed:
>
> http://techdocs.postgresql.org/techdocs/usingcopy.php
> http://www.faqs.org/docs/ppbook/x5504.htm
> http://www.postgresql.org/docs/7.4/static/sql-copy.html
> --
>
> output = ("cbbrowne" "@" "ntlug.org")
> http://www3.sympatico.ca/cbbrowne/lsf.html
> Question: How many surrealists does it take to change a light bulb?
>
> Answer: Two, one to hold the giraffe, and the other to fill the bathtub
>         with brightly colored machine tools.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
>
> ______________________________________________________________________
> This email, including attachments, is intended only for the addressee
> and may be confidential, privileged and subject to copyright.  If you
> have received this email in error, please advise the sender and delete
> it.  If you are not the intended recipient of this email, you must not
> use, copy or disclose its content to anyone.  You must not copy or
>
> communicate to others content that is confidential or subject to
>
> copyright, unless you have the consent of the content owner.

Re: Export tab delimited from mysql to postgres.

From

Tom Lane

Date:

12 October 2004, 14:56:28

Theo Galanakis <Theo.Galanakis@lonelyplanet.com.au> writes:
> As you can see row 2 has a value that holds a CR value which ends up
> wrapping around onto the third line. Postgres copy command does not like
> this and mysql is unable to replace the value with another type of
> delimiter, like a \r.

> So I gather I have to some how manually replace the carriage return with
> something postgres understand \r...

> columnA    columnB
> 1        What a day!
> 2        What a week it has \r been!
> 3        What the!

> How do I do this without getting a text file that looks like this
> 1    What a day! \r\n2        What a week it has \r been!\r\n3
> What the!\r\n

Looks like a simple sed problem to me:
cat myfile | sed 's/\r$//' | sed 's/\r/\\r/g' >newfile

(Most likely you can do both steps in one sed process, but this is easy
to follow.)  This assumes that you only have newlines (\n) at the real
ends of lines, else you need to think harder about how to tell the
difference between data and formatting.
        regards, tom lane

Re: Export tab delimited from mysql to postgres.

From

Theo Galanakis

Date:

13 October 2004, 00:59:43

<p><font size="2">Thankyou all for your advice,</font><p><font size="2">Looking further into the issue I have realised
that"data and formatting" lines are denoted by the same CR/LF symbols. So if I where to replace all CR/LF symbols it
wouldcreate a file with one line... I have replace the CR/LF for reading purposes below with \r\n or alternatively 0D
0A(hex):</font><p><font size="2">columnA                 columnB </font><br /><font size="2">1               What a
day!\r\n</font><br/><font size="2">2               What a week it has\r\n</font><br /><font
size="2">been!\r\n</font><br/><font size="2">3               What the!\r\n</font><p><font size="2">What this means that
itis impossible to move line three back to line two to look something like this:</font><p><font size="2">columnA
               columnB </font><br /><font size="2">1               What a day!\r\n</font><br /><font
size="2">2              What a week it has\rbeen!\r\n</font><br /><font size="2">3               What
the!\r\n</font><p><fontsize="2">I was thinking of writing code that checked each line to see if correct number of tab
delimitedelements in this case 2 elements, if this was not the case replace the \r\n with \r.</font><p><font
size="2">Theo</font><br/><p><font size="2">-----Original Message-----</font><br /><font size="2">From: Harald Fuchs [<a
href="mailto:hf0722x@protecting.net">mailto:hf0722x@protecting.net</a>]</font><br /><font size="2">Sent: Wednesday, 13
October2004 1:38 AM</font><br /><font size="2">To: Theo Galanakis</font><br /><font size="2">Subject: Re: Export tab
delimitedfrom mysql to postgres.</font><br /><p><font size="2">>         Could you provide a example of how to do
this?</font><p><fontsize="2">Depends on what exactly your dump file contains.  Probably something like</font><p><font
size="2"> sed 's/\\\r/\\r/g'</font><p><font size="2">This means: replace all carriage returns preceded by a backslash
by'\r'.</font><table><tr><td bgcolor="#ffffff"><font
color="#000000">______________________________________________________________________<br/>This email, including
attachments,is intended only for the addressee<br />and may be confidential, privileged and subject to copyright. If
you<br/>have received this email in error, please advise the sender and delete<br />it. If you are not the intended
recipientof this email, you must not<br />use, copy or disclose its content to anyone. You must not copy or <br
/>communicateto others content that is confidential or subject to <br />copyright, unless you have the consent of the
contentowner.<br /></font></td></tr></table>