Thread: Importing *huge* mysql database into pgsql
Hello, I would like to convert a mysql database with 5 million records and growing, to a pgsql database. All the stuff I have come across on the net has things like "mysqldump" and "psql -f", which sounds like I will be sitting forever getting this to work. Is there anything else? Thanks.
> I would like to convert a mysql database with 5 million records and > growing, to a pgsql database. > > All the stuff I have come across on the net has things like > "mysqldump" and "psql -f", which sounds like I will be sitting forever > getting this to work. Have you tried it? 5 million rows seem doable. In postgres, make sure you disable indexes and checks when you do your import, and use the bulk copy. How long is forever? Can you go offline? If you only need to do it once, it probably won't be too painful W
> I would like to convert a mysql database with 5 million records and > growing, to a pgsql database. I wouldn't qualify 5 million as "huge". We have here several 100 million sized tables, and I still think it's a medium sized DB... > All the stuff I have come across on the net has things like > "mysqldump" and "psql -f", which sounds like I will be sitting forever > getting this to work. Why do you think that would be running forever ? It should be fast enough. Or you mean it's "forever" measured in development effort ? I would say you should give it a go, it shouldn't be that bad... Cheers, Csaba.
.ep wrote: > Hello, > > I would like to convert a mysql database with 5 million records and > growing, to a pgsql database. And where's the huge database? > All the stuff I have come across on the net has things like > "mysqldump" and "psql -f", which sounds like I will be sitting forever > getting this to work. Well, there's not much of an alternative to exporting from one system and importing to another. If you do find a better way, patent it! This is probably a sensible place to start for converting schemas: http://pgfoundry.org/projects/mysql2pgsql Then, you'll face two problems: 1. Invalid data in your mysql dump (e.g. dates like 0000-00-00) 2. Mysql-specific usage in your application Then you might want to examine any performance issues (where your application code has been tuned to work well with MySQL but not necessarily PG). Shouldn't be more than a day's work, maybe just 1/2 a day. I like to build these things up as sets of perl scripts. That way when I notice "one more thing" I can re-run my scripts from wherever the problem was. Oh - if you come up with any improvements in mysql2pgsql then let the developers know - I'm sure they'll be interested. Good luck! -- Richard Huxton Archonet Ltd
.ep wrote: > Hello, > > I would like to convert a mysql database with 5 million records and > growing, to a pgsql database. > > All the stuff I have come across on the net has things like > "mysqldump" and "psql -f", which sounds like I will be sitting forever > getting this to work. > > Is there anything else? > > Thanks. > I used this perl script (you'll need to modify it a little for your setup). It generates a psql script that uses COPY instead of INSERT, so it runs much faster. -Andy #!/usr/bin/perl # call like: ./mydump.pl dbname filename.sql use strict; use DBI; my $outfile = pop; my $database = pop; print "dumping db: $database to file $outfile\n"; my $host = 'servername'; my $port = 33060; my $dsn = "DBI:mysql:database=$database;host=$host;port=$port"; my $db = DBI->connect($dsn, 'username', 'password') or die; sub dumptable { my $tbl = pop; print "Dumping table: $tbl\n"; my $q = $db->prepare("select * from $tbl"); #$q->{"mysql_use_result"} = 1; $q->execute(); my $names = $q->{'NAME'}; my $type = $q->{'mysql_type_name'}; my $numFields = $q->{'NUM_OF_FIELDS'}; print OUT "\\echo Table: $tbl\n"; print OUT "delete from $tbl;\n"; print OUT "copy $tbl ("; for (my $i = 0; $i < $numFields; $i++) { printf(OUT "%s%s", $i ? ", " : "", $$names[$i]); } print OUT ") FROM stdin;\n"; my($s); while (my $ref = $q->fetchrow_arrayref) { for (my $i = 0; $i < $numFields; $i++) { if (defined($$ref[$i])) { $s = $$ref[$i]; $s =~ s{\\}[\\\\]g; #things that should not be double slashed $s =~ s/\n/\\n/g; $s =~ s/\r/\\r/g; $s =~ s/\t/\\t/g; $s =~ s/\000/\\000/g; } else { $s = '\\N'; } printf(OUT "%s%s", $i ? "\t" : "", $s); } print OUT "\n"; } print OUT "\\.\n"; $q = undef; } sub dumpall { open(OUT, ">$outfile") or die; # tables you dont want to dump my $dont = {'junk' => 1, 'temp' => 1, 'temp2' => 1, 'tempetc' => 1, }; my $q = $db->prepare('show tables'); $q->execute(); while (my $x = $q->fetchrow_arrayref) { if (! exists($dont->{$$x[0]}) ) { #print "dump $$x[0]\n"; dumptable($$x[0]); } } $q = undef; print OUT "VACUUM VERBOSE ANALYZE;\n"; close(OUT); } #open(OUT, '>out.sql') or die; #dumptable('note'); #close(OUT); dumpall(); $db->disconnect();
".ep" <erick.papa@gmail.com> writes: > Hello, > > I would like to convert a mysql database with 5 million records and > growing, to a pgsql database. > > All the stuff I have come across on the net has things like > "mysqldump" and "psql -f", which sounds like I will be sitting forever > getting this to work. > > Is there anything else? Where's the "*huge*" database? 5 million records is nothing; I'll run that on my laptop, forget about having a real server... If memory serves, the mysqldump will generate a dump consisting of: 1. Schema information - which will need to get edited a bit to get rid of manifest MySQL-isms. For instance... "TYPE=ISAM PACK_KEYS=1" needs to be trimmed out... There may be some column types that exist in MySQL that do not have the same names in PostgreSQL. Those will need to be changed. 2. It will then consist of a series of INSERT statements. Those will insert mighty slowly if loaded as a transaction apiece. If you add a BEGIN; every once in a while followed by a COMMIT;, perhaps surrounding each table's data, that will cause all that data to be loaded as a single transaction, which will be much quicker. If you could dump out each table in something like tab-delimited form, PostgreSQL could use COPY to load the data, which tends to be way, way, faster. But you only have 5 million records, so it's hardly a large database requiring special measures. -- (reverse (concatenate 'string "ofni.secnanifxunil" "@" "enworbbc")) http://linuxfinances.info/info/finances.html Rules of the Evil Overlord #189. "I will never tell the hero "Yes I was the one who did it, but you'll never be able to prove it to that incompetent old fool." Chances are, that incompetent old fool is standing behind the curtain." <http://www.eviloverlord.com/>
On 06/03/2007 16:53, Chris Browne wrote: > Where's the "*huge*" database? 5 million records is nothing; I'll run ...unless they're all biggish BLOBs..... ;-P Ray. --------------------------------------------------------------- Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland rod@iol.ie ---------------------------------------------------------------
.ep wrote: > Hello, > > I would like to convert a mysql database with 5 million records and > growing, to a pgsql database. > > All the stuff I have come across on the net has things like > "mysqldump" and "psql -f", which sounds like I will be sitting forever > getting this to work. If you can convert the database schema, then in mysql do a dump of the tables like this: select * from table into outfile '/tmp/filename'; (see http://dev.mysql.com/doc/refman/4.1/en/select.html) and then import it into postgres like this: \copy table from '/tmp/filename' (see http://www.postgresql.org/docs/8.2/interactive/sql-copy.html) That's much better because it creates a CSV like file which postgres can process in one go. Using complete inserts to do a conversion is horribly slow because postgres does a single transaction per insert - so you can either wrap a number of inserts inside a transaction, or do a copy like this (copy is best). -- Postgresql & php tutorials http://www.designmagick.com/
In article <1173191066.416664.320470@n33g2000cwc.googlegroups.com>, ".ep" <erick.papa@gmail.com> writes: > Hello, > I would like to convert a mysql database with 5 million records and > growing, to a pgsql database. > All the stuff I have come across on the net has things like > "mysqldump" and "psql -f", which sounds like I will be sitting forever > getting this to work. > Is there anything else? If you really want to convert a *huge* MySQL database (and not your tiny 5M record thingie), I'd suggest "mysqldump -T". This creates for each table an .sql file containing just the DDL, and a .txt file containing the data. Then edit all .sql files: * Fix type and index definitions etc. * Append a "COPY thistbl FROM 'thispath/thistbl.txt';" Then run all .sql files with psql, in an order dictated by foreign keys.