Thread: MySQL -> Postgres migration tools?

MySQL -> Postgres migration tools?

From

Erik Jones

Date:

22 January 2010, 15:39:35

Hello,

Given that the EU has approved Oracle's acquisition of Sun w/ MySQL it's fairly likely that there may be a number of
peopleand companies looking to move from MySQL to Postgres in the coming months.  Does anyone know of any good, current
migrationtools out there?  A search for mysql on pgfoundry only turns up a few projects that haven't seen any recent
updates...

Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k

Re: MySQL -> Postgres migration tools?

From

Serge Fonville

Date:

22 January 2010, 17:45:56

Hi,

I googled for 'migrate mysql to postgresql' the first 5 results seem
very useful.
The most useful link I found was http://www.raditha.com/mysql/mysql2pgsql.php

HTH

Regards,

Serge Fonville

On Fri, Jan 22, 2010 at 8:15 PM, Erik Jones <ejones@engineyard.com> wrote:
> Hello,
>
> Given that the EU has approved Oracle's acquisition of Sun w/ MySQL it's fairly likely that there may be a number of
peopleand companies looking to move from MySQL to Postgres in the coming months.  Does anyone know of any good, current
migrationtools out there?  A search for mysql on pgfoundry only turns up a few projects that haven't seen any recent
updates...
>
> Erik Jones, Database Administrator
> Engine Yard
> Support, Scalability, Reliability
> 866.518.9273 x 260
> Location: US/Pacific
> IRC: mage2k
>
>
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

--
http://www.sergefonville.nl

Convince Google!!
They need to support Adsense over SSL
https://www.google.com/adsense/support/bin/answer.py?hl=en&answer=10528
http://www.google.com/support/forum/p/AdSense/thread?tid=1884bc9310d9f923&hl=en

Re: MySQL -> Postgres migration tools?

From

Grzegorz Jaśkiewicz

Date:

23 January 2010, 03:31:28

On Fri, Jan 22, 2010 at 7:15 PM, Erik Jones <ejones@engineyard.com> wrote:
> Hello,
>
> Given that the EU has approved Oracle's acquisition of Sun w/ MySQL it's fairly likely that there may be a number of
peopleand companies looking to move from MySQL to Postgres in the coming months.  Does anyone know of any good, current
migrationtools out there?  A search for mysql on pgfoundry only turns up a few projects that haven't seen any recent
updates...
>

The problem there is not that there are no such tools, but that the
database might need a bit of refinement when you move away from such
simple database engine as MySQL. Most often, either due to lack of
knowledge, or sometimes because you are forced to - the mysql database
won't utilise sometimes even basic features of postgresql (sequences
would be one grand example). On app side, queries would need
refinement. Some of the code can be moved away from app, and into
plpgsql. Tables can be normalised better, since you can successfully
run bit more complicated queries on database side. Etc.

So to summarise, it is two stage process really. And the first stage -
migration, should be fallowed by the database and code refinement
before even the first 'after migration' release. And I believe that
there is numerous sources out there to help people with that. Plus, we
are always trying to be helpful on the list.

--
GJ

Re: MySQL -> Postgres migration tools?

From

Craig Ringer

Date:

23 January 2010, 04:50:49

On 23/01/2010 3:31 PM, Grzegorz Jaśkiewicz wrote:
> On Fri, Jan 22, 2010 at 7:15 PM, Erik Jones<ejones@engineyard.com>  wrote:
>> Hello,
>>
>> Given that the EU has approved Oracle's acquisition of Sun w/ MySQL it's fairly likely that there may be a number of
peopleand companies looking to move from MySQL to Postgres in the coming months.  Does anyone know of any good, current
migrationtools out there?  A search for mysql on pgfoundry only turns up a few projects that haven't seen any recent
updates...
>>
>
> The problem there is not that there are no such tools, but that the
> database might need a bit of refinement when you move away from such
> simple database engine as MySQL. Most often, either due to lack of
> knowledge, or sometimes because you are forced to - the mysql database
> won't utilise sometimes even basic features of postgresql (sequences
> would be one grand example).

An InnoDB `AUTO_INCREMENT' column under MySQL 5.x is pretty similar to a
sequence.

I increasingly think it's pretty safe to just
   's/AUTO_INCREMENT/SERIAL/g'
in DDL. Apps that use MyISAM aren't going to be fussy about how it
works, and apps that use InnoDB get Pg-like behaviour anyway.

Modern (5.x-something-late) MySQL with the InnoDB storage engine is a
very different beast from MySQL 3.x/4.x or MySQL 5.x with MyISAM. It
still has some scary data-conversion habits and you need to turn on
several different kinds of strict modes and behave-standard-ways modes
to get it to play nice, but if you have and you're moving over to Pg you
shouldn't actually have too rough a time.

( Unless you use some of the MySQL features, like its upsert command
REPLACE or the INSERT ... ON DUPLICATE KEY UPDATE ..., that just have no
equivalent in Pg. Or you're relying on scary MySQL-isms like 0000-00-00
dates. ).

> So to summarise, it is two stage process really. And the first stage -
> migration, should be fallowed by the database and code refinement
> before even the first 'after migration' release. And I believe that
> there is numerous sources out there to help people with that. Plus, we
> are always trying to be helpful on the list.

That I tend to agree with. It's not just about converting your data and
schema. You *will* need to adjust apps with any non-trivial queries.
Even if you can hack it to work, you can often learn things as part of
the porting process that can improve your results on *both* databases.

--
Craig Ringer

Re: MySQL -> Postgres migration tools?

From

Grzegorz Jaśkiewicz

Date:

23 January 2010, 12:06:11

2010/1/23 Craig Ringer <craig@postnewspapers.com.au>:
>
> An InnoDB `AUTO_INCREMENT' column under MySQL 5.x is pretty similar to a
> sequence.
>
> I increasingly think it's pretty safe to just
>  's/AUTO_INCREMENT/SERIAL/g'
> in DDL. Apps that use MyISAM aren't going to be fussy about how it works,
> and apps that use InnoDB get Pg-like behaviour anyway.

Yes, but I have seen it many times, where people needed a sequence
that drives more than just one column/table. Can't do it with the
simplistic mysql approach.


> Modern (5.x-something-late) MySQL with the InnoDB storage engine is a very
> different beast from MySQL 3.x/4.x or MySQL 5.x with MyISAM. It still has
> some scary data-conversion habits and you need to turn on several different
> kinds of strict modes and behave-standard-ways modes to get it to play nice,
> but if you have and you're moving over to Pg you shouldn't actually have too
> rough a time.
>
> ( Unless you use some of the MySQL features, like its upsert command REPLACE
> or the INSERT ... ON DUPLICATE KEY UPDATE ..., that just have no equivalent
> in Pg. Or you're relying on scary MySQL-isms like 0000-00-00 dates. ).

this is easily done with triggers, and again is more powerful. MySQL
was designed to be easy to grasp for someone with limited time/skill
in SQL.
And as everything that is meant to be simple, it fails quickly if you
need to use it for something more realistic/complete/insert your word
here.

Lets be honest, people use mysql with the default backend usually,
because innodb is just too slow.
I dumped mysql long time ago, but I still hear from people that it
suffers from same problems. Hence, I still recommend people to switch
to postgresql.

Btw, if you want to help, please ask as many hosting places as you can
about postgresql support, mentioning that you would probably buy it
from them, if they had postgresql for the same price as mysql (read,
for free, included in cheapest package usually). This is so far the
biggest problem to overcome, and the major reason people choose mysql
in first place.




--
GJ

Re: MySQL -> Postgres migration tools?

From

Devrim GÜNDÜZ

Date:

23 January 2010, 13:36:24

On Fri, 2010-01-22 at 11:15 -0800, Erik Jones wrote:
> Does anyone know of any good, current migration tools out there?

There is an open source tool developed by EnterpriseDB: MigrationWizard

http://www.enterprisedb.com/openDownloads.do?productId=407&redirectReason=true&productVersion=otherDownload

If you are using RHEL/CentOS 5 or Fedora 11+, you may find RPM version
from:

http://yum.pgsqlrpms.org/8.4/redhat/rhel-5Server-i386/repoview/migrationwizard.html
http://yum.pgsqlrpms.org/8.4/fedora/fedora-11-i386/repoview/migrationwizard.html
http://yum.pgsqlrpms.org/8.4/fedora/fedora-12-i386/repoview/migrationwizard.html

Regards,
--
Devrim GÜNDÜZ, RHCE
Command Prompt - http://www.CommandPrompt.com
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz

Attachment

signature.asc

Unusual table size and very slow inserts

From

Ivano Luberti

Date:

01 February 2010, 07:07:40

Hello, I have a software that uses Posgtres 8.4.2 on Windows.
I have a database with  data splitted into schemas, so that every schema
replicates the same set of tables.
One of the table is called "code": it has 16 columns, almos all numerics
except for a carachtervarying(1024) and two text fields. It holds
usually a few thousands record at most, then the file size of the table
is usually around few hundred kbytes.

In only one case so far, the "code" table with 442 record has a size of
18MB. If I run an vacuum full and a reindex it shrinks to less than 100KB.
If I use the software to delete the rows and reinsert the same records
it explodes again to 18MB.

I have backed up the table (from a WIndows7 instance running on a
virtual machine) and recovered it on another database (running on
a"real" Windows Vista) and tried the same things with the same results.

Then I have dropped the table, recreated it and reinserted the records:
the anomaly has disappeared.

I really don't know what to look for.
I cannot ignore the problem because schemas are created by the software
on software users request and it could reappear in any schema.

I really don't have a clue: I would be happy to further study the
problem but I don't know in which direction I have to go. Can someone
point me in some (hopefully good) direction?



--
==================================================
dott. Ivano Mario Luberti
Archimede Informatica societa' cooperativa a r. l.
Sede Operativa
Via Gereschi 36 - 56126- Pisa
tel.: +39-050- 580959
tel/fax: +39-050-9711344
web: www.archicoop.it
==================================================

Unusual table size and very slow inserts

From

Ivano Luberti

Date:

02 February 2010, 10:49:46

Sorry to post this again, but I have seen no response at all and this is
strange on this list.
Maybe I have not properly submitted my question ?

I wish also to add another parameter: the size problem is usually
associated with the following log messages:

2010-02-02 00:00:14 GMTLOG:  checkpoints are occurring too frequently
(15 seconds apart)
2010-02-02 00:00:14 GMTHINT:  Consider increasing the configuration
parameter "checkpoint_segments".

Where the number of seconds apart of course changes


---------------------


Hello, I have a software that uses Posgtres 8.4.2 on Windows.
I have a database with  data splitted into schemas, so that every schema
replicates the same set of tables.
One of the table is called "code": it has 16 columns, almos all numerics
except for a carachtervarying(1024) and two text fields. It holds
usually a few thousands record at most, then the file size of the table
is usually around few hundred kbytes.

In only one case so far, the "code" table with 442 record has a size of
18MB. If I run an vacuum full and a reindex it shrinks to less than 100KB.
If I use the software to delete the rows and reinsert the same records
it explodes again to 18MB.

I have backed up the table (from a WIndows7 instance running on a
virtual machine) and recovered it on another database (running on
a"real" Windows Vista) and tried the same things with the same results.

Then I have dropped the table, recreated it and reinserted the records:
the anomaly has disappeared.

I really don't know what to look for.
I cannot ignore the problem because schemas are created by the software
on software users request and it could reappear in any schema.

I really don't have a clue: I would be happy to further study the
problem but I don't know in which direction I have to go. Can someone
point me in some (hopefully good) direction?



--
==================================================
dott. Ivano Mario Luberti
Archimede Informatica societa' cooperativa a r. l.
Sede Operativa
Via Gereschi 36 - 56126- Pisa
tel.: +39-050- 580959
tel/fax: +39-050-9711344
web: www.archicoop.it
==================================================

Re: Unusual table size and very slow inserts

From

Richard Huxton

Date:

02 February 2010, 11:05:28

On 02/02/10 14:46, Ivano Luberti wrote:
> Sorry to post this again, but I have seen no response at all and this is
> strange on this list.
> Maybe I have not properly submitted my question ?

You've replied to an existing question, which means your message is
hidden in amidst the replies to that.

> I wish also to add another parameter: the size problem is usually
> associated with the following log messages:
>
> 2010-02-02 00:00:14 GMTLOG:  checkpoints are occurring too frequently
> (15 seconds apart)
> 2010-02-02 00:00:14 GMTHINT:  Consider increasing the configuration
> parameter "checkpoint_segments".
>
> Where the number of seconds apart of course changes

Not directly related, although you might want to do as it says.

>
>
> Hello, I have a software that uses Posgtres 8.4.2 on Windows.
> I have a database with  data splitted into schemas, so that every schema
> replicates the same set of tables.
> One of the table is called "code": it has 16 columns, almos all numerics
> except for a carachtervarying(1024) and two text fields. It holds
> usually a few thousands record at most, then the file size of the table
> is usually around few hundred kbytes.
>
> In only one case so far, the "code" table with 442 record has a size of
> 18MB. If I run an vacuum full and a reindex it shrinks to less than 100KB.
> If I use the software to delete the rows and reinsert the same records
> it explodes again to 18MB.

That suggests the autovacuum system isn't checking the table often
enough. Or, perhaps that you have a long-lived transaction that is
preventing it from reclaiming space.

Autovacuum is disussed at the bottom of this page:
http://www.postgresql.org/docs/8.4/static/routine-vacuuming.html
The "storage parameters" link has details on setting vacuum parameters
for a single table.

If your application is sat there holding open a transaction without
doing anything stop doing that. It means the system can't be sure it's
safe to reclaim the space used by old versions of rows.

--
   Richard Huxton
   Archonet Ltd

Re: Unusual table size and very slow inserts

From

Ivano Luberti

Date:

02 February 2010, 14:48:51


Richard Huxton ha scritto:
> On 02/02/10 14:46, Ivano Luberti wrote:
>> Sorry to post this again, but I have seen no response at all and this is
>> strange on this list.
>> Maybe I have not properly submitted my question ?
>
> You've replied to an existing question, which means your message is
> hidden in amidst the replies to that.
>
I don't want to bore the list with this but I don't understand. I have
posted a new message to the list. I didn't reply to anything......or al
least I didn't mean to do that.

>
>> In only one case so far, the "code" table with 442 record has a size of
>> 18MB. If I run an vacuum full and a reindex it shrinks to less than
>> 100KB.
>> If I use the software to delete the rows and reinsert the same records
>> it explodes again to 18MB.
>
> That suggests the autovacuum system isn't checking the table often
> enough. Or, perhaps that you have a long-lived transaction that is
> preventing it from reclaiming space.
>
> Autovacuum is disussed at the bottom of this page:
> http://www.postgresql.org/docs/8.4/static/routine-vacuuming.html
> The "storage parameters" link has details on setting vacuum parameters
> for a single table.
>
> If your application is sat there holding open a transaction without
> doing anything stop doing that. It means the system can't be sure it's
> safe to reclaim the space used by old versions of rows.
>
No the application is doing what is supposed to do: inserting records.
But when the size of the table is so great insert become really slow, so
indeed autovacuum has been canceled a few times.
Moreover when autovacuum runs and the application is idle he is able to
run but not able to claim space.
Then if I run vacuum manually with full, freeze and analyze checked and
also I run reindex everything return to normality.

What really worries and puzzles me is the size of the table is not
coherent with other copies of the same table with similar records number.

--
==================================================
dott. Ivano Mario Luberti
Archimede Informatica societa' cooperativa a r. l.
Sede Operativa
Via Gereschi 36 - 56126- Pisa
tel.: +39-050- 580959
tel/fax: +39-050-9711344
web: www.archicoop.it
==================================================

Re: Unusual table size and very slow inserts

From

Bill Moran

Date:

02 February 2010, 16:21:55

In response to Ivano Luberti <luberti@archicoop.it>:
> >
> >> In only one case so far, the "code" table with 442 record has a size of
> >> 18MB. If I run an vacuum full and a reindex it shrinks to less than
> >> 100KB.
> >> If I use the software to delete the rows and reinsert the same records
> >> it explodes again to 18MB.
> >
> > That suggests the autovacuum system isn't checking the table often
> > enough. Or, perhaps that you have a long-lived transaction that is
> > preventing it from reclaiming space.
> >
> > Autovacuum is disussed at the bottom of this page:
> > http://www.postgresql.org/docs/8.4/static/routine-vacuuming.html
> > The "storage parameters" link has details on setting vacuum parameters
> > for a single table.
> >
> > If your application is sat there holding open a transaction without
> > doing anything stop doing that. It means the system can't be sure it's
> > safe to reclaim the space used by old versions of rows.
> >
> No the application is doing what is supposed to do: inserting records.
> But when the size of the table is so great insert become really slow, so
> indeed autovacuum has been canceled a few times.

This is another strong indicator that your autovacuum is not configured
properly.  You should probably make the settings more aggressive.

> Moreover when autovacuum runs and the application is idle he is able to
> run but not able to claim space.

I don't understand what that comment means.

> Then if I run vacuum manually with full, freeze and analyze checked and
> also I run reindex everything return to normality.

Again, this indicates that autovacuum is not configured to do what you
want it to do.

> What really worries and puzzles me is the size of the table is not
> coherent with other copies of the same table with similar records number.

Have you read the section on how MVCC and autovacuum work?  The size of
the table is not a function of the # of records.  It's the # of records,
plus the number of "dead rows" (which are records that have been deleted
or updated and thus replaced with new rows).

If this table sees frequent updates, then it will _NEVER_ be the size
that you expect it to be if you just multiple #rows * size of row, because
it will _always_ have some dead rows in the table.  This is OK, it's
how the system is designed to work, and frequent VACUUM FULL is just
wasting time as the table will just enlarge again.

The key is that autovacuum runs often enough that the size stabalizes
based on usage.  For example, if you have a table that usually has 500
rows in it and it's common for all the rows to be deleted and replaced,
then you can expect the table to _always_ be big enough to hold 1000
rows.  However, if all the rows are updated 5 times between each vacuum
run, the table will be big enough to hold 2500 rows most of the time.
Again, this is typical, it's how the system is designed to run.

I'm guessing (although a lot of the original email has been trimmed) that
the actual problem you're having is that autovacuum is taking too long,
and is slowing down modifications to table data.

If autovacuum is taking too long and is slowing down other operations,
you have a few options (you can do one or many of these):
* Run autovacuum more frequently.
* Modify your application so it modifies less rows (yes, sometimes this
  is possible and the best solution.  The fact that the table blows
  up to 180x the minimum size is a hint that you may be updating
  very inefficiently.)
* Tune PostgreSQL to make more efficient use of RAM for caching (which
  will speed everything up)
* Get faster hardware
* Upgrade to a newer version of PostgreSQL that has more efficient
  vacuum code (if you mentioned which version you are using, it was
  lost when the message was trimmed)

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: Unusual table size and very slow inserts

From

Ivano Luberti

Date:

02 February 2010, 16:49:59

Ok I definitely food for thought and that is what I was searching for,
as stated in my first message.
Thanks to Richard and Bill for that.
But I have a few things that still I don't understand and I think I had
not outlined enough.

>
>
>> What really worries and puzzles me is the size of the table is not
>> coherent with other copies of the same table with similar records number.
>>
>
> Have you read the section on how MVCC and autovacuum work?  The size of
> the table is not a function of the # of records.  It's the # of records,
> plus the number of "dead rows" (which are records that have been deleted
> or updated and thus replaced with new rows).
>
>
I was not assuming the proportion between size and number of records.
I was simply comparing the size of the table that troubles me with the
size of other identical tables that have similar usage but are in other
schemas inside the same DB.


> The key is that autovacuum runs often enough that the size stabalizes
> based on usage.  For example, if you have a table that usually has 500
> rows in it and it's common for all the rows to be deleted and replaced,
> then you can expect the table to _always_ be big enough to hold 1000
> rows.  However, if all the rows are updated 5 times between each vacuum
> run, the table will be big enough to hold 2500 rows most of the time.
> Again, this is typical, it's how the system is designed to run.
>

In this table and all her sisters in the other schemas, records are only
inserted and deleted. No update

> I'm guessing (although a lot of the original email has been trimmed) that
> the actual problem you're having is that autovacuum is taking too long,
> and is slowing down modifications to table data.
>
>
This is what I was thinking , but today I was able to look at the
processes running while a client was doing a bunc of inserts. There was
no autovacuum running and every insert was taking many seconds to e
executed.
Moreover every insert seemed to increase a lot the size of the table.
Much more than the size of a record and the related index

Then after the insert autovacuum started and it took a lot of time to
complete. So the following suggestions certainly have interest to me. By
the way we are running 8.4.2 on Windows 2003 server, but the same
problem ha occurred on Windows (S)vista and Windows 7. So no chance to
upgrade :-).
> If autovacuum is taking too long and is slowing down other operations,
> you have a few options (you can do one or many of these):
> * Run autovacuum more frequently.
> * Modify your application so it modifies less rows (yes, sometimes this
>   is possible and the best solution.  The fact that the table blows
>   up to 180x the minimum size is a hint that you may be updating
>   very inefficiently.)
> * Tune PostgreSQL to make more efficient use of RAM for caching (which
>   will speed everything up)
> * Get faster hardware
> * Upgrade to a newer version of PostgreSQL that has more efficient
>   vacuum code (if you mentioned which version you are using, it was
>   lost when the message was trimmed)
>
>



--
==================================================
dott. Ivano Mario Luberti
Archimede Informatica societa' cooperativa a r. l.
Sede Operativa
Via Gereschi 36 - 56126- Pisa
tel.: +39-050- 580959
tel/fax: +39-050-9711344
web: www.archicoop.it
==================================================

Re: Unusual table size and very slow inserts

From

Scott Marlowe

Date:

02 February 2010, 17:06:28

On Tue, Feb 2, 2010 at 1:45 PM, Ivano Luberti <luberti@archicoop.it> wrote:
> In this table and all her sisters in the other schemas, records are only
> inserted and deleted. No update

same diff.  In pgsql an update equals a delete and an insert.  deleted
rows use up space just like former versions from an update.  If you're
deleting a lot of rows then you'll have a lot of bloat.

Re: Unusual table size and very slow inserts

From

Greg Smith

Date:

03 February 2010, 04:08:13

Ivano Luberti wrote:
> This is what I was thinking , but today I was able to look at the
> processes running while a client was doing a bunc of inserts. There was
> no autovacuum running and every insert was taking many seconds to e
> executed.
>

Have you done any basic tuning of the database parameters?  From your
earlier message, it sounds like you might have checkpoint_segments at
its default, which can contribute to these inserts taking so long.  See
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server for a
guide to where to start.

Also, if you want to get a better idea what's actually going on with
your data, you should be looking at "select * from pg_stat_user_tables"
; that will give you information about things like how many dead rows
there are in the table, when autovacuum last did some work, etc.  Should
be possible to figure out what's different about the use pattern of this
table compared to the ones you suggest work as expected by analyzing
that data.

P.S. To clean up from one of these messes you might try CLUSTER instead
of VACUUM FULL.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us