On Tue, Jun 27, 2017 at 10:17:49AM +1200, Craig de Stigter wrote:
> We're doing a large migration on our site which involves changing most of
> the primary key values. We've noticed this is a *very* slow process.
Indeed.
Does the database need to be online when this is happening?
If it were me, I'd try to find a way to dump it, modify the data in a
dump file, and then reload it. I think that'd be faster.
Another way you might try, if you need to be online while doing this,
is to write the whole thing into a new SQL schema. Make the mods you
need. When you think you're close to done, put a trigger in the "old
schema" to update data in the new schema, then do a last pass to catch
anything you missed in the interim, then cut your application over to
the new schema (update the search_path, force everything to disconnect
and reconnect, and when they reconnect they have the new data in
place). A variation on this technique is also useful for gradual
roll-out of new features -- you don't have to upgrade everything at
once and you have a natural rollback strategy (but you need a more
complicated set of triggers that keeps the two schemas in sync during
cutover period).
This second approach isn't faster, it's hard on I/O and disk space,
but it keeps you up and you can do the changes at a leisurely pace.
Just make sure you have the I/O and space before you do it :)
Hope that helps,
A
--
Andrew Sullivan
ajs@crankycanuck.ca