Re: Why dump/restore to upgrade? - Mailing list pgsql-hackers

From Justin Clift
Subject Re: Why dump/restore to upgrade?
Date
Msg-id 3C63D2D6.E815279C@postgresql.org
Whole thread Raw
In response to Why dump/restore to upgrade?  (mlw <markw@mohawksoft.com>)
List pgsql-hackers
Hi everyone,

mlw wrote:
> 
> Tom Lane wrote:
<snip> 
> For a mission critical installation, this is really unacceptable.
> 
> >
> > I would like to see more attention paid to supporting cross-version
> > upgrades via pg_upgrade (or some improved version thereof) when
> > practical, which it should be more often than not.  But to bind
> > ourselves forever to the current on-disk format is sheer folly.
> > And if you have to convert the datafile format then you might as
> > well dump and reload.
> 
> The backup/restore to upgrade will be a deal breaker for many installations. If
> you want more people using PostgreSQL, you need to accept that this is a very
> real problem, and one which should be addressed as an unacceptable behavior.
> 
> I don't want to say "Other databases do it, why can't PostgreSQL" because that
> isn't the point. Databases can be HUGE, pg_dumpall can take an hour or more to
> run. Then, it takes longer to restore because indexes have to be recreated.

Here's a thought for a method which doesn't yet exist, but as we're
about to start into the next version of PostgreSQL it might be worth
considering.

a) Do a pg_dump of the old-version database in question.  It takes note
of the latest transaction numbers in progress.

b) On a seperate machine, start doing a restore of the data, into the
new version database.

c) Take the old-version database out-of-production, so no new
transactions are done on it.

d) Run a yet-to-be-created utility which then takes a look at the
difference between the two, and updates the new-version database with
the entries which have changed since the first snapshot.

e) Put the new-version database into production, assuming the
applications hitting it have already been tested for compatibility with
the new version.

You could skip step c) (taking the old database offline) and do instead
the syncronisation step with it online if there has been a large
timeframe of changes, and THEN (one the differences are minimal) take
the old-version database offline and do another syncronisation for the
remaining differences.

However, I get the feeling a lot of this kind of thing is very similar
to some of the approaches to replication already around or being
developed.

It might be simpler to make the old-version database a master replica,
make the new-version database a slave replica of it, then once they're
in sync cut over to the new system (again assuming the applications
using it have been tested for compatibility with the new version).

The theory sounds alright, but in practise it might not be that easy. 
We can live in hope however.  :)

Regards and best wishes,

Justin Clift


> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly

-- 
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."  - Indira Gandhi


pgsql-hackers by date:

Previous
From: "Marc G. Fournier"
Date:
Subject: Re: Threaded PosgreSQL server
Next
From: Matthew Kirkwood
Date:
Subject: Re: Why dump/restore to upgrade?