Re: [HACKERS] PG_UPGRADE status? - Mailing list pgsql-hackers

From Lamar Owen
Subject Re: [HACKERS] PG_UPGRADE status?
Date
Msg-id 37D6DE0D.DE0B1BF8@wgcr.org
Whole thread Raw
In response to Re: [HACKERS] PG_UPGRADE status?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] PG_UPGRADE status?
List pgsql-hackers
Tom Lane wrote:
> Lamar Owen <lamar.owen@wgcr.org> writes:
> > [ messiness required to upgrade versions by piping data from a
> > pg_dumpall to a psql talking to the new version ]
> 
> It'd be considerably less messy, and safer, if you were willing to
> stick the pg_dump output into a file rather than piping it on the fly.
> Then (a) you wouldn't need to run both versions concurrently, and
> (b) you'd have a dump backup if something went wrong during the install.

Pipe or file, both versions have to be installed at the same time, so,
either way, it's messy.  But, you are right that putting it in a file
(which is the way I manually update now) is a little less hairy.  But
not by much.

> > You can see how pg_upgrade would be useful in such a scenario, no?
> 
> We may have lost the option of pg_upgrade-like upgrades anyway.
> I'm still waiting to hear Vadim's opinion about whether pg_upgrade
> can be made safe under MVCC.

I'm curious as to how difficult it would be to rewrite pg_upgrade to be
substantially more intelligent in its work.  Thanks to CVS, we can
access the on-disk formats for any version since creation -- ergo, why
can't a program be written that can understand all of those formats and
convert to the latest and greatest without a backend running?  All of
the code to deal with any version is out there in CVS already.  It's
just a matter of writing conversion routines that:

0.)    Backup PGDATA.
1.)    Determine the source PGDATA version.
2.)    Load a storage manager (for reading) corresponding to that version.
3.)    Load a storage manager (for writing) corresponding to latest
version.
4.)    Transfer tuples sequentially from old to new.
5.)    Walk the PGDATA hierarchy for each and every database directory,
then update PG_VERSION and other needed files.

What am I missing (in concept -- I know there are alot of details that
I'm skimming over)?  The hard part is getting storage readers for every
major version -- and there's not been THAT many on-disk format changes,
has there?

Now, I realize that this upgrading would HAVE to be done with no
backends running and no transactions outstanding -- IOW, you only want
the latest version of a tuple anyway.  Was this the issue with
pg_upgrade and MVCC, or am I misunderstanding it?

Just the ramblings of a packager trying to make upgrades a little less
painful for the masses.

Lamar Owen
WGCR Internet Radio


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] PG_UPGRADE status?
Next
From: Lamar Owen
Date:
Subject: Re: [HACKERS] PG_UPGRADE status?