Re: [HACKERS] PG_UPGRADE status? - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] PG_UPGRADE status?
Date
Msg-id 1377.936829369@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] PG_UPGRADE status?  (Lamar Owen <lamar.owen@wgcr.org>)
Responses Re: [HACKERS] PG_UPGRADE status
List pgsql-hackers
Lamar Owen <lamar.owen@wgcr.org> writes:
> Tom Lane wrote:
>> It'd be considerably less messy, and safer, if you were willing to
>> stick the pg_dump output into a file rather than piping it on the fly.
>> Then (a) you wouldn't need to run both versions concurrently, and
>> (b) you'd have a dump backup if something went wrong during the install.

> Pipe or file, both versions have to be installed at the same time, so,
> either way, it's messy.

Er, no, that's the whole point.  The easy way to attack this is(1) While running old installation, pg_dumpall into a
file.(2)Shut down old postmaster, blow away old database files.(3) Install new version, initdb, start new
postmaster.(4)Restore from pg_dump output file.
 

> I'm curious as to how difficult it would be to rewrite pg_upgrade to be
> substantially more intelligent in its work.  Thanks to CVS, we can
> access the on-disk formats for any version since creation -- ergo, why
> can't a program be written that can understand all of those formats and
> convert to the latest and greatest without a backend running?  All of
> the code to deal with any version is out there in CVS already.

Go for it ;-).

> Now, I realize that this upgrading would HAVE to be done with no
> backends running and no transactions outstanding -- IOW, you only want
> the latest version of a tuple anyway.  Was this the issue with
> pg_upgrade and MVCC, or am I misunderstanding it?

The issue with MVCC is that the state of a tuple isn't solely determined
by what is in the disk file for its table; you have to also consult
pg_log to see whether recent transactions have been committed or not.
pg_upgrade doesn't import the old pg_log into the new database (and
can't very easily, since the new database will have its own), so there's
a problem with recent tuples possibly getting lost.

OTOH, it seems to me that this was true in older releases as well
(pg_log has always been critical data), so I guess I'm not clear on
why pg_upgrade worked at all, ever...
        regards, tom lane


pgsql-hackers by date:

Previous
From: Lamar Owen
Date:
Subject: Re: [HACKERS] PG_UPGRADE status?
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] Postgres Performance