Re: Proposal: In-Place upgrade concept - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Proposal: In-Place upgrade concept
Date
Msg-id 12029.1183488806@sss.pgh.pa.us
Whole thread Raw
In response to Re: Proposal: In-Place upgrade concept  (Martijn van Oosterhout <kleptog@svana.org>)
Responses Re: Proposal: In-Place upgrade concept  (Gregory Stark <stark@enterprisedb.com>)
List pgsql-hackers
Martijn van Oosterhout <kleptog@svana.org> writes:
> On Tue, Jul 03, 2007 at 11:36:03AM -0400, Tom Lane wrote:
>> ... (Thought experiment: a page is read in during crash recovery
>> or PITR slave operation, and discovered to have the old format.)

> Hmm, actually, what's the problem with PITR restoring a page in the old
> format. As long as it's clear it's the old format it'll get fixed when
> the page is actually used.

Well, what I'm concerned about is something like a WAL record providing
a new-format tuple to be inserted into a page, and then you find that
the page contains old-format tuples.

[ thinks some more... ]  Actually, so long as we are willing to posit that

1. You're only allowed to upgrade a DB that's been cleanly shut down
(no replay of old-format WAL logs allowed)

2. Page format conversion is WAL-logged as a complete page replacement

then AFAICS WAL-reading operations should never have to apply any
updates to an old-format page; the first touch of any old page in the
WAL sequence should be a page replacement that updates it to new format.
This is not different from the argument why full_page_writes ensures
recovery from write failures.

So in principle the page-conversion stuff should always operate in a
live transaction.  (Which is good, because now that I think about it
we couldn't emit a WAL record for the page conversion in those other
contexts.)  I still feel pretty twitchy about letting it do catalog
access, though, because it has to operate at such a low level of the
system.  bufmgr.c has no business invoking anything that might do
catalog access.  If nothing else there are deadlock issues.

On the whole I think we could define format conversions for user-defined
types as "not our problem".  A new version of a UDT that has an
incompatible representation on disk can simply be treated as a new type
with a different OID, exactly as Zdenek was suggesting for index AMs.
To upgrade a database containing such a column, you install
"my_udt_old.so" that services the old representation, ALTER TYPE my_udt
RENAME TO my_udt_old, then install new type my_udt and start using that.
Anyway that seems good enough for version 1.0 --- I don't recall that
we've ever changed the on-disk representation of any contrib/ types,
so how important is this scenario in the real world?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Richard Huxton
Date:
Subject: Re: Proposal: In-Place upgrade concept
Next
From: Gregory Stark
Date:
Subject: ACM Paper relevant to our buffer algorithm