Re: Patch for disaster recovery - Mailing list pgsql-patches

From Tom Lane
Subject Re: Patch for disaster recovery
Date
Msg-id 12145.1108884060@sss.pgh.pa.us
Whole thread Raw
In response to Re: Patch for disaster recovery  (Michael Fuhr <mike@fuhr.org>)
Responses Re: Patch for disaster recovery
Re: Patch for disaster recovery
Re: Patch for disaster recovery
Re: Patch for disaster recovery
List pgsql-patches
Michael Fuhr <mike@fuhr.org> writes:
> Hmmm...after seeing Tom's reply, I suppose I should have first
> asked, "Gee, looks simple, but does it work?"  Should I even bother
> experimenting with it in a test environment?

Experiment away, but I have a hard time visualizing how you'll find it
useful.

Just brainstorming here, but it seems to me that what might make some
kind of sense is a command to "undelete all tuples in this table".
You do that, you look through them, you delete the versions you don't
want, you're happy.  The problem with the patch as-is is that (a)
"deleting the versions you don't want" is a no-op, so you cannot keep
straight what you've done in terms of filtering out garbage; and (b)
when you revert to a non-broken postmaster, the stuff you couldn't see
before goes back to being unseeable, because after all you didn't change
its state.

With either the snapshot kluge or the undelete-all kluge, you have an
issue in that constraints are broken wholesale --- you can see lots of
duplicate row versions that violate unique constraints, deleted versions
that violate FK constraints because they reference also-deleted master
rows, deleted versions that violate later-added CHECK constraints, etc.
I'd sort of like to see the system flip into some mode that says "we're
not promising constraints are honored", and then you can't go back to
normal operation without going through some pushup that checks all the
remaining rows satisfy the declared constraints.

In any case I suspect it's a bad idea to treat tuples as good if their
originating transaction did not commit.  For starters, such a tuple
might not possess all the index entries it should (if the originating
transaction failed before inserting said entries).  I think what we
want to think about is overriding delete commands, not overriding
insert failures.

Not sure where this leads to, but it's not leading to an undocumented
one-line hack in tqual.c, and definitely not *that* one-line hack.

            regards, tom lane

pgsql-patches by date:

Previous
From: Michael Fuhr
Date:
Subject: Re: Patch for disaster recovery
Next
From: Michael Fuhr
Date:
Subject: Re: Patch for disaster recovery