Re: POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: POC: Cleaning up orphaned files using undo logs
Date
Msg-id 1362.1605271163@antos
Whole thread Raw
In response to Re: POC: Cleaning up orphaned files using undo logs  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
Thomas Munro <thomas.munro@gmail.com> wrote:

> On Thu, Nov 12, 2020 at 10:15 PM Antonin Houska <ah@cybertec.at> wrote:

> I saw that -- great news! -- and have been meaning to write for a
> while.  I think I am nearly ready to talk about it again.

I'm looking forward to it :-)

> 100% that it's worth trying to do something much simpler than a new
> access manager, and this was the simplest useful feature solving a
> real-world-problem-that-people-actually-have we could come up with
> (based on an idea from Robert).  I think it needs a convincing
> explanation for why there is no scenario where the relfilenode is
> recycled for a new unlucky table before the rollback is executed,
> which might depend on details that you might be working on/changing
> (scenarios where you execute undo twice because you forgot you already
> did it).

Oh, I haven't thought about this problem yet. That might be another reason for
the undo log infrastructure to record the progress somehow.

> > No background undo
> > ------------------
> >
> > Reduced complexity of the patch seems to be the priority at the moment. Amit
> > suggested that cleanup of an orphaned relation file is simple enough to be
> > done on foreground and I agree.
> >
> > "undo worker" is still there, but it only processes undo requests after server
> > restart because relation data can only be changed in a transaction - it seems
> > cleaner to launch a background worker for this than to hack the startup
> > process.
>
> I suppose the simplest useful system would be one does the work at
> startup before allowing connections, and also in regular backends, and
> panics if a backend ever exits while it has pending undo (panic =
> "goto crash recovery").  Then you don't have to deal with undo workers
> running at the same time as regular sessions which might run into
> trouble reacquiring locks (for an AM I mean), or due to OIDs being
> recycled with multiple checkpoints, or undo work that gets deferred
> until the next restart of the server.

I think that zheap can recognize that page has unapplied undo, so we don't
need to reacquire any page lock on restart. However I agree that the
background undo might introduce other concurrency issues. At least for now
it's worth trying to move the cleanup into the startup process. We can
reconsider this when implementing more expensive undo actions, especially the
zheap rollback.

> > Since the concept of undo requests is closely related to the undo worker, I
> > removed undorequest.c too. The new (much simpler) undo worker gets the
> > information on incomplete / aborted transactions from the undo log as
> > mentioned above.
> >
> > SMGR enhancement
> > ----------------
> >
> > I used the 0001 patch from [3] rather than [4], although it's more invasive
> > because I noticed somewhere in the discussion that there should be no reserved
> > database OID for the undo log. (InvalidOid cannot be used because it's already
> > in use for shared catalogs.)
>
> I give up thinking about the colour of the BufferTag shed and went
> back to magic database 9, mainly because there seemed to be more
> pressing matters.  I don't even think it's that crazy to store this
> type of system-wide data in pseudo databases, and I know of other
> systems that do similar sorts of things without blinking...

ok

--
Antonin Houska
Web: https://www.cybertec-postgresql.com



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Next
From: Antonin Houska
Date:
Subject: Re: POC: Cleaning up orphaned files using undo logs