Re: POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: POC: Cleaning up orphaned files using undo logs
Date
Msg-id CAA4eK1KrNt2ThSvC6Cz4cWby81-ygFvzpd_U-921dmBETPnA1A@mail.gmail.com
Whole thread Raw
In response to Re: POC: Cleaning up orphaned files using undo logs  (Antonin Houska <ah@cybertec.at>)
Responses Re: POC: Cleaning up orphaned files using undo logs  (Amit Kapila <amit.kapila16@gmail.com>)
Re: POC: Cleaning up orphaned files using undo logs  (Antonin Houska <ah@cybertec.at>)
List pgsql-hackers
On Fri, Nov 13, 2020 at 6:02 PM Antonin Houska <ah@cybertec.at> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > On Thu, Nov 12, 2020 at 2:45 PM Antonin Houska <ah@cybertec.at> wrote:
> > >
> > >
> > > No background undo
> > > ------------------
> > >
> > > Reduced complexity of the patch seems to be the priority at the moment. Amit
> > > suggested that cleanup of an orphaned relation file is simple enough to be
> > > done on foreground and I agree.
> > >
> >
> > Yeah, I think we should try and see if we can make it work but I
> > noticed that there are few places like AbortOutOfAnyTransaction where
> > we have the assumption that undo will be executed in the background.
> > We need to deal with it.
>
> I think this is o.k. if we always check for unapplied undo during startup.
>

Hmm, how it is ok to leave undo (and rely on startup) unless it is a
PANIC error. IIRC, this path is invoked in non-panic errors as well.
Basically, we won't be able to discard such an undo which doesn't seem
like a good idea.

> > > "undo worker" is still there, but it only processes undo requests after server
> > > restart because relation data can only be changed in a transaction - it seems
> > > cleaner to launch a background worker for this than to hack the startup
> > > process.
> > >
> >
> > But, I see there are still multiple undoworkers that are getting
> > launched and I am not sure if that works correctly because a
> > particular undoworker is connected to a database and then it starts
> > processing all the pending undo.
>
> Each undo worker applies only transactions for its own database, see
> ProcessExistingUndoRequests():
>
>         /* We can only process undo of the database we are connected to. */
>         if (xact_hdr.dboid != MyDatabaseId)
>                 continue;
>
> Nevertheless, as I've just mentioned in my response to Thomas, I admit that we
> should try to live w/o the undo worker altogether.
>

Okay, but keep in mind that there could be a large amount of undo
(unlike redo which has some limit as we can replay it from the last
checkpoint) which needs to be processed but it might be okay to live
with that for now. Another thing is that it seems we need to connect
to the database to perform it which might appear a bit odd that we
don't allow users to connect to the database but internally we are
connecting it. These are just some points to consider while finalizing
the solution to this.

> > > Discarding
> > > ----------
> > >
> > > There's no discard worker for the URS infrastructure yet. I thought about
> > > discarding the undo log during checkpoint, but checkpoint should probably do
> > > more straightforward tasks than the calculation of a new discard pointer for
> > > each undo log, so a background worker is needed. A few notes on that:
> > >
> > >   * until the zheap AM gets added, only the transaction that creates the undo
> > >     records needs to access them. This assumption should make the discarding
> > >     algorithm a bit simpler. Note that with zheap, the other transactions need
> > >     to look for old versions of tuples, so the concept of oldestXidHavingUndo
> > >     variable is needed there.
> > >
> > >   * it's rather simple to pass pointer the URS pointer to the discard worker
> > >     when transaction either committed or the undo has been executed.
> > >
> >
> > Why can't we have a separate discard worker which keeps on scanning
> > the undorecords and discard accordingly? Giving the onus of foreground
> > process might be tricky because say discard worker is not up to speed
> > and we ran out of space to pass such information for each commit/abort
> > request.
>
> Sure, there should be a discard worker. The question is how to make its work
> efficient. The initial run after restart probably needs to scan everything
> between 'discard' and 'insert' pointers,
>

Yeah, such an initial scan would be helpful to identify pending aborts
and allow them to be processed.

> but then it should process only the
> parts created by individual transactions.
>

Yeah, it needs to process transaction-by-transaction to see which all
we can discard. Also, note that in Single-User mode we need to discard
undo after commit. I think we also need to maintain
oldestXidHavingUndo for CLOG truncation and transaction-wraparound. We
can't allow CLOG truncation for the transaction whose undo is not
discarded as that could be required by some other transaction. For
similar reasons, we can't allow transaction-wraparound and we need to
integrate this into the existing xid-allocation mechanism. I have
found one of the old patch
(Allow-execution-and-discard-of-undo-by-background-wo) attached where
all these concepts were implemented. Unless you have a reason why we
don't these things, you might want to refer to the attached patch to
either re-use or refer to these ideas. There are a few other things
like undorequest and some undoworker mechanism which you can ignore.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: Add docs stub for recovery.conf
Next
From: Alexander Korotkov
Date:
Subject: Re: Supporting = operator in gin/gist_trgm_ops