Re: POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: POC: Cleaning up orphaned files using undo logs
Date
Msg-id 49584.1605699331@antos
Whole thread Raw
In response to Re: POC: Cleaning up orphaned files using undo logs  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: POC: Cleaning up orphaned files using undo logs
Re: POC: Cleaning up orphaned files using undo logs
List pgsql-hackers
Amit Kapila <amit.kapila16@gmail.com> wrote:

> On Fri, Nov 13, 2020 at 6:02 PM Antonin Houska <ah@cybertec.at> wrote:
> >
> > Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > On Thu, Nov 12, 2020 at 2:45 PM Antonin Houska <ah@cybertec.at> wrote:
> > > >
> > > >
> > > > No background undo
> > > > ------------------
> > > >
> > > > Reduced complexity of the patch seems to be the priority at the moment. Amit
> > > > suggested that cleanup of an orphaned relation file is simple enough to be
> > > > done on foreground and I agree.
> > > >
> > >
> > > Yeah, I think we should try and see if we can make it work but I
> > > noticed that there are few places like AbortOutOfAnyTransaction where
> > > we have the assumption that undo will be executed in the background.
> > > We need to deal with it.
> >
> > I think this is o.k. if we always check for unapplied undo during startup.
> >
>
> Hmm, how it is ok to leave undo (and rely on startup) unless it is a
> PANIC error. IIRC, this path is invoked in non-panic errors as well.
> Basically, we won't be able to discard such an undo which doesn't seem
> like a good idea.

Since failure to apply leaves unconsistent data, I assume it should always
cause PANIC, shouldn't it? (Thomas seems to assume the same in [1].)

> > > > "undo worker" is still there, but it only processes undo requests after server
> > > > restart because relation data can only be changed in a transaction - it seems
> > > > cleaner to launch a background worker for this than to hack the startup
> > > > process.
> > > >
> > >
> > > But, I see there are still multiple undoworkers that are getting
> > > launched and I am not sure if that works correctly because a
> > > particular undoworker is connected to a database and then it starts
> > > processing all the pending undo.
> >
> > Each undo worker applies only transactions for its own database, see
> > ProcessExistingUndoRequests():
> >
> >         /* We can only process undo of the database we are connected to. */
> >         if (xact_hdr.dboid != MyDatabaseId)
> >                 continue;
> >
> > Nevertheless, as I've just mentioned in my response to Thomas, I admit that we
> > should try to live w/o the undo worker altogether.
> >
>
> Okay, but keep in mind that there could be a large amount of undo
> (unlike redo which has some limit as we can replay it from the last
> checkpoint) which needs to be processed but it might be okay to live
> with that for now.

Yes, the information to remove relation file does not take much space in the
undo log.

> Another thing is that it seems we need to connect to the database to perform
> it which might appear a bit odd that we don't allow users to connect to the
> database but internally we are connecting it.

I think the implementation will need to follow the outcome of the part of the
discussion that starts at [2], but I see your concern. I'm thinking why
database connection is not needed to apply WAL but is needed for UNDO. I think
locks make the difference. So maybe we can make the RMGR specific callbacks
(rm_undo) aware of the fact that the cluster is still in the startup state, so
the relations should be opened in NoLock mode?

> > > > Discarding
> > > > ----------
> > > >
> > > > There's no discard worker for the URS infrastructure yet. I thought about
> > > > discarding the undo log during checkpoint, but checkpoint should probably do
> > > > more straightforward tasks than the calculation of a new discard pointer for
> > > > each undo log, so a background worker is needed. A few notes on that:
> > > >
> > > >   * until the zheap AM gets added, only the transaction that creates the undo
> > > >     records needs to access them. This assumption should make the discarding
> > > >     algorithm a bit simpler. Note that with zheap, the other transactions need
> > > >     to look for old versions of tuples, so the concept of oldestXidHavingUndo
> > > >     variable is needed there.
> > > >
> > > >   * it's rather simple to pass pointer the URS pointer to the discard worker
> > > >     when transaction either committed or the undo has been executed.
> > > >
> > >
> > > Why can't we have a separate discard worker which keeps on scanning
> > > the undorecords and discard accordingly? Giving the onus of foreground
> > > process might be tricky because say discard worker is not up to speed
> > > and we ran out of space to pass such information for each commit/abort
> > > request.
> >
> > Sure, there should be a discard worker. The question is how to make its work
> > efficient. The initial run after restart probably needs to scan everything
> > between 'discard' and 'insert' pointers,
> >
>
> Yeah, such an initial scan would be helpful to identify pending aborts
> and allow them to be processed.
>
> > but then it should process only the
> > parts created by individual transactions.
> >
>
> Yeah, it needs to process transaction-by-transaction to see which all
> we can discard. Also, note that in Single-User mode we need to discard
> undo after commit.

ok, I missed this problem so far.

> I think we also need to maintain oldestXidHavingUndo for CLOG truncation and
> transaction-wraparound. We can't allow CLOG truncation for the transaction
> whose undo is not discarded as that could be required by some other
> transaction.

Good point. Even the discard worker might need to check the transaction status
when deciding whether undo log of that transaction should be discarded.

> For similar reasons, we can't allow transaction-wraparound and
> we need to integrate this into the existing xid-allocation mechanism. I have
> found one of the old patch
> (Allow-execution-and-discard-of-undo-by-background-wo) attached where all
> these concepts were implemented. Unless you have a reason why we don't these
> things, you might want to refer to the attached patch to either re-use or
> refer to these ideas. There are a few other things like undorequest and some
> undoworker mechanism which you can ignore.

Thanks.

[1] https://www.postgresql.org/message-id/CA%2BhUKGJL4X1em70rxN1d_EC3rxiVhVd1woHviydW%3DHr2PeGBpg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAH2-Wzk06ypb40z3B8HFiSsTVg961%3DE0%3DuQvqARJgT8_4QB2Mg%40mail.gmail.com

--
Antonin Houska
Web: https://www.cybertec-postgresql.com



pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Devel docs on website reloading
Next
From: Amit Kapila
Date:
Subject: Re: [Patch] Optimize dropping of relation buffers using dlist