Re: POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: POC: Cleaning up orphaned files using undo logs
Date
Msg-id 49372.1632838107@antos
Whole thread Raw
In response to Re: POC: Cleaning up orphaned files using undo logs  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: POC: Cleaning up orphaned files using undo logs
List pgsql-hackers
Amit Kapila <amit.kapila16@gmail.com> wrote:

> On Mon, Sep 20, 2021 at 10:55 AM Antonin Houska <ah@cybertec.at> wrote:
> >
> > Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > On Thu, Sep 9, 2021 at 8:33 PM Antonin Houska <ah@cybertec.at> wrote:
> >
> > > > The problem of the temporary undo log is that it's loaded into local buffers
> > > > and that backend can exit w/o flushing local buffers to disk, and thus we are
> > > > not guaranteed to find enough information when trying to discard the undo log
> > > > the backend wrote. I'm thinking about the following solutions:
> > > >
> > > > 1. Let the backend manage temporary undo log on its own (even the slot
> > > >    metadata would stay outside the shared memory, and in particular the
> > > >    insertion pointer could start from 1 for each session) and remove the
> > > >    segment files at the same moment the temporary relations are removed.
> > > >
> > > >    However, by moving the temporary undo slots away from the shared memory,
> > > >    computation of oldestFullXidHavingUndo (see the PROC_HDR structure) would
> > > >    be affected. It might seem that a transaction which only writes undo log
> > > >    for temporary relations does not need to affect oldestFullXidHavingUndo,
> > > >    but it needs to be analyzed thoroughly. Since oldestFullXidHavingUndo
> > > >    prevents transactions to be truncated from the CLOG too early, I wonder if
> > > >    the following is possible (This scenario is only applicable to the zheap
> > > >    storage engine [1], which is not included in this patch, but should already
> > > >    be considered.):
> > > >
> > > >    A transaction creates a temporary table, does some (many) changes and then
> > > >    gets rolled back. The undo records are being applied and it takes some
> > > >    time. Since XID of the transaction did not affect oldestFullXidHavingUndo,
> > > >    the XID can disappear from the CLOG due to truncation.
> > > >
> > >
> > > By above do you mean to say that in zheap code, we don't consider XIDs
> > > that operate on temp table/undo for oldestFullXidHavingUndo?
> >
> > I was referring to the code
> >
> >                 /* We can't process temporary undo logs. */
> >                 if (log->meta.persistence == UNDO_TEMP)
> >                         continue;
> >
> > in undodiscard.c:UndoDiscard().
> >
>
> Here, I think it will just skip undo of temporary undo logs and
> oldestFullXidHavingUndo should be advanced after skipping it.

Right, it'll be adavanced, but the transaction XID (if the transaction wrote
only to temporary relations) might still be needed.

> > >
> > > > However zundo.c in
> > > >    [1] indicates that the transaction status *is* checked during undo
> > > >    execution, so we might have a problem.
> > > >
> > >
> > > It would be easier to follow if you can tell which exact code are you
> > > referring here?
> >
> > In meant the call of TransactionIdDidCommit() in
> > zundo.c:zheap_exec_pending_rollback().
> >
>
> IIRC, this should be called for temp tables after they have exited as
> this is only to apply the pending undo actions if any, and in case of
> temporary undo after session exit, we shouldn't need it.

I see (had to play with debugger a bit). Currently this works because the
temporary relations are dropped by AbortTransaction() ->
smgrDoPendingDeletes(), before the undo execution starts. The situation will
change as soon as the file removal will also be handled by the undo subsystem,
however I'm still not sure how to hit the TransactionIdDidCommit() call for
the XID already truncated from CLOG.

I'm starting to admint that there's no issue here: temporary undo is always
applied immediately in foreground, and thus the zheap_exec_pending_rollback()
function never needs to examine XID which no longer exists in the CLOG.

> I am not able to understand what exact problem you are facing for temp
> tables after the session exit. Can you please explain it a bit more?

The problem is that the temporary undo buffers are loaded into backend-local
buffers. Thus there's no guarantee that we'll find a consistent information in
the undo file even if the backend exited cleanly (local buffers are not
flushed at backend exit and there's no WAL for them). However, we need to read
the undo file to find out if (part of) it can be discarded.

I'm trying to find out whether we can ignore the temporary undo when trying to
advance oldestFullXidHavingUndo or not. If we can, then each backend can mange
its temporary undo on its own and - instead of checking which chunks can be
discarded - simply delete the undo files on exit as a whole, just like it
deletes temporary relations. Thus we wouldn't need to pay any special
attention to discarding. Also, if backends managed the temporary undo this
way, it wouldn't be necessary to track it via the shared memory
(UndoLogMetaData).

(With this approach, the undo record to delete the temporary relation must not
be temporary, but this should not be an issue.)

--
Antonin Houska
Web: https://www.cybertec-postgresql.com



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: verify_heapam for sequences?
Next
From: Andrew Dunstan
Date:
Subject: Re: how to distinguish between using the server as a standby or for executing a targeted recovery in PG 11?