Re: POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: POC: Cleaning up orphaned files using undo logs |
Date | |
Msg-id | CAA4eK1+UtutcnUY4LgfS_ndA81tEDr5F67WVFixLYejGObW0Og@mail.gmail.com Whole thread Raw |
In response to | Re: POC: Cleaning up orphaned files using undo logs (Antonin Houska <ah@cybertec.at>) |
Responses |
Re: POC: Cleaning up orphaned files using undo logs
(Antonin Houska <ah@cybertec.at>)
|
List | pgsql-hackers |
On Fri, Sep 24, 2021 at 4:44 PM Antonin Houska <ah@cybertec.at> wrote: > > Amit Kapila <amit.kapila16@gmail.com> wrote: > > > On Mon, Sep 20, 2021 at 10:24 AM Antonin Houska <ah@cybertec.at> wrote: > > > > > > Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > On Fri, Sep 17, 2021 at 9:50 PM Dmitry Dolgov <9erthalion6@gmail.com> wrote: > > > > > > > > > > > On Tue, Sep 14, 2021 at 10:51:42AM +0200, Antonin Houska wrote: > > > > > > > > > > * What happened with the idea of abandoning discard worker for the sake > > > > > of simplicity? From what I see limiting everything to foreground undo > > > > > could reduce the core of the patch series to the first four patches > > > > > (forgetting about test and docs, but I guess it would be enough at > > > > > least for the design review), which is already less overwhelming. > > > > What we can miss, at least for the cleanup of the orphaned files, is the *undo > > > worker*. In this patch series the cleanup is handled by the startup process. > > > > > > > Okay, I think various people at different point of times has suggested > > that idea. I think one thing we might need to consider is what to do > > in case of a FATAL error? In case of FATAL error, it won't be > > advisable to execute undo immediately, so would we upgrade the error > > to PANIC in such cases. I remember vaguely that for clean up of > > orphaned files that can happen rarely and someone has suggested > > upgrading the error to PANIC in such a case but I don't remember the > > exact details. > > Do you mean FATAL error during normal operation? > Yes. > As far as I understand, even > zheap does not rely on immediate UNDO execution (otherwise it'd never > introduce the undo worker), so FATAL only means that the undo needs to be > applied later so it can be discarded. > Yeah, zheap either applies undo later via background worker or next time before dml operation if there is a need. > As for the orphaned files cleanup feature with no undo worker, we might need > PANIC to ensure that the undo is applied during restart and that it can be > discarded, otherwise the unapplied undo log would stay there until the next > (regular) restart and it would block discarding. However upgrading FATAL to > PANIC just because the current transaction created a table seems quite > rude. > True, I guess but we can once see in what all scenarios it can generate FATAL during that operation. > So the undo worker might be needed even for this patch? > I think we can keep undo worker as a separate patch and for base patch keep the idea of promoting FATAL to PANIC. This will at the very least make the review easier. > Or do you mean FATAL error when executing the UNDO? > No. -- With Regards, Amit Kapila.
pgsql-hackers by date: