Re: POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: POC: Cleaning up orphaned files using undo logs |
Date | |
Msg-id | CAA4eK1L5bVTHkCQtZauAEk3CfSZy9Z9eUeBDCkJNkE3uEhJirQ@mail.gmail.com Whole thread Raw |
In response to | Re: POC: Cleaning up orphaned files using undo logs (Antonin Houska <ah@cybertec.at>) |
Responses |
Re: POC: Cleaning up orphaned files using undo logs
|
List | pgsql-hackers |
On Wed, Nov 18, 2020 at 4:03 PM Antonin Houska <ah@cybertec.at> wrote: > > Amit Kapila <amit.kapila16@gmail.com> wrote: > > > On Fri, Nov 13, 2020 at 6:02 PM Antonin Houska <ah@cybertec.at> wrote: > > > > > > Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > On Thu, Nov 12, 2020 at 2:45 PM Antonin Houska <ah@cybertec.at> wrote: > > > > > > > > > > > > > > > No background undo > > > > > ------------------ > > > > > > > > > > Reduced complexity of the patch seems to be the priority at the moment. Amit > > > > > suggested that cleanup of an orphaned relation file is simple enough to be > > > > > done on foreground and I agree. > > > > > > > > > > > > > Yeah, I think we should try and see if we can make it work but I > > > > noticed that there are few places like AbortOutOfAnyTransaction where > > > > we have the assumption that undo will be executed in the background. > > > > We need to deal with it. > > > > > > I think this is o.k. if we always check for unapplied undo during startup. > > > > > > > Hmm, how it is ok to leave undo (and rely on startup) unless it is a > > PANIC error. IIRC, this path is invoked in non-panic errors as well. > > Basically, we won't be able to discard such an undo which doesn't seem > > like a good idea. > > Since failure to apply leaves unconsistent data, I assume it should always > cause PANIC, shouldn't it? > But how can we ensure that AbortOutOfAnyTransaction will be called only in that scenario? > (Thomas seems to assume the same in [1].) > > > > > > "undo worker" is still there, but it only processes undo requests after server > > > > > restart because relation data can only be changed in a transaction - it seems > > > > > cleaner to launch a background worker for this than to hack the startup > > > > > process. > > > > > > > > > > > > > But, I see there are still multiple undoworkers that are getting > > > > launched and I am not sure if that works correctly because a > > > > particular undoworker is connected to a database and then it starts > > > > processing all the pending undo. > > > > > > Each undo worker applies only transactions for its own database, see > > > ProcessExistingUndoRequests(): > > > > > > /* We can only process undo of the database we are connected to. */ > > > if (xact_hdr.dboid != MyDatabaseId) > > > continue; > > > > > > Nevertheless, as I've just mentioned in my response to Thomas, I admit that we > > > should try to live w/o the undo worker altogether. > > > > > > > Okay, but keep in mind that there could be a large amount of undo > > (unlike redo which has some limit as we can replay it from the last > > checkpoint) which needs to be processed but it might be okay to live > > with that for now. > > Yes, the information to remove relation file does not take much space in the > undo log. > > > Another thing is that it seems we need to connect to the database to perform > > it which might appear a bit odd that we don't allow users to connect to the > > database but internally we are connecting it. > > I think the implementation will need to follow the outcome of the part of the > discussion that starts at [2], but I see your concern. I'm thinking why > database connection is not needed to apply WAL but is needed for UNDO. I think > locks make the difference. > Yeah, it would be probably a good idea to see if we can make undo apply work without db-connection especially if we want to do before allowing connections. The other possibility could be to let discard worker do this work lazily after allowing connections. -- With Regards, Amit Kapila.
pgsql-hackers by date: