Re: POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers
From | Antonin Houska |
---|---|
Subject | Re: POC: Cleaning up orphaned files using undo logs |
Date | |
Msg-id | 57137.1630413990@antos Whole thread Raw |
In response to | Re: POC: Cleaning up orphaned files using undo logs (Dmitry Dolgov <9erthalion6@gmail.com>) |
Responses |
Re: POC: Cleaning up orphaned files using undo logs
Re: POC: Cleaning up orphaned files using undo logs |
List | pgsql-hackers |
Dmitry Dolgov <9erthalion6@gmail.com> wrote: > Hi, > > I'm crawling through the patch set like even slower creature than a snail, > sorry for long absence. I'm reading the latest version posted here and, > although it's hard to give any high level design comments on it yet, I thought > it could be useful to post a few findings and questions in the meantime. > > * One question about the functionality: > > > On Fri, Jan 29, 2021 at 06:30:15PM +0100, Antonin Houska wrote: > > Attached is the next version. Changes done: > > > > * Removed the progress tracking and implemented undo discarding in a simpler > > way. Now, instead of maintaining the pointer to the last record applied, > > only a boolean field in the chunk header is set when ROLLBACK is > > done. This helps to determine whether the undo of a non-committed > > transaction can be discarded. > > Just to clarify, the whole feature was removed for the sake of > simplicity, right? Amit Kapila told me that zheap can recognize that particular undo record was already applied and I could eventually find the corresponding code. So I removed the tracking from the undo log layer, although I still think it'd fit there. However then I found out that at least a boolean flag in the chunk header is needed to handle the discarding, so I implemented it. > * By throwing at the patchset `make installcheck` I'm getting from time to time > and error on the restart: > > TRAP: FailedAssertion("BufferIsValid(buffers[nbuffers].buffer)", > File: "undorecordset.c", Line: 1098, PID: 6055) > > From what I see XLogReadBufferForRedoExtended finds an invalid buffer and > returns BLK_NOTFOUND. The commentary says: > > If the block was not found, then it must be discarded later in > the WAL. > > and continues with skip = false, but fails to get a page from an invalid > buffer few lines later. It seems that the skip flag is supposed to be used > this situation, should it also guard the BufferGetPage part? I could see this sometime too, but can't reproduce it now. It's also not clear to me how XLogReadBufferForRedoExtended() can return BLK_NOTFOUND, as the whole undo log segment is created at once, even if only part of it is needed - see allocate_empty_undo_segment(). > * Another interesting issue I've found happened inside > DropUndoLogsInTablespace, when the process got SIGTERM. It seems processing > stuck on: > > slist_foreach_modify(iter, &UndoLogShared->shared_free_lists[i]) > > iterating on the same element over and over. My guess is > clear_private_free_lists was called and caused such unexpected outcome, > should the access to shared_free_lists be somehow protected? Well, I could get this error on repeated test run too. Thanks for the report. The list is protected by UndoLogLock. I found out that the problem was that free_undo_log_slot() "freed" the slot but didn't remove it from the shared freelist. Then some other backend thought it's free, picked it from the shared slot array, used and pushed again to the shared freelist. If the same item is already at the list head, slist_push_head() makes the initial node point to itself. I fixed it by removing the slot from the freelist before calling free_undo_log_slot() from CheckPointUndoLogs(). (The other call site DropUndoLogsInTablespace() was o.k.) > * I also wonder about the segments in base/undo, the commentary in pg_undodump > says: > > Since the UNDO log is a continuous stream of changes, any hole > terminates processing. > > It looks like it's relatively easy to end up with such holes, and pg_undodump > ends up with a message (found is added by me and contains a found offset > which do not match the expected value): > > pg_undodump: error: segment 0000000000 missing in log 2, found 0000100000 > > This seems to be not causing any real issues, but it's not clear for me if > such situation with gaps is fine or is it a problem? ok, I missed the point that the initial segment (or the initial sequence of segments) of the log can be missing due to discarding and segments recycling. I've fixed that, but if a segment is missing in the middle, it's still considered an error. > Other than that one more time thank you for this tremendous work, I find that > the topic is of extreme importance. I'm just trying to continue the tremendous work of others :-) Thanks for your review! -- Antonin Houska Web: https://www.cybertec-postgresql.com
Attachment
pgsql-hackers by date: