Re: POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: POC: Cleaning up orphaned files using undo logs |
Date | |
Msg-id | CAA4eK1+Np54a7=VAqQOUKo+qzPXVk+ZM=3OyZbm-Ky2L-f9PRA@mail.gmail.com Whole thread Raw |
In response to | Re: POC: Cleaning up orphaned files using undo logs (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: POC: Cleaning up orphaned files using undo logs
|
List | pgsql-hackers |
On Tue, Jul 16, 2019 at 2:09 AM Robert Haas <robertmhaas@gmail.com> wrote: > > PerformUndoActions() also thinks that there is a possibility of > failing to insert a failed request into the error queue, and makes > reference to such requests being rediscovered by the discard worker, > but I thought (as I said in my previous email) that we had abandoned > that approach in favor of always having enough space in shared memory > to record everything. Among other problems, if you want > oldestXidHavingUndo to be calculated based on the information in > shared memory, then you have to have all the records in shared memory, > not lose some of them temporarily and have them get re-inserted into > the error queue. > The idea is that the queues can get full, but not rollback hash table. In the case where the error queue gets full, we mark the entry as Invalid in the hash table and later when discard worker again encounters this request, it adds it to the queue if there is a space available and marks the entry in the hash table as valid. This allows us to keep the information of all xacts having pending undo in shared memory. > It also feels to me like there may be a conflict > between the everything-must-fit approach and the > one-request-per-persistence level thing you've got here. I believe > Andres's idea was one-request-per-transaction, so the idea is > something like: > > - When your transaction first tries to attach to an undo log, you make > a hash table entry. .. .. > - If you commit, you remove the entry, because your transaction does > not need to be undone. I think this can regress the performance when there are many concurrent sessions unless there is a way to add/remove request without a lock. As of now, we don't enter any request or block any space in shared memory related to pending undo till there is an error or user explicitly Rollback the transaction. We can surely do some other way as well, but this way we won't have any overhead in the commit or successful transaction's path. > > If you have one request per persistence level, you could make an entry > for the first persistence level, and then find that you are out of > room when trying to make an entry for the second persistence level. I > guess that doesn't break anything: the changes from the first > persistence level would get undone, and the second persistence level > wouldn't get any undo. Maybe that's OK, but again it doesn't seem all > that nice, so maybe we need to think about it some more. > Again coming to question of whether we need single or multiple entries for one-request-per-persistence level, the reason for the same we have discussed so far is that discard worker can register the requests for them while scanning undo logs at different times. However, there are few more things like what if while applying the actions, the actions for logged are successful and unlogged fails, keeping them separate allows better processing. If one fails, register its request in error queue and try to process the request for another persistence level. I think the requests for the different persistence levels are kept in a separate log which makes their processing separately easier. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: