Re: POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: POC: Cleaning up orphaned files using undo logs
Date
Msg-id CAA4eK1LHq2T3T4pQOnaO0A=pOeCGc5gpCTuPYp8_Sc7nLV7tig@mail.gmail.com
Whole thread Raw
In response to Re: POC: Cleaning up orphaned files using undo logs  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: POC: Cleaning up orphaned files using undo logs  (Amit Kapila <amit.kapila16@gmail.com>)
Re: POC: Cleaning up orphaned files using undo logs  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Mon, Jul 1, 2019 at 1:24 PM Thomas Munro <thomas.munro@gmail.com> wrote:
>
> Another small change/review: the function UndoLogGetNextInsertPtr()
> previously took a transaction ID, but I'm not sure if that made sense,
> I need to think about it some more.
>

The changes you have made related to UndoLogGetNextInsertPtr() doesn't
seem correct to me.

@@ -854,7 +854,9 @@ FindUndoEndLocationAndSize(UndoRecPtr start_urecptr,
  * has already started in this log then lets re-fetch the undo
  * record.
  */
- next_insert = UndoLogGetNextInsertPtr(slot->logno, uur->uur_xid);
+ next_insert = UndoLogGetNextInsertPtr(slot->logno);
+
+ /* TODO this can't happen */
  if (!UndoRecPtrIsValid(next_insert))

I think this is a possible case.  Say while the discard worker tries
to register the rollback request from some log and after it fetches
the undo record corresponding to start location in this function,
another backend adds the new transaction undo.  The same is mentioned
in comments as well.   Can you explain what makes you think that this
can't happen?  If we don't want to pass the xid to
UndoLogGetNextInsertPtr, then I think we need to get the insert
location before fetching the record.  I will think more on it to see
if there is any other problem with the same.

2.
@@ -167,25 +205,14 @@ UndoDiscardOneLog(UndoLogSlot *slot,
TransactionId xmin, bool *hibernate)
+ if (!TransactionIdIsValid(wait_xid) && !pending_abort)
  {
  UndoRecPtr next_insert = InvalidUndoRecPtr;
- /*
- * If more undo has been inserted since we checked last, then
- * we can process that as well.
- */
- next_insert = UndoLogGetNextInsertPtr(logno, undoxid);
- if (!UndoRecPtrIsValid(next_insert))
- continue;
+ next_insert = UndoLogGetNextInsertPtr(logno);

This change is also not safe.  It can lead to discarding the undo of
some random transaction because a new undo records from some other
transaction would have been added since we last fetched the undo
record.  This can be fixed by just removing the call to
UndoLogGetNextInsertPtr.  I have done so in undoprocessing branch and
added the comment as well.

I think the common problem with the above changes is that it assumes
that new undo can't be added to undo logs while discard worker is
processing them.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Khandekar
Date:
Subject: Re: Minimal logical decoding on standbys
Next
From: Amit Khandekar
Date:
Subject: Re: Minimal logical decoding on standbys