Re: POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: POC: Cleaning up orphaned files using undo logs
Date
Msg-id CAA4eK1J55LU=F4MY4rpx5yOZD7YxWp1buf-+uGUfToEejE1crg@mail.gmail.com
Whole thread Raw
In response to Re: POC: Cleaning up orphaned files using undo logs  (Amit Khandekar <amitdkhan.pg@gmail.com>)
List pgsql-hackers
On Mon, Jul 22, 2019 at 8:39 PM Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
>
> On Mon, 22 Jul 2019 at 14:21, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> I have started review of
> 0014-Allow-execution-and-discard-of-undo-by-background-wo.patch. Below
> are some quick comments to start with:
>
> +++ b/src/backend/access/undo/undoworker.c
>
> +#include "access/xact.h"
> +#include "access/undorequest.h"
> Order is not alphabetical
>

Fixed this and a few others.

> + * Each undo worker then start reading from one of the queue the requests for
> start=>starts
> queue=>queues
>
> -------------
>
> + rc = WaitLatch(MyLatch,
> +    WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
> +    10L, WAIT_EVENT_BGWORKER_STARTUP);
> +
> + /* emergency bailout if postmaster has died */
> + if (rc & WL_POSTMASTER_DEATH)
> + proc_exit(1);
> I think now, thanks to commit cfdf4dc4fc9635a, you don't have to
> explicitly handle postmaster death; instead you can use
> WL_EXIT_ON_PM_DEATH. Please check at all such places where this is
> done in this patch.
>
> -------------
>

Fixed both of the above issues.

> +UndoWorkerGetSlotInfo(int slot, UndoRequestInfo *urinfo)
> +{
> + /* Block concurrent access. */
> + LWLockAcquire(UndoWorkerLock, LW_EXCLUSIVE);
> +
> + MyUndoWorker = &UndoApplyCtx->workers[slot];
> Not sure why MyUndoWorker is used here. Can't we use a local variable
> ? Or do we intentionally attach to the slot as a side-operation ?
>
> -------------
>

I have changed the code around this such that we first attach to the
slot and then get the required info.  Also, I don't see the need of
exclusive lock here, so changed it to shared lock.

> + * Get the dbid where the wroker should connect to and get the worker
> wroker=>worker
>
> -------------
>
> + BackgroundWorkerInitializeConnectionByOid(urinfo.dbid, 0, 0);
> 0, 0 => InvalidOid, 0
>
> + * Set the undo worker request queue from which the undo worker start
> + * looking for a work.
> start => should start
> a work => work
>
> --------------
>

Fixed both of these.

> + if (!InsertRequestIntoErrorUndoQueue(urinfo))
> I was thinking what happens if for some reason
> InsertRequestIntoErrorUndoQueue() itself errors out. In that case, the
> entry will not be marked invalid, and so there will be no undo action
> carried out because I think the undo worker will exit. What happens
> next with this entry ?

I think this will change after integration with Robert's latest patch
on queues, so will address along with it if required.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: Small const correctness patch
Next
From: Heikki Linnakangas
Date:
Subject: default_table_access_method is not in sample config file