Re: POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers

From vignesh C
Subject Re: POC: Cleaning up orphaned files using undo logs
Date
Msg-id CALDaNm3sP-gZ-s=0hKY3cjwLy_vA-zWWKtmVL4vmjjP4DBTH6g@mail.gmail.com
Whole thread Raw
In response to Re: POC: Cleaning up orphaned files using undo logs  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: POC: Cleaning up orphaned files using undo logs  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
Hi,

I have done some review of undolog patch series
and here are my comments:
0003-Add-undo-log-manager.patch

1) As undo log is being created in tablespace,
 if the tablespace is dropped later, will it have any impact?
+void
+UndoLogDirectory(Oid tablespace, char *dir)
+{
+ if (tablespace == DEFAULTTABLESPACE_OID ||
+ tablespace == InvalidOid)
+ snprintf(dir, MAXPGPATH, "base/undo");
+ else
+ snprintf(dir, MAXPGPATH, "pg_tblspc/%u/%s/undo",
+ tablespace, TABLESPACE_VERSION_DIRECTORY);
+}

2) Header file exclusion
a) The following headers can be excluded in undolog.c
+#include "access/transam.h"
+#include "access/undolog.h"
+#include "access/xlogreader.h"
+#include "catalog/catalog.h"
+#include "nodes/execnodes.h"
+#include "storage/buf.h"
+#include "storage/bufmgr.h"
+#include "storage/fd.h"
+#include "storage/lwlock.h"
+#include "storage/shmem.h"
+#include "storage/standby.h"
+#include "storage/sync.h"
+#include "utils/memutils.h"

b) The following headers can be excluded from undofile.c
+#include "access/undolog.h"
+#include "catalog/database_internal.h"
+#include "miscadmin.h"
+#include "postmaster/bgwriter.h"
+#include "storage/fd.h"
+#include "storage/smgr.h"
+#include "utils/memutils.h"

3) Some macro replacement.
a)Session.h

+++ b/src/include/access/session.h
@@ -17,6 +17,9 @@
 /* Avoid including typcache.h */
 struct SharedRecordTypmodRegistry;

+/* Avoid including undolog.h */
+struct UndoLogSlot;
+
 /*
  * A struct encapsulating some elements of a user's session.  For now this
  * manages state that applies to parallel query, but it principle it could
@@ -27,6 +30,10 @@ typedef struct Session
  dsm_segment *segment; /* The session-scoped DSM segment. */
  dsa_area   *area; /* The session-scoped DSA area. */

+ /* State managed by undolog.c. */
+ struct UndoLogSlot *attached_undo_slots[4]; /* UndoLogCategories */
+ bool need_to_choose_undo_tablespace;
+

Should we change 4 to UndoLogCategories or suitable macro?
b)
+static inline size_t
+UndoLogNumSlots(void)
+{
+ return MaxBackends * 4;
+}
Should we change 4 to UndoLogCategories  or suitable macro

c)
+allocate_empty_undo_segment(UndoLogNumber logno, Oid tablespace,
+ UndoLogOffset end)
+{
+ struct stat stat_buffer;
+ off_t size;
+ char path[MAXPGPATH];
+ void   *zeroes;
+ size_t nzeroes = 8192;
+ int fd;

should we use BLCKSZ instead of 8192?

4) Should we add a readme file for undolog as it does a fair amount of work
    and is core part of the undo system?

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com


On Wed, Jul 24, 2019 at 5:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jul 24, 2019 at 2:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Jul 18, 2019 at 5:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > 7.
> > +attach_undo_log(UndoLogCategory category, Oid tablespace)
> > {
> > ..
> > if (candidate->meta.tablespace == tablespace)
> > + {
> > + logno = *place;
> > + slot = candidate;
> > + *place = candidate->next_free;
> > + break;
> > + }
> >
> > Here, the code is breaking from the loop, so why do we need to set
> > *place?  Am I missing something obvious?
> >
>
> I think I know what I was missing.  It seems here you are removing an
> element from the freelist.
>
> One point related to detach_current_undo_log.
>
> + LWLockAcquire(&slot->mutex, LW_EXCLUSIVE);
> + slot->pid = InvalidPid;
> + slot->meta.unlogged.xid = InvalidTransactionId;
> + if (full)
> + slot->meta.status = UNDO_LOG_STATUS_FULL;
> + LWLockRelease(&slot->mutex);
>
> If I read the comments in structure UndoLogMetaData, it is mentioned
> that 'status' is changed by explicit WAL record whereas there is no
> WAL record in code to change the status.  I see the problem as well if
> we don't WAL log this change.  Suppose after changing the status of
> this log, we allocate a new log and insert some records in that log as
> well for the same transaction for which we have inserted records in
> the log which we just marked as FULL.  Now, here we form the link
> between two logs as the same transaction has overflowed into a new
> log.  Say, we crash after this.  Now, after recovery the log won't be
> marked as FULL which means there is a chance that it can be used for
> some other transaction, if that happens, then our link for a
> transaction spanning to different log will break and we won't be able
> to access the data in another log.  In short, I think it is important
> to WAL log this status change unless I am missing something.
>
> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>
>


--
Regards,
vignesh
                          Have a nice day



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Seek failure at end of FSM file during WAL replay (in 11)
Next
From: Tom Lane
Date:
Subject: Re: Speed up transaction completion faster after many relations are accessed in a transaction