Re: POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: POC: Cleaning up orphaned files using undo logs
Date
Msg-id CAA4eK1L6FJPr+JJwKJsms+o-NARagd6Byh_S19xqA4wAE6q-aA@mail.gmail.com
Whole thread Raw
In response to Re: POC: Cleaning up orphaned files using undo logs  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Wed, Jul 24, 2019 at 2:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jul 18, 2019 at 5:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > Yep, that was completely wrong.  Here's a new version.
> > >
> >
> > One comment/question related to
> > 0022-Use-undo-based-rollback-to-clean-up-files-on-abort.patch.
> >
>
> I have done some more review of undolog patch series and here are my comments:
> 0003-Add-undo-log-manager.patch
>

Some more review of the same patch:
1.
+typedef struct UndoLogSharedData
+{
+ UndoLogNumber free_lists[UndoLogCategories];
+ UndoLogNumber low_logno;


What is the use of low_logno?  I don't see anywhere in the code this
being assigned any value.  Is it for some future use?

2.
+void
+CheckPointUndoLogs(XLogRecPtr checkPointRedo, XLogRecPtr priorCheckPointRedo)
{
..
+ /* Compute header checksum. */
+ INIT_CRC32C(crc);
+ COMP_CRC32C(crc, &UndoLogShared->low_logno, sizeof(UndoLogShared->low_logno));
+ COMP_CRC32C(crc, &UndoLogShared->next_logno,
sizeof(UndoLogShared->next_logno));
+ COMP_CRC32C(crc, &num_logs, sizeof(num_logs));
+ FIN_CRC32C(crc);
+
+ /* Write out the number of active logs + crc. */
+ if ((write(fd, &UndoLogShared->low_logno,
sizeof(UndoLogShared->low_logno)) != sizeof(UndoLogShared->low_logno))
||
+ (write(fd, &UndoLogShared->next_logno,
sizeof(UndoLogShared->next_logno)) !=
sizeof(UndoLogShared->next_logno)) ||

Is it safe to read UndoLogShared without UndoLogLock?  All other
places accessing UndoLogShared uses UndoLogLock, so if this usage is
safe, maybe it is better to add a comment.

3.
UndoLogAllocateInRecovery()
{
..
/*
+ * Otherwise we need to do our own transaction tracking
+ * whenever we see a new xid, to match the logic in
+ * UndoLogAllocate().
+ */
+ if (xid != slot->meta.unlogged.xid)
+ {
+ slot->meta.unlogged.xid = xid;
+ if (slot->meta.unlogged.this_xact_start != slot->meta.unlogged.insert)
+ slot->meta.unlogged.last_xact_start =
+ slot->meta.unlogged.this_xact_start;
+ slot->meta.unlogged.this_xact_start =
+ slot->meta.unlogged.insert;

The code doesn't follow the comment.  In UndoLogAllocate, both
last_xact_start and this_xact_start are assigned in if block, so the
should be the case here.

4.
UndoLogAllocateInRecovery()
{
..
+ /*
+ * Just as in UndoLogAllocate(), the caller may be extending an existing
+ * allocation before committing with UndoLogAdvance().
+ */
+ if (context->try_location != InvalidUndoRecPtr)
+ {
..
}

I am not sure how will this work because unlike UndoLogAllocate, this
function doesn't set try_location initially.  It will be set later by
UndoLogAdvance which can easily go wrong because that doesn't include
UndoLogBlockHeaderSize.

5.
+UndoLogAdvance(UndoLogAllocContext *context, size_t size)
+{
+ context->try_location = UndoLogOffsetPlusUsableBytes(context->try_location,
+ size);
+}

Here, you are using UndoRecPtr whereas UndoLogOffsetPlusUsableBytes
expects offset.

6.
UndoLogAllocateInRecovery()
{
..
+ /*
+ * At this stage we should have an undo log that can handle this
+ * allocation.  If we don't, something is screwed up.
+ */
+ if (UndoLogOffsetPlusUsableBytes(slot->meta.unlogged.insert, size) >
slot->meta.end)
+ elog(ERROR,
+ "cannot allocate %d bytes in undo log %d",
+ (int) size, slot->logno);
..
}

Similar to point-5, here you are using a pointer instead of offset.

7.
UndoLogAllocateInRecovery()
{
..
+ /* We found a reference to a different (or first) undo log. */
+ slot = find_undo_log_slot(logno, false);
..
+ /* TODO: check locking against undo log slot recycling? */
..
}

I think it is better to have an Assert here that slot can't be NULL.
AFAICS, slot can't be NULL unless there is some bug.  I don't
understand this 'TODO' comment.

8.
+ {
+ {"undo_tablespaces", PGC_USERSET,
CLIENT_CONN_STATEMENT,
+ gettext_noop("Sets the
tablespace(s) to use for undo logs."),
+ NULL,
+
GUC_LIST_INPUT | GUC_LIST_QUOTE
+ },
+
&undo_tablespaces,
+ "",
+ check_undo_tablespaces,
assign_undo_tablespaces, NULL
+ },

It seems you need to update variable_is_guc_list_quote for this variable.

9.
+extend_undo_log(UndoLogNumber logno, UndoLogOffset new_end)
{
..
+ if (!InRecovery)
+ {
+ xl_undolog_extend xlrec;
+ XLogRecPtr ptr;
+
+ xlrec.logno = logno;
+ xlrec.end = end;
+
+ XLogBeginInsert();
+ XLogRegisterData((char *) &xlrec, sizeof(xlrec));
+ ptr = XLogInsert(RM_UNDOLOG_ID, XLOG_UNDOLOG_EXTEND);
+ XLogFlush(ptr);
+ }
..
}

Do we need it for temporary/unlogged persistence level?  Similarly,
there is a WAL logging in attach_undo_log which I can't understand why
it would be required for temporary/unlogged persistence levels.

10.
+choose_undo_tablespace(bool force_detach, Oid *tablespace)
{
..
+ oid = get_tablespace_oid(name, true);
+ if (oid == InvalidOid)
..
}

Do we need to check permissions to see if the current user is allowed
to create in this tablespace?

11.
+static bool
+choose_undo_tablespace(bool force_detach, Oid *tablespace)
+{
+ char   *rawname;
+ List   *namelist;
+ bool
need_to_unlock;
+ int length;
+ int
i;
+
+ /* We need a modifiable copy of string. */
+ rawname =
pstrdup(undo_tablespaces);

I don't see the usage of rawname outside this function, isn't it
better to free it?  I understand that this function won't be called
frequently enough to matter, but still, there is some theoretical
danger if the user continuously changes undo_tablespaces.

12.
+find_undo_log_slot(UndoLogNumber logno, bool locked)
{
..
+ * TODO: We could track the lowest known undo log
number, to reduce
+ * the negative cache entry bloat.
+ */
+ if (result == NULL)
+ {
..
}

Do we have any mechanism to clear this bloat or will it stay till the
end of the session?  If it is later, then I think it might be good to
take care of this TODO.  I think this is not a blocker, but good to
have kind of stuff.

13.
+static void
+allocate_empty_undo_segment(UndoLogNumber logno, Oid tablespace,
+ UndoLogOffset end)
{
..
}

What will happen if the transaction creating undolog segment rolls
back?  Do we want to have pendingDeletes stuff as we have for normal
relation files?  This might also help in clearing the shared memory
state (undo log slots) if any.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Sergei Kornilov
Date:
Subject: Re: Add parallelism and glibc dependent only options to reindexdb
Next
From: Julien Rouhaud
Date:
Subject: Re: Add parallelism and glibc dependent only options to reindexdb