Re: Undo logs - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Undo logs
Date
Msg-id CAA4eK1+_R8Mgr0un=wr-v6rre3e_suQvQaWwit451TFY3-qXOw@mail.gmail.com
Whole thread Raw
In response to Re: Undo logs  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: Undo logs
List pgsql-hackers
On Fri, Nov 16, 2018 at 9:46 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Nov 15, 2018 at 12:14 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> Updated patch (merged latest code from the zheap main branch [1]).
>

Review comments:
-------------------------------
1.
UndoRecordPrepareTransInfo()
{
..
+ /*
+ * The absence of previous transaction's undo indicate that this backend
+ * is preparing its first undo in which case we have nothing to update.
+ */
+ if (!UndoRecPtrIsValid(prev_xact_urp))
+ return;
..
}

It is expected that the caller of UndoRecPtrIsValid should have
discard lock, but I don't see that how this the call from this place
ensures the same?

2.
UndoRecordPrepareTransInfo()
{
..
/*
+ * The absence of previous transaction's undo indicate that this backend
+ * is preparing its first undo in which case we have nothing to update.
+ */
+ if (!UndoRecPtrIsValid(prev_xact_urp))
+ return;
+
+ /*
+ * Acquire the discard lock before accessing the undo record so that
+ * discard worker doesn't remove the record while we are in process of
+ * reading it.
+ */
+ LWLockAcquire(&log->discard_lock, LW_SHARED);
+
+ if (!UndoRecordIsValid(log, prev_xact_urp))
+ return;
..
}

I don't understand this logic where you are checking the same
information with and without a lock, is there any reason for same?  It
seems we don't need the first call to  UndoRecPtrIsValid is not
required.

3.
UndoRecordPrepareTransInfo()
{
..
+ while (true)
+ {
+ bufidx = InsertFindBufferSlot(rnode, cur_blk,
+   RBM_NORMAL,
+   log->meta.persistence);
+ prev_txn_info.prev_txn_undo_buffers[index] = bufidx;
+ buffer = undo_buffer[bufidx].buf;
+ page = BufferGetPage(buffer);
+ index++;
+
+ if (UnpackUndoRecord(&prev_txn_info.uur, page, starting_byte,
+ &already_decoded, true))
+ break;
+
+ starting_byte = UndoLogBlockHeaderSize;
+ cur_blk++;
+ }


Can you write some commentary on what this code is doing?

There is no need to use index++; as a separate statement, you can do
it while assigning the buffer in that index.

4.
+UndoRecordPrepareTransInfo(UndoRecPtr urecptr, bool log_switched)
+{
+ UndoRecPtr prev_xact_urp;

I think you can simply name this variable as xact_urp.  All this and
related prev_* terminology used for variables seems confusing to me. I
understand that you are trying to update the last transactions undo
record information, but you can explain that via comments.  Keeping
such information as part of variable names not only makes their length
longer, but is also confusing.

5.
/*
+ * Structure to hold the previous transaction's undo update information.
+ */
+typedef struct PreviousTxnUndoRecord
+{
+ UndoRecPtr prev_urecptr; /* prev txn's starting urecptr */
+ int prev_txn_undo_buffers[MAX_BUFFER_PER_UNDO];
+ UnpackedUndoRecord uur; /* prev txn's first undo record. */
+} PreviousTxnInfo;
+
+static PreviousTxnInfo prev_txn_info;

Due to reasons mentioned in point-4, lets name the structure and it's
variables as below:

typedef struct XactUndoRecordInfo
{
UndoRecPtr start_urecptr; /* prev txn's starting urecptr */
int idx_undo_buffers[MAX_BUFFER_PER_UNDO];
UnpackedUndoRecord first_uur; /* prev txn's first undo record. */
} XactUndoRecordInfo;

static XactUndoRecordInfo xact_ur_info;

6.
+static int
+InsertFindBufferSlot(RelFileNode rnode,

The name of this function is not clear, can we change it to
UndoGetBufferSlot or UndoGetBuffer?

7.
+UndoRecordAllocateMulti(UnpackedUndoRecord *undorecords, int nrecords,
+ UndoPersistence upersistence, TransactionId txid)
{
..
+ /*
+ * If this is the first undo record for this transaction then set the
+ * uur_next to the SpecialUndoRecPtr.  This is the indication to allocate
+ * the space for the transaction header and the valid value of the uur_next
+ * will be updated while preparing the first undo record of the next
+ * transaction.
+ */
+ first_rec_in_recovery = InRecovery && IsTransactionFirstRec(txid);
..
}

I think it will be better if we move this comment few lines down:
+ if (need_start_undo && i == 0)
+ {
+ urec->uur_next = SpecialUndoRecPtr;

BTW, is the only reason set a special value (SpecialUndoRecPtr) for
uur_next is for allocating transaction header? If so, can't we
directly set the corresponding flag (UREC_INFO_TRANSACTION) in
uur_info and then remove it from UndoRecordSetInfo?

I think it would have been better if there is one central location to
set uur_info, but as that is becoming tricky,
we should not try to add some special stuff to make it possible.

8.
+UndoRecordExpectedSize(UnpackedUndoRecord *uur)
+{
+ Size size;
+
+ /* Fixme : Temporary hack to allow zheap to set some value for uur_info. */
+ /* if (uur->uur_info == 0) */
+ UndoRecordSetInfo(uur);

Can't we move UndoRecordSetInfo in it's caller
(UndoRecordAllocateMulti)?  It seems another caller of this function
doesn't expect this.  If we do that way, then we can have an Assert
for non-zero uur_info in UndoRecordExpectedSize.

9.
bool
+InsertUndoRecord(UnpackedUndoRecord *uur, Page page,
+ int starting_byte, int *already_written, bool header_only)
+{
+ char   *writeptr = (char *) page + starting_byte;
+ char   *endptr = (char *) page + BLCKSZ;
+ int my_bytes_written = *already_written;
+
+ if (uur->uur_info == 0)
+ UndoRecordSetInfo(uur);

Do we really need UndoRecordSetInfo here?  If not, then just add an
assert for non-zero uur_info?

10
UndoRecordAllocateMulti()
{
..
else
+ {
+ /*
+ * It is okay to initialize these variables as these are used only
+ * with the first record of transaction.
+ */
+ urec->uur_next = InvalidUndoRecPtr;
+ urec->uur_xidepoch = 0;
+ urec->uur_dbid = 0;
+ urec->uur_progress = 0;
+ }
+
+
+ /* calculate the size of the undo record. */
+ size += UndoRecordExpectedSize(urec);
+ }

Remove one extra line before comment "calculate the size of ..".

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: New function pg_stat_statements_reset_query() to resetstatistics of a specific query
Next
From: Hakan Kocaman
Date:
Subject: Re: zheap: a new storage format for PostgreSQL