Re: Undo logs - Mailing list pgsql-hackers
From | Dilip Kumar |
---|---|
Subject | Re: Undo logs |
Date | |
Msg-id | CAFiTN-v-uusoMw3b-qRLuegry8-ygPa5YkDeweL9kTZ57Fo68w@mail.gmail.com Whole thread Raw |
In response to | Re: Undo logs (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Undo logs
|
List | pgsql-hackers |
On Sat, Nov 10, 2018 at 9:27 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Nov 5, 2018 at 5:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > [review for undo record layer (0003-undo-interface-v3)] > > I might sound repeating myself, but just to be clear, I was involved > in the design of this patch as well and I have given a few high-level > inputs for this patch. I have used this interface in the zheap > development, but haven't done any sort of detailed review which I am > doing now. I encourage others also to review this patch. Thanks for the review, please find my reply inline. > > 1. > * NOTES: > + * Handling multilog - > + * It is possible that the undo record of a transaction can be spread across > + * multiple undo log. And, we need some special handling while inserting the > + * undo for discard and rollback to work sanely. > > I think before describing how the undo record is spread across > multiple logs, you can explain how it is laid out when that is not the > case. You can also explain how undo record headers are linked. I am > not sure file header is the best place or it should be mentioned in > README, but I think for now we can use file header for this purpose > and later we can move it to README if required. Added in the header. > > 2. > +/* > + * FIXME: Do we want to support undo tuple size which is more than the BLCKSZ > + * if not than undo record can spread across 2 buffers at the max. > + */ > > +#define MAX_BUFFER_PER_UNDO 2 > > I think here the right question is what is the possibility of undo > record to be greater than BLCKSZ? For zheap, as of today, we don' > have any such requirement as the largest undo record is written for > update or multi_insert and in both cases we don't exceed the limit of > BLCKSZ. I guess some user other than zheap could probably have such > requirement and I don't think it is impossible to enhance this if we > have any requirement. > > If anybody else has an opinion here, please feel to share it. Should we remove this FIXME or lets wait for some other opinion. As of now I have kept it as it is. > > 3. > +/* > + * FIXME: Do we want to support undo tuple size which is more than the BLCKSZ > + * if not than undo record can spread across 2 buffers at the max. > + */ > +#define MAX_BUFFER_PER_UNDO 2 > + > +/* > + * Consider buffers needed for updating previous transaction's > + * starting undo record. Hence increased by 1. > + */ > +#define MAX_UNDO_BUFFERS (MAX_PREPARED_UNDO + 1) * MAX_BUFFER_PER_UNDO > + > +/* Maximum number of undo record that can be prepared before calling insert. */ > +#define MAX_PREPARED_UNDO 2 > > I think it is better to define MAX_PREPARED_UNDO before > MAX_UNDO_BUFFERS as the first one is used in the definition of a > second. Done > > 4. > +/* > + * Call PrepareUndoInsert to tell the undo subsystem about the undo record you > + * intended to insert. Upon return, the necessary undo buffers are pinned. > + * This should be done before any critical section is established, since it > + * can fail. > + * > + * If not in recovery, 'xid' should refer to the top transaction id because > + * undo log only stores mapping for the top most transactions. > + * If in recovery, 'xid' refers to the transaction id stored in WAL. > + */ > +UndoRecPtr > +PrepareUndoInsert(UnpackedUndoRecord *urec, UndoPersistence upersistence, > + TransactionId xid) > > This function locks the buffer as well which is right as we need to do > that before critical section, but the function header comments doesn't > indicate it. You can modify it as: > "Upon return, the necessary undo buffers are pinned and locked." > > Note that similar modification is required in .h file as well. Done > > 5. > +/* > + * Insert a previously-prepared undo record. This will lock the buffers > + * pinned in the previous step, write the actual undo record into them, > + * and mark them dirty. For persistent undo, this step should be performed > + * after entering a critical section; it should never fail. > + */ > +void > +InsertPreparedUndo(void) > > Here, the comments are wrong as buffers are already locked in the > previous step. A similar change is required in .h file as well. Fixed > > 6. > +InsertPreparedUndo(void) > { > .. > /* > + * Try to insert the record into the current page. If it doesn't > + * succeed then recall the routine with the next page. > + */ > + if (InsertUndoRecord(uur, page, starting_byte, &already_written, false)) > + { > + undo_len += already_written; > + MarkBufferDirty(buffer); > + break; > + } > + > + MarkBufferDirty(buffer); > .. > } > > Here, you are writing into a shared buffer and marking it dirty, isn't > it a good idea to Assert for being in the critical section? Done > > 7. > +/* Maximum number of undo record that can be prepared before calling insert. */ > +#define MAX_PREPARED_UNDO 2 > > /record/records > > I think this definition doesn't define the maximum number of undo > records that can be prepared as the caller can use UndoSetPrepareSize > to change it. I think you can modify the comment as below or > something on those lines: > "This defines the number of undo records that can be prepared before > calling insert by default. If you need to prepare more than > MAX_PREPARED_UNDO undo records, then you must call UndoSetPrepareSize > first." Fixed > > 8. > + * Caller should call this under log->discard_lock > + */ > +static bool > +IsPrevTxnUndoDiscarded(UndoLogControl *log, UndoRecPtr prev_xact_urp) > +{ > + if (log->oldest_data == InvalidUndoRecPtr) > .. > > Isn't it a good idea to have an Assert that we already have discard_lock? Done > > 9. > + UnpackedUndoRecord uur; /* prev txn's first undo record.*/ > +} PreviousTxnInfo; > > Extra space at the of comment is required. Done > > 10. > +/* > + * Check if previous transactions undo is already discarded. > + * > + * Caller should call this under log->discard_lock > + */ > +static bool > +IsPrevTxnUndoDiscarded(UndoLogControl *log, UndoRecPtr prev_xact_urp) > +{ > > The name suggests that this function is doing something special for > the previous transaction whereas this it just checks whether undo is > discarded corresponding to a particular undo location. Isn't it > better if we name it as UndoRecordExists or UndoRecordIsValid? Then > explain in comments when do you consider particular record exists. > Changed to UndoRecordIsValid > Another point to note is that you are not releasing the lock in all > paths, so it is better to mention in comments when will it be released > and when not. Done > > > 11. > +IsPrevTxnUndoDiscarded(UndoLogControl *log, UndoRecPtr prev_xact_urp) > +{ > + if (log->oldest_data == InvalidUndoRecPtr) > + { > + /* > + * oldest_data is not yet initialized. We have to check > + * UndoLogIsDiscarded and if it's already discarded then we have > + * nothing to do. > + */ > + LWLockRelease(&log->discard_lock); > + if (UndoLogIsDiscarded(prev_xact_urp)) > + return true; > > The comment in above code is just trying to write the code in words. > I think here you should tell why we need to call UndoLogIsDiscarded > when oldest_data is not initialized and or the scenario when > oldest_data will not be initialized. Fixed > > 12. > + * The actual update will be done by UndoRecordUpdateTransInfo under the > + * critical section. > + */ > +void > +PrepareUndoRecordUpdateTransInfo(UndoRecPtr urecptr, bool log_switched) > +{ > .. > > I find this function name bit awkward. How about UndoRecordPrepareTransInfo? Changed as per the suggestion > > 13. > +PrepareUndoRecordUpdateTransInfo(UndoRecPtr urecptr, bool log_switched) > { > .. > + /* > + * TODO: For now we don't know how to build a transaction chain for > + * temporary undo logs. That's because this log might have been used by a > + * different backend, and we can't access its buffers. What should happen > + * is that the undo data should be automatically discarded when the other > + * backend detaches, but that code doesn't exist yet and the undo worker > + * can't do it either. > + */ > + if (log->meta.persistence == UNDO_TEMP) > + return; > > Aren't we already dealing with this case in the other patch [1]? > Basically, I think we should discard it at commit time and or when the > backend is detached. Changed > > 14. > +PrepareUndoRecordUpdateTransInfo(UndoRecPtr urecptr, bool log_switched) > { > .. > /* > + * If previous transaction's urp is not valid means this backend is > + * preparing its first undo so fetch the information from the undo log > + * if it's still invalid urp means this is the first undo record for this > + * log and we have nothing to update. > + */ > + if (!UndoRecPtrIsValid(prev_xact_urp)) > + return; > .. > > This comment is confusing. It appears to be saying same thing twice. > You can write it along something like: > > "The absence of previous transaction's undo indicate that this backend > is preparing its first undo in which case we have nothing to update.". Done as per the suggestion > > 15. > +PrepareUndoRecordUpdateTransInfo(UndoRecPtr urecptr, bool log_switched) > /* > + * Acquire the discard lock before accessing the undo record so that > + * discard worker doen't remove the record while we are in process of > + * reading it. > + */ > > Typo doen't/doesn't. > > I think you can use 'can't' instead of doesn't. Fixed > > This is by no means a complete review, rather just noticed a few > things while reading the patch. > > [1] - https://www.postgresql.org/message-id/CAFiTN-t8fv-qYG9zynhS-1jRrvt_o5C-wCMRtzOsK8S%3DMXvKKw%40mail.gmail.com > > -- > With Regards, > Amit Kapila. > EnterpriseDB: http://www.enterprisedb.com -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Attachment
pgsql-hackers by date: