Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Date
Msg-id CAFiTN-t4-D88VY6rV7Y1bH96kVfOjvgMSTXvgH+HsSXs4h4mNA@mail.gmail.com
Whole thread Raw
In response to Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
List pgsql-hackers
On Thu, Jun 25, 2020 at 7:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Jun 24, 2020 at 4:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Jun 23, 2020 at 7:00 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > Here is the POC patch to discuss the idea of a cleanup of shared
> > > fileset on proc exit.  As discussed offlist,  here I am maintaining
> > > the list of shared fileset.  First time when the list is NULL I am
> > > registering the cleanup function with on_proc_exit routine.  After
> > > that for subsequent fileset, I am just appending it to filesetlist.
> > > There is also an interface to unregister the shared file set from the
> > > cleanup list and that is done by the caller whenever we are deleting
> > > the shared fileset manually.  While explaining it here, I think there
> > > could be one issue if we delete all the element from the list will
> > > become NULL and on next SharedFileSetInit we will again register the
> > > function.  Maybe that is not a problem but we can avoid registering
> > > multiple times by using some flag in the file
> > >
> >
> > I don't understand what you mean by "using some flag in the file".
> >
> > Review comments on various patches.
> >
> > poc_shared_fileset_cleanup_on_procexit
> > =================================
> > 1.
> > - ent->subxact_fileset =
> > - MemoryContextAlloc(ApplyContext, sizeof(SharedFileSet));
> > + MemoryContext oldctx;
> >
> > + /* Shared fileset handle must be allocated in the persistent context */
> > + oldctx = MemoryContextSwitchTo(ApplyContext);
> > + ent->subxact_fileset = palloc(sizeof(SharedFileSet));
> >   SharedFileSetInit(ent->subxact_fileset, NULL);
> > + MemoryContextSwitchTo(oldctx);
> >   fd = BufFileCreateShared(ent->subxact_fileset, path);
> >
> > Why is this change required for this patch and why we only cover
> > SharedFileSetInit in the Apply context and not BufFileCreateShared?
> > The comment is also not very clear on this point.
>
> Added the comments for the same.
>
> > 2.
> > +void
> > +SharedFileSetUnregister(SharedFileSet *input_fileset)
> > +{
> > + bool found = false;
> > + ListCell *l;
> > +
> > + Assert(filesetlist != NULL);
> > +
> > + /* Loop over all the pending shared fileset entry */
> > + foreach (l, filesetlist)
> > + {
> > + SharedFileSet *fileset = (SharedFileSet *) lfirst(l);
> > +
> > + /* remove the entry from the list and delete the underlying files */
> > + if (input_fileset->number == fileset->number)
> > + {
> > + SharedFileSetDeleteAll(fileset);
> > + filesetlist = list_delete_cell(filesetlist, l);
> >
> > Why are we calling SharedFileSetDeleteAll here when in the caller we
> > have already deleted the fileset as per below code?
> > BufFileDeleteShared(ent->stream_fileset, path);
> > + SharedFileSetUnregister(ent->stream_fileset);
>
> That's wrong I have removed this.
>
>
> > I think it will be good if somehow we can remove the fileset from
> > filesetlist during BufFileDeleteShared.  If that is possible, then we
> > don't need a separate API for SharedFileSetUnregister.
>
> I have done as discussed on later replies, basically called
> SharedFileSetUnregister from BufFileDeleteShared.
>
> > 3.
> > +static List * filesetlist = NULL;
> > +
> >  static void SharedFileSetOnDetach(dsm_segment *segment, Datum datum);
> > +static void SharedFileSetOnProcExit(int status, Datum arg);
> >  static void SharedFileSetPath(char *path, SharedFileSet *fileset, Oid
> > tablespace);
> >  static void SharedFilePath(char *path, SharedFileSet *fileset, const
> > char *name);
> >  static Oid ChooseTablespace(const SharedFileSet *fileset, const char *name);
> > @@ -76,6 +80,13 @@ SharedFileSetInit(SharedFileSet *fileset, dsm_segment *seg)
> >   /* Register our cleanup callback. */
> >   if (seg)
> >   on_dsm_detach(seg, SharedFileSetOnDetach, PointerGetDatum(fileset));
> > + else
> > + {
> > + if (filesetlist == NULL)
> > + on_proc_exit(SharedFileSetOnProcExit, 0);
> >
> > We use NIL for list initialization and comparison.  See lock_files usage.
>
> Done
>
> > 4.
> > +SharedFileSetOnProcExit(int status, Datum arg)
> > +{
> > + ListCell *l;
> > +
> > + /* Loop over all the pending  shared fileset entry */
> > + foreach (l, filesetlist)
> > + {
> > + SharedFileSet *fileset = (SharedFileSet *) lfirst(l);
> > + SharedFileSetDeleteAll(fileset);
> > + }
> >
> > We can initialize filesetlist as NIL after the for loop as it will
> > make the code look clean.
>
> Right.
>
> > Comments on other patches:
> > =========================
> > 5.
> > > 3. On concurrent abort we are truncating all the changes including
> > > some incomplete changes,  so later when we get the complete changes we
> > > don't have the previous changes,  e.g, if we had specinsert in the
> > > last stream and due to concurrent abort detection if we delete that
> > > changes later we will get spec_confirm without spec insert.  We could
> > > have simply avoided deleting all the changes, but I think the better
> > > fix is once we detect the concurrent abort for any transaction, then
> > > why do we need to collect the changes for that, we can simply avoid
> > > that.  So I have put that fix. (0006)
> > >
> >
> > On similar lines, I think we need to skip processing message, see else
> > part of code in ReorderBufferQueueMessage.
>
> Basically, ReorderBufferQueueMessage also calls the
> ReorderBufferQueueChange internally for transactional changes.  But,
> having said that, I realize the idea of skipping the changes in
> ReorderBufferQueueChange is not good,  because by then we have already
> allocated the memory for the change and the tuple and it's not a
> correct to ReturnChanges because it will update the memory accounting.
> So I think we can do it at a more centralized place and before we
> process the change,  maybe in LogicalDecodingProcessRecord, before
> going to the switch we can call a function from the reorderbuffer.c
> layer to see whether this transaction is detected as aborted or not.
> But I have to think more on this line that can we skip all the
> processing of that record or not.
>
> Your other comments look fine to me so I will send in the next patch
> set and reply on them individually.

I think we can not put this check, in the higher-level functions like
LogicalDecodingProcessRecord or DecodeXXXOp because we need to process
that xid at least for abort,  so I think it is good to keep the check,
inside ReorderBufferQueueChange only and we can free the memory of the
change if the abort is detected.  Also, if just skip those changes in
ReorderBufferQueueChange then the effect will be localized to that
particular transaction which is already aborted.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: min_safe_lsn column in pg_replication_slots view
Next
From: Masahiko Sawada
Date:
Subject: Re: Transactions involving multiple postgres foreign servers, take 2