Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Date
Msg-id CAFiTN-uCVcUKJRBM5-F+KxTPFeEfaGk6h8x19CgytZzPozrb=Q@mail.gmail.com
Whole thread Raw
In response to Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
List pgsql-hackers
On Wed, Jun 24, 2020 at 4:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jun 23, 2020 at 7:00 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > Here is the POC patch to discuss the idea of a cleanup of shared
> > fileset on proc exit.  As discussed offlist,  here I am maintaining
> > the list of shared fileset.  First time when the list is NULL I am
> > registering the cleanup function with on_proc_exit routine.  After
> > that for subsequent fileset, I am just appending it to filesetlist.
> > There is also an interface to unregister the shared file set from the
> > cleanup list and that is done by the caller whenever we are deleting
> > the shared fileset manually.  While explaining it here, I think there
> > could be one issue if we delete all the element from the list will
> > become NULL and on next SharedFileSetInit we will again register the
> > function.  Maybe that is not a problem but we can avoid registering
> > multiple times by using some flag in the file
> >
>
> I don't understand what you mean by "using some flag in the file".
>
> Review comments on various patches.
>
> poc_shared_fileset_cleanup_on_procexit
> =================================
> 1.
> - ent->subxact_fileset =
> - MemoryContextAlloc(ApplyContext, sizeof(SharedFileSet));
> + MemoryContext oldctx;
>
> + /* Shared fileset handle must be allocated in the persistent context */
> + oldctx = MemoryContextSwitchTo(ApplyContext);
> + ent->subxact_fileset = palloc(sizeof(SharedFileSet));
>   SharedFileSetInit(ent->subxact_fileset, NULL);
> + MemoryContextSwitchTo(oldctx);
>   fd = BufFileCreateShared(ent->subxact_fileset, path);
>
> Why is this change required for this patch and why we only cover
> SharedFileSetInit in the Apply context and not BufFileCreateShared?
> The comment is also not very clear on this point.

Added the comments for the same.

> 2.
> +void
> +SharedFileSetUnregister(SharedFileSet *input_fileset)
> +{
> + bool found = false;
> + ListCell *l;
> +
> + Assert(filesetlist != NULL);
> +
> + /* Loop over all the pending shared fileset entry */
> + foreach (l, filesetlist)
> + {
> + SharedFileSet *fileset = (SharedFileSet *) lfirst(l);
> +
> + /* remove the entry from the list and delete the underlying files */
> + if (input_fileset->number == fileset->number)
> + {
> + SharedFileSetDeleteAll(fileset);
> + filesetlist = list_delete_cell(filesetlist, l);
>
> Why are we calling SharedFileSetDeleteAll here when in the caller we
> have already deleted the fileset as per below code?
> BufFileDeleteShared(ent->stream_fileset, path);
> + SharedFileSetUnregister(ent->stream_fileset);

That's wrong I have removed this.


> I think it will be good if somehow we can remove the fileset from
> filesetlist during BufFileDeleteShared.  If that is possible, then we
> don't need a separate API for SharedFileSetUnregister.

I have done as discussed on later replies, basically called
SharedFileSetUnregister from BufFileDeleteShared.

> 3.
> +static List * filesetlist = NULL;
> +
>  static void SharedFileSetOnDetach(dsm_segment *segment, Datum datum);
> +static void SharedFileSetOnProcExit(int status, Datum arg);
>  static void SharedFileSetPath(char *path, SharedFileSet *fileset, Oid
> tablespace);
>  static void SharedFilePath(char *path, SharedFileSet *fileset, const
> char *name);
>  static Oid ChooseTablespace(const SharedFileSet *fileset, const char *name);
> @@ -76,6 +80,13 @@ SharedFileSetInit(SharedFileSet *fileset, dsm_segment *seg)
>   /* Register our cleanup callback. */
>   if (seg)
>   on_dsm_detach(seg, SharedFileSetOnDetach, PointerGetDatum(fileset));
> + else
> + {
> + if (filesetlist == NULL)
> + on_proc_exit(SharedFileSetOnProcExit, 0);
>
> We use NIL for list initialization and comparison.  See lock_files usage.

Done

> 4.
> +SharedFileSetOnProcExit(int status, Datum arg)
> +{
> + ListCell *l;
> +
> + /* Loop over all the pending  shared fileset entry */
> + foreach (l, filesetlist)
> + {
> + SharedFileSet *fileset = (SharedFileSet *) lfirst(l);
> + SharedFileSetDeleteAll(fileset);
> + }
>
> We can initialize filesetlist as NIL after the for loop as it will
> make the code look clean.

Right.

> Comments on other patches:
> =========================
> 5.
> > 3. On concurrent abort we are truncating all the changes including
> > some incomplete changes,  so later when we get the complete changes we
> > don't have the previous changes,  e.g, if we had specinsert in the
> > last stream and due to concurrent abort detection if we delete that
> > changes later we will get spec_confirm without spec insert.  We could
> > have simply avoided deleting all the changes, but I think the better
> > fix is once we detect the concurrent abort for any transaction, then
> > why do we need to collect the changes for that, we can simply avoid
> > that.  So I have put that fix. (0006)
> >
>
> On similar lines, I think we need to skip processing message, see else
> part of code in ReorderBufferQueueMessage.

Basically, ReorderBufferQueueMessage also calls the
ReorderBufferQueueChange internally for transactional changes.  But,
having said that, I realize the idea of skipping the changes in
ReorderBufferQueueChange is not good,  because by then we have already
allocated the memory for the change and the tuple and it's not a
correct to ReturnChanges because it will update the memory accounting.
So I think we can do it at a more centralized place and before we
process the change,  maybe in LogicalDecodingProcessRecord, before
going to the switch we can call a function from the reorderbuffer.c
layer to see whether this transaction is detected as aborted or not.
But I have to think more on this line that can we skip all the
processing of that record or not.

Your other comments look fine to me so I will send in the next patch
set and reply on them individually.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Darafei "Komяpa" Praliaskouski
Date:
Subject: Re: CUBE_MAX_DIM
Next
From: Masahiko Sawada
Date:
Subject: Re: xid wraparound danger due to INDEX_CLEANUP false