Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date
Msg-id CAFiTN-vQoPJt9jaDeJf3V8A8oeQ2BTg+6U7Suv3hiB4f4azGQg@mail.gmail.com
Whole thread Raw
In response to Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Wed, Aug 19, 2020 at 10:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Aug 17, 2020 at 6:29 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> >
> > In last patch v49-0001, there is one issue,  Basically, I have called
> > BufFileFlush in all the cases.  But, ideally, we can not call this if
> > the underlying files are deleted/truncated because those files/blocks
> > might not exist now.  So I think if the truncate position is within
> > the same buffer we just need to adjust the buffer,  otherwise we just
> > need to set the currFile and currOffset to the absolute number and set
> > the pos and nbytes 0.  Attached patch fixes this issue.
> >
>
> Few comments on the latest patch v50-0001-Extend-the-BufFile-interface
> 1.
> +
> + /*
> + * If the truncate point is within existing buffer then we can just
> + * adjust pos-within-buffer, without flushing buffer.  Otherwise,
> + * we don't need to do anything because we have already deleted/truncated
> + * the underlying files.
> + */
> + if (curFile == file->curFile &&
> + curOffset >= file->curOffset &&
> + curOffset <= file->curOffset + file->nbytes)
> + {
> + file->pos = (int) (curOffset - file->curOffset);
> + return;
> + }
>
> I think in this case you have set the position correctly but what
> about file->nbytes? In BufFileSeek, it was okay not to update 'nbytes'
> because the contents of the buffer are still valid but I don't think
> the same is true here.

Right, I think we need to set nbytes to new file->pos as shown below

> + file->pos = (int) (curOffset - file->curOffset);
>  file->nbytes = file->pos


> 2.
> + int curFile = file->curFile;
> + off_t curOffset = file->curOffset;
>
> I find the previous naming (newFile, newOffset) was better as it
> distinguishes them from BufFile variables.

Ok

> 3.
> +void
> +SharedFileSetUnregister(SharedFileSet *input_fileset)
> +{
> ..
> + /* Delete all files in the set */
> + SharedFileSetDeleteAll(input_fileset);
> ..
> }
>
> I am not sure if this is completely correct because we call this
> function (SharedFileSetUnregister) from BufFileDeleteShared which
> would have already removed all the required files. This raises the
> question in my mind whether it is correct to call
> SharedFileSetUnregister from BufFileDeleteShared from the API
> perspective as one might not want to remove the entire fileset at that
> point of time. It will work for your use case (where while removing
> buffile you also want to remove the entire fileset) but not sure if it
> is generic enough. For your case, I wonder if we can directly call
> SharedFileSetDeleteAll and we can have a call like
> SharedFileSetUnregister which will be called from it.

Yeah this make more sense to me that we can directly call
SharedFileSetDeleteAll, instead of calling BufFileDeleteShared and we
can call SharedFileSetUnregister from SharedFileSetDeleteAll.

I will make these changes and send the patch after some testing.




--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: display offset along with block number in vacuum errors
Next
From: Dilip Kumar
Date:
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions