Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Date
Msg-id CAFiTN-tHpd+zXVemo9WqQUJS50p9m8jD=AWjsugKZQ4F-K8Pbw@mail.gmail.com
Whole thread Raw
In response to Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
List pgsql-hackers
On Thu, May 28, 2020 at 5:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, May 27, 2020 at 8:22 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Tue, May 26, 2020 at 7:45 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > >
> > > Okay, sending again.
> >
> > While reviewing/testing I have found a couple of problems in 0005 and
> > 0006 which I have fixed in the attached version.
> >
>
> I haven't reviewed the new fixes yet but I have some comments on
> 0008-Add-support-for-streaming-to-built-in-replicatio.patch.
> 1.
> I think the temporary files (and or handles) used for storing the
> information of changes and subxacts are getting leaked in the patch.
> At some places, it is taken care to close the file but cases like
> apply_handle_stream_commit where if any error occurred in
> apply_dispatch(), the file might not get closed.  The other place is
> in apply_handle_stream_abort() where if there is an error in ftruncate
> the file won't be closed.   Now, the bigger problem is with changes
> related file which is opened in apply_handle_stream_start and closed
> in apply_handle_stream_stop and if there is any error in-between, we
> won't close it.
>
> OTOH, I think the worker will exit on an error so it might not matter
> but then why we are at few other places we are closing it before the
> error?  I think on error these temporary files should be removed
> instead of relying on them to get removed next time when we receive
> changes for the same transaction which I feel is what we do in other
> cases where we use temporary files like for sorts or hashjoins.
>
> Also, what if the changes file size overflows "OS file size limit"?
> If we agree that the above are problems then do you think we should
> explore using BufFile interface (see storage/file/buffile.c) to avoid
> all such problems?

I also think that the file size is a problem.  I think we can use
BufFile with some modifications.  We can not use the
BufFileCreateTemp, because of few reasons
1) files get deleted on close, but we have to open/close on every
stream start/stop.
2) even if we try to avoid closing we need to the BufFile pointers
(which take 8192k per file) because there is no option to pass the
file name.

I thin for our use case BufFileCreateShared is more suitable.  I think
we need to do some modifications so that we can use these apps without
SharedFileSet. Otherwise, we need to unnecessarily need to create
SharedFileSet for each transaction and also need to maintain it in xid
array or xid hash until transaction commit/abort.  So I suggest
following modifications in shared files set so that we can
conveniently use it.
1. ChooseTablespace(const SharedFileSet fileset, const char name)
  if fileset is NULL then select the DEFAULTTABLESPACEOID
2. SharedFileSetPath(char path, SharedFileSet fileset, Oid tablespace)
If fileset is NULL then in directory path we can use MyProcPID or
something instead of fileset->creator_pid.
3. Pass some parameter to BufFileOpenShared, so that it can open the
file in RW mode instead of read-only mode.


> 2.
> apply_handle_stream_abort()
> {
> ..
> + /* discard the subxacts added later */
> + nsubxacts = subidx;
> +
> + /* write the updated subxact list */
> + subxact_info_write(MyLogicalRepWorker->subid, xid);
> ..
> }
>
> Here, if subxacts becomes zero, then also subxact_info_write will
> create a new file and write checksum.

How, will it create the new file, in fact it will write nsubxacts as 0
in the existing file, and I think we need to do that right so that in
next open we will know that the nsubxact is 0.

  I think subxact_info_write
> should have a check for nsubxacts > 0 before writing to the file.

But, even if nsubxacts become 0 we want to write the file so that we
can overwrite the previous info.

> 3.
> apply_handle_stream_commit(StringInfo s)
> {
> ..
> + /*
> + * send feedback to upstream
> + *
> + * XXX Probably should send a valid LSN. But which one?
> + */
> + send_feedback(InvalidXLogRecPtr, false, false);
> ..
> }
>
> Why do we need to send the feedback at this stage after applying each
> message?  If we see a non-streamed case, we never send_feedback after
> each message. So, following that, I don't see the need to send it here
> but if you see any specific reason then do let me know?  And if we
> have to send feedback, then we need to decide the appropriate values
> as well.

Let me put more thought on this and then I will revert back to you.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: elog(DEBUG2 in SpinLocked section.
Next
From: Amit Kapila
Date:
Subject: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions