Thread: Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Andres Freund
Date:
Hi,

On 2021-08-12 11:46:09 +0530, Amit Kapila wrote:
> On Thu, Aug 12, 2021 at 11:38 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > On Thu, Aug 12, 2021 at 7:39 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > It seems to me that moving the shared fileset cleanup to
> > > before_shmem_exit() is the right approach to fix this problem. The
> > > issue is fixed by the attached patch.
> >
> > +1, the fix makes sense to me.

I'm not so sure. Why does sharedfileset have its own proc exit hook in the
first place? ISTM that this should be dealt with using resowners, rathers than
a sharedfileset specific mechanism?

That said, I think it's fine to go for the ordering change in the short term.


> I have also tested and fix works for me. The fix works because
> pgstat_initialize() is called before we register clean up in
> SharedFileSetInit(). I am not sure if we need an Assert to ensure that
> and if so how we can do that? Any suggestions?

I don't think we need to assert that - we'd see failures soon enough if
that rule were violated...

Greetings,

Andres Freund



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Amit Kapila
Date:
On Thu, Aug 12, 2021 at 1:52 PM Andres Freund <andres@anarazel.de> wrote:
>
> On 2021-08-12 11:46:09 +0530, Amit Kapila wrote:
> > On Thu, Aug 12, 2021 at 11:38 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > On Thu, Aug 12, 2021 at 7:39 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > It seems to me that moving the shared fileset cleanup to
> > > > before_shmem_exit() is the right approach to fix this problem. The
> > > > issue is fixed by the attached patch.
> > >
> > > +1, the fix makes sense to me.
>
> I'm not so sure. Why does sharedfileset have its own proc exit hook in the
> first place? ISTM that this should be dealt with using resowners, rathers than
> a sharedfileset specific mechanism?
>

The underlying temporary files need to be closed at xact end but need
to survive across transactions. These are registered with the resource
owner via PathNameOpenTemporaryFile/PathNameCreateTemporaryFile and
then closed at xact end. So, we need a way to remove the files used by
the process (apply worker in this particular case) before process exit
and used this proc_exit hook (possibly on the lines of
AtProcExit_Files).

> That said, I think it's fine to go for the ordering change in the short term.
>
>
> > I have also tested and fix works for me. The fix works because
> > pgstat_initialize() is called before we register clean up in
> > SharedFileSetInit(). I am not sure if we need an Assert to ensure that
> > and if so how we can do that? Any suggestions?
>
> I don't think we need to assert that - we'd see failures soon enough if
> that rule were violated...
>

Fair enough.

-- 
With Regards,
Amit Kapila.



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Amit Kapila
Date:
On Thu, Aug 12, 2021 at 1:52 PM Andres Freund <andres@anarazel.de> wrote:
>
> On 2021-08-12 11:46:09 +0530, Amit Kapila wrote:
> > On Thu, Aug 12, 2021 at 11:38 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > On Thu, Aug 12, 2021 at 7:39 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > It seems to me that moving the shared fileset cleanup to
> > > > before_shmem_exit() is the right approach to fix this problem. The
> > > > issue is fixed by the attached patch.
> > >
> > > +1, the fix makes sense to me.
>
> I'm not so sure. Why does sharedfileset have its own proc exit hook in the
> first place? ISTM that this should be dealt with using resowners, rathers than
> a sharedfileset specific mechanism?
>

The underlying temporary files need to be closed at xact end but need
to survive across transactions. These are registered with the resource
owner via PathNameOpenTemporaryFile/PathNameCreateTemporaryFile and
then closed at xact end. So, we need a way to remove the files used by
the process (apply worker in this particular case) before process exit
and used this proc_exit hook (possibly on the lines of
AtProcExit_Files).

> That said, I think it's fine to go for the ordering change in the short term.
>
>
> > I have also tested and fix works for me. The fix works because
> > pgstat_initialize() is called before we register clean up in
> > SharedFileSetInit(). I am not sure if we need an Assert to ensure that
> > and if so how we can do that? Any suggestions?
>
> I don't think we need to assert that - we'd see failures soon enough if
> that rule were violated...
>

Fair enough.

-- 
With Regards,
Amit Kapila.



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Andres Freund
Date:
Hi,

On 2021-08-12 15:06:23 +0530, Amit Kapila wrote:
> On Thu, Aug 12, 2021 at 1:52 PM Andres Freund <andres@anarazel.de> wrote:
> > I'm not so sure. Why does sharedfileset have its own proc exit hook in the
> > first place? ISTM that this should be dealt with using resowners, rathers than
> > a sharedfileset specific mechanism?
> >

> The underlying temporary files need to be closed at xact end but need
> to survive across transactions.

Why do they need to be closed at xact end? To avoid wasting memory due to too
many buffered files?


> These are registered with the resource owner via
> PathNameOpenTemporaryFile/PathNameCreateTemporaryFile and then closed
> at xact end. So, we need a way to remove the files used by the process
> (apply worker in this particular case) before process exit and used
> this proc_exit hook (possibly on the lines of AtProcExit_Files).

What I'm wondering is why it is a good idea to have a SharedFileSet specific
cleanup mechanism. One that only operates on process lifetime level, rather
than something more granular. I get that the of the files here needs to be
longer than a transaction, but that can easily be addressed by having a longer
lived resource owner.

Process lifetime may work well for the current worker.c, but even there it
doesn't seem optimal. One e.g. could easily imagine that we'd want to handle
connection errors or configuration changes without restarting the worker, in
which case process lifetime obviously isn't a good idea anymore.


I think SharedFileSetInit() needs a comment explaining that it needs to be
called in a process-lifetime memory context if used without dsm
segments. Because otherwise SharedFileSetDeleteOnProcExit() will access
already freed memory (both for filesetlist and the SharedFileSet itself).

Greetings,

Andres Freund



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Andres Freund
Date:
Hi,

On 2021-08-12 15:06:23 +0530, Amit Kapila wrote:
> On Thu, Aug 12, 2021 at 1:52 PM Andres Freund <andres@anarazel.de> wrote:
> > I'm not so sure. Why does sharedfileset have its own proc exit hook in the
> > first place? ISTM that this should be dealt with using resowners, rathers than
> > a sharedfileset specific mechanism?
> >

> The underlying temporary files need to be closed at xact end but need
> to survive across transactions.

Why do they need to be closed at xact end? To avoid wasting memory due to too
many buffered files?


> These are registered with the resource owner via
> PathNameOpenTemporaryFile/PathNameCreateTemporaryFile and then closed
> at xact end. So, we need a way to remove the files used by the process
> (apply worker in this particular case) before process exit and used
> this proc_exit hook (possibly on the lines of AtProcExit_Files).

What I'm wondering is why it is a good idea to have a SharedFileSet specific
cleanup mechanism. One that only operates on process lifetime level, rather
than something more granular. I get that the of the files here needs to be
longer than a transaction, but that can easily be addressed by having a longer
lived resource owner.

Process lifetime may work well for the current worker.c, but even there it
doesn't seem optimal. One e.g. could easily imagine that we'd want to handle
connection errors or configuration changes without restarting the worker, in
which case process lifetime obviously isn't a good idea anymore.


I think SharedFileSetInit() needs a comment explaining that it needs to be
called in a process-lifetime memory context if used without dsm
segments. Because otherwise SharedFileSetDeleteOnProcExit() will access
already freed memory (both for filesetlist and the SharedFileSet itself).

Greetings,

Andres Freund



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Andres Freund
Date:
Hi,

On 2021-08-12 05:48:19 -0700, Andres Freund wrote:
> I think SharedFileSetInit() needs a comment explaining that it needs to be
> called in a process-lifetime memory context if used without dsm
> segments. Because otherwise SharedFileSetDeleteOnProcExit() will access
> already freed memory (both for filesetlist and the SharedFileSet itself).

Oh. And I think it's not ok that SharedFileSetDeleteAll() unconditionally does
SharedFileSetUnregister(). SharedFileSetUnregister() asserts out if there's no
match, but DSM based sets are never entered into filesetlist. So one cannot
have a non-DSM and DSM set coexisting. Which seems surprising.

Greetings,

Andres Freund



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Andres Freund
Date:
Hi,

On 2021-08-12 05:48:19 -0700, Andres Freund wrote:
> I think SharedFileSetInit() needs a comment explaining that it needs to be
> called in a process-lifetime memory context if used without dsm
> segments. Because otherwise SharedFileSetDeleteOnProcExit() will access
> already freed memory (both for filesetlist and the SharedFileSet itself).

Oh. And I think it's not ok that SharedFileSetDeleteAll() unconditionally does
SharedFileSetUnregister(). SharedFileSetUnregister() asserts out if there's no
match, but DSM based sets are never entered into filesetlist. So one cannot
have a non-DSM and DSM set coexisting. Which seems surprising.

Greetings,

Andres Freund



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Amit Kapila
Date:
On Thu, Aug 12, 2021 at 6:18 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2021-08-12 15:06:23 +0530, Amit Kapila wrote:
> > On Thu, Aug 12, 2021 at 1:52 PM Andres Freund <andres@anarazel.de> wrote:
> > > I'm not so sure. Why does sharedfileset have its own proc exit hook in the
> > > first place? ISTM that this should be dealt with using resowners, rathers than
> > > a sharedfileset specific mechanism?
> > >
>
> > The underlying temporary files need to be closed at xact end but need
> > to survive across transactions.
>
> Why do they need to be closed at xact end? To avoid wasting memory due to too
> many buffered files?
>

Yes.

>
> > These are registered with the resource owner via
> > PathNameOpenTemporaryFile/PathNameCreateTemporaryFile and then closed
> > at xact end. So, we need a way to remove the files used by the process
> > (apply worker in this particular case) before process exit and used
> > this proc_exit hook (possibly on the lines of AtProcExit_Files).
>
> What I'm wondering is why it is a good idea to have a SharedFileSet specific
> cleanup mechanism. One that only operates on process lifetime level, rather
> than something more granular. I get that the of the files here needs to be
> longer than a transaction, but that can easily be addressed by having a longer
> lived resource owner.
>
> Process lifetime may work well for the current worker.c, but even there it
> doesn't seem optimal. One e.g. could easily imagine that we'd want to handle
> connection errors or configuration changes without restarting the worker, in
> which case process lifetime obviously isn't a good idea anymore.
>

I don't deny that we can't make this at a more granular level. IIRC,
at that time, we tried to follow AtProcExit_Files which cleans up temp
files at proc exit and we needed something similar for temporary files
used via SharedFileSet. I think we can extend this API but I guess it
is better to then do it for dsm-based as well so that these get
tracked via resowner.

>
> I think SharedFileSetInit() needs a comment explaining that it needs to be
> called in a process-lifetime memory context if used without dsm
> segments.
>

We already have a comment about proc_exit clean up of files but will
extend that a bit about memory context.

-- 
With Regards,
Amit Kapila.



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Amit Kapila
Date:
On Thu, Aug 12, 2021 at 6:18 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2021-08-12 15:06:23 +0530, Amit Kapila wrote:
> > On Thu, Aug 12, 2021 at 1:52 PM Andres Freund <andres@anarazel.de> wrote:
> > > I'm not so sure. Why does sharedfileset have its own proc exit hook in the
> > > first place? ISTM that this should be dealt with using resowners, rathers than
> > > a sharedfileset specific mechanism?
> > >
>
> > The underlying temporary files need to be closed at xact end but need
> > to survive across transactions.
>
> Why do they need to be closed at xact end? To avoid wasting memory due to too
> many buffered files?
>

Yes.

>
> > These are registered with the resource owner via
> > PathNameOpenTemporaryFile/PathNameCreateTemporaryFile and then closed
> > at xact end. So, we need a way to remove the files used by the process
> > (apply worker in this particular case) before process exit and used
> > this proc_exit hook (possibly on the lines of AtProcExit_Files).
>
> What I'm wondering is why it is a good idea to have a SharedFileSet specific
> cleanup mechanism. One that only operates on process lifetime level, rather
> than something more granular. I get that the of the files here needs to be
> longer than a transaction, but that can easily be addressed by having a longer
> lived resource owner.
>
> Process lifetime may work well for the current worker.c, but even there it
> doesn't seem optimal. One e.g. could easily imagine that we'd want to handle
> connection errors or configuration changes without restarting the worker, in
> which case process lifetime obviously isn't a good idea anymore.
>

I don't deny that we can't make this at a more granular level. IIRC,
at that time, we tried to follow AtProcExit_Files which cleans up temp
files at proc exit and we needed something similar for temporary files
used via SharedFileSet. I think we can extend this API but I guess it
is better to then do it for dsm-based as well so that these get
tracked via resowner.

>
> I think SharedFileSetInit() needs a comment explaining that it needs to be
> called in a process-lifetime memory context if used without dsm
> segments.
>

We already have a comment about proc_exit clean up of files but will
extend that a bit about memory context.

-- 
With Regards,
Amit Kapila.



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Amit Kapila
Date:
On Thu, Aug 12, 2021 at 6:24 PM Andres Freund <andres@anarazel.de> wrote:
>
> On 2021-08-12 05:48:19 -0700, Andres Freund wrote:
> > I think SharedFileSetInit() needs a comment explaining that it needs to be
> > called in a process-lifetime memory context if used without dsm
> > segments. Because otherwise SharedFileSetDeleteOnProcExit() will access
> > already freed memory (both for filesetlist and the SharedFileSet itself).
>
> Oh. And I think it's not ok that SharedFileSetDeleteAll() unconditionally does
> SharedFileSetUnregister(). SharedFileSetUnregister() asserts out if there's no
> match, but DSM based sets are never entered into filesetlist. So one cannot
> have a non-DSM and DSM set coexisting. Which seems surprising.
>

Oops, it should be allowed to have both non-DSM and DSM set
coexisting. I think we can remove Assert from
SharedFileSetUnregister(). The other way could be to pass a parameter
to SharedFileSetDeleteAll() to tell whether to unregister or not.

-- 
With Regards,
Amit Kapila.



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Amit Kapila
Date:
On Thu, Aug 12, 2021 at 6:24 PM Andres Freund <andres@anarazel.de> wrote:
>
> On 2021-08-12 05:48:19 -0700, Andres Freund wrote:
> > I think SharedFileSetInit() needs a comment explaining that it needs to be
> > called in a process-lifetime memory context if used without dsm
> > segments. Because otherwise SharedFileSetDeleteOnProcExit() will access
> > already freed memory (both for filesetlist and the SharedFileSet itself).
>
> Oh. And I think it's not ok that SharedFileSetDeleteAll() unconditionally does
> SharedFileSetUnregister(). SharedFileSetUnregister() asserts out if there's no
> match, but DSM based sets are never entered into filesetlist. So one cannot
> have a non-DSM and DSM set coexisting. Which seems surprising.
>

Oops, it should be allowed to have both non-DSM and DSM set
coexisting. I think we can remove Assert from
SharedFileSetUnregister(). The other way could be to pass a parameter
to SharedFileSetDeleteAll() to tell whether to unregister or not.

-- 
With Regards,
Amit Kapila.



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Andres Freund
Date:
Hi,

(dropping -committers to avoid moderation stalls due xposting to multiple lists -
I find that more annoying than helpful)

On 2021-08-13 14:38:37 +0530, Amit Kapila wrote:
> > What I'm wondering is why it is a good idea to have a SharedFileSet specific
> > cleanup mechanism. One that only operates on process lifetime level, rather
> > than something more granular. I get that the of the files here needs to be
> > longer than a transaction, but that can easily be addressed by having a longer
> > lived resource owner.
> >
> > Process lifetime may work well for the current worker.c, but even there it
> > doesn't seem optimal. One e.g. could easily imagine that we'd want to handle
> > connection errors or configuration changes without restarting the worker, in
> > which case process lifetime obviously isn't a good idea anymore.
> >
> 
> I don't deny that we can't make this at a more granular level. IIRC,
> at that time, we tried to follow AtProcExit_Files which cleans up temp
> files at proc exit and we needed something similar for temporary files
> used via SharedFileSet.

The comparison to AtProcExit_Files isn't convincing to me - normally temp
files are cleaned up long before that via resowner cleanup.

To me the reuse of SharedFileSet for worker.c as executed seems like a bad
design. As things stand there's little code shared between dsm/non-dsm shared
file sets. The cleanup semantics are entirely different. Several functions
don't work if used on the "wrong kind" of set (e.g. SharedFileSetAttach()).


> I think we can extend this API but I guess it is better to then do it
> for dsm-based as well so that these get tracked via resowner.

DSM segments are resowner managed already, so it's not obvious that that'd buy
us much? Although I guess there could be a few edge cases that'd look cleaner,
because we could reliably trigger cleanup in the leader instead of relying on
dsm detach hooks + refcounts to manage when a set is physically deleted?

Greetings,

Andres Freund



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Dilip Kumar
Date:
On Fri, Aug 13, 2021 at 9:29 PM Andres Freund <andres@anarazel.de> wrote:

> > I think we can extend this API but I guess it is better to then do it
> > for dsm-based as well so that these get tracked via resowner.
>
> DSM segments are resowner managed already, so it's not obvious that that'd buy
> us much? Although I guess there could be a few edge cases that'd look cleaner,
> because we could reliably trigger cleanup in the leader instead of relying on
> dsm detach hooks + refcounts to manage when a set is physically deleted?
>

In an off-list discussion with Thomas and Amit, we tried to discuss
how to clean up the shared files set in the current use case.  Thomas
suggested that instead of using individual shared fileset for storing
the data for each XID why don't we just create a single shared fileset
for complete worker lifetime and when the worker is exiting we can
just remove that shared fileset.  And for storing XID data, we can
just create the files under the same shared fileset and delete those
files when we longer need them.  I like this idea and it looks much
cleaner, after this, we can get rid of the special cleanup mechanism
using 'filesetlist'.  I have attached a patch for the same.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment

Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Amit Kapila
Date:
On Mon, Aug 16, 2021 at 8:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Aug 13, 2021 at 9:29 PM Andres Freund <andres@anarazel.de> wrote:
>
> > > I think we can extend this API but I guess it is better to then do it
> > > for dsm-based as well so that these get tracked via resowner.
> >
> > DSM segments are resowner managed already, so it's not obvious that that'd buy
> > us much? Although I guess there could be a few edge cases that'd look cleaner,
> > because we could reliably trigger cleanup in the leader instead of relying on
> > dsm detach hooks + refcounts to manage when a set is physically deleted?
> >
>
> In an off-list discussion with Thomas and Amit, we tried to discuss
> how to clean up the shared files set in the current use case.  Thomas
> suggested that instead of using individual shared fileset for storing
> the data for each XID why don't we just create a single shared fileset
> for complete worker lifetime and when the worker is exiting we can
> just remove that shared fileset.  And for storing XID data, we can
> just create the files under the same shared fileset and delete those
> files when we longer need them.  I like this idea and it looks much
> cleaner, after this, we can get rid of the special cleanup mechanism
> using 'filesetlist'.  I have attached a patch for the same.
>

It seems to me that this idea would obviate any need for resource
owners as we will have only one fileset now. I have a few initial
comments on the patch:

1.
+ /* do cleanup on worker exit (e.g. after DROP SUBSCRIPTION) */
+ on_shmem_exit(worker_cleanup, (Datum) 0);

It should be registered with before_shmem_exit() hook to allow sending
stats for file removal.

2. After these changes, the comments atop stream_open_file and
SharedFileSetInit need to be changed.

3. In function subxact_info_write(), we are computing subxact file
path twice which doesn't seem to be required.

4.
+ if (missing_ok)
+ return NULL;
  ereport(ERROR,
  (errcode_for_file_access(),
- errmsg("could not open temporary file \"%s\" from BufFile \"%s\": %m",
+ errmsg("could not open temporary file \"%s\" from BufFile \"%s\": %m",
  segment_name, name)));

There seems to be a formatting issue with errmsg. Also, it is better
to keep an empty line before ereport.

5. How can we provide a strict mechanism to not allow to use dsm APIs
for non-dsm FileSet? One idea could be that we can have a variable
(probably bool) in SharedFileSet structure which will be initialized
in SharedFileSetInit based on whether the caller has provided dsm
segment. Then in other DSM-based APIs, we can check if it is used for
the wrong type.

-- 
With Regards,
Amit Kapila.



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Amit Kapila
Date:
On Tue, Aug 17, 2021 at 10:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Aug 16, 2021 at 8:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Fri, Aug 13, 2021 at 9:29 PM Andres Freund <andres@anarazel.de> wrote:
> >
> > > > I think we can extend this API but I guess it is better to then do it
> > > > for dsm-based as well so that these get tracked via resowner.
> > >
> > > DSM segments are resowner managed already, so it's not obvious that that'd buy
> > > us much? Although I guess there could be a few edge cases that'd look cleaner,
> > > because we could reliably trigger cleanup in the leader instead of relying on
> > > dsm detach hooks + refcounts to manage when a set is physically deleted?
> > >
> >
> > In an off-list discussion with Thomas and Amit, we tried to discuss
> > how to clean up the shared files set in the current use case.  Thomas
> > suggested that instead of using individual shared fileset for storing
> > the data for each XID why don't we just create a single shared fileset
> > for complete worker lifetime and when the worker is exiting we can
> > just remove that shared fileset.  And for storing XID data, we can
> > just create the files under the same shared fileset and delete those
> > files when we longer need them.  I like this idea and it looks much
> > cleaner, after this, we can get rid of the special cleanup mechanism
> > using 'filesetlist'.  I have attached a patch for the same.
> >
>
> It seems to me that this idea would obviate any need for resource
> owners as we will have only one fileset now. I have a few initial
> comments on the patch:
>

One more comment:
@@ -2976,39 +2952,17 @@ subxact_info_write(Oid subid, TransactionId xid)
..
+ /* Try to open the subxact file, if it doesn't exist then create it */
+ fd = BufFileOpenShared(xidfileset, path, O_RDWR, true);
+ if (fd == NULL)
+ fd = BufFileCreateShared(xidfileset, path);
..

Instead of trying to create the file here based on whether it exists
or not, can't we create it in subxact_info_add where we are first time
allocating memory for subxacts? If that works then in the above code,
the file should always exist.

-- 
With Regards,
Amit Kapila.



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Dilip Kumar
Date:
On Tue, Aug 17, 2021 at 12:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

> One more comment:
> @@ -2976,39 +2952,17 @@ subxact_info_write(Oid subid, TransactionId xid)
> ..
> + /* Try to open the subxact file, if it doesn't exist then create it */
> + fd = BufFileOpenShared(xidfileset, path, O_RDWR, true);
> + if (fd == NULL)
> + fd = BufFileCreateShared(xidfileset, path);
> ..
>
> Instead of trying to create the file here based on whether it exists
> or not, can't we create it in subxact_info_add where we are first time
> allocating memory for subxacts? If that works then in the above code,
> the file should always exist.

One problem with this approach is that for now we delay creating the
subxact file till the end of the stream and if by end of the stream
all the subtransactions got aborted within the same stream then we
don't even create that file.  But with this suggestion, we will always
create the file as soon as we get the first subtransaction.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Dilip Kumar
Date:
On Tue, Aug 17, 2021 at 10:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Aug 16, 2021 at 8:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Fri, Aug 13, 2021 at 9:29 PM Andres Freund <andres@anarazel.de> wrote:
> >
> > > > I think we can extend this API but I guess it is better to then do it
> > > > for dsm-based as well so that these get tracked via resowner.
> > >
> > > DSM segments are resowner managed already, so it's not obvious that that'd buy
> > > us much? Although I guess there could be a few edge cases that'd look cleaner,
> > > because we could reliably trigger cleanup in the leader instead of relying on
> > > dsm detach hooks + refcounts to manage when a set is physically deleted?
> > >
> >
> > In an off-list discussion with Thomas and Amit, we tried to discuss
> > how to clean up the shared files set in the current use case.  Thomas
> > suggested that instead of using individual shared fileset for storing
> > the data for each XID why don't we just create a single shared fileset
> > for complete worker lifetime and when the worker is exiting we can
> > just remove that shared fileset.  And for storing XID data, we can
> > just create the files under the same shared fileset and delete those
> > files when we longer need them.  I like this idea and it looks much
> > cleaner, after this, we can get rid of the special cleanup mechanism
> > using 'filesetlist'.  I have attached a patch for the same.
> >
>
> It seems to me that this idea would obviate any need for resource
> owners as we will have only one fileset now. I have a few initial
> comments on the patch:
>
> 1.
> + /* do cleanup on worker exit (e.g. after DROP SUBSCRIPTION) */
> + on_shmem_exit(worker_cleanup, (Datum) 0);
>
> It should be registered with before_shmem_exit() hook to allow sending
> stats for file removal.

Done

> 2. After these changes, the comments atop stream_open_file and
> SharedFileSetInit need to be changed.

Done

> 3. In function subxact_info_write(), we are computing subxact file
> path twice which doesn't seem to be required.

Fixed

> 4.
> + if (missing_ok)
> + return NULL;
>   ereport(ERROR,
>   (errcode_for_file_access(),
> - errmsg("could not open temporary file \"%s\" from BufFile \"%s\": %m",
> + errmsg("could not open temporary file \"%s\" from BufFile \"%s\": %m",
>   segment_name, name)));
>
> There seems to be a formatting issue with errmsg. Also, it is better
> to keep an empty line before ereport.

Done

> 5. How can we provide a strict mechanism to not allow to use dsm APIs
> for non-dsm FileSet? One idea could be that we can have a variable
> (probably bool) in SharedFileSet structure which will be initialized
> in SharedFileSetInit based on whether the caller has provided dsm
> segment. Then in other DSM-based APIs, we can check if it is used for
> the wrong type.

Yeah, we can do something like that, can't we just use an existing
variable instead of adding new, e.g. refcnt is required only when
multiple processes are attached, so maybe if dsm segment is not passed
then we can keep refcnt as 0 and based on we can give an error.  For
example, if we try to call SharedFileSetAttach for the SharedFileSet
which has refcnt as 0 then we error out?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment

Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Amit Kapila
Date:
On Tue, Aug 17, 2021 at 1:30 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Aug 17, 2021 at 10:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> > 5. How can we provide a strict mechanism to not allow to use dsm APIs
> > for non-dsm FileSet? One idea could be that we can have a variable
> > (probably bool) in SharedFileSet structure which will be initialized
> > in SharedFileSetInit based on whether the caller has provided dsm
> > segment. Then in other DSM-based APIs, we can check if it is used for
> > the wrong type.
>
> Yeah, we can do something like that, can't we just use an existing
> variable instead of adding new, e.g. refcnt is required only when
> multiple processes are attached, so maybe if dsm segment is not passed
> then we can keep refcnt as 0 and based on we can give an error.  For
> example, if we try to call SharedFileSetAttach for the SharedFileSet
> which has refcnt as 0 then we error out?
>

But as of now, we treat refcnt as 0 for SharedFileSet that is already
destroyed. See SharedFileSetAttach.

-- 
With Regards,
Amit Kapila.



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Andres Freund
Date:
Hi,

On 2021-08-17 10:54:30 +0530, Amit Kapila wrote:
> 5. How can we provide a strict mechanism to not allow to use dsm APIs
> for non-dsm FileSet? One idea could be that we can have a variable
> (probably bool) in SharedFileSet structure which will be initialized
> in SharedFileSetInit based on whether the caller has provided dsm
> segment. Then in other DSM-based APIs, we can check if it is used for
> the wrong type.

Well, isn't the issue here that it's not a shared file set in case you
explicitly don't want to share it? ISTM that the proper way to address
this would be to split out a FileSet from SharedFileSet that's then used
for worker.c and sharedfileset.c. Rather than making sharedfileset.c
support a non-shared mode.

Greetings,

Andres Freund



RE: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
"houzj.fnst@fujitsu.com"
Date:
Hi,

I took a quick look at the v2 patch and noticed a typo.

+ * backends and render it read-only.  If missing_ok is true then it will return
+ * NULL if file doesn not exist otherwise error.
  */
doesn not=> doesn't

Best regards,
Houzj



RE: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
"houzj.fnst@fujitsu.com"
Date:
On Wed, Aug 18, 2021 9:17 AM houzj.fnst@fujitsu.com wrote:
> Hi,
> 
> I took a quick look at the v2 patch and noticed a typo.
> 
> + * backends and render it read-only.  If missing_ok is true then it will return
> + * NULL if file doesn not exist otherwise error.
>   */
> doesn not=> doesn't
>

Here are some other comments:

1).
+BufFileOpenShared(SharedFileSet *fileset, const char *name, int mode,
+                  bool missing_ok)
 {
     BufFile    *file;
     char        segment_name[MAXPGPATH];
    ...
    files = palloc(sizeof(File) * capacity);
    ...
@@ -318,10 +320,15 @@ BufFileOpenShared(SharedFileSet *fileset, const char *name, int mode)
      * name.
      */
     if (nfiles == 0)
+    {
+        if (missing_ok)
+            return NULL;
+

I think it might be better to pfree(files) when return NULL.


2).
     /* Delete the subxact file and release the memory, if it exist */
-    if (ent->subxact_fileset)
-    {
-        subxact_filename(path, subid, xid);
-        SharedFileSetDeleteAll(ent->subxact_fileset);
-        pfree(ent->subxact_fileset);
-        ent->subxact_fileset = NULL;
-    }
-
-    /* Remove the xid entry from the stream xid hash */
-    hash_search(xidhash, (void *) &xid, HASH_REMOVE, NULL);
+    subxact_filename(path, subid, xid);
+    SharedFileSetDelete(xidfileset, path, true);

Without the patch it doesn't throw an error if not exist,
But with the patch, it pass error_on_failure=true to SharedFileSetDelete().
Was it intentional ?

Best regards,
Houzj



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Amit Kapila
Date:
On Wed, Aug 18, 2021 at 8:23 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Wed, Aug 18, 2021 9:17 AM houzj.fnst@fujitsu.com wrote:
> > Hi,
> >
> > I took a quick look at the v2 patch and noticed a typo.
> >
> > + * backends and render it read-only.  If missing_ok is true then it will return
> > + * NULL if file doesn not exist otherwise error.
> >   */
> > doesn not=> doesn't
> >
>
> Here are some other comments:
>
> 1).
> +BufFileOpenShared(SharedFileSet *fileset, const char *name, int mode,
> +                                 bool missing_ok)
>  {
>         BufFile    *file;
>         char            segment_name[MAXPGPATH];
>         ...
>         files = palloc(sizeof(File) * capacity);
>         ...
> @@ -318,10 +320,15 @@ BufFileOpenShared(SharedFileSet *fileset, const char *name, int mode)
>          * name.
>          */
>         if (nfiles == 0)
> +       {
> +               if (missing_ok)
> +                       return NULL;
> +
>
> I think it might be better to pfree(files) when return NULL.
>
>
> 2).
>         /* Delete the subxact file and release the memory, if it exist */
> -       if (ent->subxact_fileset)
> -       {
> -               subxact_filename(path, subid, xid);
> -               SharedFileSetDeleteAll(ent->subxact_fileset);
> -               pfree(ent->subxact_fileset);
> -               ent->subxact_fileset = NULL;
> -       }
> -
> -       /* Remove the xid entry from the stream xid hash */
> -       hash_search(xidhash, (void *) &xid, HASH_REMOVE, NULL);
> +       subxact_filename(path, subid, xid);
> +       SharedFileSetDelete(xidfileset, path, true);
>
> Without the patch it doesn't throw an error if not exist,
> But with the patch, it pass error_on_failure=true to SharedFileSetDelete().
>

Don't we need to use BufFileDeleteShared instead of
SharedFileSetDelete as you have used to remove the changes file?

-- 
With Regards,
Amit Kapila.



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Dilip Kumar
Date:
On Wed, Aug 18, 2021 at 9:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Aug 18, 2021 at 8:23 AM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Wed, Aug 18, 2021 9:17 AM houzj.fnst@fujitsu.com wrote:
> > > Hi,
> > >
> > > I took a quick look at the v2 patch and noticed a typo.
> > >
> > > + * backends and render it read-only.  If missing_ok is true then it will return
> > > + * NULL if file doesn not exist otherwise error.
> > >   */
> > > doesn not=> doesn't
> > >
> >
> > Here are some other comments:
> >
> > 1).
> > +BufFileOpenShared(SharedFileSet *fileset, const char *name, int mode,
> > +                                 bool missing_ok)
> >  {
> >         BufFile    *file;
> >         char            segment_name[MAXPGPATH];
> >         ...
> >         files = palloc(sizeof(File) * capacity);
> >         ...
> > @@ -318,10 +320,15 @@ BufFileOpenShared(SharedFileSet *fileset, const char *name, int mode)
> >          * name.
> >          */
> >         if (nfiles == 0)
> > +       {
> > +               if (missing_ok)
> > +                       return NULL;
> > +
> >
> > I think it might be better to pfree(files) when return NULL.
> >
> >
> > 2).
> >         /* Delete the subxact file and release the memory, if it exist */
> > -       if (ent->subxact_fileset)
> > -       {
> > -               subxact_filename(path, subid, xid);
> > -               SharedFileSetDeleteAll(ent->subxact_fileset);
> > -               pfree(ent->subxact_fileset);
> > -               ent->subxact_fileset = NULL;
> > -       }
> > -
> > -       /* Remove the xid entry from the stream xid hash */
> > -       hash_search(xidhash, (void *) &xid, HASH_REMOVE, NULL);
> > +       subxact_filename(path, subid, xid);
> > +       SharedFileSetDelete(xidfileset, path, true);
> >
> > Without the patch it doesn't throw an error if not exist,
> > But with the patch, it pass error_on_failure=true to SharedFileSetDelete().
> >
>
> Don't we need to use BufFileDeleteShared instead of
> SharedFileSetDelete as you have used to remove the changes file?

Yeah, it should be BufFileDeleteShared.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Amit Kapila
Date:
On Tue, Aug 17, 2021 at 4:34 PM Andres Freund <andres@anarazel.de> wrote:
>
> On 2021-08-17 10:54:30 +0530, Amit Kapila wrote:
> > 5. How can we provide a strict mechanism to not allow to use dsm APIs
> > for non-dsm FileSet? One idea could be that we can have a variable
> > (probably bool) in SharedFileSet structure which will be initialized
> > in SharedFileSetInit based on whether the caller has provided dsm
> > segment. Then in other DSM-based APIs, we can check if it is used for
> > the wrong type.
>
> Well, isn't the issue here that it's not a shared file set in case you
> explicitly don't want to share it? ISTM that the proper way to address
> this would be to split out a FileSet from SharedFileSet that's then used
> for worker.c and sharedfileset.c.
>

Okay, but note that to accomplish the same, we need to tweak the
BufFile (buffile.c) APIs as well so that they can work with FileSet.
As per the initial analysis, there doesn't seem to be any problem with
that though.

-- 
With Regards,
Amit Kapila.



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Dilip Kumar
Date:
On Wed, Aug 18, 2021 at 11:24 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Aug 17, 2021 at 4:34 PM Andres Freund <andres@anarazel.de> wrote:
> >
> > On 2021-08-17 10:54:30 +0530, Amit Kapila wrote:
> > > 5. How can we provide a strict mechanism to not allow to use dsm APIs
> > > for non-dsm FileSet? One idea could be that we can have a variable
> > > (probably bool) in SharedFileSet structure which will be initialized
> > > in SharedFileSetInit based on whether the caller has provided dsm
> > > segment. Then in other DSM-based APIs, we can check if it is used for
> > > the wrong type.
> >
> > Well, isn't the issue here that it's not a shared file set in case you
> > explicitly don't want to share it? ISTM that the proper way to address
> > this would be to split out a FileSet from SharedFileSet that's then used
> > for worker.c and sharedfileset.c.
> >
>
> Okay, but note that to accomplish the same, we need to tweak the
> BufFile (buffile.c) APIs as well so that they can work with FileSet.
> As per the initial analysis, there doesn't seem to be any problem with
> that though.

I was looking into this, so if we want to do that I think the outline
will look like this

- There will be a fileset.c and fileset.h files, and we will expose a
new structure FileSet, which will be the same as SharedFileSet, except
mutext and refcount.  The fileset.c will expose FileSetInit(),
FileSetCreate(), FileSetOpen(), FileSetDelete() and FileSetDeleteAll()
interfaces.

- sharefileset.c will internally call the fileset.c's interfaces.  The
SharedFileSet structure will also contain FileSet and other members
i.e. mutex and refcount.

- the buffile.c's interfaces which are ending with Shared e.g.
BufFileCreateShared, BufFileOpenShared, should be converted to
BufFileCreate and
BufFileOpen respectively.  And the input to these interfaces can be
converted to FileSet instead of SharedFileSet.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Amit Kapila
Date:
On Wed, Aug 18, 2021 at 3:45 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Aug 18, 2021 at 11:24 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Aug 17, 2021 at 4:34 PM Andres Freund <andres@anarazel.de> wrote:
> > >
> > > On 2021-08-17 10:54:30 +0530, Amit Kapila wrote:
> > > > 5. How can we provide a strict mechanism to not allow to use dsm APIs
> > > > for non-dsm FileSet? One idea could be that we can have a variable
> > > > (probably bool) in SharedFileSet structure which will be initialized
> > > > in SharedFileSetInit based on whether the caller has provided dsm
> > > > segment. Then in other DSM-based APIs, we can check if it is used for
> > > > the wrong type.
> > >
> > > Well, isn't the issue here that it's not a shared file set in case you
> > > explicitly don't want to share it? ISTM that the proper way to address
> > > this would be to split out a FileSet from SharedFileSet that's then used
> > > for worker.c and sharedfileset.c.
> > >
> >
> > Okay, but note that to accomplish the same, we need to tweak the
> > BufFile (buffile.c) APIs as well so that they can work with FileSet.
> > As per the initial analysis, there doesn't seem to be any problem with
> > that though.
>
> I was looking into this, so if we want to do that I think the outline
> will look like this
>
> - There will be a fileset.c and fileset.h files, and we will expose a
> new structure FileSet, which will be the same as SharedFileSet, except
> mutext and refcount.  The fileset.c will expose FileSetInit(),
> FileSetCreate(), FileSetOpen(), FileSetDelete() and FileSetDeleteAll()
> interfaces.
>
> - sharefileset.c will internally call the fileset.c's interfaces.  The
> SharedFileSet structure will also contain FileSet and other members
> i.e. mutex and refcount.
>
> - the buffile.c's interfaces which are ending with Shared e.g.
> BufFileCreateShared, BufFileOpenShared, should be converted to
> BufFileCreate and
> BufFileOpen respectively.
>

The other alternative to name buffile APIs could be
BufFileCreateFileSet, BufFileOpenFileSet, etc.

-- 
With Regards,
Amit Kapila.



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Dilip Kumar
Date:
On Wed, Aug 18, 2021 at 3:45 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Aug 18, 2021 at 11:24 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Aug 17, 2021 at 4:34 PM Andres Freund <andres@anarazel.de> wrote:
> > >
> > > On 2021-08-17 10:54:30 +0530, Amit Kapila wrote:
> > > > 5. How can we provide a strict mechanism to not allow to use dsm APIs
> > > > for non-dsm FileSet? One idea could be that we can have a variable
> > > > (probably bool) in SharedFileSet structure which will be initialized
> > > > in SharedFileSetInit based on whether the caller has provided dsm
> > > > segment. Then in other DSM-based APIs, we can check if it is used for
> > > > the wrong type.
> > >
> > > Well, isn't the issue here that it's not a shared file set in case you
> > > explicitly don't want to share it? ISTM that the proper way to address
> > > this would be to split out a FileSet from SharedFileSet that's then used
> > > for worker.c and sharedfileset.c.
> > >
> >
> > Okay, but note that to accomplish the same, we need to tweak the
> > BufFile (buffile.c) APIs as well so that they can work with FileSet.
> > As per the initial analysis, there doesn't seem to be any problem with
> > that though.
>
> I was looking into this, so if we want to do that I think the outline
> will look like this
>
> - There will be a fileset.c and fileset.h files, and we will expose a
> new structure FileSet, which will be the same as SharedFileSet, except
> mutext and refcount.  The fileset.c will expose FileSetInit(),
> FileSetCreate(), FileSetOpen(), FileSetDelete() and FileSetDeleteAll()
> interfaces.
>
> - sharefileset.c will internally call the fileset.c's interfaces.  The
> SharedFileSet structure will also contain FileSet and other members
> i.e. mutex and refcount.
>
> - the buffile.c's interfaces which are ending with Shared e.g.
> BufFileCreateShared, BufFileOpenShared, should be converted to
> BufFileCreate and
> BufFileOpen respectively.  And the input to these interfaces can be
> converted to FileSet instead of SharedFileSet.

Here is the first draft based on the idea we discussed, 0001, splits
sharedfileset.c in sharedfileset.c and fileset.c and 0002 is same
patch I submitted earlier(use single fileset throughout the worker),
just it is rebased on top of 0001.  Please let me know your thoughts.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment

RE: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
"houzj.fnst@fujitsu.com"
Date:
On Sat, Aug 21, 2021 8:38 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> On Wed, Aug 18, 2021 at 3:45 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > I was looking into this, so if we want to do that I think the outline
> > will look like this
> >
> > - There will be a fileset.c and fileset.h files, and we will expose a
> > new structure FileSet, which will be the same as SharedFileSet, except
> > mutext and refcount.  The fileset.c will expose FileSetInit(),
> > FileSetCreate(), FileSetOpen(), FileSetDelete() and FileSetDeleteAll()
> > interfaces.
> >
> > - sharefileset.c will internally call the fileset.c's interfaces.  The
> > SharedFileSet structure will also contain FileSet and other members
> > i.e. mutex and refcount.
> >
> > - the buffile.c's interfaces which are ending with Shared e.g.
> > BufFileCreateShared, BufFileOpenShared, should be converted to
> > BufFileCreate and BufFileOpen respectively.  And the input to these
> > interfaces can be converted to FileSet instead of SharedFileSet.
> 
> Here is the first draft based on the idea we discussed, 0001, splits
> sharedfileset.c in sharedfileset.c and fileset.c and 0002 is same patch I
> submitted earlier(use single fileset throughout the worker), just it is rebased on
> top of 0001.  Please let me know your thoughts.

Hi,

Here are some comments for the new version patches.

1)
+    TempTablespacePath(tempdirpath, tablespace);
+    snprintf(path, MAXPGPATH, "%s/%s%lu.%u.sharedfileset",
+             tempdirpath, PG_TEMP_FILE_PREFIX,
+             (unsigned long) fileset->creator_pid, fileset->number);

do we need to use different filename for shared and un-shared fileset ?

2)
I think we can remove or adjust the following comments in sharedfileset.c.

----
 * SharedFileSets can be used by backends when the temporary files need to be
 * opened/closed multiple times and the underlying files need to survive across
 * transactions.
----
 * We can also use this interface if the temporary files are used only by
 * single backend but the files need to be opened and closed multiple times
 * and also the underlying files need to survive across transactions.  For
----

3)
The 0002 patch still used the word "shared fileset" in some places, I think we
should change it to "fileset".

4)
-extern File SharedFileSetCreate(SharedFileSet *fileset, const char *name);
-extern File SharedFileSetOpen(SharedFileSet *fileset, const char *name,
-                              int mode);
-extern bool SharedFileSetDelete(SharedFileSet *fileset, const char *name,
-                                bool error_on_failure);
 extern void SharedFileSetDeleteAll(SharedFileSet *fileset);
-extern void SharedFileSetUnregister(SharedFileSet *input_fileset);

I noticed the patch delete part of public api, is it better to keep the old api and
let them invoke new api internally ? Having said that, I didn’t find any open source
extension use these old api, so it might be fine to delete them.

Best regards,
Hou zj

Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Dilip Kumar
Date:
On Mon, Aug 23, 2021 at 9:11 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:

> 4)
> -extern File SharedFileSetCreate(SharedFileSet *fileset, const char *name);
> -extern File SharedFileSetOpen(SharedFileSet *fileset, const char *name,
> -                                                         int mode);
> -extern bool SharedFileSetDelete(SharedFileSet *fileset, const char *name,
> -                                                               bool error_on_failure);
>  extern void SharedFileSetDeleteAll(SharedFileSet *fileset);
> -extern void SharedFileSetUnregister(SharedFileSet *input_fileset);
>
> I noticed the patch delete part of public api, is it better to keep the old api and
> let them invoke new api internally ? Having said that, I didn’t find any open source
> extension use these old api, so it might be fine to delete them.

Right, those were internally used by buffile.c but now we have changed
buffile.c to directly use the fileset interfaces, so we better remove
them.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



On Mon, Aug 23, 2021 at 9:53 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Aug 23, 2021 at 9:11 AM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
>
> > 4)
> > -extern File SharedFileSetCreate(SharedFileSet *fileset, const char *name);
> > -extern File SharedFileSetOpen(SharedFileSet *fileset, const char *name,
> > -                                                         int mode);
> > -extern bool SharedFileSetDelete(SharedFileSet *fileset, const char *name,
> > -                                                               bool error_on_failure);
> >  extern void SharedFileSetDeleteAll(SharedFileSet *fileset);
> > -extern void SharedFileSetUnregister(SharedFileSet *input_fileset);
> >
> > I noticed the patch delete part of public api, is it better to keep the old api and
> > let them invoke new api internally ? Having said that, I didn’t find any open source
> > extension use these old api, so it might be fine to delete them.
>
> Right, those were internally used by buffile.c but now we have changed
> buffile.c to directly use the fileset interfaces, so we better remove
> them.
>

I also don't see any reason to keep those exposed from
sharedfileset.h. I see that even in the original commit dc6c4c9dc2,
these APIs seem to be introduced to be used by buffile. Andres/Thomas,
do let us know if you think otherwise?

One more comment:
I think v1-0001-Sharedfileset-refactoring doesn't have a way for
cleaning up worker.c temporary files on error/exit. It is better to
have that to make it an independent patch.

--
With Regards,
Amit Kapila.



On Mon, Aug 23, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Aug 23, 2021 at 9:53 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Mon, Aug 23, 2021 at 9:11 AM houzj.fnst@fujitsu.com
> > <houzj.fnst@fujitsu.com> wrote:
> >
> > > 4)
> > > -extern File SharedFileSetCreate(SharedFileSet *fileset, const char *name);
> > > -extern File SharedFileSetOpen(SharedFileSet *fileset, const char *name,
> > > -                                                         int mode);
> > > -extern bool SharedFileSetDelete(SharedFileSet *fileset, const char *name,
> > > -                                                               bool error_on_failure);
> > >  extern void SharedFileSetDeleteAll(SharedFileSet *fileset);
> > > -extern void SharedFileSetUnregister(SharedFileSet *input_fileset);
> > >
> > > I noticed the patch delete part of public api, is it better to keep the old api and
> > > let them invoke new api internally ? Having said that, I didn’t find any open source
> > > extension use these old api, so it might be fine to delete them.
> >
> > Right, those were internally used by buffile.c but now we have changed
> > buffile.c to directly use the fileset interfaces, so we better remove
> > them.
> >
>
> I also don't see any reason to keep those exposed from
> sharedfileset.h. I see that even in the original commit dc6c4c9dc2,
> these APIs seem to be introduced to be used by buffile. Andres/Thomas,
> do let us know if you think otherwise?
>
> One more comment:
> I think v1-0001-Sharedfileset-refactoring doesn't have a way for
> cleaning up worker.c temporary files on error/exit. It is better to
> have that to make it an independent patch.

I think we should handle that in worker.c itself, by adding a
before_dsm_detach function before_shmem_exit right?  Or you are
thinking that in FileSetInit, we keep the mechanism of filesetlist
like we were doing in SharedFileSetInit?  I think that will just add
unnecessary complexity in the first patch which will eventually go
away in the second patch.  And if we do that then SharedFileSetInit
can not directly use the FileSetInit, otherwise, the dsm based fileset
will also get registered for cleanup in filesetlist so for that we
might need to pass one parameter to the FileSetInit() that whether to
register for cleanup or not and that will again not look clean because
now we are again adding the conditional cleanup, IMHO that is the same
problem what we are trying to cleanup in SharedFileSetInit by
introducing a new FileSetInit.

I think what we can do is, introduce a new function
FileSetInitInternal(), that will do what FileSetInit() is doing today
and now both SharedFileSetInit() and the FileSetInit() will call this
function, and along with that SharedFileSetInit(), will register the
dsm based cleanup and FileSetInit() will do the filesetlist based
cleanup.  But IMHO, we should try to go in this direction only if we
are sure that we want to commit the first patch and not the second.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

From
Masahiko Sawada
Date:
On Sat, Aug 21, 2021 at 9:38 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Aug 18, 2021 at 3:45 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Wed, Aug 18, 2021 at 11:24 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Aug 17, 2021 at 4:34 PM Andres Freund <andres@anarazel.de> wrote:
> > > >
> > > > On 2021-08-17 10:54:30 +0530, Amit Kapila wrote:
> > > > > 5. How can we provide a strict mechanism to not allow to use dsm APIs
> > > > > for non-dsm FileSet? One idea could be that we can have a variable
> > > > > (probably bool) in SharedFileSet structure which will be initialized
> > > > > in SharedFileSetInit based on whether the caller has provided dsm
> > > > > segment. Then in other DSM-based APIs, we can check if it is used for
> > > > > the wrong type.
> > > >
> > > > Well, isn't the issue here that it's not a shared file set in case you
> > > > explicitly don't want to share it? ISTM that the proper way to address
> > > > this would be to split out a FileSet from SharedFileSet that's then used
> > > > for worker.c and sharedfileset.c.
> > > >
> > >
> > > Okay, but note that to accomplish the same, we need to tweak the
> > > BufFile (buffile.c) APIs as well so that they can work with FileSet.
> > > As per the initial analysis, there doesn't seem to be any problem with
> > > that though.
> >
> > I was looking into this, so if we want to do that I think the outline
> > will look like this
> >
> > - There will be a fileset.c and fileset.h files, and we will expose a
> > new structure FileSet, which will be the same as SharedFileSet, except
> > mutext and refcount.  The fileset.c will expose FileSetInit(),
> > FileSetCreate(), FileSetOpen(), FileSetDelete() and FileSetDeleteAll()
> > interfaces.
> >
> > - sharefileset.c will internally call the fileset.c's interfaces.  The
> > SharedFileSet structure will also contain FileSet and other members
> > i.e. mutex and refcount.
> >
> > - the buffile.c's interfaces which are ending with Shared e.g.
> > BufFileCreateShared, BufFileOpenShared, should be converted to
> > BufFileCreate and
> > BufFileOpen respectively.  And the input to these interfaces can be
> > converted to FileSet instead of SharedFileSet.
>
> Here is the first draft based on the idea we discussed, 0001, splits
> sharedfileset.c in sharedfileset.c and fileset.c and 0002 is same
> patch I submitted earlier(use single fileset throughout the worker),
> just it is rebased on top of 0001.  Please let me know your thoughts.

Here are some comments on 0001 patch:

+/*
+ * Initialize a space for temporary files. This API can be used by shared
+ * fileset as well as if the temporary files are used only by single backend
+ * but the files need to be opened and closed multiple times and also the
+ * underlying files need to survive across transactions.
+ *
+ * Files will be distributed over the tablespaces configured in
+ * temp_tablespaces.
+ *
+ * Under the covers the set is one or more directories which will eventually
+ * be deleted.
+ */

I think it's better to mention cleaning up here like we do in the
comment of SharedFileSetInit().

---
I think we need to clean up both stream_fileset and subxact_fileset on
proc exit. In 0002 patch cleans up the fileset but I think we need to
do that also in 0001 patch.

---
There still are some comments using "shared fileset" in both buffile.c
and worker.c.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



On Mon, Aug 23, 2021 at 12:52 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Aug 23, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Aug 23, 2021 at 9:53 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Mon, Aug 23, 2021 at 9:11 AM houzj.fnst@fujitsu.com
> > > <houzj.fnst@fujitsu.com> wrote:
> > >
> > > > 4)
> > > > -extern File SharedFileSetCreate(SharedFileSet *fileset, const char *name);
> > > > -extern File SharedFileSetOpen(SharedFileSet *fileset, const char *name,
> > > > -                                                         int mode);
> > > > -extern bool SharedFileSetDelete(SharedFileSet *fileset, const char *name,
> > > > -                                                               bool error_on_failure);
> > > >  extern void SharedFileSetDeleteAll(SharedFileSet *fileset);
> > > > -extern void SharedFileSetUnregister(SharedFileSet *input_fileset);
> > > >
> > > > I noticed the patch delete part of public api, is it better to keep the old api and
> > > > let them invoke new api internally ? Having said that, I didn’t find any open source
> > > > extension use these old api, so it might be fine to delete them.
> > >
> > > Right, those were internally used by buffile.c but now we have changed
> > > buffile.c to directly use the fileset interfaces, so we better remove
> > > them.
> > >
> >
> > I also don't see any reason to keep those exposed from
> > sharedfileset.h. I see that even in the original commit dc6c4c9dc2,
> > these APIs seem to be introduced to be used by buffile. Andres/Thomas,
> > do let us know if you think otherwise?
> >
> > One more comment:
> > I think v1-0001-Sharedfileset-refactoring doesn't have a way for
> > cleaning up worker.c temporary files on error/exit. It is better to
> > have that to make it an independent patch.
>
> I think we should handle that in worker.c itself, by adding a
> before_dsm_detach function before_shmem_exit right?
>

Yeah, I thought of handling it in worker.c similar to what you've in 0002 patch.

--
With Regards,
Amit Kapila.



On Mon, Aug 23, 2021 at 1:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Note: merge comments from multiple mails

> > I think we should handle that in worker.c itself, by adding a
> > before_dsm_detach function before_shmem_exit right?
> >
>
> Yeah, I thought of handling it in worker.c similar to what you've in 0002 patch.
>

I have done handling in worker.c

On Mon, Aug 23, 2021 at 9:11 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Sat, Aug 21, 2021 8:38 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> 1)
> +       TempTablespacePath(tempdirpath, tablespace);
> +       snprintf(path, MAXPGPATH, "%s/%s%lu.%u.sharedfileset",
> +                        tempdirpath, PG_TEMP_FILE_PREFIX,
> +                        (unsigned long) fileset->creator_pid, fileset->number);
>
> do we need to use different filename for shared and un-shared fileset ?

I was also thinking about the same, does it make sense to name it just
""%s/%s%lu.%u.fileset"?

On Mon, Aug 23, 2021 at 1:08 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

> Here are some comments on 0001 patch:
>
> +/*
> + * Initialize a space for temporary files. This API can be used by shared
> + * fileset as well as if the temporary files are used only by single backend
> + * but the files need to be opened and closed multiple times and also the
> + * underlying files need to survive across transactions.
> + *
> + * Files will be distributed over the tablespaces configured in
> + * temp_tablespaces.
> + *
> + * Under the covers the set is one or more directories which will eventually
> + * be deleted.
> + */
>
> I think it's better to mention cleaning up here like we do in the
> comment of SharedFileSetInit().

Right, done

>
> ---
> I think we need to clean up both stream_fileset and subxact_fileset on
> proc exit. In 0002 patch cleans up the fileset but I think we need to
> do that also in 0001 patch.

Done

> ---
> There still are some comments using "shared fileset" in both buffile.c
> and worker.c.

Done


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment
On Mon, Aug 23, 2021 at 3:13 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Aug 23, 2021 at 1:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> Note: merge comments from multiple mails
>
> > > I think we should handle that in worker.c itself, by adding a
> > > before_dsm_detach function before_shmem_exit right?
> > >
> >
> > Yeah, I thought of handling it in worker.c similar to what you've in 0002 patch.
> >
>
> I have done handling in worker.c
>
> On Mon, Aug 23, 2021 at 9:11 AM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Sat, Aug 21, 2021 8:38 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > 1)
> > +       TempTablespacePath(tempdirpath, tablespace);
> > +       snprintf(path, MAXPGPATH, "%s/%s%lu.%u.sharedfileset",
> > +                        tempdirpath, PG_TEMP_FILE_PREFIX,
> > +                        (unsigned long) fileset->creator_pid, fileset->number);
> >
> > do we need to use different filename for shared and un-shared fileset ?
>
> I was also thinking about the same, does it make sense to name it just
> ""%s/%s%lu.%u.fileset"?
>

I think it is reasonable to use .fileset as proposed by you.

Few other comments:
=================
1.
+ /*
+ * Register before-shmem-exit hook to ensure filesets are dropped while we
+ * can still report stats for underlying temporary files.
+ */
+ before_shmem_exit(worker_cleanup, (Datum) 0);
+

Do we really need to register a new callback here? Won't the existing
logical replication worker exit routine (logicalrep_worker_onexit) be
sufficient for this patch's purpose?

2.
- SharedFileSet *fileset; /* space for segment files if shared */
- const char *name; /* name of this BufFile if shared */
+ FileSet    *fileset; /* space for fileset for fileset based file */
+ const char *name; /* name of this BufFile */

The comments for the above two variables can be written as (a) space
for fileset based segment files, (b) name of fileset based BufFile.

3.
 /*
- * Create a new segment file backing a shared BufFile.
+ * Create a new segment file backing a fileset BufFile.
  */
 static File
-MakeNewSharedSegment(BufFile *buffile, int segment)
+MakeNewSegment(BufFile *buffile, int segment)

I think it is better to name this function as MakeNewFileSetSegment.
You can slightly change the comment as: "Create a new segment file
backing a fileset based BufFile."

4.
/*
  * It is possible that there are files left over from before a crash
- * restart with the same name.  In order for BufFileOpenShared() not to
+ * restart with the same name.  In order for BufFileOpen() not to
  * get confused about how many segments there are, we'll unlink the next

Typo. /BufFileOpen/BufFileOpenFileSet

5.
 static void
-SharedSegmentName(char *name, const char *buffile_name, int segment)
+SegmentName(char *name, const char *buffile_name, int segment)

Can we name this as FileSetSegmentName?

6.
  *
- * The naming scheme for shared BufFiles is left up to the calling code.  The
+ * The naming scheme for fileset BufFiles is left up to the calling code.

Isn't it better to say "... fileset based BufFiles .."?

7.
+ * FileSets provide a temporary namespace (think directory) so that files can
+ * be discovered by name

A full stop is missing at the end of the statement.

8.
+ *
+ * The callers are expected to explicitly remove such files by using
+ * SharedFileSetDelete/ SharedFileSetDeleteAll.
+ *
+ * Files will be distributed over the tablespaces configured in
+ * temp_tablespaces.
+ *
+ * Under the covers the set is one or more directories which will eventually
+ * be deleted.
+ */
+void
+FileSetInit(FileSet *fileset)

Is there a need to mention 'Shared' in API names (SharedFileSetDelete/
SharedFileSetDeleteAll) in the comments? Also, there doesn't seem to
be a need for extra space before *DeleteAll API in comments.

9.
 * Files will be distributed over the tablespaces configured in
 * temp_tablespaces.
 *
 * Under the covers the set is one or more directories which will eventually
 * be deleted.
 */
void
SharedFileSetInit(SharedFileSet *fileset, dsm_segment *seg)

I think we can remove the part of the above comment where it says
"Files will be distributed over ..." as that is already mentioned atop
FileSetInit.

-- 
With Regards,
Amit Kapila.



On Tue, Aug 24, 2021 at 12:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

> > I was also thinking about the same, does it make sense to name it just
> > ""%s/%s%lu.%u.fileset"?

Done

> I think it is reasonable to use .fileset as proposed by you.
>
> Few other comments:
> =================
> 1.
> + /*
> + * Register before-shmem-exit hook to ensure filesets are dropped while we
> + * can still report stats for underlying temporary files.
> + */
> + before_shmem_exit(worker_cleanup, (Datum) 0);
> +
>
> Do we really need to register a new callback here? Won't the existing
> logical replication worker exit routine (logicalrep_worker_onexit) be
> sufficient for this patch's purpose?

Right, we don't need an extra function for this.

> 2.
> - SharedFileSet *fileset; /* space for segment files if shared */
> - const char *name; /* name of this BufFile if shared */
> + FileSet    *fileset; /* space for fileset for fileset based file */
> + const char *name; /* name of this BufFile */
>
> The comments for the above two variables can be written as (a) space
> for fileset based segment files, (b) name of fileset based BufFile.

Done

> 3.
>  /*
> - * Create a new segment file backing a shared BufFile.
> + * Create a new segment file backing a fileset BufFile.
>   */
>  static File
> -MakeNewSharedSegment(BufFile *buffile, int segment)
> +MakeNewSegment(BufFile *buffile, int segment)
>
> I think it is better to name this function as MakeNewFileSetSegment.
> You can slightly change the comment as: "Create a new segment file
> backing a fileset based BufFile."

 Make sense

> 4.
> /*
>   * It is possible that there are files left over from before a crash
> - * restart with the same name.  In order for BufFileOpenShared() not to
> + * restart with the same name.  In order for BufFileOpen() not to
>   * get confused about how many segments there are, we'll unlink the next
>
> Typo. /BufFileOpen/BufFileOpenFileSet

Fixed

> 5.
>  static void
> -SharedSegmentName(char *name, const char *buffile_name, int segment)
> +SegmentName(char *name, const char *buffile_name, int segment)
>
> Can we name this as FileSetSegmentName?

Done

> 6.
>   *
> - * The naming scheme for shared BufFiles is left up to the calling code.  The
> + * The naming scheme for fileset BufFiles is left up to the calling code.
>
> Isn't it better to say "... fileset based BufFiles .."?

Done

> 7.
> + * FileSets provide a temporary namespace (think directory) so that files can
> + * be discovered by name
>
> A full stop is missing at the end of the statement.

Fixed

> 8.
> + *
> + * The callers are expected to explicitly remove such files by using
> + * SharedFileSetDelete/ SharedFileSetDeleteAll.
> + *
> + * Files will be distributed over the tablespaces configured in
> + * temp_tablespaces.
> + *
> + * Under the covers the set is one or more directories which will eventually
> + * be deleted.
> + */
> +void
> +FileSetInit(FileSet *fileset)
>
> Is there a need to mention 'Shared' in API names (SharedFileSetDelete/
> SharedFileSetDeleteAll) in the comments? Also, there doesn't seem to
> be a need for extra space before *DeleteAll API in comments.

Fixed

> 9.
>  * Files will be distributed over the tablespaces configured in
>  * temp_tablespaces.
>  *
>  * Under the covers the set is one or more directories which will eventually
>  * be deleted.
>  */
> void
> SharedFileSetInit(SharedFileSet *fileset, dsm_segment *seg)
>
> I think we can remove the part of the above comment where it says
> "Files will be distributed over ..." as that is already mentioned atop
> FileSetInit.

Done

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment
On Tuesday, August 24, 2021 6:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> On Tue, Aug 24, 2021 at 12:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> > > I was also thinking about the same, does it make sense to name it
> > > just ""%s/%s%lu.%u.fileset"?
> 
> Done
> 
> > I think it is reasonable to use .fileset as proposed by you.
> >
> > Few other comments:
> Done

After applying the patch, I tested the impacted features
(parallel hashjoin/btree build) with different settings
(log_temp_files/maintenance_work_mem/parallel_leader_participation/workers)
, and the patch works well.

One thing I noticed is that when enable log_temp_files, the pathname in log
changed from "xxx.sharedfileset" to "xxx.fileset", and I also noticed some
blogs[1] reference the old path name. Although, I think it doesn't matter, just
share the information here in case someone has different opinions.


Some minor thing I noticed in the patch:

1)
it seems we can add "FileSet" to typedefs.list

2)
The commit message in 0002 used "shared fileset" which should be "fileset",
---
Instead of using a separate shared fileset for each xid, use one shared
fileset for whole lifetime of the worker...
---

[1] https://blog.dbi-services.com/about-temp_tablespaces-in-postgresql/

Best regards,
Hou zj


On Tue, Aug 24, 2021 at 3:55 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Aug 24, 2021 at 12:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>

The first patch looks good to me. I have made minor changes to the
attached patch. The changes include: fixing compilation warning, made
some comment changes, ran pgindent, and few other cosmetic changes. If
you are fine with the attached, then kindly rebase the second patch
atop it.

-- 
With Regards,
Amit Kapila.

Attachment
On Wed, Aug 25, 2021 at 9:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Aug 24, 2021 at 3:55 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Tue, Aug 24, 2021 at 12:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> The first patch looks good to me. I have made minor changes to the
> attached patch. The changes include: fixing compilation warning, made
> some comment changes, ran pgindent, and few other cosmetic changes. If
> you are fine with the attached, then kindly rebase the second patch
> atop it.

The patch looks good to me.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



On Wed, Aug 25, 2021 at 5:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Aug 24, 2021 at 3:55 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Aug 24, 2021 at 12:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>

The first patch looks good to me. I have made minor changes to the
attached patch. The changes include: fixing compilation warning, made
some comment changes, ran pgindent, and few other cosmetic changes. If
you are fine with the attached, then kindly rebase the second patch
atop it.


The patch looks good to me, I have rebased 0002 atop this patch and also done some cosmetic fixes in 0002. 


--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachment
On Thu, Aug 26, 2021 at 11:47 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Aug 25, 2021 at 5:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Tue, Aug 24, 2021 at 3:55 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>> >
>> > On Tue, Aug 24, 2021 at 12:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >
>>
>> The first patch looks good to me. I have made minor changes to the
>> attached patch. The changes include: fixing compilation warning, made
>> some comment changes, ran pgindent, and few other cosmetic changes. If
>> you are fine with the attached, then kindly rebase the second patch
>> atop it.
>>
>
> The patch looks good to me,
>

Thanks, Sawada-San and Dilip for confirmation. I would like to commit
this and the second patch (the second one still needs some more
testing and review) for PG-15 as there is no bug per-se related to
this work in PG-14 but I see an argument to commit this for PG-14 to
keep the code (APIs) consistent. What do you think? Does anybody else
have any opinion on this?

Below is a summary of each of the patches for those who are not
following this closely:
Patch-1: The purpose of this patch is to refactor sharedfileset.c to
separate out fileset implementation. Basically, this moves the fileset
related implementation out of sharedfileset.c to allow its usage by
backends that don't want to share filesets among different processes.
After this split, fileset infrastructure is used by both
sharedfileset.c and worker.c for the named temporary files that
survive across transactions. This is suggested by Andres to have a
cleaner API and its usage.

Patch-2: Allow to use single fileset in worker.c (for the lifetime of
worker) instead of using a separate fileset for each remote
transaction. After this, the files to record changes for each remote
transaction will be created under the same fileset and the files will
be deleted after the transaction is completed. This is suggested by
Thomas to allow better resource usage.

-- 
With Regards,
Amit Kapila.



On Thu, Aug 26, 2021 at 2:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Thanks, Sawada-San and Dilip for confirmation. I would like to commit
this and the second patch (the second one still needs some more
testing and review) for PG-15 as there is no bug per-se related to
this work in PG-14 but I see an argument to commit this for PG-14 to
keep the code (APIs) consistent. What do you think? Does anybody else
have any opinion on this?


IMHO, this is a fair amount of refactoring and this is actually an improvement patch so we should push it to v15.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Thu, Aug 26, 2021 2:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

> The patch looks good to me, I have rebased 0002 atop
> this patch and also done some cosmetic fixes in 0002. 

Here are some comments for the 0002 patch.

1)

- * toplevel transaction. Each subscription has a separate set of files.
+ * toplevel transaction. Each subscription has a separate files.

a separate files => a separate file

2)
+     * Otherwise, just open it file for writing, in append mode.
      */

open it file => open the file


3)
     if (subxact_data.nsubxacts == 0)
     {
-        if (ent->subxact_fileset)
-        {
-            cleanup_subxact_info();
-            FileSetDeleteAll(ent->subxact_fileset);
-            pfree(ent->subxact_fileset);
-            ent->subxact_fileset = NULL;
-        }
+        cleanup_subxact_info();
+        BufFileDeleteFileSet(stream_fileset, path, true);
+


Before applying the patch, the code only invoke cleanup_subxact_info() when the
file exists. After applying the patch, it will invoke cleanup_subxact_info()
either the file exists or not. Is it correct ?


4)
     /*
-     * If this is the first streamed segment, the file must not exist, so make
-     * sure we're the ones creating it. Otherwise just open the file for
-     * writing, in append mode.
+     * If this is the first streamed segment, create the changes file.
+     * Otherwise, just open it file for writing, in append mode.
      */
     if (first_segment)
-    {
    ...
-        if (found)
-            ereport(ERROR,
-                    (errcode(ERRCODE_PROTOCOL_VIOLATION),
-                     errmsg_internal("incorrect first-segment flag for streamed replication transaction")));
    ...
-    }
+        stream_fd = BufFileCreateFileSet(stream_fileset, path);


Since the function BufFileCreateFileSet() doesn't check the file's existence,
the change here seems remove the file existence check which the old code did.


Best regards,
Hou zj

On Fri, Aug 27, 2021 1:25 PM Dilip Kumar <dilipbalaut@gmail.com>  wrote:
> On Thu, Aug 26, 2021 at 2:10 PM Amit Kapila <mailto:amit.kapila16@gmail.com> wrote:
> 
>> Thanks, Sawada-San and Dilip for confirmation. I would like to commit
>> this and the second patch (the second one still needs some more
>> testing and review) for PG-15 as there is no bug per-se related to
>> this work in PG-14 but I see an argument to commit this for PG-14 to
>> keep the code (APIs) consistent. What do you think? Does anybody else
>> have any opinion on this?

> IMHO, this is a fair amount of refactoring and this is actually an
> improvement patch so we should push it to v15.

+1

Best regards,
Hou zj

On Fri, Aug 27, 2021 at 10:56 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Thu, Aug 26, 2021 2:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> > The patch looks good to me, I have rebased 0002 atop
> > this patch and also done some cosmetic fixes in 0002.
>
> Here are some comments for the 0002 patch.
>
> 1)
>
> - * toplevel transaction. Each subscription has a separate set of files.
> + * toplevel transaction. Each subscription has a separate files.
>
> a separate files => a separate file

Done

> 2)
> +        * Otherwise, just open it file for writing, in append mode.
>          */
>
> open it file => open the file

Done

>
> 3)
>         if (subxact_data.nsubxacts == 0)
>         {
> -               if (ent->subxact_fileset)
> -               {
> -                       cleanup_subxact_info();
> -                       FileSetDeleteAll(ent->subxact_fileset);
> -                       pfree(ent->subxact_fileset);
> -                       ent->subxact_fileset = NULL;
> -               }
> +               cleanup_subxact_info();
> +               BufFileDeleteFileSet(stream_fileset, path, true);
> +
>
> Before applying the patch, the code only invoke cleanup_subxact_info() when the
> file exists. After applying the patch, it will invoke cleanup_subxact_info()
> either the file exists or not. Is it correct ?

I think this is just structure resetting at the end of the stream.
Earlier the hash was telling us whether we have ever dirtied that
structure or not but now we are not maintaining that hash so we just
reset it at the end of the stream.  I don't think its any bad, in fact
I think this is much cheaper compared to maining the hash.

>
> 4)
>         /*
> -        * If this is the first streamed segment, the file must not exist, so make
> -        * sure we're the ones creating it. Otherwise just open the file for
> -        * writing, in append mode.
> +        * If this is the first streamed segment, create the changes file.
> +        * Otherwise, just open it file for writing, in append mode.
>          */
>         if (first_segment)
> -       {
>         ...
> -               if (found)
> -                       ereport(ERROR,
> -                                       (errcode(ERRCODE_PROTOCOL_VIOLATION),
> -                                        errmsg_internal("incorrect first-segment flag for streamed replication
transaction")));
>         ...
> -       }
> +               stream_fd = BufFileCreateFileSet(stream_fileset, path);
>
>
> Since the function BufFileCreateFileSet() doesn't check the file's existence,
> the change here seems remove the file existence check which the old code did.

Not really, we were just doing a sanity check of the in memory hash
entry, now we don't maintain that so we don't need to do that.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment
On Fri, Aug 27, 2021 at 2:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Aug 26, 2021 at 2:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>>
>> Thanks, Sawada-San and Dilip for confirmation. I would like to commit
>> this and the second patch (the second one still needs some more
>> testing and review) for PG-15 as there is no bug per-se related to
>> this work in PG-14 but I see an argument to commit this for PG-14 to
>> keep the code (APIs) consistent. What do you think? Does anybody else
>> have any opinion on this?
>>
>
> IMHO, this is a fair amount of refactoring and this is actually an improvement patch so we should push it to v15.

I think so too.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



On Fri, Aug 27, 2021 at 3:34 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Aug 27, 2021 at 10:56 AM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Thu, Aug 26, 2021 2:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > > The patch looks good to me, I have rebased 0002 atop
> > > this patch and also done some cosmetic fixes in 0002.
> >
> > Here are some comments for the 0002 patch.
> >
> > 1)
> >
> > - * toplevel transaction. Each subscription has a separate set of files.
> > + * toplevel transaction. Each subscription has a separate files.
> >
> > a separate files => a separate file
>
> Done
>
> > 2)
> > +        * Otherwise, just open it file for writing, in append mode.
> >          */
> >
> > open it file => open the file
>
> Done
>
> >
> > 3)
> >         if (subxact_data.nsubxacts == 0)
> >         {
> > -               if (ent->subxact_fileset)
> > -               {
> > -                       cleanup_subxact_info();
> > -                       FileSetDeleteAll(ent->subxact_fileset);
> > -                       pfree(ent->subxact_fileset);
> > -                       ent->subxact_fileset = NULL;
> > -               }
> > +               cleanup_subxact_info();
> > +               BufFileDeleteFileSet(stream_fileset, path, true);
> > +
> >
> > Before applying the patch, the code only invoke cleanup_subxact_info() when the
> > file exists. After applying the patch, it will invoke cleanup_subxact_info()
> > either the file exists or not. Is it correct ?
>
> I think this is just structure resetting at the end of the stream.
> Earlier the hash was telling us whether we have ever dirtied that
> structure or not but now we are not maintaining that hash so we just
> reset it at the end of the stream.  I don't think its any bad, in fact
> I think this is much cheaper compared to maining the hash.
>
> >
> > 4)
> >         /*
> > -        * If this is the first streamed segment, the file must not exist, so make
> > -        * sure we're the ones creating it. Otherwise just open the file for
> > -        * writing, in append mode.
> > +        * If this is the first streamed segment, create the changes file.
> > +        * Otherwise, just open it file for writing, in append mode.
> >          */
> >         if (first_segment)
> > -       {
> >         ...
> > -               if (found)
> > -                       ereport(ERROR,
> > -                                       (errcode(ERRCODE_PROTOCOL_VIOLATION),
> > -                                        errmsg_internal("incorrect first-segment flag for streamed replication
transaction")));
> >         ...
> > -       }
> > +               stream_fd = BufFileCreateFileSet(stream_fileset, path);
> >
> >
> > Since the function BufFileCreateFileSet() doesn't check the file's existence,
> > the change here seems remove the file existence check which the old code did.
>
> Not really, we were just doing a sanity check of the in memory hash
> entry, now we don't maintain that so we don't need to do that.

Thank you for updating the patch!

The patch looks good to me except for the below comment:

+        /* Delete the subxact file, if it exist. */
+        subxact_filename(path, subid, xid);
+        BufFileDeleteFileSet(stream_fileset, path, true);

s/it exist/it exists/

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



On Mon, Aug 30, 2021 at 6:42 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Aug 27, 2021 at 2:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, Aug 26, 2021 at 2:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>
> >>
> >> Thanks, Sawada-San and Dilip for confirmation. I would like to commit
> >> this and the second patch (the second one still needs some more
> >> testing and review) for PG-15 as there is no bug per-se related to
> >> this work in PG-14 but I see an argument to commit this for PG-14 to
> >> keep the code (APIs) consistent. What do you think? Does anybody else
> >> have any opinion on this?
> >>
> >
> > IMHO, this is a fair amount of refactoring and this is actually an improvement patch so we should push it to v15.
>
> I think so too.
>

I have pushed the first patch in this series.

-- 
With Regards,
Amit Kapila.



From Mon, Aug 30, 2021 2:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> On Fri, Aug 27, 2021 at 3:34 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Fri, Aug 27, 2021 at 10:56 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
> > >
> > > On Thu, Aug 26, 2021 2:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > > The patch looks good to me, I have rebased 0002 atop this patch
> > > > and also done some cosmetic fixes in 0002.
> 
> Thank you for updating the patch!
> 
> The patch looks good to me except for the below comment:
> 
> +        /* Delete the subxact file, if it exist. */
> +        subxact_filename(path, subid, xid);
> +        BufFileDeleteFileSet(stream_fileset, path, true);
> 
> s/it exist/it exists/

Except for Sawada-san's comment, the v6-0002 patch looks good to me.

Best regards,
Hou zj

On Tue, Aug 31, 2021 at 7:39 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
From Mon, Aug 30, 2021 2:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> On Fri, Aug 27, 2021 at 3:34 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Fri, Aug 27, 2021 at 10:56 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
> > >
> > > On Thu, Aug 26, 2021 2:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > > The patch looks good to me, I have rebased 0002 atop this patch
> > > > and also done some cosmetic fixes in 0002.
>
> Thank you for updating the patch!
>
> The patch looks good to me except for the below comment:
>
> +        /* Delete the subxact file, if it exist. */
> +        subxact_filename(path, subid, xid);
> +        BufFileDeleteFileSet(stream_fileset, path, true);
>
> s/it exist/it exists/

Except for Sawada-san's comment, the v6-0002 patch looks good to me.

Thanks, I will wait for a day to see if there are any other comments on this, after that I will fix this one issue and post the updated patch.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Fri, Aug 27, 2021 at 12:04 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>

Few comments on v6-0002*
=========================
1.
-BufFileDeleteFileSet(FileSet *fileset, const char *name)
+BufFileDeleteFileSet(FileSet *fileset, const char *name, bool missing_ok)
 {
  char segment_name[MAXPGPATH];
  int segment = 0;
@@ -358,7 +369,7 @@ BufFileDeleteFileSet(FileSet *fileset, const char *name)
  for (;;)
  {
  FileSetSegmentName(segment_name, name, segment);
- if (!FileSetDelete(fileset, segment_name, true))
+ if (!FileSetDelete(fileset, segment_name, !missing_ok))

I don't think the usage of missing_ok is correct here. If you see
FileSetDelete->PathNameDeleteTemporaryFile, it already tolerates that
the file doesn't exist but gives an error only when it is unable to
link. So, with this missing_ok users (say like worker.c) won't even
get errors when they are not able to remove files whereas I think the
need for the patch is to not get an error when the file doesn't exist.
I think you don't need to change anything in the way we invoke
FileSetDelete.

2.
-static HTAB *xidhash = NULL;
+static FileSet *stream_fileset = NULL;

Can we keep this in LogicalRepWorker and initialize it accordingly?

3.
+ /* Open the subxact file, if it does not exist, create it. */
+ fd = BufFileOpenFileSet(stream_fileset, path, O_RDWR, true);
+ if (fd == NULL)
+ fd = BufFileCreateFileSet(stream_fileset, path);

I think retaining the existing comment: "Create the subxact file if it
not already created, otherwise open the existing file." seems better
here.

4.
  /*
- * If there is no subtransaction then nothing to do, but if already have
- * subxact file then delete that.
+ * If there are no subtransactions, there is nothing to be done, but if
+ * subxacts already exist, delete it.
  */

How about changing the above comment to something like: "Delete the
subxacts file, if exists"?

5. Can we slightly change the commit message as:
Optimize fileset usage in apply worker.

Use one fileset for the entire worker lifetime instead of using
separate filesets for each streaming transaction. Now, the
changes/subxacts files for every streaming transaction will be created
under the same fileset and the files will be deleted after the
transaction is completed.

This patch extends the BufFileOpenFileSet and BufFileDeleteFileSet
APIs to allow users to specify whether to give an error on missing
files.

-- 
With Regards,
Amit Kapila.



On Tue, Aug 31, 2021 at 3:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 27, 2021 at 12:04 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
>
> Few comments on v6-0002*
> =========================
> 1.
> -BufFileDeleteFileSet(FileSet *fileset, const char *name)
> +BufFileDeleteFileSet(FileSet *fileset, const char *name, bool missing_ok)
>  {
>   char segment_name[MAXPGPATH];
>   int segment = 0;
> @@ -358,7 +369,7 @@ BufFileDeleteFileSet(FileSet *fileset, const char *name)
>   for (;;)
>   {
>   FileSetSegmentName(segment_name, name, segment);
> - if (!FileSetDelete(fileset, segment_name, true))
> + if (!FileSetDelete(fileset, segment_name, !missing_ok))
>
> I don't think the usage of missing_ok is correct here. If you see
> FileSetDelete->PathNameDeleteTemporaryFile, it already tolerates that
> the file doesn't exist but gives an error only when it is unable to
> link. So, with this missing_ok users (say like worker.c) won't even
> get errors when they are not able to remove files whereas I think the
> need for the patch is to not get an error when the file doesn't exist.
> I think you don't need to change anything in the way we invoke
> FileSetDelete.

Right, fixed.

> 2.
> -static HTAB *xidhash = NULL;
> +static FileSet *stream_fileset = NULL;
>
> Can we keep this in LogicalRepWorker and initialize it accordingly?

Done

> 3.
> + /* Open the subxact file, if it does not exist, create it. */
> + fd = BufFileOpenFileSet(stream_fileset, path, O_RDWR, true);
> + if (fd == NULL)
> + fd = BufFileCreateFileSet(stream_fileset, path);
>
> I think retaining the existing comment: "Create the subxact file if it
> not already created, otherwise open the existing file." seems better
> here.

Done

> 4.
>   /*
> - * If there is no subtransaction then nothing to do, but if already have
> - * subxact file then delete that.
> + * If there are no subtransactions, there is nothing to be done, but if
> + * subxacts already exist, delete it.
>   */
>
> How about changing the above comment to something like: "Delete the
> subxacts file, if exists"?

Done

> 5. Can we slightly change the commit message as:
> Optimize fileset usage in apply worker.
>
> Use one fileset for the entire worker lifetime instead of using
> separate filesets for each streaming transaction. Now, the
> changes/subxacts files for every streaming transaction will be created
> under the same fileset and the files will be deleted after the
> transaction is completed.
>
> This patch extends the BufFileOpenFileSet and BufFileDeleteFileSet
> APIs to allow users to specify whether to give an error on missing
> files.

Done


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment
On Wed, Sep 1, 2021 at 1:53 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>

The latest patch looks good to me. I have made some changes in the
comments, see attached. I am planning to push this tomorrow unless you
or others have any comments on it.

-- 
With Regards,
Amit Kapila.

Attachment
On Wed, Sep 1, 2021 at 5:23 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Sep 1, 2021 at 1:53 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>

The latest patch looks good to me. I have made some changes in the
comments, see attached. I am planning to push this tomorrow unless you
or others have any comments on it.
 
These comments changes look good to me. 

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Wed, Sep 1, 2021 at 5:33 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Sep 1, 2021 at 5:23 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Wed, Sep 1, 2021 at 1:53 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>> >
>>
>> The latest patch looks good to me. I have made some changes in the
>> comments, see attached. I am planning to push this tomorrow unless you
>> or others have any comments on it.
>
>
> These comments changes look good to me.
>

Pushed.

-- 
With Regards,
Amit Kapila.



Hi,

Thank for all the work on this topic - I'd somehow accidentally marked this
thread as read when coming back from vacation...

Greetings,

Andres Freund