Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o) - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)
Date
Msg-id CAA4eK1JUvDMqdiOUKn5G1QemBxwHKaEE2au2vxs5LfgnwxdN5g@mail.gmail.com
Whole thread Raw
In response to Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Mon, Aug 23, 2021 at 3:13 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Aug 23, 2021 at 1:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> Note: merge comments from multiple mails
>
> > > I think we should handle that in worker.c itself, by adding a
> > > before_dsm_detach function before_shmem_exit right?
> > >
> >
> > Yeah, I thought of handling it in worker.c similar to what you've in 0002 patch.
> >
>
> I have done handling in worker.c
>
> On Mon, Aug 23, 2021 at 9:11 AM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Sat, Aug 21, 2021 8:38 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > 1)
> > +       TempTablespacePath(tempdirpath, tablespace);
> > +       snprintf(path, MAXPGPATH, "%s/%s%lu.%u.sharedfileset",
> > +                        tempdirpath, PG_TEMP_FILE_PREFIX,
> > +                        (unsigned long) fileset->creator_pid, fileset->number);
> >
> > do we need to use different filename for shared and un-shared fileset ?
>
> I was also thinking about the same, does it make sense to name it just
> ""%s/%s%lu.%u.fileset"?
>

I think it is reasonable to use .fileset as proposed by you.

Few other comments:
=================
1.
+ /*
+ * Register before-shmem-exit hook to ensure filesets are dropped while we
+ * can still report stats for underlying temporary files.
+ */
+ before_shmem_exit(worker_cleanup, (Datum) 0);
+

Do we really need to register a new callback here? Won't the existing
logical replication worker exit routine (logicalrep_worker_onexit) be
sufficient for this patch's purpose?

2.
- SharedFileSet *fileset; /* space for segment files if shared */
- const char *name; /* name of this BufFile if shared */
+ FileSet    *fileset; /* space for fileset for fileset based file */
+ const char *name; /* name of this BufFile */

The comments for the above two variables can be written as (a) space
for fileset based segment files, (b) name of fileset based BufFile.

3.
 /*
- * Create a new segment file backing a shared BufFile.
+ * Create a new segment file backing a fileset BufFile.
  */
 static File
-MakeNewSharedSegment(BufFile *buffile, int segment)
+MakeNewSegment(BufFile *buffile, int segment)

I think it is better to name this function as MakeNewFileSetSegment.
You can slightly change the comment as: "Create a new segment file
backing a fileset based BufFile."

4.
/*
  * It is possible that there are files left over from before a crash
- * restart with the same name.  In order for BufFileOpenShared() not to
+ * restart with the same name.  In order for BufFileOpen() not to
  * get confused about how many segments there are, we'll unlink the next

Typo. /BufFileOpen/BufFileOpenFileSet

5.
 static void
-SharedSegmentName(char *name, const char *buffile_name, int segment)
+SegmentName(char *name, const char *buffile_name, int segment)

Can we name this as FileSetSegmentName?

6.
  *
- * The naming scheme for shared BufFiles is left up to the calling code.  The
+ * The naming scheme for fileset BufFiles is left up to the calling code.

Isn't it better to say "... fileset based BufFiles .."?

7.
+ * FileSets provide a temporary namespace (think directory) so that files can
+ * be discovered by name

A full stop is missing at the end of the statement.

8.
+ *
+ * The callers are expected to explicitly remove such files by using
+ * SharedFileSetDelete/ SharedFileSetDeleteAll.
+ *
+ * Files will be distributed over the tablespaces configured in
+ * temp_tablespaces.
+ *
+ * Under the covers the set is one or more directories which will eventually
+ * be deleted.
+ */
+void
+FileSetInit(FileSet *fileset)

Is there a need to mention 'Shared' in API names (SharedFileSetDelete/
SharedFileSetDeleteAll) in the comments? Also, there doesn't seem to
be a need for extra space before *DeleteAll API in comments.

9.
 * Files will be distributed over the tablespaces configured in
 * temp_tablespaces.
 *
 * Under the covers the set is one or more directories which will eventually
 * be deleted.
 */
void
SharedFileSetInit(SharedFileSet *fileset, dsm_segment *seg)

I think we can remove the part of the above comment where it says
"Files will be distributed over ..." as that is already mentioned atop
FileSetInit.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: [PROPOSAL] new diagnostic items for the dynamic sql
Next
From: Andrey Lepikhov
Date:
Subject: Representation of SubPlan testexpr in EXPLAIN