Thread: BUG #15105: OpenTransientFile() should be paired withCloseTransientFile() rather than close()

The following bug has been logged on the website:

Bug reference:      15105
Logged by:          Pan Bian
Email address:      bianpan2016@163.com
PostgreSQL version: 10.3
Operating system:   Linux
Description:

File: src/backend/storage/ipc/dsm_impl.c
Function: dsm_impl_mmap()

Details: The handler opened with OpenTransientFile() should be closed with
CloseTransientFile(). However, in function dsm_impl_mmap(), on a certain
path, the return value of OpenTransientFile() (at line 885) is passed to
close() (at line 926). It is better to use CloseTransientFile() here, as
done at line 909.

For your convenience, I paste the related source code as follows:

 845 static bool
 846 dsm_impl_mmap(dsm_op op, dsm_handle handle, Size request_size,
 847               void **impl_private, void **mapped_address, Size
*mapped_size,
 848               int elevel)
 849 {
         ...
 883     /* Create new segment or open an existing one for attach or resize.
*/
 884     flags = O_RDWR | (op == DSM_OP_CREATE ? O_CREAT | O_EXCL : 0);
 885     if ((fd = OpenTransientFile(name, flags, 0600)) == -1)
 886     {
 887         if (errno != EEXIST)
 888             ereport(elevel,
 889                     (errcode_for_dynamic_shared_memory(),
 890                      errmsg("could not open shared memory segment
\"%s\": %m",
 891                             name)));
 892         return false;
 893     }
 894 
 895     /*
 896      * If we're attaching the segment, determine the current size; if
we are
 897      * creating or resizing the segment, set the size to the requested
value.
 898      */
 899     if (op == DSM_OP_ATTACH)
 900     {
 901         struct stat st;
 902 
 903         if (fstat(fd, &st) != 0)
 904         {
 905             int         save_errno;
 906 
 907             /* Back out what's already been done. */
 908             save_errno = errno;
 909             CloseTransientFile(fd);
 910             errno = save_errno;
 911 
 912             ereport(elevel,
 913                     (errcode_for_dynamic_shared_memory(),
 914                      errmsg("could not stat shared memory segment
\"%s\": %m",
 915                             name)));
 916             return false;
 917         }
 918         request_size = st.st_size;
 919     }
 920     else if (*mapped_size > request_size && ftruncate(fd,
request_size))
 921     {
 922         int         save_errno;
 923 
 924         /* Back out what's already been done. */
 925         save_errno = errno;
 926         close(fd);
 927         if (op == DSM_OP_CREATE)
 928             unlink(name);
 929         errno = save_errno;
 930 
 931         ereport(elevel,
 932                 (errcode_for_dynamic_shared_memory(),
 933                  errmsg("could not resize shared memory segment \"%s\"
to %zu bytes: %m",
 934                         name, request_size)));
 935         return false;
 936     }
         ...
1038     *mapped_address = address;
1039     *mapped_size = request_size;
1040     CloseTransientFile(fd);
1041 
1042     return true;
1043 }

Thanks!


On Fri, Mar 09, 2018 at 03:29:25AM +0000, PG Bug reporting form wrote:
> Details: The handler opened with OpenTransientFile() should be closed with
> CloseTransientFile(). However, in function dsm_impl_mmap(), on a certain
> path, the return value of OpenTransientFile() (at line 885) is passed to
> close() (at line 926). It is better to use CloseTransientFile() here, as
> done at line 909.

Good catch!  This is visibly a copy-paste error coming from
dsm_impl_posix().  As a patch it gives the attached.  I am adding also
Robert in CC for awareness.
--
Michael

Attachment
Michael Paquier <michael@paquier.xyz> writes:
> On Fri, Mar 09, 2018 at 03:29:25AM +0000, PG Bug reporting form wrote:
>> Details: The handler opened with OpenTransientFile() should be closed with
>> CloseTransientFile(). However, in function dsm_impl_mmap(), on a certain
>> path, the return value of OpenTransientFile() (at line 885) is passed to
>> close() (at line 926). It is better to use CloseTransientFile() here, as
>> done at line 909.

> Good catch!  This is visibly a copy-paste error coming from
> dsm_impl_posix().  As a patch it gives the attached.  I am adding also
> Robert in CC for awareness.

Presumably, this would have been found sooner if AtEOXact_Files acted
like most other end-of-xact cleanup functions and whined about leaked
resources (in the commit case).  I wonder why it doesn't.

            regards, tom lane


I wrote:
> Michael Paquier <michael@paquier.xyz> writes:
>> On Fri, Mar 09, 2018 at 03:29:25AM +0000, PG Bug reporting form wrote:
>>> Details: The handler opened with OpenTransientFile() should be closed with
>>> CloseTransientFile(). However, in function dsm_impl_mmap(), on a certain
>>> path, the return value of OpenTransientFile() (at line 885) is passed to
>>> close() (at line 926). It is better to use CloseTransientFile() here, as
>>> done at line 909.

>> Good catch!  This is visibly a copy-paste error coming from
>> dsm_impl_posix().  As a patch it gives the attached.  I am adding also
>> Robert in CC for awareness.

Now that the commitfest crunch is over, I've checked and pushed this.

> Presumably, this would have been found sooner if AtEOXact_Files acted
> like most other end-of-xact cleanup functions and whined about leaked
> resources (in the commit case).  I wonder why it doesn't.

On closer inspection, such a cross-check wouldn't have helped find this
particular mistake, because it was in an error exit path that's likely
never been exercised by anybody, and certainly isn't hit in the regression
tests.  Still, I think it'd be a good thing to add, and hence propose the
attached patch.  (I've verified that the warning doesn't fire in "make
check-world".)

            regards, tom lane

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 948733c..a2d5981 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -2123,7 +2123,7 @@ CommitTransaction(void)
     AtEOXact_on_commit_actions(true);
     AtEOXact_Namespace(true, is_parallel_worker);
     AtEOXact_SMgr();
-    AtEOXact_Files();
+    AtEOXact_Files(true);
     AtEOXact_ComboCid();
     AtEOXact_HashTables(true);
     AtEOXact_PgStat(true);
@@ -2401,7 +2401,7 @@ PrepareTransaction(void)
     AtEOXact_on_commit_actions(true);
     AtEOXact_Namespace(true, false);
     AtEOXact_SMgr();
-    AtEOXact_Files();
+    AtEOXact_Files(true);
     AtEOXact_ComboCid();
     AtEOXact_HashTables(true);
     /* don't call AtEOXact_PgStat here; we fixed pgstat state above */
@@ -2603,7 +2603,7 @@ AbortTransaction(void)
         AtEOXact_on_commit_actions(false);
         AtEOXact_Namespace(false, is_parallel_worker);
         AtEOXact_SMgr();
-        AtEOXact_Files();
+        AtEOXact_Files(false);
         AtEOXact_ComboCid();
         AtEOXact_HashTables(false);
         AtEOXact_PgStat(false);
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 3b90b16..02e6d81 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -531,7 +531,7 @@ AutoVacLauncherMain(int argc, char *argv[])
         }
         AtEOXact_Buffers(false);
         AtEOXact_SMgr();
-        AtEOXact_Files();
+        AtEOXact_Files(false);
         AtEOXact_HashTables(false);

         /*
diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c
index b7813d7..d5ce685 100644
--- a/src/backend/postmaster/bgwriter.c
+++ b/src/backend/postmaster/bgwriter.c
@@ -198,7 +198,7 @@ BackgroundWriterMain(void)
         /* we needn't bother with the other ResourceOwnerRelease phases */
         AtEOXact_Buffers(false);
         AtEOXact_SMgr();
-        AtEOXact_Files();
+        AtEOXact_Files(false);
         AtEOXact_HashTables(false);

         /*
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index 4b452e7..0950ada 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -282,7 +282,7 @@ CheckpointerMain(void)
         /* we needn't bother with the other ResourceOwnerRelease phases */
         AtEOXact_Buffers(false);
         AtEOXact_SMgr();
-        AtEOXact_Files();
+        AtEOXact_Files(false);
         AtEOXact_HashTables(false);

         /* Warn any waiting backends that the checkpoint failed. */
diff --git a/src/backend/postmaster/walwriter.c b/src/backend/postmaster/walwriter.c
index 4fa3a3b..688d5b5 100644
--- a/src/backend/postmaster/walwriter.c
+++ b/src/backend/postmaster/walwriter.c
@@ -179,7 +179,7 @@ WalWriterMain(void)
         /* we needn't bother with the other ResourceOwnerRelease phases */
         AtEOXact_Buffers(false);
         AtEOXact_SMgr();
-        AtEOXact_Files();
+        AtEOXact_Files(false);
         AtEOXact_HashTables(false);

         /*
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index f772dfe..07e1e48 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -314,7 +314,7 @@ static bool reserveAllocatedDesc(void);
 static int    FreeDesc(AllocateDesc *desc);

 static void AtProcExit_Files(int code, Datum arg);
-static void CleanupTempFiles(bool isProcExit);
+static void CleanupTempFiles(bool isCommit, bool isProcExit);
 static void RemovePgTempFilesInDir(const char *tmpdirname, bool missing_ok,
                        bool unlink_all);
 static void RemovePgTempRelationFiles(const char *tsdirname);
@@ -2902,17 +2902,19 @@ AtEOSubXact_Files(bool isCommit, SubTransactionId mySubid,
 /*
  * AtEOXact_Files
  *
- * This routine is called during transaction commit or abort (it doesn't
- * particularly care which).  All still-open per-transaction temporary file
- * VFDs are closed, which also causes the underlying files to be deleted
- * (although they should've been closed already by the ResourceOwner
- * cleanup). Furthermore, all "allocated" stdio files are closed. We also
- * forget any transaction-local temp tablespace list.
+ * This routine is called during transaction commit or abort.  All still-open
+ * per-transaction temporary file VFDs are closed, which also causes the
+ * underlying files to be deleted (although they should've been closed already
+ * by the ResourceOwner cleanup). Furthermore, all "allocated" stdio files are
+ * closed. We also forget any transaction-local temp tablespace list.
+ *
+ * The isCommit flag is used only to decide whether to emit warnings about
+ * unclosed files.
  */
 void
-AtEOXact_Files(void)
+AtEOXact_Files(bool isCommit)
 {
-    CleanupTempFiles(false);
+    CleanupTempFiles(isCommit, false);
     tempTableSpaces = NULL;
     numTempTableSpaces = -1;
 }
@@ -2926,12 +2928,15 @@ AtEOXact_Files(void)
 static void
 AtProcExit_Files(int code, Datum arg)
 {
-    CleanupTempFiles(true);
+    CleanupTempFiles(false, true);
 }

 /*
  * Close temporary files and delete their underlying files.
  *
+ * isCommit: if true, this is normal transaction commit, and we don't
+ * expect any remaining files; warn if there are some.
+ *
  * isProcExit: if true, this is being called as the backend process is
  * exiting. If that's the case, we should remove all temporary files; if
  * that's not the case, we are being called for transaction commit/abort
@@ -2939,7 +2944,7 @@ AtProcExit_Files(int code, Datum arg)
  * also clean up "allocated" stdio files, dirs and fds.
  */
 static void
-CleanupTempFiles(bool isProcExit)
+CleanupTempFiles(bool isCommit, bool isProcExit)
 {
     Index        i;

@@ -2979,6 +2984,11 @@ CleanupTempFiles(bool isProcExit)
         have_xact_temporary_files = false;
     }

+    /* Complain if any allocated files remain open at commit. */
+    if (isCommit && numAllocatedDescs > 0)
+        elog(WARNING, "%d temporary files and directories not closed at end-of-transaction",
+             numAllocatedDescs);
+
     /* Clean up "allocated" stdio files, dirs and fds. */
     while (numAllocatedDescs > 0)
         FreeDesc(&allocatedDescs[0]);
diff --git a/src/include/storage/fd.h b/src/include/storage/fd.h
index 484339b..548a832 100644
--- a/src/include/storage/fd.h
+++ b/src/include/storage/fd.h
@@ -123,7 +123,7 @@ extern void SetTempTablespaces(Oid *tableSpaces, int numSpaces);
 extern bool TempTablespacesAreSet(void);
 extern int    GetTempTablespaces(Oid *tableSpaces, int numSpaces);
 extern Oid    GetNextTempTableSpace(void);
-extern void AtEOXact_Files(void);
+extern void AtEOXact_Files(bool isCommit);
 extern void AtEOSubXact_Files(bool isCommit, SubTransactionId mySubid,
                   SubTransactionId parentSubid);
 extern void RemovePgTempFiles(void);

On Tue, Apr 10, 2018 at 06:40:43PM -0400, Tom Lane wrote:
> On closer inspection, such a cross-check wouldn't have helped find this
> particular mistake, because it was in an error exit path that's likely
> never been exercised by anybody, and certainly isn't hit in the regression
> tests.  Still, I think it'd be a good thing to add, and hence propose the
> attached patch.  (I've verified that the warning doesn't fire in "make
> check-world".)

Good idea.  +1 for your proposed patch.
--
Michael

Attachment