[PATCH] Fix fd leak in pg_dump compression backends when dup()+fdopen() fails - Mailing list pgsql-hackers

From Jianghua Yang
Subject [PATCH] Fix fd leak in pg_dump compression backends when dup()+fdopen() fails
Date
Msg-id 62bbe34d-2315-4b42-b768-56d901aa83e1@gmail.com
Whole thread Raw
Responses Re: [PATCH] Fix fd leak in pg_dump compression backends when dup()+fdopen() fails
List pgsql-hackers
Hi,

   I found a file descriptor leak in the pg_dump compression backends
   (gzip, lz4, zstd, and the uncompressed "none" backend), introduced
   in commit e9960732a9 ("Introduce a generic pg_dump compression API",
   PostgreSQL 16).

   == The Bug ==

   All four compression open functions use this pattern when an existing
   file descriptor is passed in:

       if (fd >= 0)
           fp = fdopen(dup(fd), mode);   /* or gzdopen() */

       if (fp == NULL)
           return false;                 /* dup'd fd is leaked here */

   The problem is that dup(fd) and fdopen()/gzdopen() are two separate
   steps, and their failure modes must be handled independently:

   1. If dup() fails (returns -1), no fd is created -- this case was
      not checked at all in the original code.

   2. If dup() succeeds but fdopen()/gzdopen() fails (e.g., due to a
      failed malloc(3) for the FILE structure), POSIX explicitly states:

        "If the fdopen() function fails, the file descriptor is
         not closed."
        -- POSIX.1-2017, fdopen() specification

      The duplicated fd therefore remains open with no owner, leaking
      until the process exits.

   == When Can dup() Fail? ==

   The most realistic trigger for dup() returning EMFILE is parallel
   pg_dump (pg_dump -j N) against a large database.  Each worker opens
   multiple file descriptors for tables, indexes, TOC files, and
   compression streams simultaneously.  On systems with a low per-process
   fd limit (e.g., ulimit -n 1024), or when dumping a schema with a very
   large number of objects, the process fd count can approach the limit.
   At that point, dup() fails with EMFILE and the subsequent NULL-check
   on fp returns false without any cleanup.

   While fdopen() failure (EMFILE/OOM) is less common, it is equally
   incorrect to ignore -- and it is precisely the case that POSIX calls
   out as the caller's responsibility to close.

   == The Fix ==

   Save the result of dup() in a local variable.  Check it immediately.
   If fdopen()/gzdopen() subsequently fails, explicitly close the
   duplicated fd before returning false.

       if (fd >= 0)
       {
           int dup_fd = dup(fd);

           if (dup_fd < 0)
               return false;
           fp = fdopen(dup_fd, mode);
           if (fp == NULL)
           {
               close(dup_fd);   /* POSIX: fdopen does not close on 
failure */
               return false;
           }
       }
       else
       {
           fp = fopen(path, mode);
           if (fp == NULL)
               return false;
       }

   The zstd fix additionally ensures that the previously allocated
   zstdcs structure is freed on all new failure paths.

   == Affected Versions ==

   PostgreSQL 16 and 17, and current master.
   The compress_* files were introduced in commit e9960732a9 (Feb 2023).

   == Patch ==

   Patch attached.  It applies cleanly to current master (as of
   commit 18bcdb75d15).

   Regards,
   Jianghua Yang
Attachment

pgsql-hackers by date:

Previous
From: Junwang Zhao
Date:
Subject: Re: Eliminating SPI / SQL from some RI triggers - take 3
Next
From: Tom Lane
Date:
Subject: Re: remove bits* types