Re: Bump soft open file limit (RLIMIT_NOFILE) to hard limit on startup - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Bump soft open file limit (RLIMIT_NOFILE) to hard limit on startup
Date
Msg-id 9004aa84-4ea3-414f-9268-101510f41a29@iki.fi
Whole thread Raw
In response to Re: Bump soft open file limit (RLIMIT_NOFILE) to hard limit on startup  ("Jelte Fennema-Nio" <postgres@jeltef.nl>)
List pgsql-hackers
On 18/03/2025 01:08, Jelte Fennema-Nio wrote:
> On Mon Feb 24, 2025 at 12:01 PM CET, Jelte Fennema-Nio wrote:
>> Right after pressing send I realized I could remove two more useless
>> lines...
> 
> Rebased patchset attached (trivial conflict against pg_noreturn
> changes).

v7-0001-Adds-a-helper-for-places-that-shell-out-to-system.patch:

> /*
>  * A custom wrapper around the system() function that calls the necessary
>  * functions pre/post-fork.
>  */
> int
> System(const char *command, uint32 wait_event_info)
> {
>     int            rc;
> 
>     fflush(NULL);
>     pgstat_report_wait_start(wait_event_info);
> 
>     if (wait_event_info == WAIT_EVENT_RESTORE_COMMAND)
>     {
>         /*
>          * Set in_restore_command to tell the signal handler that we should
>          * exit right away on SIGTERM. This is done for the duration of the
>          * system() call because there isn't a good way to break out while it
>          * is executing.  Since we might call proc_exit() in a signal handler,
>          * it is best to put any additional logic outside of this section
>          * where in_restore_command is set to true.
>          */
>         in_restore_command = true;
> 
>         /*
>          * Also check if we had already received the signal, so that we don't
>          * miss a shutdown request received just before this.
>          */
>         if (shutdown_requested)
>             proc_exit(1);
>     }
> 
>     rc = system(command);
> 
>     if (wait_event_info == WAIT_EVENT_RESTORE_COMMAND)
>         in_restore_command = false;
> 
>     pgstat_report_wait_end();
>     return rc;
> }

Let's move that 'in_restore_command' business to the caller. It's weird 
modularity for the function to implicitly behave differently for some 
callers. And 'wait_event_info' should only affect pgstat reporting, not 
actual behavior.

I don't feel good about the function name. How about pg_system() or 
something? postmaster/startup.c also seems like a weird place for it; 
not sure where it belongs but probably not there. Maybe next to 
OpenPipeStream() in fd.c, or move both to a new file.

v7-0002-Bump-postmaster-soft-open-file-limit-RLIMIT_NOFIL.patch:

> @@ -274,6 +275,7 @@ System(const char *command, uint32 wait_event_info)
>  {
>      int            rc;
>  
> +    RestoreOriginalOpenFileLimit();
>      fflush(NULL);
>      pgstat_report_wait_start(wait_event_info);
>  
> @@ -303,6 +305,7 @@ System(const char *command, uint32 wait_event_info)
>          in_restore_command = false;
>  
>      pgstat_report_wait_end();
> +    RestoreCustomOpenFileLimit();
>      return rc;
>  }
>  

Looks a bit funny that both functions are called Restore<something>(). 
Not sure what to suggest instead though. Maybe RaiseOpenFileLimit() and 
LowerOpenFileLimit().

> @@ -2724,6 +2873,19 @@ OpenPipeStream(const char *command, const char *mode)
>      ReleaseLruFiles();
>  
>  TryAgain:
> +
> +    /*
> +     * It would be great if we could call RestoreOriginalOpenFileLimit here,
> +     * but since popen() also opens a file in the current process (this side
> +     * of the pipe) we cannot do so safely. Because we might already have many
> +     * more files open than the original limit.
> +     *
> +     * The only way to address this would be implementing a custom popen()
> +     * that calls RestoreOriginalOpenFileLimit only around the actual fork
> +     * call, but that seems too much effort to handle the corner case where
> +     * this external command uses both select() and tries to open more files
> +     * than select() allows for.
> +     */
>      fflush(NULL);
>      pqsignal(SIGPIPE, SIG_DFL);
>      errno = 0;

What would it take to re-implement popen() with fork+exec? Genuine 
question; I have a feeling it might be complicated to do correctly on 
all platforms, but I don't know.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




pgsql-hackers by date:

Previous
From: Marcos Pegoraro
Date:
Subject: Re: Exponential notation bug
Next
From: Masahiko Sawada
Date:
Subject: Re: pg_recvlogical cannot create slots with failover=true