Re: Bump soft open file limit (RLIMIT_NOFILE) to hard limit on startup - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Bump soft open file limit (RLIMIT_NOFILE) to hard limit on startup
Date
Msg-id xvd2cyrtd4wk42ugweydxfcy3bwtaymu4gqmky5fpfcu6xia4m@qbgeq23yncch
Whole thread Raw
In response to Re: Bump soft open file limit (RLIMIT_NOFILE) to hard limit on startup  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Bump soft open file limit (RLIMIT_NOFILE) to hard limit on startup
List pgsql-hackers
Hi,

On 2025-02-11 16:18:37 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > And when using something like io_uring for AIO, it'd allow to
> > max_files_per_process in addition to the files requires for the io_uring
> > instances.
>
> Not following?  Surely we'd not be configuring that so early in
> postmaster start?

The issue is that, with io_uring, we need to create one FD for each possible
child process, so that one backend can wait for completions for IO issued by
another backend [1]. Those io_uring instances need to be created in
postmaster, so they're visible to each backend. Obviously that helps to much
more quickly run into an unadjusted soft RLIMIT_NOFILE, particularly if
max_connections is set to a higher value.

In the current version of the AIO patchset, the creation of those io_uring
instances does happen as part of an shmem init callback, as the io uring
creation also sets up queues visible in shmem. And shmem init callbacks are
currently happening *before* postmaster's set_max_safe_fds() call:


    /*
     * Set up shared memory and semaphores.
     *
     * Note: if using SysV shmem and/or semas, each postmaster startup will
     * normally choose the same IPC keys.  This helps ensure that we will
     * clean up dead IPC objects if the postmaster crashes and is restarted.
     */
    CreateSharedMemoryAndSemaphores();

    /*
     * Estimate number of openable files.  This must happen after setting up
     * semaphores, because on some platforms semaphores count as open files.
     */
    set_max_safe_fds();


So the issue would actually be that we're currently doing set_max_safe_fds()
too late, not too early :/

Greetings,

Andres Freund

[1] Initially I tried to avoid that, by sharing a smaller number of io_uring
    instances across backends. Making that work was a fair bit of code *and*
    was considerably slower, due to now needing a lock around submission of
    IOs. Moving to one io_uring instance per backend fairly dramatically
    simplified the code while also speeding it up.



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Skip collecting decoded changes of already-aborted transactions
Next
From: Greg Sabino Mullane
Date:
Subject: Re: Proposal: allow non-masked IPs inside of pg_hba.conf