Re: too-may-open-files log file entries when vauuming under solaris - Mailing list pgsql-general

From Tom Lane
Subject Re: too-may-open-files log file entries when vauuming under solaris
Date
Msg-id 18754.1394050610@sss.pgh.pa.us
Whole thread Raw
In response to too-may-open-files log file entries when vauuming under solaris  ("Raschick, Hartmut" <Hartmut.Raschick@keymile.com>)
Responses Re: too-may-open-files log file entries when vauuming under solaris  ("Raschick, Hartmut" <Hartmut.Raschick@keymile.com>)
List pgsql-general
"Raschick, Hartmut" <Hartmut.Raschick@keymile.com> writes:
> recently we have seen a lot of occurrences of "out of file descriptors:
> Too many open files; release and retry" in our postgres log files, every
> night when a "vacuum full analyze" is run.  After some digging into the
> code we found that postgres potentially tries to open as many as a
> pre-determined maximum number of file descriptors when vacuuming. That
> number is the lesser of the one from the configuration file
> (max_files_per_process) and the one determined at start-up by
> "src/backend/storage/file/fd.c::count_usable_fds()". Under Solaris now,
> it would seem, finding out that number via dup(0) is not sufficient, as
> the actual number of interest might be/is the number of usable stream
> file descriptors (up until Solaris 10, at least). Also, closing the last
> recently used file descriptor might therefore not solve a temporary
> problem (as something below 256 is needed). Now, this can be fixed by
> setting/leaving the descriptor limit at 256 or changing the
> postgresql.conf setting accordingly. Still, the function for determining
> the max number is not working as intended under Solaris, it would
> appear. One might try using fopen() instead of dup() or have a different
> handling for stream and normal file descriptors (including moving
> standard file descriptors to above 255 to leave room for stream
> ones). Maybe though, all this is not worth the effort; then it might
> perhaps be a good idea to mention the limitations/specialties in the
> platform specific notes (e.g. have u/limit at 256 maximum).

TBH this sounds like unfounded speculation.  AFAIK a Postgres backend will
not open anything but regular files after its initial startup.  I'm not
sure what a "stream" is on Solaris, but guessing that it refers to pipes
or sockets, I don't think we have a problem with an OS restriction that
those be below FD 256.  In any case, if we did, it would presumably show
up as errors not release-and-retry events.

Our usual experience is that you get release-and-retry log messages when
the OS is up against the system-wide open-file limit rather than the
per-process limit (ie, the underlying error code is ENFILE not EMFILE).
I don't know exactly how Solaris strerror() spells those codes so it's
difficult to tell from your reported log message which case is happening.
If it is the system-wide limit that's at issue, then of course the dup(0)
loop isn't likely to find it, and adjusting max_files_per_process (or
maybe better, reducing max_connections) is the expected solution.

            regards, tom lane


pgsql-general by date:

Previous
From: Thom Brown
Date:
Subject: Re: Mysterious DB reset
Next
From: Adrian Klaver
Date:
Subject: Re: Mysterious DB reset