On 6/21/19 9:45 AM, Tom Lane wrote:
> David Steele <david@pgmasters.net> writes:
>> While investigating "Too many open files" errors reported in our
>> parallel restore_command I noticed that the restore_command can inherit
>> quite a lot of fds from the recovery process. This limits the number of
>> fds available in the restore_command depending on the setting of system
>> nofile and Postgres max_files_per_process.
>
> Hm. Presumably you could hit the same issue with things like COPY FROM
> PROGRAM. And the only reason the archiver doesn't hit it is it never
> opens many files to begin with.
Yes. The archiver process is fine because it has ~8 fds open.
>> I was wondering if we should consider closing these fds before calling
>> restore_command? It seems like we could do this by forking first or by
>> setting FD_CLOEXEC using fcntl() or O_CLOEXEC on open() where available.
>
> +1 for using O_CLOEXEC on machines that have it. I don't think I want to
> jump through hoops for machines that don't have it --- POSIX has required
> it for some time, so there should be few machines in that category.
Another possible issue is that if we allow a child process to inherit
all these fds it might accidentally write to them, which would be bad.
I know the child process can go and maliciously open and trash files if
it wants, but it doesn't seem like we should allow it to happen
unintentionally.
Regards,
--
-David
david@pgmasters.net