Thread: File descriptors inherited by restore_command

File descriptors inherited by restore_command

From
David Steele
Date:
Hackers,

While investigating "Too many open files" errors reported in our
parallel restore_command I noticed that the restore_command can inherit
quite a lot of fds from the recovery process.  This limits the number of
fds available in the restore_command depending on the setting of system
nofile and Postgres max_files_per_process.

I was wondering if we should consider closing these fds before calling
restore_command?  It seems like we could do this by forking first or by
setting FD_CLOEXEC using fcntl() or O_CLOEXEC on open() where available.

Thoughts on this?  Is this something we want to change or should I just
recommend that users set nofile and max_files_per_process appropriately?

Regards,
-- 
-David
david@pgmasters.net



Re: File descriptors inherited by restore_command

From
Tom Lane
Date:
David Steele <david@pgmasters.net> writes:
> While investigating "Too many open files" errors reported in our
> parallel restore_command I noticed that the restore_command can inherit
> quite a lot of fds from the recovery process.  This limits the number of
> fds available in the restore_command depending on the setting of system
> nofile and Postgres max_files_per_process.

Hm.  Presumably you could hit the same issue with things like COPY FROM
PROGRAM.  And the only reason the archiver doesn't hit it is it never
opens many files to begin with.

> I was wondering if we should consider closing these fds before calling
> restore_command?  It seems like we could do this by forking first or by
> setting FD_CLOEXEC using fcntl() or O_CLOEXEC on open() where available.

+1 for using O_CLOEXEC on machines that have it.  I don't think I want to
jump through hoops for machines that don't have it --- POSIX has required
it for some time, so there should be few machines in that category.

            regards, tom lane



Re: File descriptors inherited by restore_command

From
David Steele
Date:
On 6/21/19 9:45 AM, Tom Lane wrote:
> David Steele <david@pgmasters.net> writes:
>> While investigating "Too many open files" errors reported in our
>> parallel restore_command I noticed that the restore_command can inherit
>> quite a lot of fds from the recovery process.  This limits the number of
>> fds available in the restore_command depending on the setting of system
>> nofile and Postgres max_files_per_process.
> 
> Hm.  Presumably you could hit the same issue with things like COPY FROM
> PROGRAM.  And the only reason the archiver doesn't hit it is it never
> opens many files to begin with.

Yes.  The archiver process is fine because it has ~8 fds open.

>> I was wondering if we should consider closing these fds before calling
>> restore_command?  It seems like we could do this by forking first or by
>> setting FD_CLOEXEC using fcntl() or O_CLOEXEC on open() where available.
> 
> +1 for using O_CLOEXEC on machines that have it.  I don't think I want to
> jump through hoops for machines that don't have it --- POSIX has required
> it for some time, so there should be few machines in that category.

Another possible issue is that if we allow a child process to inherit
all these fds it might accidentally write to them, which would be bad.
I know the child process can go and maliciously open and trash files if
it wants, but it doesn't seem like we should allow it to happen
unintentionally.

Regards,
-- 
-David
david@pgmasters.net



Re: File descriptors inherited by restore_command

From
Tom Lane
Date:
David Steele <david@pgmasters.net> writes:
> On 6/21/19 9:45 AM, Tom Lane wrote:
>> +1 for using O_CLOEXEC on machines that have it.  I don't think I want to
>> jump through hoops for machines that don't have it --- POSIX has required
>> it for some time, so there should be few machines in that category.

> Another possible issue is that if we allow a child process to inherit
> all these fds it might accidentally write to them, which would be bad.
> I know the child process can go and maliciously open and trash files if
> it wants, but it doesn't seem like we should allow it to happen
> unintentionally.

True.  But I don't want to think of this as a security issue, because
then it becomes a security bug to forget O_CLOEXEC anywhere in the
backend, and that is a standard we cannot meet.  (Even if we could
hold to it for the core code, stuff like libperl and libpython can't
be relied on to play ball.)  In practice, as long as we use O_CLOEXEC
for files opened by fd.c, that would eliminate the actual too-many-fds
hazard.  I don't object to desultorily looking around for other places
where we might want to add it, but personally I'd be satisfied with a
patch that CLOEXEC-ifies fd.c.

            regards, tom lane



Re: File descriptors inherited by restore_command

From
Tom Lane
Date:
I wrote:
> In practice, as long as we use O_CLOEXEC
> for files opened by fd.c, that would eliminate the actual too-many-fds
> hazard.  I don't object to desultorily looking around for other places
> where we might want to add it, but personally I'd be satisfied with a
> patch that CLOEXEC-ifies fd.c.

Actually, even that much coverage might be exciting.  Be sure to test
patch with EXEC_BACKEND to see if it causes zapping of any files the
postmaster needs to pass down to backends.

            regards, tom lane



Re: File descriptors inherited by restore_command

From
Stephen Frost
Date:
Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> David Steele <david@pgmasters.net> writes:
> > On 6/21/19 9:45 AM, Tom Lane wrote:
> >> +1 for using O_CLOEXEC on machines that have it.  I don't think I want to
> >> jump through hoops for machines that don't have it --- POSIX has required
> >> it for some time, so there should be few machines in that category.
>
> > Another possible issue is that if we allow a child process to inherit
> > all these fds it might accidentally write to them, which would be bad.
> > I know the child process can go and maliciously open and trash files if
> > it wants, but it doesn't seem like we should allow it to happen
> > unintentionally.
>
> True.  But I don't want to think of this as a security issue, because
> then it becomes a security bug to forget O_CLOEXEC anywhere in the
> backend, and that is a standard we cannot meet.  (Even if we could
> hold to it for the core code, stuff like libperl and libpython can't
> be relied on to play ball.)  In practice, as long as we use O_CLOEXEC
> for files opened by fd.c, that would eliminate the actual too-many-fds
> hazard.  I don't object to desultorily looking around for other places
> where we might want to add it, but personally I'd be satisfied with a
> patch that CLOEXEC-ifies fd.c.

Agreed, it's not a security issue, and also agreed that we should
probably get it done with fd.c right off, and then if someone wants to
think about other places where it might be good to do then more power to
them and it seems like we'd be happy to accept such patches.

Thanks,

Stephen

Attachment

Re: File descriptors inherited by restore_command

From
David Steele
Date:
On 6/21/19 10:26 AM, Stephen Frost wrote:
>>
>>> Another possible issue is that if we allow a child process to inherit
>>> all these fds it might accidentally write to them, which would be bad.
>>> I know the child process can go and maliciously open and trash files if
>>> it wants, but it doesn't seem like we should allow it to happen
>>> unintentionally.
>>
>> True.  But I don't want to think of this as a security issue, because
>> then it becomes a security bug to forget O_CLOEXEC anywhere in the
>> backend, and that is a standard we cannot meet.  (Even if we could
>> hold to it for the core code, stuff like libperl and libpython can't
>> be relied on to play ball.)  In practice, as long as we use O_CLOEXEC
>> for files opened by fd.c, that would eliminate the actual too-many-fds
>> hazard.  I don't object to desultorily looking around for other places
>> where we might want to add it, but personally I'd be satisfied with a
>> patch that CLOEXEC-ifies fd.c.
>
> Agreed, it's not a security issue, and also agreed that we should
> probably get it done with fd.c right off, and then if someone wants to
> think about other places where it might be good to do then more power to
> them and it seems like we'd be happy to accept such patches.

I agree this is not a security issue and I wasn't intending to present
it that way, but in general the more fds closed the better.

I'll work up a patch for fd.c which is the obvious win and we can work
from there if it makes sense.  I'll be sure to test EXEC_BACKEND on
Linux but I don't think it will matter on Windows.  cfbot may feel
differently, though.

Regards,
--
-David
david@pgmasters.net


Attachment