Thread: File descriptors inherited by restore_command
Hackers, While investigating "Too many open files" errors reported in our parallel restore_command I noticed that the restore_command can inherit quite a lot of fds from the recovery process. This limits the number of fds available in the restore_command depending on the setting of system nofile and Postgres max_files_per_process. I was wondering if we should consider closing these fds before calling restore_command? It seems like we could do this by forking first or by setting FD_CLOEXEC using fcntl() or O_CLOEXEC on open() where available. Thoughts on this? Is this something we want to change or should I just recommend that users set nofile and max_files_per_process appropriately? Regards, -- -David david@pgmasters.net
David Steele <david@pgmasters.net> writes: > While investigating "Too many open files" errors reported in our > parallel restore_command I noticed that the restore_command can inherit > quite a lot of fds from the recovery process. This limits the number of > fds available in the restore_command depending on the setting of system > nofile and Postgres max_files_per_process. Hm. Presumably you could hit the same issue with things like COPY FROM PROGRAM. And the only reason the archiver doesn't hit it is it never opens many files to begin with. > I was wondering if we should consider closing these fds before calling > restore_command? It seems like we could do this by forking first or by > setting FD_CLOEXEC using fcntl() or O_CLOEXEC on open() where available. +1 for using O_CLOEXEC on machines that have it. I don't think I want to jump through hoops for machines that don't have it --- POSIX has required it for some time, so there should be few machines in that category. regards, tom lane
On 6/21/19 9:45 AM, Tom Lane wrote: > David Steele <david@pgmasters.net> writes: >> While investigating "Too many open files" errors reported in our >> parallel restore_command I noticed that the restore_command can inherit >> quite a lot of fds from the recovery process. This limits the number of >> fds available in the restore_command depending on the setting of system >> nofile and Postgres max_files_per_process. > > Hm. Presumably you could hit the same issue with things like COPY FROM > PROGRAM. And the only reason the archiver doesn't hit it is it never > opens many files to begin with. Yes. The archiver process is fine because it has ~8 fds open. >> I was wondering if we should consider closing these fds before calling >> restore_command? It seems like we could do this by forking first or by >> setting FD_CLOEXEC using fcntl() or O_CLOEXEC on open() where available. > > +1 for using O_CLOEXEC on machines that have it. I don't think I want to > jump through hoops for machines that don't have it --- POSIX has required > it for some time, so there should be few machines in that category. Another possible issue is that if we allow a child process to inherit all these fds it might accidentally write to them, which would be bad. I know the child process can go and maliciously open and trash files if it wants, but it doesn't seem like we should allow it to happen unintentionally. Regards, -- -David david@pgmasters.net
David Steele <david@pgmasters.net> writes: > On 6/21/19 9:45 AM, Tom Lane wrote: >> +1 for using O_CLOEXEC on machines that have it. I don't think I want to >> jump through hoops for machines that don't have it --- POSIX has required >> it for some time, so there should be few machines in that category. > Another possible issue is that if we allow a child process to inherit > all these fds it might accidentally write to them, which would be bad. > I know the child process can go and maliciously open and trash files if > it wants, but it doesn't seem like we should allow it to happen > unintentionally. True. But I don't want to think of this as a security issue, because then it becomes a security bug to forget O_CLOEXEC anywhere in the backend, and that is a standard we cannot meet. (Even if we could hold to it for the core code, stuff like libperl and libpython can't be relied on to play ball.) In practice, as long as we use O_CLOEXEC for files opened by fd.c, that would eliminate the actual too-many-fds hazard. I don't object to desultorily looking around for other places where we might want to add it, but personally I'd be satisfied with a patch that CLOEXEC-ifies fd.c. regards, tom lane
I wrote: > In practice, as long as we use O_CLOEXEC > for files opened by fd.c, that would eliminate the actual too-many-fds > hazard. I don't object to desultorily looking around for other places > where we might want to add it, but personally I'd be satisfied with a > patch that CLOEXEC-ifies fd.c. Actually, even that much coverage might be exciting. Be sure to test patch with EXEC_BACKEND to see if it causes zapping of any files the postmaster needs to pass down to backends. regards, tom lane
Greetings, * Tom Lane (tgl@sss.pgh.pa.us) wrote: > David Steele <david@pgmasters.net> writes: > > On 6/21/19 9:45 AM, Tom Lane wrote: > >> +1 for using O_CLOEXEC on machines that have it. I don't think I want to > >> jump through hoops for machines that don't have it --- POSIX has required > >> it for some time, so there should be few machines in that category. > > > Another possible issue is that if we allow a child process to inherit > > all these fds it might accidentally write to them, which would be bad. > > I know the child process can go and maliciously open and trash files if > > it wants, but it doesn't seem like we should allow it to happen > > unintentionally. > > True. But I don't want to think of this as a security issue, because > then it becomes a security bug to forget O_CLOEXEC anywhere in the > backend, and that is a standard we cannot meet. (Even if we could > hold to it for the core code, stuff like libperl and libpython can't > be relied on to play ball.) In practice, as long as we use O_CLOEXEC > for files opened by fd.c, that would eliminate the actual too-many-fds > hazard. I don't object to desultorily looking around for other places > where we might want to add it, but personally I'd be satisfied with a > patch that CLOEXEC-ifies fd.c. Agreed, it's not a security issue, and also agreed that we should probably get it done with fd.c right off, and then if someone wants to think about other places where it might be good to do then more power to them and it seems like we'd be happy to accept such patches. Thanks, Stephen
Attachment
On 6/21/19 10:26 AM, Stephen Frost wrote: >> >>> Another possible issue is that if we allow a child process to inherit >>> all these fds it might accidentally write to them, which would be bad. >>> I know the child process can go and maliciously open and trash files if >>> it wants, but it doesn't seem like we should allow it to happen >>> unintentionally. >> >> True. But I don't want to think of this as a security issue, because >> then it becomes a security bug to forget O_CLOEXEC anywhere in the >> backend, and that is a standard we cannot meet. (Even if we could >> hold to it for the core code, stuff like libperl and libpython can't >> be relied on to play ball.) In practice, as long as we use O_CLOEXEC >> for files opened by fd.c, that would eliminate the actual too-many-fds >> hazard. I don't object to desultorily looking around for other places >> where we might want to add it, but personally I'd be satisfied with a >> patch that CLOEXEC-ifies fd.c. > > Agreed, it's not a security issue, and also agreed that we should > probably get it done with fd.c right off, and then if someone wants to > think about other places where it might be good to do then more power to > them and it seems like we'd be happy to accept such patches. I agree this is not a security issue and I wasn't intending to present it that way, but in general the more fds closed the better. I'll work up a patch for fd.c which is the obvious win and we can work from there if it makes sense. I'll be sure to test EXEC_BACKEND on Linux but I don't think it will matter on Windows. cfbot may feel differently, though. Regards, -- -David david@pgmasters.net