Thread: Number of open files
Hi, I am having problems with the number of open files on Redhat 6.1. The value of /proc/sys/fs/file-max is 4096 (the default), but this value is reached with about 50 ODBC connections. Increasing the file-max value would only temporarily improve matters because on the long-term I expect to have 500+ active connections. How comes there are so many open files per connection? Is there any way to decrease the number of open files, so that I don't have to increase file-max to immense proportions? Thanks, Mark.
"Mark Alliban" <MarkA@idnltd.com> writes: > I am having problems with the number of open files on Redhat 6.1. The value > of /proc/sys/fs/file-max is 4096 (the default), but this value is reached > with about 50 ODBC connections. Increasing the file-max value would only > temporarily improve matters because on the long-term I expect to have 500+ > active connections. How comes there are so many open files per connection? > Is there any way to decrease the number of open files, so that I don't have > to increase file-max to immense proportions? You can hack the routine pg_nofile() in src/backend/storage/file/fd.c to return some smaller number than it's returning now, but I really wouldn't advise reducing it below thirty or so. You'll still need to increase file-max. regards, tom lane
> "Mark Alliban" <MarkA@idnltd.com> writes: > > I am having problems with the number of open files on Redhat 6.1. The value > > of /proc/sys/fs/file-max is 4096 (the default), but this value is reached > > with about 50 ODBC connections. Increasing the file-max value would only > > temporarily improve matters because on the long-term I expect to have 500+ > > active connections. How comes there are so many open files per connection? > > Is there any way to decrease the number of open files, so that I don't have > > to increase file-max to immense proportions? > > You can hack the routine pg_nofile() in src/backend/storage/file/fd.c > to return some smaller number than it's returning now, but I really > wouldn't advise reducing it below thirty or so. You'll still need to > increase file-max. > > regards, tom lane I have increased file-max to 16000. However after about 24 hours of running, pgsql crashed and errors in the log showed that the system had run out of memory. I do not have the exact error message, as I was in a hurry to get the system up and running again (it is a live production system). The system has 512MB memory and there were 47 ODBC sessions in progress, so I cannot believe that the system *really* ran out of memory. I start postmaster with -B 2048 -N 500, if that is relevant. Also backends seem to hang around for about a minute after I close the ODBC connections. Is this normal? Thanks, Mark.
"Mark Alliban" <MarkA@idnltd.com> writes: > I have increased file-max to 16000. However after about 24 hours of running, > pgsql crashed and errors in the log showed that the system had run out of > memory. I do not have the exact error message, as I was in a hurry to get > the system up and running again (it is a live production system). The system > has 512MB memory and there were 47 ODBC sessions in progress, so I cannot > believe that the system *really* ran out of memory. Oh, I could believe that, depending on what your ODBC clients were doing. 10 meg of working store per backend is not out of line for complex queries. Have you tried watching with 'top' to see what a typical backend process size actually is for your workload? Also, the amount of RAM isn't necessarily the limiting factor here; what you should have told us is how much swap space you have ... > Also backends seem to hang around for about a minute after I close the ODBC > connections. Is this normal? Seems odd to me too. regards, tom lane
> "Mark Alliban" <MarkA@idnltd.com> writes: > > I have increased file-max to 16000. However after about 24 hours of running, > > pgsql crashed and errors in the log showed that the system had run out of > > memory. I do not have the exact error message, as I was in a hurry to get > > the system up and running again (it is a live production system). The system > > has 512MB memory and there were 47 ODBC sessions in progress, so I cannot > > believe that the system *really* ran out of memory. > > Oh, I could believe that, depending on what your ODBC clients were > doing. 10 meg of working store per backend is not out of line for > complex queries. Have you tried watching with 'top' to see what a > typical backend process size actually is for your workload? > > Also, the amount of RAM isn't necessarily the limiting factor here; > what you should have told us is how much swap space you have ... 530MB of swap. top reports that the backends use around 17-19MB on average. Are you saying then, that if I have 500 concurrent queries, I will need 8GB of swap space? Is there any way to limit the amount of memory a backend can use, and if there is, would it be a very bad idea to do it? Thanks, Mark.
"Mark Alliban" <MarkA@idnltd.com> writes: > 530MB of swap. top reports that the backends use around 17-19MB on average. > Are you saying then, that if I have 500 concurrent queries, I will need 8GB > of swap space? Something like that. You weren't expecting to support 500 concurrent queries on toy iron, I hope. > Is there any way to limit the amount of memory a backend can > use, and if there is, would it be a very bad idea to do it? Can't think of anything very productive offhand. We have gotten rid of some memory-leak problems in 7.1, so you may find that the next release will not need so much memory. Or perhaps you can rewrite your queries to not need so much --- can you determine which queries bloat the backend to that size? regards, tom lane
I wrote: > "Mark Alliban" <MarkA@idnltd.com> writes: >> Also backends seem to hang around for about a minute after I close >> the ODBC connections. Is this normal? > Seems odd to me too. How reproducible is that behavior --- does it happen for all connections, or only a few? Is the time before disconnection consistent? I've just noticed and repaired a bug that might explain this misbehavior, but only if the behavior is not as consistent as you imply. The bug I found is that if a connection request is completed and a backend is forked off while other connection request(s) are in progress, the new child backend has an open file descriptor for the other client connection(s) as well as its own. It will never touch the other clients' descriptors, but simply having them might affect the kernel's behavior. In particular, if another client performs its tasks and exits without sending a disconnect message --- which ODBC doesn't send --- then I think the backend spawned for the other client wouldn't be told the connection is closed until the first one exits and its descriptor for the connection is discarded. You could check into this possibility with a tool like lsof: when you see a backend hanging around with no client, look to see if any other backends have open file descriptors for the same connection. regards, tom lane