Thread: [7.0.2] spinlock problems reported earlier ...
Earlier this week, I reported getting core dumps with the following bt: (gdb) where #0 0x18271d90 in kill () from /usr/lib/libc.so.4 #1 0x182b2e09 in abort () from /usr/lib/libc.so.4 #2 0x80ee847 in s_lock_stuck (lock=0x20048065 "\001", file=0x816723c "spin.c", line=127) at s_lock.c:51 #3 0x80ee8c3 in s_lock (lock=0x20048065 "\001", file=0x816723c "spin.c", line=127) at s_lock.c:80 #4 0x80f1580 in SpinAcquire (lockid=7) at spin.c:127 #5 0x80f3903 in LockRelease (lockmethod=1, locktag=0xbfbfe968, lockmode=1) at lock.c:1044 I've been monitoring 'open files' on that machine, and after raising them to 8192, saw it hit "Open Files Peak: 8179" this morning and once more have a dead database ... Tom, you stated "That sure looks like you'd better tweak your kernel settings ... but offhand I don't see how it could lead to "stuck spinlock" errors.", so I'm wondering if maybe there is a bug, in that it should be handling running out of FDs better? I just raised mine to 32k so that it *hopefully* never happens again, if I hit *that* many open files I'll be surprised ... Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
The Hermit Hacker <scrappy@hub.org> writes: > I've been monitoring 'open files' on that machine, and after raising them > to 8192, saw it hit "Open Files Peak: 8179" this morning and once more > have a dead database ... > Tom, you stated "That sure looks like you'd better tweak your kernel > settings ... but offhand I don't see how it could lead to "stuck spinlock" > errors.", so I'm wondering if maybe there is a bug, in that it should be > handling running out of FDs better? Ah-hah, now that I get to see the log file before it vanished, I have a theory about how no FDs leads to stuck spinlock. The postmaster's own log has postmaster: StreamConnection: accept: Too many open files in system postmaster: StreamConnection: accept: Too many open files in system FATAL 1: ReleaseLruFile: No open files available to be closed FATAL: s_lock(20048065) at spin.c:127, stuck spinlock. Aborting. FATAL: s_lock(20048065) at spin.c:127, stuck spinlock. Aborting. (more of same) while the backend log has a bunch of IpcSemaphoreLock: semop failed (Identifier removed) id=524288 IpcSemaphoreLock: semop failed (Identifier removed) id=524288 IpcSemaphoreLock: semop failed (Identifier removed) id=524288 IpcSemaphoreLock: semop failed (Identifier removed) id=524288 *followed by* the spinlock gripes. Here's my theory: 1. Postmaster gets a connection, tries to read pg_hba.conf, which it does via AllocateFile(). On EMFILE failure that calls ReleaseLruFile, which elog()'s because in the postmaster environment there are not going to be any open virtual FDs to close. 2. elog() inside the postmaster causes the postmaster to shut down. Which it does faithfully, including cleaning up after itself, which includes removing the semaphores it owns. 3. Backends start falling over with semaphore-operation failures. This is treated as a system-restart event (backend does proc_exit(255)) but there's no postmaster to kill the other backends and start a new cycle of life. 4. At least one dying backend leaves the lock manager's spinlock locked (which it should not), so by and by we start to see stuck-spinlock gripes from backends that haven't yet tried to do a semop. But that's pretty far down the cause-and-effect chain. It looks to me like we have several things we want to do here. 1. ReleaseLruFile() should not immediately elog() but should return a failure code instead, allowing AllocateFile() to return NULL, which the postmaster can handle more gracefully than it does an elog(). 2. ProcReleaseSpins() ought to be done by proc_exit(). Someone was lazy and hard-coded it into elog() instead. 3. I think the real problem here is that the backends are opening too damn many files. IIRC, FreeBSD is one of the platforms where sysconf(_SC_OPEN_MAX) will return a large number, which means that fd.c will have no useful limit on the number of open files it eats up. Increasing your kernel NFILES setting will just allow Postgres to eat up more FDs, and eventually (if you allow enough backends to run) you'll be up against it again. Even if we manage to make Postgres itself fairly bulletproof against EMFILE failures, much of the rest of your system will be kayoed when PG is eating up every available kernel FD, so that is not the path to true happiness. (You might care to use lsof or some such to see just how many open files you have per backend. I bet it's a lot.) Hmm, this is interesting: on HPUX, man sysconf(2) says that sysconf(_SC_OPEN_MAX) returns the max number of open files per process --- which is what fd.c assumes it means. But I see that on your FreeBSD box, the sysconf man page defines it as _SC_OPEN_MAX The maximum number of open files per user id. which suggests that *on that platform* we need to divide by MAXBACKENDS. Does anyone know of a more portable way to determine the appropriate number of open files per backend? Otherwise, we'll have to put some kind of a-priori sanity check on what we will believe from sysconf(). I don't much care for the idea of putting a hard-wired limit on max files per backend, but that might be the quick-and-dirty answer. Another possibility is to add a postmaster parameter "max open files for whole installation", which we'd then divide by MAXBACKENDS to determine max files per backend, rather than trying to discover a safe value on-the-fly. In any case, I think we want something quick and dirty for a 7.0.* back-patch. Maybe just limiting what we believe from sysconf() to 100 or so would be OK for a patch. regards, tom lane
On Sun, 27 Aug 2000, Tom Lane wrote: > Hmm, this is interesting: on HPUX, man sysconf(2) says that > sysconf(_SC_OPEN_MAX) returns the max number of open files per process > --- which is what fd.c assumes it means. But I see that on your FreeBSD > box, the sysconf man page defines it as > > _SC_OPEN_MAX > The maximum number of open files per user id. > > which suggests that *on that platform* we need to divide by MAXBACKENDS. > Does anyone know of a more portable way to determine the appropriate > number of open files per backend? Okay, I just checked out Solaris 8/x86, and it confirms what HP/ux thinks: _SC_OPEN_MAX OPEN_MAX Max open files per process I'm curious as to whether FreeBSD is the only one that doesn't follow this "convention"? I'm CCng in the FreeBSD Hackers mailing list to see if someone there might be able to shed some light on this ... my first thought, personally, would be to throw in some sort of: #ifdef __FreeBSD__ max_files_per_backend = sysconf(_SC_OPEN_MAX) / num_of_backends; #else max_files_per_backend = sysconf(_SC_OPEN_MAX); #endif
The Hermit Hacker <scrappy@hub.org> writes: > Okay, I just checked out Solaris 8/x86, and it confirms what HP/ux thinks: > _SC_OPEN_MAX OPEN_MAX Max open files per > process > I'm curious as to whether FreeBSD is the only one that doesn't follow this > "convention"? I've also confirmed that SunOS 4.1.4 (about as old-line BSD as it gets these days) says _SC_OPEN_MAX is max per process. Furthermore, I notice that FreeBSD's description of sysctl(3) refers to a max-files-per-process kernel parameter, but no max-files-per-userid parameter. Perhaps the entry in the FreeBSD sysconf(2) man page is merely a typo? If so, I still consider that FreeBSD returns an unreasonably large fraction of the kernel FD table size as the number of files one process is allowed to open. regards, tom lane
The Hermit Hacker <scrappy@hub.org> writes: > Okay, I just checked out Solaris 8/x86, and it confirms what HP/ux thinks: > _SC_OPEN_MAX OPEN_MAX Max open files per > process > I'm curious as to whether FreeBSD is the only one that doesn't follow this > "convention"? From part of the NetBSD manpage for sysconf(3): DESCRIPTION This interface is defined by IEEE Std1003.1-1988 (``POSIX''). A far more complete interface is availableusing sysctl(3). _SC_OPEN_MAX The maximum number of open files per user id. _SC_STREAM_MAX The minimum maximum number of streams that a process may have open at any one time. BUGS The value for _SC_STREAM_MAX is a minimum maximum, and required to be the same as ANSI C's FOPEN_MAX, so the returnedvalue is a ridiculously small and misleading number. STANDARDS The sysconf() function conforms to IEEE Std1003.1-1990 (``POSIX''). HISTORY The sysconf function first appeared in 4.4BSD. This suggests that _SC_STREAM_MAX might be a better value to use. On one of my NetBSD boxes I have the following: _SC_OPEN_MAX: 64 _SC_STREAM_MAX: 20 In any case, if this really follows the POSIX standard, perhaps PostgreSQL code should assume these semantics and work around other cases that don't follow the standard (instead of work around the POSIX cases). Cheers, Brook
Brook Milligan <brook@biology.nmsu.edu> writes: > In any case, if this really follows the POSIX standard, perhaps > PostgreSQL code should assume these semantics and work around other > cases that don't follow the standard (instead of work around the POSIX > cases). HP asserts that *they* follow the POSIX standard, and in this case I'm more inclined to believe them than the *BSD camp. A per-process limit on open files has existed in most Unices I've heard of; I had never heard of a per-userid limit until yesterday. (And I'm not yet convinced that that's actually what *BSD implements; are we sure it's not just a typo in the man page?) 64 or so for _SC_OPEN_MAX is not really what I'm worried about anyway. IIRC, we've heard reports that some platforms return values in the thousands, ie, essentially telling each process it can have the whole kernel FD table, and it's that behavior that I'm speculating is causing Marc's problem. Marc, could you check what is returned by sysconf(_SC_OPEN_MAX) on your box? And/or check to see how many files each backend is actually holding open? regards, tom lane
Re: Re: Too many open files (was Re: spinlock problems reported earlier)
From
The Hermit Hacker
Date:
On Mon, 28 Aug 2000, Tom Lane wrote: > Brook Milligan <brook@biology.nmsu.edu> writes: > > In any case, if this really follows the POSIX standard, perhaps > > PostgreSQL code should assume these semantics and work around other > > cases that don't follow the standard (instead of work around the POSIX > > cases). > > HP asserts that *they* follow the POSIX standard, and in this case > I'm more inclined to believe them than the *BSD camp. A per-process > limit on open files has existed in most Unices I've heard of; I had > never heard of a per-userid limit until yesterday. (And I'm not yet > convinced that that's actually what *BSD implements; are we sure it's > not just a typo in the man page?) > > 64 or so for _SC_OPEN_MAX is not really what I'm worried about anyway. > IIRC, we've heard reports that some platforms return values in the > thousands, ie, essentially telling each process it can have the whole > kernel FD table, and it's that behavior that I'm speculating is causing > Marc's problem. > > Marc, could you check what is returned by sysconf(_SC_OPEN_MAX) on your > box? And/or check to see how many files each backend is actually > holding open? > ./t 4136 > sysctl kern.maxfiles kern.maxfiles: 32768 > cat t.c #include <stdio.h> #include <unistd.h> main() { printf("%ld\n", sysconf(_SC_OPEN_MAX)); } okay, slightly difficult since they come and go, but using the database that is used for the search engine, with just a psql session: pgsql# lsof -p 85333 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME postgres 85333 pgsql cwd VDIR 13,131088 3072 7936 /pgsql/data2/udmsearch postgres 85333 pgsql rtd VDIR 13,131072 512 2 / postgres 85333 pgsql txt VREG 13,131084 4651486 103175 /pgsql/bin/postgres postgres 85333 pgsql txt VREG 13,131076 77648 212924 /usr/libexec/ld-elf.so.1 postgres 85333 pgsql txt VREG 13,131076 11860 56504 /usr/lib/libdescrypt.so.2 postgres 85333 pgsql txt VREG 13,131076 120736 56525 /usr/lib/libm.so.2 postgres 85333 pgsql txt VREG 13,131076 34336 56677 /usr/lib/libutil.so.3 postgres 85333 pgsql txt VREG 13,131076 154128 57068 /usr/lib/libreadline.so.4 postgres 85333 pgsql txt VREG 13,131076 270100 56532 /usr/lib/libncurses.so.5 postgres 85333 pgsql txt VREG 13,131076 570064 56679 /usr/lib/libc.so.4 postgres 85333 pgsql 0r VCHR 2,2 0t0 7967 /dev/null postgres 85333 pgsql 1w VREG 13,131084 995 762037 /pgsql/logs/postmaster.5432.61308 postgres 85333 pgsql 2w VREG 13,131084 316488878 762038 /pgsql/logs/5432.61308 postgres 85333 pgsql 3r VREG 13,131088 1752 8011 /pgsql/data2/udmsearch/pg_internal.init postgres 85333 pgsql 4u VREG 13,131084 22757376 15922 /pgsql/data/pg_log postgres 85333 pgsql 5u unix 0xd46a3300 0t0 ->0xd469a540 postgres 85333 pgsql 6u VREG 13,131084 8192 15874 /pgsql/data/pg_variable postgres 85333 pgsql 7u VREG 13,131088 16384 7982 /pgsql/data2/udmsearch/pg_class postgres 85333 pgsql 8u VREG 13,131088 32768 7980 /pgsql/data2/udmsearch/pg_class_relname_index postgres 85333 pgsql 9u VREG 13,131088 81920 7985 /pgsql/data2/udmsearch/pg_attribute postgres 85333 pgsql 10u VREG 13,131088 65536 7983 /pgsql/data2/udmsearch/pg_attribute_relid_attnum_index postgres 85333 pgsql 11u VREG 13,131088 8192 7945 /pgsql/data2/udmsearch/pg_trigger postgres 85333 pgsql 12u VREG 13,131088 8192 7993 /pgsql/data2/udmsearch/pg_am postgres 85333 pgsql 13u VREG 13,131088 16384 7977 /pgsql/data2/udmsearch/pg_index postgres 85333 pgsql 14u VREG 13,131088 8192 7988 /pgsql/data2/udmsearch/pg_amproc postgres 85333 pgsql 15u VREG 13,131088 16384 7991 /pgsql/data2/udmsearch/pg_amop postgres 85333 pgsql 16u VREG 13,131088 73728 7961 /pgsql/data2/udmsearch/pg_operator postgres 85333 pgsql 17u VREG 13,131088 16384 7976 /pgsql/data2/udmsearch/pg_index_indexrelid_index postgres 85333 pgsql 18u VREG 13,131088 32768 7960 /pgsql/data2/udmsearch/pg_operator_oid_index postgres 85333 pgsql 19u VREG 13,131088 16384 7976 /pgsql/data2/udmsearch/pg_index_indexrelid_index postgres 85333 pgsql 20u VREG 13,131088 16384 7942 /pgsql/data2/udmsearch/pg_trigger_tgrelid_index postgres 85333 pgsql 21u VREG 13,131084 8192 15921 /pgsql/data/pg_shadow postgres 85333 pgsql 22u VREG 13,131084 8192 15918 /pgsql/data/pg_database postgres 85333 pgsql 23u VREG 13,131088 8192 7952 /pgsql/data2/udmsearch/pg_rewrite postgres 85333 pgsql 24u VREG 13,131088 16384 7941 /pgsql/data2/udmsearch/pg_type postgres 85333 pgsql 25u VREG 13,131088 16384 7940 /pgsql/data2/udmsearch/pg_type_oid_index postgres 85333 pgsql 26u VREG 13,131088 0 7938 /pgsql/data2/udmsearch/pg_user postgres 85333 pgsql 27u VREG 13,131088 188416 7984 /pgsql/data2/udmsearch/pg_attribute_relid_attnam_index postgres 85333 pgsql 28u VREG 13,131088 65536 7959 /pgsql/data2/udmsearch/pg_operator_oprname_l_r_k_index postgres 85333 pgsql 29u VREG 13,131088 16384 7981 /pgsql/data2/udmsearch/pg_class_oid_index postgres 85333 pgsql 30u VREG 13,131088 40960 7948 /pgsql/data2/udmsearch/pg_statistic postgres 85333 pgsql 31u VREG 13,131088 32768 7947 /pgsql/data2/udmsearch/pg_statistic_relid_att_index postgres 85333 pgsql 32u VREG 13,131088 212992 7958 /pgsql/data2/udmsearch/pg_proc postgres 85333 pgsql 33u VREG 13,131088 49152 7957 /pgsql/data2/udmsearch/pg_proc_oid_index when running a vacuum on the database, the only changes appear to be adding (and removing when done) those tables that are currently being vacuumed ... so, it appears, ~48 or so files opened ...
The Hermit Hacker <scrappy@hub.org> writes: >> cat t.c > #include <stdio.h> > #include <unistd.h> > main() > { > printf("%ld\n", sysconf(_SC_OPEN_MAX)); > } >> ./t > 4136 Yup, there's our problem. Each backend will feel entitled to open up to about 4100 files, assuming it manages to hit that many distinct tables/ indexes during its run. You probably haven't got that many, but even several hundred files times a couple dozen backends would start pushing your (previous) kernel FD limit. So, at least on FreeBSD, we can't trust sysconf(_SC_OPEN_MAX) to tell us the number we need. An explicit parameter to the postmaster, setting the installation-wide open file count (with default maybe about 50 * MaxBackends) is starting to look like a good answer to me. Comments? regards, tom lane
Re: Re: Too many open files (was Re: spinlock problems reported earlier)
From
The Hermit Hacker
Date:
On Mon, 28 Aug 2000, Tom Lane wrote: > The Hermit Hacker <scrappy@hub.org> writes: > >> cat t.c > > #include <stdio.h> > > #include <unistd.h> > > > main() > > { > > printf("%ld\n", sysconf(_SC_OPEN_MAX)); > > } > > >> ./t > > 4136 > > Yup, there's our problem. Each backend will feel entitled to open up to > about 4100 files, assuming it manages to hit that many distinct tables/ > indexes during its run. You probably haven't got that many, but even > several hundred files times a couple dozen backends would start pushing > your (previous) kernel FD limit. > > So, at least on FreeBSD, we can't trust sysconf(_SC_OPEN_MAX) to tell us > the number we need. > > An explicit parameter to the postmaster, setting the installation-wide > open file count (with default maybe about 50 * MaxBackends) is starting > to look like a good answer to me. Comments? Okay, if I understand correctly, this would just result in more I/O as far as having to close off "unused files" once that 50 limit is reached? Would it be installation-wide, or per-process? Ie. if I have 100 as maxbackends, and set it to 1000, could one backend suck up all 1000, or would each max out at 10? (note. I'm running with 192 backends right now, and have actually pushed it to run 188 simultaneously *grin*) ...
The Hermit Hacker <scrappy@hub.org> writes: >> An explicit parameter to the postmaster, setting the installation-wide >> open file count (with default maybe about 50 * MaxBackends) is starting >> to look like a good answer to me. Comments? > Okay, if I understand correctly, this would just result in more I/O as far > as having to close off "unused files" once that 50 limit is reached? Right, the cost is extra close() and open() kernel calls to release FDs temporarily. > Would it be installation-wide, or per-process? Ie. if I have 100 as > maxbackends, and set it to 1000, could one backend suck up all 1000, or > would each max out at 10? The only straightforward implementation is to take the parameter, divide by MaxBackends, and allow each backend to have no more than that many files open. Any sort of dynamic allocation would require inter-backend communication, which is probably more trouble than it's worth to avoid a few kernel calls. > (note. I'm running with 192 backends right now, > and have actually pushed it to run 188 simultaneously *grin*) ... Lessee, 8192 FDs / 192 backends = 42 per backend. No wonder you were running out. regards, tom lane
Re: Re: Too many open files (was Re: spinlock problems reported earlier)
From
The Hermit Hacker
Date:
On Mon, 28 Aug 2000, Tom Lane wrote: > The only straightforward implementation is to take the parameter, > divide by MaxBackends, and allow each backend to have no more than > that many files open. Any sort of dynamic allocation would require > inter-backend communication, which is probably more trouble than it's > worth to avoid a few kernel calls. agreed, just wanted to make sure ... sound great to me ... > > (note. I'm running with 192 backends right now, > > and have actually pushed it to run 188 simultaneously *grin*) ... > > Lessee, 8192 FDs / 192 backends = 42 per backend. No wonder you were > running out. *grin* I up'd it to 32k ... so far its max'd out at around 7175used ...
Department of Things that Fell Through the Cracks: Back in August we had concluded that it is a bad idea to trust "sysconf(_SC_OPEN_MAX)" as an indicator of how many files each backend can safely open. FreeBSD was reported to return 4136, and I have since noticed that LinuxPPC returns 1024. Both of those are unreasonably large fractions of the actual kernel file table size. A few dozen backends opening hundreds of files apiece will fill the kernel file table on most Unix platforms. I'm not sure why this didn't get dealt with, but I think it's a "must fix" kind of problem for 7.1. The dbadmin has *got* to be able to limit Postgres' appetite for open file descriptors. I propose we add a new configuration parameter, MAX_FILES_PER_PROCESS, with a default value of about 100. A new backend would set its max-files setting to the smaller of this parameter or sysconf(_SC_OPEN_MAX). An alternative approach would be to make the parameter be total open files across the whole installation, and divide it by MaxBackends to arrive at the per-backend limit. However, it'd be much harder to pick a reasonable default value if we did it that way. Comments? regards, tom lane
Tom Lane wrote: > > Department of Things that Fell Through the Cracks: > > Back in August we had concluded that it is a bad idea to trust > "sysconf(_SC_OPEN_MAX)" as an indicator of how many files each backend > can safely open. FreeBSD was reported to return 4136, and I have > since noticed that LinuxPPC returns 1024. Both of those are > unreasonably large fractions of the actual kernel file table size. > A few dozen backends opening hundreds of files apiece will fill the > kernel file table on most Unix platforms. > > I'm not sure why this didn't get dealt with, but I think it's a "must > fix" kind of problem for 7.1. The dbadmin has *got* to be able to > limit Postgres' appetite for open file descriptors. > > I propose we add a new configuration parameter, MAX_FILES_PER_PROCESS, > with a default value of about 100. A new backend would set its > max-files setting to the smaller of this parameter or > sysconf(_SC_OPEN_MAX). > > An alternative approach would be to make the parameter be total open files > across the whole installation, and divide it by MaxBackends to arrive at > the per-backend limit. However, it'd be much harder to pick a reasonable > default value if we did it that way. > > Comments? On Linux, at least, the 1024 file limit is a per process limit, the system wide limit defaults to 4096 and can be easily changed by echo 16384 > /proc/sys/fs/file-max (16384 is arbitrary and can be much larger) I am all for having the ability to tune behavior over the system reported values, but I think it should be an option which defaults to the previous behavior. -- http://www.mohawksoft.com
Re: Re: Too many open files (was Re: spinlock problems reported earlier)
From
Peter Eisentraut
Date:
Tom Lane writes: > I'm not sure why this didn't get dealt with, but I think it's a "must > fix" kind of problem for 7.1. The dbadmin has *got* to be able to > limit Postgres' appetite for open file descriptors. Use ulimit. > I propose we add a new configuration parameter, MAX_FILES_PER_PROCESS, > with a default value of about 100. A new backend would set its > max-files setting to the smaller of this parameter or > sysconf(_SC_OPEN_MAX). I think this is an unreasonable interference with the customary operating system interfaces (e.g., ulimit). The last thing I want to hear is "Postgres is slow and it only opens 100 files per process even though I <did something> to allow 32 million." -- Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
Peter Eisentraut <peter_e@gmx.net> writes: > Tom Lane writes: >> I'm not sure why this didn't get dealt with, but I think it's a "must >> fix" kind of problem for 7.1. The dbadmin has *got* to be able to >> limit Postgres' appetite for open file descriptors. > Use ulimit. Even if ulimit exists and is able to control that parameter on a given platform (highly unportable assumptions), it's not really a workable answer. fd.c has to stop short of using up all of the actual nfile limit, or else stuff like the dynamic loader is likely to fail. > I think this is an unreasonable interference with the customary operating > system interfaces (e.g., ulimit). The last thing I want to hear is > "Postgres is slow and it only opens 100 files per process even though I > <did something> to allow 32 million." (1) A dbadmin who hasn't read the run-time configuration doc page (that you did such a nice job with) is going to have lots of performance issues besides this one. (2) The last thing *I* want to hear is stories of a default Postgres installation causing system-wide instability. But if we don't insert an open-files limit that's tighter than the "customary operating system limit", that's exactly the situation we have, at least on several popular platforms. regards, tom lane
Re: Re: Too many open files (was Re: spinlock problems reported earlier)
From
Peter Eisentraut
Date:
Maybe a setting that controls the total number of files that postmaster plus backends can allocate among them would be useful. If you have a per backend setting then that sort of assumes lots of clients with relatively little usage. Which is probably true in many cases, but not in all. -- Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
Peter Eisentraut <peter_e@gmx.net> writes: > Maybe a setting that controls the total number of files that postmaster > plus backends can allocate among them would be useful. That'd be nice if we could do it, but I don't see any inexpensive way to get one backend to release an open FD when another one needs one. So, divvying up the limit on an N-per-backend basis seems like the most workable approach. regards, tom lane
> (1) A dbadmin who hasn't read the run-time configuration doc page (that > you did such a nice job with) is going to have lots of performance > issues besides this one. > > (2) The last thing *I* want to hear is stories of a default Postgres > installation causing system-wide instability. But if we don't insert > an open-files limit that's tighter than the "customary operating system > limit", that's exactly the situation we have, at least on several > popular platforms. IMHO, let's remember we keep a cache of file descriptors open for performance. How many file do we really need open in the cache? I can't imagine any performance reason to have hundreds of open file descriptors cached. A file open is not that big a deal. Just because the OS says we can open 1000 files doesn't mean we should open them just to keep a nice cache. We are keeping them open just for performance reasons, not because we actually need them to get work done. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Re: Re: Too many open files (was Re: spinlock problems reported earlier)
From
Alfred Perlstein
Date:
* Tom Lane <tgl@sss.pgh.pa.us> [001223 14:16] wrote: > Department of Things that Fell Through the Cracks: > > Back in August we had concluded that it is a bad idea to trust > "sysconf(_SC_OPEN_MAX)" as an indicator of how many files each backend > can safely open. FreeBSD was reported to return 4136, and I have > since noticed that LinuxPPC returns 1024. Both of those are > unreasonably large fractions of the actual kernel file table size. > A few dozen backends opening hundreds of files apiece will fill the > kernel file table on most Unix platforms. getdtablesize(2) on BSD should tell you the per-process limit. sysconf on FreeBSD shouldn't lie to you. getdtablesize should take into account limits in place. later versions of FreeBSD have a sysctl 'kern.openfiles' which can be checked to see if the system is approaching the systemwide limit. -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] "I have the heart of a child; I keep it in a jar on my desk."
> Department of Things that Fell Through the Cracks: > > Back in August we had concluded that it is a bad idea to trust > "sysconf(_SC_OPEN_MAX)" as an indicator of how many files each backend > can safely open. FreeBSD was reported to return 4136, and I have > since noticed that LinuxPPC returns 1024. Both of those are > unreasonably large fractions of the actual kernel file table size. > A few dozen backends opening hundreds of files apiece will fill the > kernel file table on most Unix platforms. > > I'm not sure why this didn't get dealt with, but I think it's a "must > fix" kind of problem for 7.1. The dbadmin has *got* to be able to > limit Postgres' appetite for open file descriptors. > > I propose we add a new configuration parameter, MAX_FILES_PER_PROCESS, > with a default value of about 100. A new backend would set its > max-files setting to the smaller of this parameter or > sysconf(_SC_OPEN_MAX). Seems nice idea. We have been heard lots of problem reports caused by ruuning out of the file table. However it would be even nicer, if it could be configurable at runtime (at the postmaster starting up time) like -N option. Maybe MAX_FILES_PER_PROCESS can be a hard limit? -- Tatsuo Ishii
Tatsuo Ishii <t-ishii@sra.co.jp> writes: >> I propose we add a new configuration parameter, MAX_FILES_PER_PROCESS, >> with a default value of about 100. A new backend would set its >> max-files setting to the smaller of this parameter or >> sysconf(_SC_OPEN_MAX). > Seems nice idea. We have been heard lots of problem reports caused by > ruuning out of the file table. > However it would be even nicer, if it could be configurable at runtime > (at the postmaster starting up time) like -N option. Yes, what I meant was a GUC parameter named MAX_FILES_PER_PROCESS. You could set it via postmaster.opts or postmaster command line switch. regards, tom lane