Thread: [7.0.2] spinlock problems reported earlier ...

[7.0.2] spinlock problems reported earlier ...

From
The Hermit Hacker
Date:
Earlier this week, I reported getting core dumps with the following bt:

(gdb) where
#0  0x18271d90 in kill () from /usr/lib/libc.so.4
#1  0x182b2e09 in abort () from /usr/lib/libc.so.4
#2  0x80ee847 in s_lock_stuck (lock=0x20048065 "\001", file=0x816723c "spin.c", line=127) at s_lock.c:51
#3  0x80ee8c3 in s_lock (lock=0x20048065 "\001", file=0x816723c "spin.c", line=127) at s_lock.c:80
#4  0x80f1580 in SpinAcquire (lockid=7) at spin.c:127
#5  0x80f3903 in LockRelease (lockmethod=1, locktag=0xbfbfe968, lockmode=1) at lock.c:1044

I've been monitoring 'open files' on that machine, and after raising them
to 8192, saw it hit "Open Files Peak: 8179" this morning and once more
have a dead database ...

Tom, you stated "That sure looks like you'd better tweak your kernel
settings ... but offhand I don't see how it could lead to "stuck spinlock"
errors.", so I'm wondering if maybe there is a bug, in that it should be
handling running out of FDs better?

I just raised mine to 32k so that it *hopefully* never happens again, if I
hit *that* many open files I'll be surprised ...



Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 




Too many open files (was Re: spinlock problems reported earlier)

From
Tom Lane
Date:
The Hermit Hacker <scrappy@hub.org> writes:
> I've been monitoring 'open files' on that machine, and after raising them
> to 8192, saw it hit "Open Files Peak: 8179" this morning and once more
> have a dead database ...

> Tom, you stated "That sure looks like you'd better tweak your kernel
> settings ... but offhand I don't see how it could lead to "stuck spinlock"
> errors.", so I'm wondering if maybe there is a bug, in that it should be
> handling running out of FDs better?

Ah-hah, now that I get to see the log file before it vanished, I have
a theory about how no FDs leads to stuck spinlock.  The postmaster's own
log has

postmaster: StreamConnection: accept: Too many open files in system
postmaster: StreamConnection: accept: Too many open files in system
FATAL 1:  ReleaseLruFile: No open files available to be closed

FATAL: s_lock(20048065) at spin.c:127, stuck spinlock. Aborting.

FATAL: s_lock(20048065) at spin.c:127, stuck spinlock. Aborting.

(more of same)

while the backend log has a bunch of

IpcSemaphoreLock: semop failed (Identifier removed) id=524288
IpcSemaphoreLock: semop failed (Identifier removed) id=524288
IpcSemaphoreLock: semop failed (Identifier removed) id=524288
IpcSemaphoreLock: semop failed (Identifier removed) id=524288

*followed by* the spinlock gripes.

Here's my theory:

1. Postmaster gets a connection, tries to read pg_hba.conf, which it
does via AllocateFile().  On EMFILE failure that calls ReleaseLruFile,
which elog()'s because in the postmaster environment there are not
going to be any open virtual FDs to close.

2. elog() inside the postmaster causes the postmaster to shut down.
Which it does faithfully, including cleaning up after itself, which
includes removing the semaphores it owns.

3. Backends start falling over with semaphore-operation failures.
This is treated as a system-restart event (backend does proc_exit(255))
but there's no postmaster to kill the other backends and start a new
cycle of life.

4. At least one dying backend leaves the lock manager's spinlock locked
(which it should not), so by and by we start to see stuck-spinlock
gripes from backends that haven't yet tried to do a semop.  But that's
pretty far down the cause-and-effect chain.

It looks to me like we have several things we want to do here.

1. ReleaseLruFile() should not immediately elog() but should return 
a failure code instead, allowing AllocateFile() to return NULL, which
the postmaster can handle more gracefully than it does an elog().

2. ProcReleaseSpins() ought to be done by proc_exit().  Someone was lazy
and hard-coded it into elog() instead.

3. I think the real problem here is that the backends are opening too
damn many files.  IIRC, FreeBSD is one of the platforms where
sysconf(_SC_OPEN_MAX) will return a large number, which means that fd.c
will have no useful limit on the number of open files it eats up.
Increasing your kernel NFILES setting will just allow Postgres to eat
up more FDs, and eventually (if you allow enough backends to run)
you'll be up against it again.  Even if we manage to make Postgres
itself fairly bulletproof against EMFILE failures, much of the rest
of your system will be kayoed when PG is eating up every available
kernel FD, so that is not the path to true happiness.

(You might care to use lsof or some such to see just how many open
files you have per backend.  I bet it's a lot.)

Hmm, this is interesting: on HPUX, man sysconf(2) says that
sysconf(_SC_OPEN_MAX) returns the max number of open files per process
--- which is what fd.c assumes it means.  But I see that on your FreeBSD
box, the sysconf man page defines it as
    _SC_OPEN_MAX            The maximum number of open files per user id.

which suggests that *on that platform* we need to divide by MAXBACKENDS.
Does anyone know of a more portable way to determine the appropriate
number of open files per backend?

Otherwise, we'll have to put some kind of a-priori sanity check on
what we will believe from sysconf().  I don't much care for the idea of
putting a hard-wired limit on max files per backend, but that might be
the quick-and-dirty answer.

Another possibility is to add a postmaster parameter "max open files
for whole installation", which we'd then divide by MAXBACKENDS to
determine max files per backend, rather than trying to discover a
safe value on-the-fly.

In any case, I think we want something quick and dirty for a 7.0.*
back-patch.  Maybe just limiting what we believe from sysconf() to
100 or so would be OK for a patch.
        regards, tom lane


Re: Too many open files (was Re: spinlock problems reported earlier)

From
The Hermit Hacker
Date:
On Sun, 27 Aug 2000, Tom Lane wrote:

> Hmm, this is interesting: on HPUX, man sysconf(2) says that
> sysconf(_SC_OPEN_MAX) returns the max number of open files per process
> --- which is what fd.c assumes it means.  But I see that on your FreeBSD
> box, the sysconf man page defines it as
> 
>      _SC_OPEN_MAX
>              The maximum number of open files per user id.
> 
> which suggests that *on that platform* we need to divide by MAXBACKENDS.
> Does anyone know of a more portable way to determine the appropriate
> number of open files per backend?

Okay, I just checked out Solaris 8/x86, and it confirms what HP/ux thinks:
    _SC_OPEN_MAX            OPEN_MAX                   Max open files per
       process
 

I'm curious as to whether FreeBSD is the only one that doesn't follow this
"convention"?   I'm CCng in the FreeBSD Hackers mailing list to see if
someone there might be able to shed some light on this ... my first
thought, personally, would be to throw in some sort of:

#ifdef __FreeBSD__ max_files_per_backend = sysconf(_SC_OPEN_MAX) / num_of_backends;
#else max_files_per_backend = sysconf(_SC_OPEN_MAX);
#endif




Re: Too many open files (was Re: spinlock problems reported earlier)

From
Tom Lane
Date:
The Hermit Hacker <scrappy@hub.org> writes:
> Okay, I just checked out Solaris 8/x86, and it confirms what HP/ux thinks:
>      _SC_OPEN_MAX            OPEN_MAX                   Max open files per
>                                                         process
> I'm curious as to whether FreeBSD is the only one that doesn't follow this
> "convention"?

I've also confirmed that SunOS 4.1.4 (about as old-line BSD as it gets
these days) says _SC_OPEN_MAX is max per process.  Furthermore,
I notice that FreeBSD's description of sysctl(3) refers to a
max-files-per-process kernel parameter, but no max-files-per-userid
parameter.  Perhaps the entry in the FreeBSD sysconf(2) man page is
merely a typo?

If so, I still consider that FreeBSD returns an unreasonably large
fraction of the kernel FD table size as the number of files one
process is allowed to open.
        regards, tom lane


Re: Re: Too many open files (was Re: spinlock problems reported earlier)

From
Brook Milligan
Date:
The Hermit Hacker <scrappy@hub.org> writes:  > Okay, I just checked out Solaris 8/x86, and it confirms what HP/ux
thinks: >      _SC_OPEN_MAX            OPEN_MAX                   Max open files per  >
                       process  > I'm curious as to whether FreeBSD is the only one that doesn't follow this  >
"convention"?

From part of the NetBSD manpage for sysconf(3):

DESCRIPTION    This interface is defined by IEEE Std1003.1-1988 (``POSIX'').  A far more    complete interface is
availableusing sysctl(3).
 
    _SC_OPEN_MAX            The maximum number of open files per user id.
    _SC_STREAM_MAX            The minimum maximum number of streams that a process may have            open at any one
time.

BUGS    The value for _SC_STREAM_MAX is a minimum maximum, and required to be the    same as ANSI C's FOPEN_MAX, so the
returnedvalue is a ridiculously small    and misleading number.
 

STANDARDS    The sysconf() function conforms to IEEE Std1003.1-1990 (``POSIX'').

HISTORY    The sysconf function first appeared in 4.4BSD.

This suggests that _SC_STREAM_MAX might be a better value to use.  On
one of my NetBSD boxes I have the following:

_SC_OPEN_MAX:  64
_SC_STREAM_MAX:  20

In any case, if this really follows the POSIX standard, perhaps
PostgreSQL code should assume these semantics and work around other
cases that don't follow the standard (instead of work around the POSIX
cases).

Cheers,
Brook


Re: Re: Too many open files (was Re: spinlock problems reported earlier)

From
Tom Lane
Date:
Brook Milligan <brook@biology.nmsu.edu> writes:
> In any case, if this really follows the POSIX standard, perhaps
> PostgreSQL code should assume these semantics and work around other
> cases that don't follow the standard (instead of work around the POSIX
> cases).

HP asserts that *they* follow the POSIX standard, and in this case
I'm more inclined to believe them than the *BSD camp.  A per-process
limit on open files has existed in most Unices I've heard of; I had
never heard of a per-userid limit until yesterday.  (And I'm not yet
convinced that that's actually what *BSD implements; are we sure it's
not just a typo in the man page?)

64 or so for _SC_OPEN_MAX is not really what I'm worried about anyway.
IIRC, we've heard reports that some platforms return values in the
thousands, ie, essentially telling each process it can have the whole
kernel FD table, and it's that behavior that I'm speculating is causing
Marc's problem.

Marc, could you check what is returned by sysconf(_SC_OPEN_MAX) on your
box?  And/or check to see how many files each backend is actually
holding open?
        regards, tom lane


Re: Re: Too many open files (was Re: spinlock problems reported earlier)

From
The Hermit Hacker
Date:
On Mon, 28 Aug 2000, Tom Lane wrote:

> Brook Milligan <brook@biology.nmsu.edu> writes:
> > In any case, if this really follows the POSIX standard, perhaps
> > PostgreSQL code should assume these semantics and work around other
> > cases that don't follow the standard (instead of work around the POSIX
> > cases).
> 
> HP asserts that *they* follow the POSIX standard, and in this case
> I'm more inclined to believe them than the *BSD camp.  A per-process
> limit on open files has existed in most Unices I've heard of; I had
> never heard of a per-userid limit until yesterday.  (And I'm not yet
> convinced that that's actually what *BSD implements; are we sure it's
> not just a typo in the man page?)
> 
> 64 or so for _SC_OPEN_MAX is not really what I'm worried about anyway.
> IIRC, we've heard reports that some platforms return values in the
> thousands, ie, essentially telling each process it can have the whole
> kernel FD table, and it's that behavior that I'm speculating is causing
> Marc's problem.
> 
> Marc, could you check what is returned by sysconf(_SC_OPEN_MAX) on your
> box?  And/or check to see how many files each backend is actually
> holding open?

> ./t
4136


> sysctl kern.maxfiles
kern.maxfiles: 32768


> cat t.c
#include <stdio.h>
#include <unistd.h>

main()
{ printf("%ld\n", sysconf(_SC_OPEN_MAX));
}

okay, slightly difficult since they come and go, but using the database
that is used for the search engine, with just a psql session:

pgsql# lsof -p 85333
COMMAND    PID  USER   FD   TYPE     DEVICE  SIZE/OFF   NODE NAME
postgres 85333 pgsql  cwd   VDIR  13,131088      3072   7936 /pgsql/data2/udmsearch
postgres 85333 pgsql  rtd   VDIR  13,131072       512      2 /
postgres 85333 pgsql  txt   VREG  13,131084   4651486 103175 /pgsql/bin/postgres
postgres 85333 pgsql  txt   VREG  13,131076     77648 212924 /usr/libexec/ld-elf.so.1
postgres 85333 pgsql  txt   VREG  13,131076     11860  56504 /usr/lib/libdescrypt.so.2
postgres 85333 pgsql  txt   VREG  13,131076    120736  56525 /usr/lib/libm.so.2
postgres 85333 pgsql  txt   VREG  13,131076     34336  56677 /usr/lib/libutil.so.3
postgres 85333 pgsql  txt   VREG  13,131076    154128  57068 /usr/lib/libreadline.so.4
postgres 85333 pgsql  txt   VREG  13,131076    270100  56532 /usr/lib/libncurses.so.5
postgres 85333 pgsql  txt   VREG  13,131076    570064  56679 /usr/lib/libc.so.4
postgres 85333 pgsql    0r  VCHR        2,2       0t0   7967 /dev/null
postgres 85333 pgsql    1w  VREG  13,131084       995 762037 /pgsql/logs/postmaster.5432.61308
postgres 85333 pgsql    2w  VREG  13,131084 316488878 762038 /pgsql/logs/5432.61308
postgres 85333 pgsql    3r  VREG  13,131088      1752   8011 /pgsql/data2/udmsearch/pg_internal.init
postgres 85333 pgsql    4u  VREG  13,131084  22757376  15922 /pgsql/data/pg_log
postgres 85333 pgsql    5u  unix 0xd46a3300       0t0        ->0xd469a540
postgres 85333 pgsql    6u  VREG  13,131084      8192  15874 /pgsql/data/pg_variable
postgres 85333 pgsql    7u  VREG  13,131088     16384   7982 /pgsql/data2/udmsearch/pg_class
postgres 85333 pgsql    8u  VREG  13,131088     32768   7980 /pgsql/data2/udmsearch/pg_class_relname_index
postgres 85333 pgsql    9u  VREG  13,131088     81920   7985 /pgsql/data2/udmsearch/pg_attribute
postgres 85333 pgsql   10u  VREG  13,131088     65536   7983 /pgsql/data2/udmsearch/pg_attribute_relid_attnum_index
postgres 85333 pgsql   11u  VREG  13,131088      8192   7945 /pgsql/data2/udmsearch/pg_trigger
postgres 85333 pgsql   12u  VREG  13,131088      8192   7993 /pgsql/data2/udmsearch/pg_am
postgres 85333 pgsql   13u  VREG  13,131088     16384   7977 /pgsql/data2/udmsearch/pg_index
postgres 85333 pgsql   14u  VREG  13,131088      8192   7988 /pgsql/data2/udmsearch/pg_amproc
postgres 85333 pgsql   15u  VREG  13,131088     16384   7991 /pgsql/data2/udmsearch/pg_amop
postgres 85333 pgsql   16u  VREG  13,131088     73728   7961 /pgsql/data2/udmsearch/pg_operator
postgres 85333 pgsql   17u  VREG  13,131088     16384   7976 /pgsql/data2/udmsearch/pg_index_indexrelid_index
postgres 85333 pgsql   18u  VREG  13,131088     32768   7960 /pgsql/data2/udmsearch/pg_operator_oid_index
postgres 85333 pgsql   19u  VREG  13,131088     16384   7976 /pgsql/data2/udmsearch/pg_index_indexrelid_index
postgres 85333 pgsql   20u  VREG  13,131088     16384   7942 /pgsql/data2/udmsearch/pg_trigger_tgrelid_index
postgres 85333 pgsql   21u  VREG  13,131084      8192  15921 /pgsql/data/pg_shadow
postgres 85333 pgsql   22u  VREG  13,131084      8192  15918 /pgsql/data/pg_database
postgres 85333 pgsql   23u  VREG  13,131088      8192   7952 /pgsql/data2/udmsearch/pg_rewrite
postgres 85333 pgsql   24u  VREG  13,131088     16384   7941 /pgsql/data2/udmsearch/pg_type
postgres 85333 pgsql   25u  VREG  13,131088     16384   7940 /pgsql/data2/udmsearch/pg_type_oid_index
postgres 85333 pgsql   26u  VREG  13,131088         0   7938 /pgsql/data2/udmsearch/pg_user
postgres 85333 pgsql   27u  VREG  13,131088    188416   7984 /pgsql/data2/udmsearch/pg_attribute_relid_attnam_index
postgres 85333 pgsql   28u  VREG  13,131088     65536   7959 /pgsql/data2/udmsearch/pg_operator_oprname_l_r_k_index
postgres 85333 pgsql   29u  VREG  13,131088     16384   7981 /pgsql/data2/udmsearch/pg_class_oid_index
postgres 85333 pgsql   30u  VREG  13,131088     40960   7948 /pgsql/data2/udmsearch/pg_statistic
postgres 85333 pgsql   31u  VREG  13,131088     32768   7947 /pgsql/data2/udmsearch/pg_statistic_relid_att_index
postgres 85333 pgsql   32u  VREG  13,131088    212992   7958 /pgsql/data2/udmsearch/pg_proc
postgres 85333 pgsql   33u  VREG  13,131088     49152   7957 /pgsql/data2/udmsearch/pg_proc_oid_index


when running a vacuum on the database, the only changes appear to be
adding (and removing when done) those tables that are currently being
vacuumed ... so, it appears, ~48 or so files opened ...




Re: Re: Too many open files (was Re: spinlock problems reported earlier)

From
Tom Lane
Date:
The Hermit Hacker <scrappy@hub.org> writes:
>> cat t.c
> #include <stdio.h>
> #include <unistd.h>

> main()
> {
>   printf("%ld\n", sysconf(_SC_OPEN_MAX));
> }

>> ./t
> 4136

Yup, there's our problem.  Each backend will feel entitled to open up to
about 4100 files, assuming it manages to hit that many distinct tables/
indexes during its run.  You probably haven't got that many, but even
several hundred files times a couple dozen backends would start pushing
your (previous) kernel FD limit.

So, at least on FreeBSD, we can't trust sysconf(_SC_OPEN_MAX) to tell us
the number we need.

An explicit parameter to the postmaster, setting the installation-wide
open file count (with default maybe about 50 * MaxBackends) is starting
to look like a good answer to me.  Comments?
        regards, tom lane


Re: Re: Too many open files (was Re: spinlock problems reported earlier)

From
The Hermit Hacker
Date:
On Mon, 28 Aug 2000, Tom Lane wrote:

> The Hermit Hacker <scrappy@hub.org> writes:
> >> cat t.c
> > #include <stdio.h>
> > #include <unistd.h>
> 
> > main()
> > {
> >   printf("%ld\n", sysconf(_SC_OPEN_MAX));
> > }
> 
> >> ./t
> > 4136
> 
> Yup, there's our problem.  Each backend will feel entitled to open up to
> about 4100 files, assuming it manages to hit that many distinct tables/
> indexes during its run.  You probably haven't got that many, but even
> several hundred files times a couple dozen backends would start pushing
> your (previous) kernel FD limit.
> 
> So, at least on FreeBSD, we can't trust sysconf(_SC_OPEN_MAX) to tell us
> the number we need.
> 
> An explicit parameter to the postmaster, setting the installation-wide
> open file count (with default maybe about 50 * MaxBackends) is starting
> to look like a good answer to me.  Comments?

Okay, if I understand correctly, this would just result in more I/O as far
as having to close off "unused files" once that 50 limit is reached?

Would it be installation-wide, or per-process?  Ie. if I have 100 as
maxbackends, and set it to 1000, could one backend suck up all 1000, or
would each max out at 10?  (note. I'm running with 192 backends right now,
and have actually pushed it to run 188 simultaneously *grin*) ...





Re: Re: Too many open files (was Re: spinlock problems reported earlier)

From
Tom Lane
Date:
The Hermit Hacker <scrappy@hub.org> writes:
>> An explicit parameter to the postmaster, setting the installation-wide
>> open file count (with default maybe about 50 * MaxBackends) is starting
>> to look like a good answer to me.  Comments?

> Okay, if I understand correctly, this would just result in more I/O as far
> as having to close off "unused files" once that 50 limit is reached?

Right, the cost is extra close() and open() kernel calls to release FDs
temporarily.

> Would it be installation-wide, or per-process?  Ie. if I have 100 as
> maxbackends, and set it to 1000, could one backend suck up all 1000, or
> would each max out at 10?

The only straightforward implementation is to take the parameter, divide
by MaxBackends, and allow each backend to have no more than that many
files open.  Any sort of dynamic allocation would require inter-backend
communication, which is probably more trouble than it's worth to avoid
a few kernel calls.

> (note. I'm running with 192 backends right now,
> and have actually pushed it to run 188 simultaneously *grin*) ...

Lessee, 8192 FDs / 192 backends = 42 per backend.  No wonder you were
running out.
        regards, tom lane


Re: Re: Too many open files (was Re: spinlock problems reported earlier)

From
The Hermit Hacker
Date:
On Mon, 28 Aug 2000, Tom Lane wrote:

> The only straightforward implementation is to take the parameter,
> divide by MaxBackends, and allow each backend to have no more than
> that many files open.  Any sort of dynamic allocation would require
> inter-backend communication, which is probably more trouble than it's
> worth to avoid a few kernel calls.

agreed, just wanted to make sure ... sound great to me ...

> > (note. I'm running with 192 backends right now,
> > and have actually pushed it to run 188 simultaneously *grin*) ...
> 
> Lessee, 8192 FDs / 192 backends = 42 per backend.  No wonder you were
> running out.

*grin*  I up'd it to 32k ... so far its max'd out at around 7175used ...




Re: Re: Too many open files (was Re: spinlock problems reported earlier)

From
Tom Lane
Date:
Department of Things that Fell Through the Cracks:

Back in August we had concluded that it is a bad idea to trust
"sysconf(_SC_OPEN_MAX)" as an indicator of how many files each backend
can safely open.  FreeBSD was reported to return 4136, and I have
since noticed that LinuxPPC returns 1024.  Both of those are
unreasonably large fractions of the actual kernel file table size.
A few dozen backends opening hundreds of files apiece will fill the
kernel file table on most Unix platforms.

I'm not sure why this didn't get dealt with, but I think it's a "must
fix" kind of problem for 7.1.  The dbadmin has *got* to be able to
limit Postgres' appetite for open file descriptors.

I propose we add a new configuration parameter, MAX_FILES_PER_PROCESS,
with a default value of about 100.  A new backend would set its
max-files setting to the smaller of this parameter or
sysconf(_SC_OPEN_MAX).

An alternative approach would be to make the parameter be total open files
across the whole installation, and divide it by MaxBackends to arrive at
the per-backend limit.  However, it'd be much harder to pick a reasonable
default value if we did it that way.

Comments?
        regards, tom lane


Tom Lane wrote:
> 
> Department of Things that Fell Through the Cracks:
> 
> Back in August we had concluded that it is a bad idea to trust
> "sysconf(_SC_OPEN_MAX)" as an indicator of how many files each backend
> can safely open.  FreeBSD was reported to return 4136, and I have
> since noticed that LinuxPPC returns 1024.  Both of those are
> unreasonably large fractions of the actual kernel file table size.
> A few dozen backends opening hundreds of files apiece will fill the
> kernel file table on most Unix platforms.
> 
> I'm not sure why this didn't get dealt with, but I think it's a "must
> fix" kind of problem for 7.1.  The dbadmin has *got* to be able to
> limit Postgres' appetite for open file descriptors.
> 
> I propose we add a new configuration parameter, MAX_FILES_PER_PROCESS,
> with a default value of about 100.  A new backend would set its
> max-files setting to the smaller of this parameter or
> sysconf(_SC_OPEN_MAX).
> 
> An alternative approach would be to make the parameter be total open files
> across the whole installation, and divide it by MaxBackends to arrive at
> the per-backend limit.  However, it'd be much harder to pick a reasonable
> default value if we did it that way.
> 
> Comments?

On Linux, at least, the 1024 file limit is a per process limit, the
system wide limit defaults to 4096 and can be easily changed by 

echo 16384 > /proc/sys/fs/file-max

(16384 is arbitrary and can be much larger)

I am all for having the ability to tune behavior over the system
reported values, but I think it should be an option which defaults to
the previous behavior.

-- 
http://www.mohawksoft.com


Re: Re: Too many open files (was Re: spinlock problems reported earlier)

From
Peter Eisentraut
Date:
Tom Lane writes:

> I'm not sure why this didn't get dealt with, but I think it's a "must
> fix" kind of problem for 7.1.  The dbadmin has *got* to be able to
> limit Postgres' appetite for open file descriptors.

Use ulimit.

> I propose we add a new configuration parameter, MAX_FILES_PER_PROCESS,
> with a default value of about 100.  A new backend would set its
> max-files setting to the smaller of this parameter or
> sysconf(_SC_OPEN_MAX).

I think this is an unreasonable interference with the customary operating
system interfaces (e.g., ulimit).  The last thing I want to hear is
"Postgres is slow and it only opens 100 files per process even though I
<did something> to allow 32 million."

-- 
Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/



Re: Re: Too many open files (was Re: spinlock problems reported earlier)

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane writes:
>> I'm not sure why this didn't get dealt with, but I think it's a "must
>> fix" kind of problem for 7.1.  The dbadmin has *got* to be able to
>> limit Postgres' appetite for open file descriptors.

> Use ulimit.

Even if ulimit exists and is able to control that parameter on a given
platform (highly unportable assumptions), it's not really a workable
answer.  fd.c has to stop short of using up all of the actual nfile
limit, or else stuff like the dynamic loader is likely to fail.

> I think this is an unreasonable interference with the customary operating
> system interfaces (e.g., ulimit).  The last thing I want to hear is
> "Postgres is slow and it only opens 100 files per process even though I
> <did something> to allow 32 million."

(1) A dbadmin who hasn't read the run-time configuration doc page (that
you did such a nice job with) is going to have lots of performance
issues besides this one.

(2) The last thing *I* want to hear is stories of a default Postgres
installation causing system-wide instability.  But if we don't insert
an open-files limit that's tighter than the "customary operating system
limit", that's exactly the situation we have, at least on several
popular platforms.
        regards, tom lane


Re: Re: Too many open files (was Re: spinlock problems reported earlier)

From
Peter Eisentraut
Date:
Maybe a setting that controls the total number of files that postmaster
plus backends can allocate among them would be useful.  If you have a per
backend setting then that sort of assumes lots of clients with relatively
little usage.  Which is probably true in many cases, but not in all.

-- 
Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/



Re: Re: Too many open files (was Re: spinlock problems reported earlier)

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Maybe a setting that controls the total number of files that postmaster
> plus backends can allocate among them would be useful.

That'd be nice if we could do it, but I don't see any inexpensive way
to get one backend to release an open FD when another one needs one.
So, divvying up the limit on an N-per-backend basis seems like the
most workable approach.
        regards, tom lane


Re: Re: Too many open files (was Re: spinlock problems reported earlier)

From
Bruce Momjian
Date:
> (1) A dbadmin who hasn't read the run-time configuration doc page (that
> you did such a nice job with) is going to have lots of performance
> issues besides this one.
> 
> (2) The last thing *I* want to hear is stories of a default Postgres
> installation causing system-wide instability.  But if we don't insert
> an open-files limit that's tighter than the "customary operating system
> limit", that's exactly the situation we have, at least on several
> popular platforms.

IMHO, let's remember we keep a cache of file descriptors open for
performance. How many file do we really need open in the cache?  I can't
imagine any performance reason to have hundreds of open file descriptors
cached.  A file open is not that big a deal.

Just because the OS says we can open 1000 files doesn't mean we should
open them just to keep a nice cache.

We are keeping them open just for performance reasons, not because we
actually need them to get work done.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Re: Too many open files (was Re: spinlock problems reported earlier)

From
Alfred Perlstein
Date:
* Tom Lane <tgl@sss.pgh.pa.us> [001223 14:16] wrote:
> Department of Things that Fell Through the Cracks:
> 
> Back in August we had concluded that it is a bad idea to trust
> "sysconf(_SC_OPEN_MAX)" as an indicator of how many files each backend
> can safely open.  FreeBSD was reported to return 4136, and I have
> since noticed that LinuxPPC returns 1024.  Both of those are
> unreasonably large fractions of the actual kernel file table size.
> A few dozen backends opening hundreds of files apiece will fill the
> kernel file table on most Unix platforms.

getdtablesize(2) on BSD should tell you the per-process limit.
sysconf on FreeBSD shouldn't lie to you.

getdtablesize should take into account limits in place.

later versions of FreeBSD have a sysctl 'kern.openfiles' which
can be checked to see if the system is approaching the systemwide
limit.

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


Re: Re: Too many open files (was Re: spinlock problems reported earlier)

From
Tatsuo Ishii
Date:
> Department of Things that Fell Through the Cracks:
> 
> Back in August we had concluded that it is a bad idea to trust
> "sysconf(_SC_OPEN_MAX)" as an indicator of how many files each backend
> can safely open.  FreeBSD was reported to return 4136, and I have
> since noticed that LinuxPPC returns 1024.  Both of those are
> unreasonably large fractions of the actual kernel file table size.
> A few dozen backends opening hundreds of files apiece will fill the
> kernel file table on most Unix platforms.
> 
> I'm not sure why this didn't get dealt with, but I think it's a "must
> fix" kind of problem for 7.1.  The dbadmin has *got* to be able to
> limit Postgres' appetite for open file descriptors.
> 
> I propose we add a new configuration parameter, MAX_FILES_PER_PROCESS,
> with a default value of about 100.  A new backend would set its
> max-files setting to the smaller of this parameter or
> sysconf(_SC_OPEN_MAX).

Seems nice idea. We have been heard lots of problem reports caused by
ruuning out of the file table.

However it would be even nicer, if it could be configurable at runtime
(at the postmaster starting up time) like -N option. Maybe
MAX_FILES_PER_PROCESS can be a hard limit?
--
Tatsuo Ishii


Re: Re: Too many open files (was Re: spinlock problems reported earlier)

From
Tom Lane
Date:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
>> I propose we add a new configuration parameter, MAX_FILES_PER_PROCESS,
>> with a default value of about 100.  A new backend would set its
>> max-files setting to the smaller of this parameter or
>> sysconf(_SC_OPEN_MAX).

> Seems nice idea. We have been heard lots of problem reports caused by
> ruuning out of the file table.

> However it would be even nicer, if it could be configurable at runtime
> (at the postmaster starting up time) like -N option.

Yes, what I meant was a GUC parameter named MAX_FILES_PER_PROCESS.
You could set it via postmaster.opts or postmaster command line switch.
        regards, tom lane