Re: Re[4]: WTF is going on with PG_VERSION? - Mailing list pgsql-general

From Tom Lane
Subject Re: Re[4]: WTF is going on with PG_VERSION?
Date
Msg-id 10879.969394384@sss.pgh.pa.us
Whole thread Raw
In response to Re[4]: WTF is going on with PG_VERSION?  ("Alexey V. Borzov" <borz_off@rdw.ru>)
Responses Re[6]: WTF is going on with PG_VERSION?
List pgsql-general
"Alexey V. Borzov" <borz_off@rdw.ru> writes:
> Nope, that's not the problem. I just checked and every DB has its own
> PG_VERSION. Besides, _all_ of the databases are accessed on regular
> basis (I'm speaking of a website), but the crashes occur only once in
> a while (like, once a week)...

Does anything else get flaky on that system at the same times?

I'm wondering if you could be running out of kernel filetable slots,
so that the open of PG_VERSION is failing with ENFILE.  (This would be
the trouble spot just because it's the first file a new backend tries
to open, and being a new backend it has no possible recovery tactic
like closing other files.  Once a backend is up and running it can
usually survive ENFILE open failures by closing off other files.)

Being out of kernel filetable slots would usually cause a lot of
unrelated programs to start having troubles too, though, so I'd think
you'd be seeing other symptoms as well.

If that's it, the solution is either to alter your kernel parameters to
increase NFILE, or to reduce the allowed number of concurrent backends,
or both.

If you're not sure, you could modify the failing code (it's in
ValidatePgVersion() in src/utils/version.c) to print the kernel errno
value as part of the error message.

            regards, tom lane

pgsql-general by date:

Previous
From: "Alexey V. Borzov"
Date:
Subject: Re[4]: WTF is going on with PG_VERSION?
Next
From: Tom Lane
Date:
Subject: Re: nasty problem with redhat 6.2 + pg 7.02