Thread: WTF is going on with PG_VERSION?
Greetings. The problem is: from time to time, PostgreSQL seems to crash. Inspection of the logs revealed the following: Сен 18 15:53:06 arbat logger: FATAL 1: File '/var/lib/pgsql/PG_VERSION' does not exist or no read permission. Well, '/var/lib/pgsql/PG_VERSION' does exist, it has read permission for user 'postgres' (I made it world readable, in fact, after I discovered this). Now, two questions: 1) Who the hell needs to read this file? 2) Why can't he do it? -- Yours, Alexey V. Borzov
On Mon, 18 Sep 2000, Alexey V. Borzov wrote: > Greetings. > > The problem is: from time to time, PostgreSQL seems to crash. > Inspection of the logs revealed the following: > > ��� 18 15:53:06 arbat logger: FATAL 1: File '/var/lib/pgsql/PG_VERSION' does not exist or no read permission. > > Well, '/var/lib/pgsql/PG_VERSION' does exist, it has read permission > for user 'postgres' (I made it world readable, in fact, after I > discovered this). > > Now, two questions: > 1) Who the hell needs to read this file? > 2) Why can't he do it? What version fo PostgreSQL are you running?
Greetings. Monday, September 18, 2000, 10:38:37 PM, you wrote: >> уЕО 18 15:53:06 arbat logger: FATAL 1: File '/var/lib/pgsql/PG_VERSION' does not exist or no read permission. >> >> Well, '/var/lib/pgsql/PG_VERSION' does exist, it has read permission >> for user 'postgres' (I made it world readable, in fact, after I >> discovered this). >> >> Now, two questions: >> 1) Who the hell needs to read this file? >> 2) Why can't he do it? THH> What version fo PostgreSQL are you running? I forgot the most important part... PostgreSQL 7.0.2 And it runs on Linux 2.2.17 SMP (The box has two Intel Pentiums II) -- Yours, Alexey V. Borzov
Maybe it was moved for PostGres v7 (I'm still using 6.5.3 because it works and I'm too lazy to upgrade. :-) but in older versions the PG_VERSION file was in the data directory (ie, /usr/local/pgsql/data/ ) Try checking what you are using for a data dir ("locate pg_database" should tell you what dir it is) and move PG_VERSION in there. Of course, your data dir could be /var/lib/pgsql, I dunno what evils RedHat does to the default Postgres install path. At 01:49 AM 9/19/00, Alexey V. Borzov wrote: >Greetings. > >Monday, September 18, 2000, 10:38:37 PM, you wrote: > >> óÅÎ 18 15:53:06 arbat logger: FATAL 1: File > '/var/lib/pgsql/PG_VERSION' does not exist or no read permission. > >> > >> Well, '/var/lib/pgsql/PG_VERSION' does exist, it has read permission > >> for user 'postgres' (I made it world readable, in fact, after I > >> discovered this). > >> > >> Now, two questions: > >> 1) Who the hell needs to read this file? > >> 2) Why can't he do it? > >THH> What version fo PostgreSQL are you running? > >I forgot the most important part... >PostgreSQL 7.0.2 >And it runs on Linux 2.2.17 SMP (The box has two Intel Pentiums II) > >-- >Yours, Alexey V. Borzov
"Alexey V. Borzov" <borz_off@rdw.ru> writes: > Greetings. > Monday, September 18, 2000, 10:38:37 PM, you wrote: >>> ��� 18 15:53:06 arbat logger: FATAL 1: File '/var/lib/pgsql/PG_VERSION' does not exist or no read permission. >>> >>> Well, '/var/lib/pgsql/PG_VERSION' does exist, it has read permission >>> for user 'postgres' (I made it world readable, in fact, after I >>> discovered this). There is also supposed to be a PG_VERSION file in each database subdirectory. For example, on my setup: $ find /opt/postgres -name 'PG_VERSION' /opt/postgres/data/base/template1/PG_VERSION /opt/postgres/data/base/tree/PG_VERSION /opt/postgres/data/base/play/PG_VERSION /opt/postgres/data/PG_VERSION If you accidentally deleted one of these per-database PG_VERSION files then future connects to that database would fail with the above message. To recover (assuming that was your only mistake), copy the top-level PG_VERSION into the subdirectory. regards, tom lane
Hello Tom, Tuesday, September 19, 2000, 8:24:01 PM, you wrote: TL> There is also supposed to be a PG_VERSION file in each database TL> subdirectory. TL> If you accidentally deleted one of these per-database PG_VERSION files TL> then future connects to that database would fail with the above TL> message. To recover (assuming that was your only mistake), copy the TL> top-level PG_VERSION into the subdirectory. Nope, that's not the problem. I just checked and every DB has its own PG_VERSION. Besides, _all_ of the databases are accessed on regular basis (I'm speaking of a website), but the crashes occur only once in a while (like, once a week)... -- Yours, Alexey V. Borzov
"Alexey V. Borzov" <borz_off@rdw.ru> writes: > Nope, that's not the problem. I just checked and every DB has its own > PG_VERSION. Besides, _all_ of the databases are accessed on regular > basis (I'm speaking of a website), but the crashes occur only once in > a while (like, once a week)... Does anything else get flaky on that system at the same times? I'm wondering if you could be running out of kernel filetable slots, so that the open of PG_VERSION is failing with ENFILE. (This would be the trouble spot just because it's the first file a new backend tries to open, and being a new backend it has no possible recovery tactic like closing other files. Once a backend is up and running it can usually survive ENFILE open failures by closing off other files.) Being out of kernel filetable slots would usually cause a lot of unrelated programs to start having troubles too, though, so I'd think you'd be seeing other symptoms as well. If that's it, the solution is either to alter your kernel parameters to increase NFILE, or to reduce the allowed number of concurrent backends, or both. If you're not sure, you could modify the failing code (it's in ValidatePgVersion() in src/utils/version.c) to print the kernel errno value as part of the error message. regards, tom lane
Hi guys, Where can I get a compiled version of the latest JDBC driver? The one I have (downloaded from FTP a few days ago) gives errors when using DatabaseMetaData - which seem to be fixed in CVS ages ago. Ideas? Mike
>I'm wondering if you could be running out of kernel filetable slots, >so that the open of PG_VERSION is failing with ENFILE. (This would be An interesting slashdot thread (that's saying alot, since I despise the place ;) yesterday mentioned putting cached stuff in RAM drives. Would that alleviate the problem, if one could load just the PG_VERSION's in there, or would it still need to allocate it when it was trying to read it? Don't think it's a real solution, merely curious :) Rob Nelson rdnelson@co.centre.pa.us
On Wed, 20 Sep 2000, Robert D. Nelson wrote: > >I'm wondering if you could be running out of kernel filetable slots, > >so that the open of PG_VERSION is failing with ENFILE. (This would be > > An interesting slashdot thread (that's saying alot, since I despise the > place ;) yesterday mentioned putting cached stuff in RAM drives. Would that > alleviate the problem, if one could load just the PG_VERSION's in there, or > would it still need to allocate it when it was trying to read it? Don't > think it's a real solution, merely curious :) It still needs a file descritpor for each file opened ... for my system, the default was somethign like 4k file descriptors, which I was blowing away on a regular basis. I ended up finally settling on 32k file descriptors and the problem hasn't resurfaced ... I wasn't getting a PG_VERSION file problem, but that appears to be the direction Tom is thinking right now ...
Greetings, Tom! At 20.09.2000, 10:41, you wrote: TL> "Alexey V. Borzov" <borz_off@rdw.ru> writes: >> Nope, that's not the problem. I just checked and every DB has its own >> PG_VERSION. Besides, _all_ of the databases are accessed on regular >> basis (I'm speaking of a website), but the crashes occur only once in >> a while (like, once a week)... TL> I'm wondering if you could be running out of kernel filetable slots, TL> so that the open of PG_VERSION is failing with ENFILE. (This would be TL> the trouble spot just because it's the first file a new backend tries TL> to open, and being a new backend it has no possible recovery tactic TL> like closing other files. Once a backend is up and running it can TL> usually survive ENFILE open failures by closing off other files.) This MIGHT be problem. I'm not sure, as it wasn't me who compiled the kernel for the box, but I'll look into it... Well, last question then: I wasn't too specific, but the problem with this crash is that not ONE SINGLE backend fails, but ALL OF THEM AT ONCE: someone comes running to me and shouts 'our site is down!', when I login and type 'ps eax | grep postgres' there are no postgres processes in memory... Which is strange, as I connect to Postgres from PHP, and use `persistent` connections, so the backends which are in memory should have already read their PG_VERSIONs... Is it as it should be with ENFILE failure? TL> If that's it, the solution is either to alter your kernel parameters to TL> increase NFILE, or to reduce the allowed number of concurrent backends, TL> or both. Guess we should increase file slots, as reducing the number of backends is definitely NOT an option. -- Yours, Alexey V. Borzov
On Thu, 21 Sep 2000, Alexey Borzov wrote: > Greetings, Tom! > > At 20.09.2000, 10:41, you wrote: > > TL> "Alexey V. Borzov" <borz_off@rdw.ru> writes: > >> Nope, that's not the problem. I just checked and every DB has its own > >> PG_VERSION. Besides, _all_ of the databases are accessed on regular > >> basis (I'm speaking of a website), but the crashes occur only once in > >> a while (like, once a week)... > TL> I'm wondering if you could be running out of kernel filetable slots, > TL> so that the open of PG_VERSION is failing with ENFILE. (This would be > TL> the trouble spot just because it's the first file a new backend tries > TL> to open, and being a new backend it has no possible recovery tactic > TL> like closing other files. Once a backend is up and running it can > TL> usually survive ENFILE open failures by closing off other files.) > > This MIGHT be problem. I'm not sure, as it wasn't me who compiled > the kernel for the box, but I'll look into it... > > Well, last question then: I wasn't too specific, but the problem > with this crash is that not ONE SINGLE backend fails, but ALL OF > THEM AT ONCE: someone comes running to me and shouts 'our site is > down!', when I login and type 'ps eax | grep postgres' there > are no postgres processes in memory... Which is strange, as I > connect to Postgres from PHP, and use `persistent` connections, so > the backends which are in memory should have already read their > PG_VERSIONs... > Is it as it should be with ENFILE failure? that is as it was when we were hitting it ... we are actually running a db on 4 seperate ports, and we would see one db beign down and the rest running happily along ... as soon as one db goes for that last slot and can't find it, that one would completely shut down, as its the 'parent process' that appears to be the one going for it ...
Well, thanks to everybody who helped! It was indeed the problem with opening files - the limit was set to 1024 with more than 100 possible backends... Well, I suppose it wouldn't hurt to change the error message in the future versions of Postgres, 'cause now it is somewhat... misleading... ;-> Greetings, The Hermit Hacker! At 21.09.2000, 13:34, you wrote: >> Well, last question then: I wasn't too specific, but the problem >> with this crash is that not ONE SINGLE backend fails, but ALL OF >> THEM AT ONCE: someone comes running to me and shouts 'our site is >> down!', when I login and type 'ps eax | grep postgres' there >> are no postgres processes in memory... Which is strange, as I >> connect to Postgres from PHP, and use `persistent` connections, so >> the backends which are in memory should have already read their >> PG_VERSIONs... >> Is it as it should be with ENFILE failure? THH> that is as it was when we were hitting it ... we are actually running a db THH> on 4 seperate ports, and we would see one db beign down and the rest THH> running happily along ... as soon as one db goes for that last slot and THH> can't find it, that one would completely shut down, as its the 'parent THH> process' that appears to be the one going for it ... -- Yours, Alexey V. Borzov
Alexey Borzov <borz_off@rdw.ru> writes: > It was indeed the problem with opening files - the limit was set > to 1024 with more than 100 possible backends... > Well, I suppose it wouldn't hurt to change the error message in > the future versions of Postgres, 'cause now it is somewhat... > misleading... ;-> I've tried of late to make sure that all file-open calls in the backend will include the kernel errno string in their failure error messages. There may still be a few stragglers but most of 'em should report something about "too many open files" if they fail. regards, tom lane
Greetings, When ever a normal user (anyaccount besides "postgres") tries to run psql they get the following message: Connection to database 'dank' failed. FATAL 1: SetUserId: user "chrisp" is not in "pg_shadow" How do i fix this?
On Fri, Sep 22, 2000 at 10:36:53PM +0000, Chris wrote: > Greetings, > > When ever a normal user (anyaccount besides "postgres") tries to > run psql they get the following message: > > Connection to database 'dank' failed. > FATAL 1: SetUserId: user "chrisp" is not in "pg_shadow" You probably have not added any users to the database. For more info, check out: http://www.postgresql.org/docs/user/app-createuser.htm You may also want to create a database for each user to access. For more info on that, see this page: http://www.postgresql.org/docs/user/app-createdb.htm Finally, now is probably a good time to secure Postgres by setting up pg_hba.conf . It's documented here: http://www.postgresql.org/docs/admin/client-authentication.htm HTH, Neil -- Neil Conway <neilconway@home.com> Get my GnuPG key from: http://klamath.dyndns.org/mykey.asc Encrypted mail welcomed Blaming guns for Columbine is like blaming spoons for Rosie O'Donnell being fat.
Attachment
Read this tutorial it has got everyting you need to know about psql. Document herewith attached On Sat, 23 Sep 2000, Chris wrote: > Greetings, > > When ever a normal user (anyaccount besides "postgres") tries to > run psql they get the following message: > > Connection to database 'dank' failed. > FATAL 1: SetUserId: user "chrisp" is not in "pg_shadow" > > > How do i fix this?