Thread: Really odd corruption problem: cannot open pg_aggregate: No such file or directory
Really odd corruption problem: cannot open pg_aggregate: No such file or directory
From
Adam Haberlach
Date:
So, one of the many machines that I support seems to have developed an incredibly odd and specific corruption that I've never seen before. Whenever a query requiring an aggregate is attempted, it spits out: cannot open pg_aggregate: No such file or directory and fails. If I do: select * from pg_class where relname='pg_aggregate'; I see that the relation exists. If I check the relfilenode in the data directory, that exists, and seems to be an object file containing what should be the basic aggregate functions. version: PostgreSQL 7.2.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2 20020903 (Red Hat Linux 8.0 3.2-7) The system ran for a few weeks before anything odd happened, and then suddenly this. Does anyone have any ideas? Now that I look at the above string, I realize that the system /is/ an Athlon processor. Does anyone know if there could be an issue between the i686 and athlon optimizations? -- Adam Haberlach | "When your product is stolen by thieves, you adam@mediariffic.com | have a police problem. When it is stolen by http://mediariffic.com | millions of honest customers, you have a | marketing problem." - GeorgeGilder
Re: Really odd corruption problem: cannot open pg_aggregate: No such file or directory
From
Doug McNaught
Date:
Adam Haberlach <adam@newsnipple.com> writes: > So, one of the many machines that I support seems to have developed > an incredibly odd and specific corruption that I've never seen before. > > Whenever a query requiring an aggregate is attempted, it spits out: > cannot open pg_aggregate: No such file or directory > and fails. Why not use 'strace' to see what file the backend is actually trying to open? -Doug
On Thu, 24 Jul 2003, Adam Haberlach wrote: > So, one of the many machines that I support seems to have developed > an incredibly odd and specific corruption that I've never seen before. > > Whenever a query requiring an aggregate is attempted, it spits out: > cannot open pg_aggregate: No such file or directory > and fails. > > If I do: > select * from pg_class where relname='pg_aggregate'; > I see that the relation exists. > > If I check the relfilenode in the data directory, that exists, and > seems to be an object file containing what should be the basic > aggregate functions. > > version: PostgreSQL 7.2.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2 20020903 (Red Hat Linux 8.0 3.2-7) > > > The system ran for a few weeks before anything odd happened, and > then suddenly this. Does anyone have any ideas? Now that I look at > the above string, I realize that the system /is/ an Athlon processor. > Does anyone know if there could be an issue between the i686 and > athlon optimizations? test your memory and drive subsystem first. memtest86.com has a nice tester for free, and on linux badblocks can do a decent job (not great, just decent) of finding bad blocks. Postgresql is good, but it can't make up for bad hardware.
Re: Really odd corruption problem: cannot open pg_aggregate: No such file or directory
From
Tom Lane
Date:
Adam Haberlach <adam@newsnipple.com> writes: > Whenever a query requiring an aggregate is attempted, it spits out: > cannot open pg_aggregate: No such file or directory > and fails. Weird. It would be useful to find out exactly what pathname it's trying to open. strace'ing the backend might be the easiest way. > Does anyone know if there could be an issue between the i686 and > athlon optimizations? Seems unlikely that it would manifest this way, if so. The error is coming from a low-level routine that would also be used for opening any other table ... regards, tom lane
Re: Really odd corruption problem: cannot open pg_aggregate: No such file or directory
From
Adam Haberlach
Date:
On Thu, Jul 24, 2003 at 10:17:06AM -0700, Adam Haberlach wrote: > So, one of the many machines that I support seems to have developed > an incredibly odd and specific corruption that I've never seen before. > > Whenever a query requiring an aggregate is attempted, it spits out: > cannot open pg_aggregate: No such file or directory > and fails. > > If I do: > select * from pg_class where relname='pg_aggregate'; > I see that the relation exists. > > If I check the relfilenode in the data directory, that exists, and > seems to be an object file containing what should be the basic > aggregate functions. > > version: PostgreSQL 7.2.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2 20020903 (Red Hat Linux 8.0 3.2-7) > > > The system ran for a few weeks before anything odd happened, and > then suddenly this. Does anyone have any ideas? Now that I look at > the above string, I realize that the system /is/ an Athlon processor. > Does anyone know if there could be an issue between the i686 and I'd like to thank everyone for the quick responses and the suggestion to strace the postmaster. open("/var/lib/pgsql/data/base/16556/16406", O_RDWR) = -1 ENOENT (No such file or directory) It looks like a file /was/ missing, and I had been looking in the wrong place to verify that it was there (the template database). I'm going to chalk this one up to bad hardware and hope it doesn't happen again. Thanks again... -- Adam Haberlach | "When your product is stolen by thieves, you adam@mediariffic.com | have a police problem. When it is stolen by http://mediariffic.com | millions of honest customers, you have a | marketing problem." - GeorgeGilder
Re: Really odd corruption problem: cannot open pg_aggregate: No such file or directory
From
"Balaji Gadhiraju"
Date:
I too got this error. This happened with Postgres 7.2.3 and Linux 2.4.20 on via processor. This happened not on just onebox but around dozen boxes. This may not be hardware problem. In our case, we create the table and use it then delete it. This activity happens very often, once a day. we run vaccum also.The problem happened on one such table. The entry for the table exists in pg_class but the actual file is missing. Onceit gets to this state, the table can not be dropped. Were there any bug fixes related to this in the later versions of postgres. I searched in the google for this error and gotsome cases but not much information why. http://www.google.com/search?hl=en&ie=UTF-8&oe=UTF-8&q=%22RelationBuildDesc%3A+can%27t+open%22 Thanks, balaji. -----Original Message----- From: Adam Haberlach [mailto:adam@newsnipple.com] Sent: Thu 7/24/2003 11:07 AM To: pgsql-hackers@postgresql.org Cc: Subject: Re: [HACKERS] Really odd corruption problem: cannot open pg_aggregate: No such file or directory On Thu, Jul 24, 2003 at 10:17:06AM -0700, Adam Haberlach wrote: > So, one of the many machines that I support seems to have developed > an incredibly odd and specific corruption that I've never seen before. > > Whenever a query requiring an aggregate is attempted, it spits out: > cannot open pg_aggregate: No such file or directory > and fails. > > If I do: > select * from pg_class where relname='pg_aggregate'; > I see that the relation exists. > > If I check the relfilenode in the data directory, that exists, and > seems to be an object file containing what should be the basic > aggregate functions. > > version: PostgreSQL 7.2.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2 20020903 (Red Hat Linux 8.0 3.2-7) > > > The system ran for a few weeks before anything odd happened, and > then suddenly this. Does anyone have any ideas? Now that I look at > the above string, I realize that the system /is/ an Athlon processor. > Does anyone know if there could be an issue between the i686 and I'd like to thank everyone for the quick responses and the suggestion to strace the postmaster. open("/var/lib/pgsql/data/base/16556/16406", O_RDWR) = -1 ENOENT (No such file or directory) It looks like a file /was/ missing, and I had been looking in the wrong place to verify that it was there (the template database). I'm going to chalk this one up to bad hardware and hope it doesn't happen again. Thanks again... -- Adam Haberlach | "When your product is stolen by thieves, you adam@mediariffic.com | have a police problem. When it is stolen by http://mediariffic.com | millions of honest customers, you have a | marketing problem." - GeorgeGilder ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)