Thread: corrupt data

corrupt data

From
"Zeno R.R. Davatz"
Date:
Hi List

Out of some reason our data of our postgresql database has been corrupted. When we try to connect to the database we
get:

psql: ERROR:  _mdfd_getrelnfd: cannot open relation pg_type_oid_index:
No such file or directory

REINDEX does not work. We get the same error.

pgfsck gives us: wrong blockseize.

We are using postgresql 7.3.3-1 (debian).

Thanks for any hints and feedback.

Zeno

Re: corrupt data

From
"scott.marlowe"
Date:
On Fri, 11 Jul 2003, Zeno R.R. Davatz wrote:

> Hi List
>
> Out of some reason our data of our postgresql database has been corrupted. When we try to connect to the database we
get:
>
> psql: ERROR:  _mdfd_getrelnfd: cannot open relation pg_type_oid_index:
> No such file or directory
>
> REINDEX does not work. We get the same error.
>
> pgfsck gives us: wrong blockseize.
>
> We are using postgresql 7.3.3-1 (debian).
>
> Thanks for any hints and feedback.

Sounds like a bad block on the hard drive maybe?

Maybe bad ram.

You should probably test the drive and memory for errors.

Can you backup parts of the database?  I.e. recover most in place and then
restore the parts you can't from backups...


Re: corrupt data

From
Robert Treat
Date:
On Fri, 2003-07-11 at 01:58, Zeno R.R. Davatz wrote:
> Hi List
>
> Out of some reason our data of our postgresql database has been corrupted. When we try to connect to the database we
get:
>
> psql: ERROR:  _mdfd_getrelnfd: cannot open relation pg_type_oid_index:
> No such file or directory
>
> REINDEX does not work. We get the same error.
>
> pgfsck gives us: wrong blockseize.
>
> We are using postgresql 7.3.3-1 (debian).
>

uh oh, sounds like you might have some flakey hardware.

try vacuumdb template1 and if that gets you in you might want to do a
pg_dump and then do some hardware testing.

Robert Treat
--
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


corrupt data - caused by lm-sensors?

From
"Zeno Davatz"
Date:
Robert Treat writes:

> On Fri, 2003-07-11 at 01:58, Zeno R.R. Davatz wrote:
>> Hi List
>>
>> Out of some reason our data of our postgresql database has been corrupted. When we try to connect to the database we
get: 
>>
>> psql: ERROR:  _mdfd_getrelnfd: cannot open relation pg_type_oid_index:
>> No such file or directory
>>
>> REINDEX does not work. We get the same error.
>>
>> pgfsck gives us: wrong blockseize.
>>
>> We are using postgresql 7.3.3-1 (debian).
>>
>
> uh oh, sounds like you might have some flakey hardware.
>
> try vacuumdb template1 and if that gets you in you might want to do a
> pg_dump and then do some hardware testing.
Thanks for the hint. In the meantime we could get access to some data with
pgfsck 0.14 - great tool, great maintainer, very dedicated (Tom Lane says
pgfsck does not exist).

What I forgot to mention:
I installed lm-sensors (cvs) version and force-loaded some drivers. Then
sensors-detect gave me a segmentation fault. Do you think that could have
caused the data-corruption?

Thanks for feedback.

Zeno

Re: corrupt data - caused by lm-sensors?

From
Will LaShell
Date:
On Sat, 2003-07-12 at 12:43, Zeno Davatz wrote:
> Robert Treat writes:
>
> > On Fri, 2003-07-11 at 01:58, Zeno R.R. Davatz wrote:
> >> Hi List
> >>
> >> Out of some reason our data of our postgresql database has been corrupted. When we try to connect to the database
weget:  
> >>
> >> psql: ERROR:  _mdfd_getrelnfd: cannot open relation pg_type_oid_index:
> >> No such file or directory
> >>
> >> REINDEX does not work. We get the same error.
> >>
> >> pgfsck gives us: wrong blockseize.
> >>
> >> We are using postgresql 7.3.3-1 (debian).
> >>
> >
> > uh oh, sounds like you might have some flakey hardware.
> >
> > try vacuumdb template1 and if that gets you in you might want to do a
> > pg_dump and then do some hardware testing.
> Thanks for the hint. In the meantime we could get access to some data with
> pgfsck 0.14 - great tool, great maintainer, very dedicated (Tom Lane says
> pgfsck does not exist).
>
> What I forgot to mention:
> I installed lm-sensors (cvs) version and force-loaded some drivers. Then
> sensors-detect gave me a segmentation fault. Do you think that could have
> caused the data-corruption?

You did this on a production system? At any rate, depending on what /
how it segfaulted it has a good chance of confirming what the others
have suggested in that you have some bad hardware on that machine. Note
that this shouldn't have caused the drive corruption unless the segfault
took out a piece of the disk io kernel subsystem. Although I have to say
that using development versions of software on production machines is
not necessarily the most safe practice one can engage in.

Sincerely,

Will LaShell

>
> Thanks for feedback.
>
> Zeno


Attachment

Re: corrupt data - caused by lm-sensors?

From
"Zeno R.R. Davatz"
Date:
On 14 Jul 2003 14:27:33 -0700
Will LaShell <will@lashell.net> wrote:

> On Sat, 2003-07-12 at 12:43, Zeno Davatz wrote:
> > Robert Treat writes:
> >
> > > On Fri, 2003-07-11 at 01:58, Zeno R.R. Davatz wrote:
> > >> Hi List
> > >>
> > >> Out of some reason our data of our postgresql database has been corrupted. When we try to connect to the
databasewe get:  
> > >>
> > >> psql: ERROR:  _mdfd_getrelnfd: cannot open relation pg_type_oid_index:
> > >> No such file or directory
> > >>
> > >> REINDEX does not work. We get the same error.
> > >>
> > >> pgfsck gives us: wrong blockseize.
> > >>
> > >> We are using postgresql 7.3.3-1 (debian).
> > >>
> > >
> > > uh oh, sounds like you might have some flakey hardware.
> > >
> > > try vacuumdb template1 and if that gets you in you might want to do a
> > > pg_dump and then do some hardware testing.
> > Thanks for the hint. In the meantime we could get access to some data with
> > pgfsck 0.14 - great tool, great maintainer, very dedicated (Tom Lane says
> > pgfsck does not exist).
> >
> > What I forgot to mention:
> > I installed lm-sensors (cvs) version and force-loaded some drivers. Then
> > sensors-detect gave me a segmentation fault. Do you think that could have
> > caused the data-corruption?
>
> You did this on a production system? At any rate, depending on what /
> how it segfaulted it has a good chance of confirming what the others
> have suggested in that you have some bad hardware on that machine. Note
> that this shouldn't have caused the drive corruption unless the segfault
> took out a piece of the disk io kernel subsystem. Although I have to say
> that using development versions of software on production machines is
> not necessarily the most safe practice one can engage in.

Thanks for the info. Do you know of any tool to check my hardware especially the hdd's?

Thanks for feedback.

Zeno