Thread: After a crash all my tables got wiped, but still using disk space.

After a crash all my tables got wiped, but still using disk space.

From

Samuel Abreu de Paula

Date:

07 January 2010, 16:33:25

Hello all, im having a problem with a crashed database and im looking
for help to try recover any data from my last backup.

Thats the situation, i found this error on postgresql logs 15 days ago:

 invalid memory alloc request size 4294967295

Then,  10 minutes later i got on my logs:

:  processo do servidor (PID 23639) foi terminado pelo sinal 11     --
server process terminated by signal 11
:  terminando quaisquer outros processos do servidor ativos
*** glibc detected *** postgres: schema database IPADDRESS19207) idle:
corrupted double-linked list: 0x0882a080 ***
======= Backtrace: =========
/lib/libc.so.6[0x969b67]
/lib/libc.so.6(__libc_malloc+0x7b)[0x96b3ab]
/lib/libc.so.6[0x92618c]
/lib/libc.so.6[0x924606]
/lib/libc.so.6[0x923fd5]
/lib/libc.so.6(dcgettext+0x43)[0x922f93]
postgres: schema database IPADDRESS(19207) idle(errhint+0x5c)[0x8233f7c]
postgres: schema database IPADDRESS(19207) idle(quickdie+0x6c)[0x81ba58c]
...

Few hours later the machine complete freeze, its a old machine, and
was holliday, so the support guy just reboot the machine and go away.

Some services return ok, but during the week some errors begins popup,
checking the log errors we get:

could not open "pg_clog/0202": File or directory not found.

One of the db admin google this error and found this answer in pt-br:

http://listas.postgresql.org.br/pipermail/pgbr-geral/2008-May/009199.html

In short terms, it says to create the files with dd if=/dev/zero
of=pgdata/pg_clog/0001 bs=512b count=1

and run pg_resetxlog -f -x 0x100000 -l 0x1,0x1,0x65

After do that, the query select * from pg_tables returns an error at
pg_toast index,  after some REINDEX table at it, it was possible to do
this query, but only returns tables from postgresql, but no to my
tables.

Now, if i do a select on one of my large tables, it takes a good
amount of time to return an answer, but returns without any rows, and
if i try to run REINDEX table mytable, i get the error:

ERROR:  could not create unique index
DETAIL:  Table contains duplicated values.

Now im here looking for help.

Anyone know anything i can do to try access my data, im restoring my
last backup, but i want try everything to get the most recent before
give up.

Thanks in advance.

PS: Please, cc me a reply cos im not in the pgsql list.



Samuel Abreu de Paula
samuel@debian-ce.org
Mike Ditka  - "If God had wanted man to play soccer, he wouldn't have
given us arms."

Re: After a crash all my tables got wiped, but still using disk space.

From

Craig Ringer

Date:

08 January 2010, 00:00:39

On 8/01/2010 3:47 AM, Samuel Abreu de Paula wrote:
> Hello all, im having a problem with a crashed database and im looking
> for help to try recover any data from my last backup.
>
> Thats the situation, i found this error on postgresql logs 15 days ago:
>
>   invalid memory alloc request size 4294967295
>
> Then,  10 minutes later i got on my logs:
>
> :  processo do servidor (PID 23639) foi terminado pelo sinal 11     --
> server process terminated by signal 11
> :  terminando quaisquer outros processos do servidor ativos
> *** glibc detected *** postgres: schema database IPADDRESS19207) idle:
> corrupted double-linked list: 0x0882a080 ***
> ======= Backtrace: =========
> /lib/libc.so.6[0x969b67]
> /lib/libc.so.6(__libc_malloc+0x7b)[0x96b3ab]
> /lib/libc.so.6[0x92618c]
> /lib/libc.so.6[0x924606]
> /lib/libc.so.6[0x923fd5]
> /lib/libc.so.6(dcgettext+0x43)[0x922f93]
> postgres: schema database IPADDRESS(19207) idle(errhint+0x5c)[0x8233f7c]
> postgres: schema database IPADDRESS(19207) idle(quickdie+0x6c)[0x81ba58c]
> ...
>
> Few hours later the machine complete freeze, its a old machine, and
> was holliday, so the support guy just reboot the machine and go away.

Sounds like it might've hit memory or swap corruption issues. I had some
fun a while ago tracking down issues with a server that had a defective
disk that was scrambling the odd byte of swap when it was read back from
disk...

I wouldn't trust that hardware as far as I could throw it.

> Some services return ok, but during the week some errors begins popup,
> checking the log errors we get:
>
> could not open "pg_clog/0202": File or directory not found.

After the reboot, did you check the file systems to make sure there was
no file system corruption or other damage?

Have you checked the disks to make sure they're OK? Does it have RAID
and if so are the arrays happy and fully in sync?

You're not running Pg with fsync=off are you?

> Now im here looking for help.
>
> Anyone know anything i can do to try access my data, im restoring my
> last backup, but i want try everything to get the most recent before
> give up.

I hope you're restoring your backup to a different machine with
trustworthy hardware.

I don't have any advice personally re recovery of the recent data. It
sounds like your catalogs may be pretty messed up, but I really don't
know what to do about that beyond restoring from backups.

--
Craig Ringer