Thread: database corruption

database corruption

From
"Jeff Brenton"
Date:

I’ve encountered some db corruption after restarting postgres on my database server running 8.2.4.  I think that postgres did not shut down cleanly.  Postgres started appropriately but crashed 45 minutes later.  I used pg_resetxlog after the crash to get the db to start again but it appears that the database is not running properly now.  When users try to access some of the tables in the db they get the error below;

 

ERROR:  index "testrun_log_pkey" contains unexpected zero page at block 3155408

HINT:  Please REINDEX it.}> <SQL environment diagnostic: no diagnostic record

SQL connection is null

SQL statement diagnostic: XX002 7 {Error while executing the query;

ERROR:  index "testrun_log_pkey" contains unexpected zero page at block 3155408

HINT:  Please REINDEX it.}

 

I’ve attempted to re-index the pkey listed but after an hour it fails with

 

REINDEX INDEX testrun_log_pkey;

 

ERROR:  could not write block 1832079 of temporary file: No space left on device

HINT:  Perhaps out of disk space?

 

There is currently 14GB free on the disk that postgres is installed on.  Does anyone know what I can do to get the db up and running again? 

 

/dev/amrd0s1a    3.9G    2.7G    898M    75%    /

/dev/amrd0s1e    115G     43G     63G    40%    /backup

/dev/amrd1s1d    133G    748M    121G     1%    /wal

/dev/amrd2s1d    663G    596G     14G    98%    /db

/dev/amrd0s1d    3.9G    184M    3.4G     5%    /var

 

Re: database corruption

From
Chris
Date:
I would imagine you would have better luck dropping the index and recreating.  But considering you're 98% full on that drive, it looks like you're about to have other problems...

On Wed, Apr 8, 2009 at 8:32 PM, Jeff Brenton <jbrenton@sandvine.com> wrote:

I’ve encountered some db corruption after restarting postgres on my database server running 8.2.4.  I think that postgres did not shut down cleanly.  Postgres started appropriately but crashed 45 minutes later.  I used pg_resetxlog after the crash to get the db to start again but it appears that the database is not running properly now.  When users try to access some of the tables in the db they get the error below;

 

ERROR:  index "testrun_log_pkey" contains unexpected zero page at block 3155408

HINT:  Please REINDEX it.}> <SQL environment diagnostic: no diagnostic record

SQL connection is null

SQL statement diagnostic: XX002 7 {Error while executing the query;

ERROR:  index "testrun_log_pkey" contains unexpected zero page at block 3155408

HINT:  Please REINDEX it.}

 

I’ve attempted to re-index the pkey listed but after an hour it fails with

 

REINDEX INDEX testrun_log_pkey;

 

ERROR:  could not write block 1832079 of temporary file: No space left on device

HINT:  Perhaps out of disk space?

 

There is currently 14GB free on the disk that postgres is installed on.  Does anyone know what I can do to get the db up and running again? 

 

/dev/amrd0s1a    3.9G    2.7G    898M    75%    /

/dev/amrd0s1e    115G     43G     63G    40%    /backup

/dev/amrd1s1d    133G    748M    121G     1%    /wal

/dev/amrd2s1d    663G    596G     14G    98%    /db

/dev/amrd0s1d    3.9G    184M    3.4G     5%    /var

 




--
Chris Spotts
rfusca@gmail.com

Re: database corruption

From
Adrian Klaver
Date:
On Wednesday 08 April 2009 6:32:06 pm Jeff Brenton wrote:
> I've encountered some db corruption after restarting postgres on my
> database server running 8.2.4.  I think that postgres did not shut down
> cleanly.  Postgres started appropriately but crashed 45 minutes later.
> I used pg_resetxlog after the crash to get the db to start again but it
> appears that the database is not running properly now.  When users try
> to access some of the tables in the db they get the error below;
>
>
>
> ERROR:  index "testrun_log_pkey" contains unexpected zero page at block
> 3155408
>
> HINT:  Please REINDEX it.}> <SQL environment diagnostic: no diagnostic
> record
>
> SQL connection is null
>
> SQL statement diagnostic: XX002 7 {Error while executing the query;
>
> ERROR:  index "testrun_log_pkey" contains unexpected zero page at block
> 3155408
>
> HINT:  Please REINDEX it.}
>
>
>
> I've attempted to re-index the pkey listed but after an hour it fails
> with
>
>
>
> REINDEX INDEX testrun_log_pkey;
>
>
>
> ERROR:  could not write block 1832079 of temporary file: No space left
> on device
>
> HINT:  Perhaps out of disk space?
>
>
>
> There is currently 14GB free on the disk that postgres is installed on.
> Does anyone know what I can do to get the db up and running again?

I guess the first question is, does the db have permissions(access) to all that
space?

>
>
>
> /dev/amrd0s1a    3.9G    2.7G    898M    75%    /
>
> /dev/amrd0s1e    115G     43G     63G    40%    /backup
>
> /dev/amrd1s1d    133G    748M    121G     1%    /wal
>
> /dev/amrd2s1d    663G    596G     14G    98%    /db
>
> /dev/amrd0s1d    3.9G    184M    3.4G     5%    /var



--
Adrian Klaver
aklaver@comcast.net

Re: database corruption

From
"Jeff Brenton"
Date:
There are no filesystem level content size restrictions that I am aware
of on this system.  The user pgsql should have full access to the
filesystems indicated except for the root filesystem.

Where is the temporary location?  I am searching around to see if I can
specify it anywhere in the config files but can't seem to find anything
which leads me to believe that its part of the postgres data directory.


-----Original Message-----
From: Adrian Klaver [mailto:aklaver@comcast.net]
Sent: Wednesday, April 08, 2009 10:10 PM
To: pgsql-general@postgresql.org
Cc: Jeff Brenton
Subject: Re: [GENERAL] database corruption

On Wednesday 08 April 2009 6:32:06 pm Jeff Brenton wrote:
> I've encountered some db corruption after restarting postgres on my
> database server running 8.2.4.  I think that postgres did not shut
down
> cleanly.  Postgres started appropriately but crashed 45 minutes later.
> I used pg_resetxlog after the crash to get the db to start again but
it
> appears that the database is not running properly now.  When users try
> to access some of the tables in the db they get the error below;
>
>
>
> ERROR:  index "testrun_log_pkey" contains unexpected zero page at
block
> 3155408
>
> HINT:  Please REINDEX it.}> <SQL environment diagnostic: no diagnostic
> record
>
> SQL connection is null
>
> SQL statement diagnostic: XX002 7 {Error while executing the query;
>
> ERROR:  index "testrun_log_pkey" contains unexpected zero page at
block
> 3155408
>
> HINT:  Please REINDEX it.}
>
>
>
> I've attempted to re-index the pkey listed but after an hour it fails
> with
>
>
>
> REINDEX INDEX testrun_log_pkey;
>
>
>
> ERROR:  could not write block 1832079 of temporary file: No space left
> on device
>
> HINT:  Perhaps out of disk space?
>
>
>
> There is currently 14GB free on the disk that postgres is installed
on.
> Does anyone know what I can do to get the db up and running again?

I guess the first question is, does the db have permissions(access) to
all that
space?

>
>
>
> /dev/amrd0s1a    3.9G    2.7G    898M    75%    /
>
> /dev/amrd0s1e    115G     43G     63G    40%    /backup
>
> /dev/amrd1s1d    133G    748M    121G     1%    /wal
>
> /dev/amrd2s1d    663G    596G     14G    98%    /db
>
> /dev/amrd0s1d    3.9G    184M    3.4G     5%    /var



--
Adrian Klaver
aklaver@comcast.net

Re: database corruption

From
"Jeff Brenton"
Date:

I was looking at dropping the index and recreating.  Part of the reason that I restarted postgres was to enable WAL archiving so that I can migrate to a larger filesystem next week.  I’ve got a system with a 1.3T array that I will be migrating to.  This DB has been neglected and I am trying to fix it but things are arguing with me.

 

Will dropping the index have any negative consequences if the indexed table has a primary key associated with it?  I think not but want to be certain.   

 


From: Chris [mailto:rfusca@gmail.com]
Sent: Wednesday, April 08, 2009 10:08 PM
To: Jeff Brenton
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] database corruption

 

I would imagine you would have better luck dropping the index and recreating.  But considering you're 98% full on that drive, it looks like you're about to have other problems...

 

On Wed, Apr 8, 2009 at 8:32 PM, Jeff Brenton <jbrenton@sandvine.com> wrote:

I’ve encountered some db corruption after restarting postgres on my database server running 8.2.4.  I think that postgres did not shut down cleanly.  Postgres started appropriately but crashed 45 minutes later.  I used pg_resetxlog after the crash to get the db to start again but it appears that the database is not running properly now.  When users try to access some of the tables in the db they get the error below;

 

ERROR:  index "testrun_log_pkey" contains unexpected zero page at block 3155408

HINT:  Please REINDEX it.}> <SQL environment diagnostic: no diagnostic record

SQL connection is null

SQL statement diagnostic: XX002 7 {Error while executing the query;

ERROR:  index "testrun_log_pkey" contains unexpected zero page at block 3155408

HINT:  Please REINDEX it.}

 

I’ve attempted to re-index the pkey listed but after an hour it fails with

 

REINDEX INDEX testrun_log_pkey;

 

ERROR:  could not write block 1832079 of temporary file: No space left on device

HINT:  Perhaps out of disk space?

 

There is currently 14GB free on the disk that postgres is installed on.  Does anyone know what I can do to get the db up and running again? 

 

/dev/amrd0s1a    3.9G    2.7G    898M    75%    /

/dev/amrd0s1e    115G     43G     63G    40%    /backup

/dev/amrd1s1d    133G    748M    121G     1%    /wal

/dev/amrd2s1d    663G    596G     14G    98%    /db

/dev/amrd0s1d    3.9G    184M    3.4G     5%    /var

 




--
Chris Spotts
rfusca@gmail.com

Re: database corruption

From
"Joshua D. Drake"
Date:
On Wed, 2009-04-08 at 22:14 -0400, Jeff Brenton wrote:
> There are no filesystem level content size restrictions that I am aware
> of on this system.  The user pgsql should have full access to the
> filesystems indicated except for the root filesystem.

Inodes?

>
> Where is the temporary location?  I am searching around to see if I can
> specify it anywhere in the config files but can't seem to find anything
> which leads me to believe that its part of the postgres data directory.
>
>
> -----Original Message-----
> From: Adrian Klaver [mailto:aklaver@comcast.net]
> Sent: Wednesday, April 08, 2009 10:10 PM
> To: pgsql-general@postgresql.org
> Cc: Jeff Brenton
> Subject: Re: [GENERAL] database corruption
>
> On Wednesday 08 April 2009 6:32:06 pm Jeff Brenton wrote:
> > I've encountered some db corruption after restarting postgres on my
> > database server running 8.2.4.  I think that postgres did not shut
> down
> > cleanly.  Postgres started appropriately but crashed 45 minutes later.
> > I used pg_resetxlog after the crash to get the db to start again but
> it
> > appears that the database is not running properly now.  When users try
> > to access some of the tables in the db they get the error below;
> >
> >
> >
> > ERROR:  index "testrun_log_pkey" contains unexpected zero page at
> block
> > 3155408
> >
> > HINT:  Please REINDEX it.}> <SQL environment diagnostic: no diagnostic
> > record
> >
> > SQL connection is null
> >
> > SQL statement diagnostic: XX002 7 {Error while executing the query;
> >
> > ERROR:  index "testrun_log_pkey" contains unexpected zero page at
> block
> > 3155408
> >
> > HINT:  Please REINDEX it.}
> >
> >
> >
> > I've attempted to re-index the pkey listed but after an hour it fails
> > with
> >
> >
> >
> > REINDEX INDEX testrun_log_pkey;
> >
> >
> >
> > ERROR:  could not write block 1832079 of temporary file: No space left
> > on device
> >
> > HINT:  Perhaps out of disk space?
> >
> >
> >
> > There is currently 14GB free on the disk that postgres is installed
> on.
> > Does anyone know what I can do to get the db up and running again?
>
> I guess the first question is, does the db have permissions(access) to
> all that
> space?
>
> >
> >
> >
> > /dev/amrd0s1a    3.9G    2.7G    898M    75%    /
> >
> > /dev/amrd0s1e    115G     43G     63G    40%    /backup
> >
> > /dev/amrd1s1d    133G    748M    121G     1%    /wal
> >
> > /dev/amrd2s1d    663G    596G     14G    98%    /db
> >
> > /dev/amrd0s1d    3.9G    184M    3.4G     5%    /var
>
>
>
> --
> Adrian Klaver
> aklaver@comcast.net
>
--
PostgreSQL - XMPP: jdrake@jabber.postgresql.org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997


Re: database corruption

From
"Jeff Brenton"
Date:
There are 9 miilion inodes free on /db.  All other partitions have at
least 1/2 million free.

-----Original Message-----
From: Joshua D. Drake [mailto:jd@commandprompt.com]
Sent: Wednesday, April 08, 2009 10:26 PM
To: Jeff Brenton
Cc: Adrian Klaver; pgsql-general@postgresql.org
Subject: Re: [GENERAL] database corruption

On Wed, 2009-04-08 at 22:14 -0400, Jeff Brenton wrote:
> There are no filesystem level content size restrictions that I am
aware
> of on this system.  The user pgsql should have full access to the
> filesystems indicated except for the root filesystem.

Inodes?

>
> Where is the temporary location?  I am searching around to see if I
can
> specify it anywhere in the config files but can't seem to find
anything
> which leads me to believe that its part of the postgres data
directory.
>
>
> -----Original Message-----
> From: Adrian Klaver [mailto:aklaver@comcast.net]
> Sent: Wednesday, April 08, 2009 10:10 PM
> To: pgsql-general@postgresql.org
> Cc: Jeff Brenton
> Subject: Re: [GENERAL] database corruption
>
> On Wednesday 08 April 2009 6:32:06 pm Jeff Brenton wrote:
> > I've encountered some db corruption after restarting postgres on my
> > database server running 8.2.4.  I think that postgres did not shut
> down
> > cleanly.  Postgres started appropriately but crashed 45 minutes
later.
> > I used pg_resetxlog after the crash to get the db to start again but
> it
> > appears that the database is not running properly now.  When users
try
> > to access some of the tables in the db they get the error below;
> >
> >
> >
> > ERROR:  index "testrun_log_pkey" contains unexpected zero page at
> block
> > 3155408
> >
> > HINT:  Please REINDEX it.}> <SQL environment diagnostic: no
diagnostic
> > record
> >
> > SQL connection is null
> >
> > SQL statement diagnostic: XX002 7 {Error while executing the query;
> >
> > ERROR:  index "testrun_log_pkey" contains unexpected zero page at
> block
> > 3155408
> >
> > HINT:  Please REINDEX it.}
> >
> >
> >
> > I've attempted to re-index the pkey listed but after an hour it
fails
> > with
> >
> >
> >
> > REINDEX INDEX testrun_log_pkey;
> >
> >
> >
> > ERROR:  could not write block 1832079 of temporary file: No space
left
> > on device
> >
> > HINT:  Perhaps out of disk space?
> >
> >
> >
> > There is currently 14GB free on the disk that postgres is installed
> on.
> > Does anyone know what I can do to get the db up and running again?
>
> I guess the first question is, does the db have permissions(access) to
> all that
> space?
>
> >
> >
> >
> > /dev/amrd0s1a    3.9G    2.7G    898M    75%    /
> >
> > /dev/amrd0s1e    115G     43G     63G    40%    /backup
> >
> > /dev/amrd1s1d    133G    748M    121G     1%    /wal
> >
> > /dev/amrd2s1d    663G    596G     14G    98%    /db
> >
> > /dev/amrd0s1d    3.9G    184M    3.4G     5%    /var
>
>
>
> --
> Adrian Klaver
> aklaver@comcast.net
>
--
PostgreSQL - XMPP: jdrake@jabber.postgresql.org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997


Re: database corruption

From
Craig Ringer
Date:
Jeff Brenton wrote:

> I've attempted to re-index the pkey listed but after an hour it fails
> with
>
> REINDEX INDEX testrun_log_pkey;
>
> ERROR:  could not write block 1832079 of temporary file: No space left
> on device
>
> HINT:  Perhaps out of disk space?
>
> There is currently 14GB free on the disk that postgres is installed on.
> Does anyone know what I can do to get the db up and running again?

Is there 14GB free at the time PostgreSQL claims to run out of space? Try:

    watch -n 60 "df -m /db"

and see if the free space falls close to zero during your reindex attempt.

Personally, I'd just try to give Pg some room to breathe.

--
Craig Ringer

Re: database corruption

From
"Albe Laurenz *EXTERN*"
Date:
This thread is a top posting mess. I'll try to rearrange:

Jeff Brenton wrote:
>>>>> REINDEX INDEX testrun_log_pkey;
>>>>>
>>>>> ERROR:  could not write block 1832079 of temporary file: No space left
>>>>> on device
>>>>> HINT:  Perhaps out of disk space?
>>>>>
>>>>> There is currently 14GB free on the disk that postgres is installed on.
>>>>> Does anyone know what I can do to get the db up and running again?
[...]
>>>>> /dev/amrd2s1d    663G    596G     14G    98%    /db
>>>>
>>>> I guess the first question is, does the db have permissions(access) to
>>>> all that space?
>>>
>>> There are no filesystem level content size restrictions that I am aware
>>> of on this system.  The user pgsql should have full access to the
>>> filesystems indicated except for the root filesystem.
>>
>> Inodes?
>
> There are 9 miilion inodes free on /db.  All other partitions have at
> least 1/2 million free.

Assuming that this is ext3 on Linux, it could be space reserved for root.

What do you get if you run the following as root:

dumpe2fs /dev/amrd2s1d | grep 'Reserved block count'

Yours,
Laurenz Albe

Re: database corruption

From
Ivan Sergio Borgonovo
Date:
On Wed, 8 Apr 2009 22:14:38 -0400
"Jeff Brenton" <jbrenton@sandvine.com> wrote:

>
> There are no filesystem level content size restrictions that I am
> aware of on this system.  The user pgsql should have full access
> to the filesystems indicated except for the root filesystem.

finished inodes?
A lot of small files (even empty) somewhere?
It happened to me when I was running a spider using curl and the
spiders where not exiting properly when another process was killing
them...


--
Ivan Sergio Borgonovo
http://www.webthatworks.it