Thread: database corruption

database corruption

From

"Jeff Brenton"

Date:

09 April 2009, 01:44:27

I’ve encountered some db corruption after restarting postgres on my database server running 8.2.4. I think that postgres did not shut down cleanly. Postgres started appropriately but crashed 45 minutes later. I used pg_resetxlog after the crash to get the db to start again but it appears that the database is not running properly now. When users try to access some of the tables in the db they get the error below;

ERROR: index "testrun_log_pkey" contains unexpected zero page at block 3155408

HINT: Please REINDEX it.}> <SQL environment diagnostic: no diagnostic record

SQL connection is null

SQL statement diagnostic: XX002 7 {Error while executing the query;

ERROR: index "testrun_log_pkey" contains unexpected zero page at block 3155408

HINT: Please REINDEX it.}

I’ve attempted to re-index the pkey listed but after an hour it fails with

REINDEX INDEX testrun_log_pkey;

ERROR: could not write block 1832079 of temporary file: No space left on device

HINT: Perhaps out of disk space?

There is currently 14GB free on the disk that postgres is installed on. Does anyone know what I can do to get the db up and running again?

/dev/amrd0s1a 3.9G 2.7G 898M 75% /

/dev/amrd0s1e 115G 43G 63G 40% /backup

/dev/amrd1s1d 133G 748M 121G 1% /wal

/dev/amrd2s1d 663G 596G 14G 98% /db

/dev/amrd0s1d 3.9G 184M 3.4G 5% /var

Re: database corruption

From

Chris

Date:

09 April 2009, 02:07:37

I would imagine you would have better luck dropping the index and recreating. But considering you're 98% full on that drive, it looks like you're about to have other problems...

On Wed, Apr 8, 2009 at 8:32 PM, Jeff Brenton <jbrenton@sandvine.com> wrote:

I’ve encountered some db corruption after restarting postgres on my database server running 8.2.4. I think that postgres did not shut down cleanly. Postgres started appropriately but crashed 45 minutes later. I used pg_resetxlog after the crash to get the db to start again but it appears that the database is not running properly now. When users try to access some of the tables in the db they get the error below;

ERROR: index "testrun_log_pkey" contains unexpected zero page at block 3155408
HINT: Please REINDEX it.}> <SQL environment diagnostic: no diagnostic record
SQL connection is null
SQL statement diagnostic: XX002 7 {Error while executing the query;
ERROR: index "testrun_log_pkey" contains unexpected zero page at block 3155408
HINT: Please REINDEX it.}

I’ve attempted to re-index the pkey listed but after an hour it fails with

REINDEX INDEX testrun_log_pkey;

ERROR: could not write block 1832079 of temporary file: No space left on device
HINT: Perhaps out of disk space?

There is currently 14GB free on the disk that postgres is installed on. Does anyone know what I can do to get the db up and running again?

/dev/amrd0s1a    3.9G    2.7G    898M    75%    /
/dev/amrd0s1e    115G     43G     63G    40%    /backup
/dev/amrd1s1d    133G    748M    121G     1%    /wal
/dev/amrd2s1d    663G    596G     14G    98%    /db
/dev/amrd0s1d    3.9G    184M    3.4G     5%    /var

--
Chris Spotts
rfusca@gmail.com

Re: database corruption

From

Adrian Klaver

Date:

09 April 2009, 02:10:13

On Wednesday 08 April 2009 6:32:06 pm Jeff Brenton wrote:
> I've encountered some db corruption after restarting postgres on my
> database server running 8.2.4.  I think that postgres did not shut down
> cleanly.  Postgres started appropriately but crashed 45 minutes later.
> I used pg_resetxlog after the crash to get the db to start again but it
> appears that the database is not running properly now.  When users try
> to access some of the tables in the db they get the error below;
>
>
>
> ERROR:  index "testrun_log_pkey" contains unexpected zero page at block
> 3155408
>
> HINT:  Please REINDEX it.}> <SQL environment diagnostic: no diagnostic
> record
>
> SQL connection is null
>
> SQL statement diagnostic: XX002 7 {Error while executing the query;
>
> ERROR:  index "testrun_log_pkey" contains unexpected zero page at block
> 3155408
>
> HINT:  Please REINDEX it.}
>
>
>
> I've attempted to re-index the pkey listed but after an hour it fails
> with
>
>
>
> REINDEX INDEX testrun_log_pkey;
>
>
>
> ERROR:  could not write block 1832079 of temporary file: No space left
> on device
>
> HINT:  Perhaps out of disk space?
>
>
>
> There is currently 14GB free on the disk that postgres is installed on.
> Does anyone know what I can do to get the db up and running again?

I guess the first question is, does the db have permissions(access) to all that
space?

>
>
>
> /dev/amrd0s1a    3.9G    2.7G    898M    75%    /
>
> /dev/amrd0s1e    115G     43G     63G    40%    /backup
>
> /dev/amrd1s1d    133G    748M    121G     1%    /wal
>
> /dev/amrd2s1d    663G    596G     14G    98%    /db
>
> /dev/amrd0s1d    3.9G    184M    3.4G     5%    /var



--
Adrian Klaver
aklaver@comcast.net

Re: database corruption

From

"Jeff Brenton"

Date:

09 April 2009, 02:14:45

There are no filesystem level content size restrictions that I am aware
of on this system.  The user pgsql should have full access to the
filesystems indicated except for the root filesystem.

Where is the temporary location?  I am searching around to see if I can
specify it anywhere in the config files but can't seem to find anything
which leads me to believe that its part of the postgres data directory.


-----Original Message-----
From: Adrian Klaver [mailto:aklaver@comcast.net]
Sent: Wednesday, April 08, 2009 10:10 PM
To: pgsql-general@postgresql.org
Cc: Jeff Brenton
Subject: Re: [GENERAL] database corruption

On Wednesday 08 April 2009 6:32:06 pm Jeff Brenton wrote:
> I've encountered some db corruption after restarting postgres on my
> database server running 8.2.4.  I think that postgres did not shut
down
> cleanly.  Postgres started appropriately but crashed 45 minutes later.
> I used pg_resetxlog after the crash to get the db to start again but
it
> appears that the database is not running properly now.  When users try
> to access some of the tables in the db they get the error below;
>
>
>
> ERROR:  index "testrun_log_pkey" contains unexpected zero page at
block
> 3155408
>
> HINT:  Please REINDEX it.}> <SQL environment diagnostic: no diagnostic
> record
>
> SQL connection is null
>
> SQL statement diagnostic: XX002 7 {Error while executing the query;
>
> ERROR:  index "testrun_log_pkey" contains unexpected zero page at
block
> 3155408
>
> HINT:  Please REINDEX it.}
>
>
>
> I've attempted to re-index the pkey listed but after an hour it fails
> with
>
>
>
> REINDEX INDEX testrun_log_pkey;
>
>
>
> ERROR:  could not write block 1832079 of temporary file: No space left
> on device
>
> HINT:  Perhaps out of disk space?
>
>
>
> There is currently 14GB free on the disk that postgres is installed
on.
> Does anyone know what I can do to get the db up and running again?

I guess the first question is, does the db have permissions(access) to
all that
space?

>
>
>
> /dev/amrd0s1a    3.9G    2.7G    898M    75%    /
>
> /dev/amrd0s1e    115G     43G     63G    40%    /backup
>
> /dev/amrd1s1d    133G    748M    121G     1%    /wal
>
> /dev/amrd2s1d    663G    596G     14G    98%    /db
>
> /dev/amrd0s1d    3.9G    184M    3.4G     5%    /var



--
Adrian Klaver
aklaver@comcast.net

Re: database corruption

From

"Jeff Brenton"

Date:

09 April 2009, 02:19:25

I was looking at dropping the index and recreating. Part of the reason that I restarted postgres was to enable WAL archiving so that I can migrate to a larger filesystem next week. I’ve got a system with a 1.3T array that I will be migrating to. This DB has been neglected and I am trying to fix it but things are arguing with me.

Will dropping the index have any negative consequences if the indexed table has a primary key associated with it? I think not but want to be certain.

From: Chris [mailto:rfusca@gmail.com]
Sent: Wednesday, April 08, 2009 10:08 PM
To: Jeff Brenton
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] database corruption

I would imagine you would have better luck dropping the index and recreating. But considering you're 98% full on that drive, it looks like you're about to have other problems...

On Wed, Apr 8, 2009 at 8:32 PM, Jeff Brenton <jbrenton@sandvine.com> wrote:

ERROR: index "testrun_log_pkey" contains unexpected zero page at block 3155408

HINT: Please REINDEX it.}> <SQL environment diagnostic: no diagnostic record

SQL connection is null

SQL statement diagnostic: XX002 7 {Error while executing the query;

ERROR: index "testrun_log_pkey" contains unexpected zero page at block 3155408

HINT: Please REINDEX it.}

I’ve attempted to re-index the pkey listed but after an hour it fails with

REINDEX INDEX testrun_log_pkey;

ERROR: could not write block 1832079 of temporary file: No space left on device

HINT: Perhaps out of disk space?

There is currently 14GB free on the disk that postgres is installed on. Does anyone know what I can do to get the db up and running again?

/dev/amrd0s1a 3.9G 2.7G 898M 75% /

/dev/amrd0s1e 115G 43G 63G 40% /backup

/dev/amrd1s1d 133G 748M 121G 1% /wal

/dev/amrd2s1d 663G 596G 14G 98% /db

/dev/amrd0s1d 3.9G 184M 3.4G 5% /var

--
Chris Spotts
rfusca@gmail.com

Re: database corruption

From

"Joshua D. Drake"

Date:

09 April 2009, 02:25:51

On Wed, 2009-04-08 at 22:14 -0400, Jeff Brenton wrote:
> There are no filesystem level content size restrictions that I am aware
> of on this system.  The user pgsql should have full access to the
> filesystems indicated except for the root filesystem.

Inodes?

>
> Where is the temporary location?  I am searching around to see if I can
> specify it anywhere in the config files but can't seem to find anything
> which leads me to believe that its part of the postgres data directory.
>
>
> -----Original Message-----
> From: Adrian Klaver [mailto:aklaver@comcast.net]
> Sent: Wednesday, April 08, 2009 10:10 PM
> To: pgsql-general@postgresql.org
> Cc: Jeff Brenton
> Subject: Re: [GENERAL] database corruption
>
> On Wednesday 08 April 2009 6:32:06 pm Jeff Brenton wrote:
> > I've encountered some db corruption after restarting postgres on my
> > database server running 8.2.4.  I think that postgres did not shut
> down
> > cleanly.  Postgres started appropriately but crashed 45 minutes later.
> > I used pg_resetxlog after the crash to get the db to start again but
> it
> > appears that the database is not running properly now.  When users try
> > to access some of the tables in the db they get the error below;
> >
> >
> >
> > ERROR:  index "testrun_log_pkey" contains unexpected zero page at
> block
> > 3155408
> >
> > HINT:  Please REINDEX it.}> <SQL environment diagnostic: no diagnostic
> > record
> >
> > SQL connection is null
> >
> > SQL statement diagnostic: XX002 7 {Error while executing the query;
> >
> > ERROR:  index "testrun_log_pkey" contains unexpected zero page at
> block
> > 3155408
> >
> > HINT:  Please REINDEX it.}
> >
> >
> >
> > I've attempted to re-index the pkey listed but after an hour it fails
> > with
> >
> >
> >
> > REINDEX INDEX testrun_log_pkey;
> >
> >
> >
> > ERROR:  could not write block 1832079 of temporary file: No space left
> > on device
> >
> > HINT:  Perhaps out of disk space?
> >
> >
> >
> > There is currently 14GB free on the disk that postgres is installed
> on.
> > Does anyone know what I can do to get the db up and running again?
>
> I guess the first question is, does the db have permissions(access) to
> all that
> space?
>
> >
> >
> >
> > /dev/amrd0s1a    3.9G    2.7G    898M    75%    /
> >
> > /dev/amrd0s1e    115G     43G     63G    40%    /backup
> >
> > /dev/amrd1s1d    133G    748M    121G     1%    /wal
> >
> > /dev/amrd2s1d    663G    596G     14G    98%    /db
> >
> > /dev/amrd0s1d    3.9G    184M    3.4G     5%    /var
>
>
>
> --
> Adrian Klaver
> aklaver@comcast.net
>
--
PostgreSQL - XMPP: jdrake@jabber.postgresql.org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997

Re: database corruption

From

"Jeff Brenton"

Date:

09 April 2009, 02:29:00

There are 9 miilion inodes free on /db.  All other partitions have at
least 1/2 million free.

-----Original Message-----
From: Joshua D. Drake [mailto:jd@commandprompt.com]
Sent: Wednesday, April 08, 2009 10:26 PM
To: Jeff Brenton
Cc: Adrian Klaver; pgsql-general@postgresql.org
Subject: Re: [GENERAL] database corruption

On Wed, 2009-04-08 at 22:14 -0400, Jeff Brenton wrote:
> There are no filesystem level content size restrictions that I am
aware
> of on this system.  The user pgsql should have full access to the
> filesystems indicated except for the root filesystem.

Inodes?

>
> Where is the temporary location?  I am searching around to see if I
can
> specify it anywhere in the config files but can't seem to find
anything
> which leads me to believe that its part of the postgres data
directory.
>
>
> -----Original Message-----
> From: Adrian Klaver [mailto:aklaver@comcast.net]
> Sent: Wednesday, April 08, 2009 10:10 PM
> To: pgsql-general@postgresql.org
> Cc: Jeff Brenton
> Subject: Re: [GENERAL] database corruption
>
> On Wednesday 08 April 2009 6:32:06 pm Jeff Brenton wrote:
> > I've encountered some db corruption after restarting postgres on my
> > database server running 8.2.4.  I think that postgres did not shut
> down
> > cleanly.  Postgres started appropriately but crashed 45 minutes
later.
> > I used pg_resetxlog after the crash to get the db to start again but
> it
> > appears that the database is not running properly now.  When users
try
> > to access some of the tables in the db they get the error below;
> >
> >
> >
> > ERROR:  index "testrun_log_pkey" contains unexpected zero page at
> block
> > 3155408
> >
> > HINT:  Please REINDEX it.}> <SQL environment diagnostic: no
diagnostic
> > record
> >
> > SQL connection is null
> >
> > SQL statement diagnostic: XX002 7 {Error while executing the query;
> >
> > ERROR:  index "testrun_log_pkey" contains unexpected zero page at
> block
> > 3155408
> >
> > HINT:  Please REINDEX it.}
> >
> >
> >
> > I've attempted to re-index the pkey listed but after an hour it
fails
> > with
> >
> >
> >
> > REINDEX INDEX testrun_log_pkey;
> >
> >
> >
> > ERROR:  could not write block 1832079 of temporary file: No space
left
> > on device
> >
> > HINT:  Perhaps out of disk space?
> >
> >
> >
> > There is currently 14GB free on the disk that postgres is installed
> on.
> > Does anyone know what I can do to get the db up and running again?
>
> I guess the first question is, does the db have permissions(access) to
> all that
> space?
>
> >
> >
> >
> > /dev/amrd0s1a    3.9G    2.7G    898M    75%    /
> >
> > /dev/amrd0s1e    115G     43G     63G    40%    /backup
> >
> > /dev/amrd1s1d    133G    748M    121G     1%    /wal
> >
> > /dev/amrd2s1d    663G    596G     14G    98%    /db
> >
> > /dev/amrd0s1d    3.9G    184M    3.4G     5%    /var
>
>
>
> --
> Adrian Klaver
> aklaver@comcast.net
>
--
PostgreSQL - XMPP: jdrake@jabber.postgresql.org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997

Re: database corruption

From

Craig Ringer

Date:

09 April 2009, 04:54:08

Jeff Brenton wrote:

> I've attempted to re-index the pkey listed but after an hour it fails
> with
>
> REINDEX INDEX testrun_log_pkey;
>
> ERROR:  could not write block 1832079 of temporary file: No space left
> on device
>
> HINT:  Perhaps out of disk space?
>
> There is currently 14GB free on the disk that postgres is installed on.
> Does anyone know what I can do to get the db up and running again?

Is there 14GB free at the time PostgreSQL claims to run out of space? Try:

    watch -n 60 "df -m /db"

and see if the free space falls close to zero during your reindex attempt.

Personally, I'd just try to give Pg some room to breathe.

--
Craig Ringer

Re: database corruption

From

"Albe Laurenz *EXTERN*"

Date:

09 April 2009, 06:09:08

This thread is a top posting mess. I'll try to rearrange:

Jeff Brenton wrote:
>>>>> REINDEX INDEX testrun_log_pkey;
>>>>>
>>>>> ERROR:  could not write block 1832079 of temporary file: No space left
>>>>> on device
>>>>> HINT:  Perhaps out of disk space?
>>>>>
>>>>> There is currently 14GB free on the disk that postgres is installed on.
>>>>> Does anyone know what I can do to get the db up and running again?
[...]
>>>>> /dev/amrd2s1d    663G    596G     14G    98%    /db
>>>>
>>>> I guess the first question is, does the db have permissions(access) to
>>>> all that space?
>>>
>>> There are no filesystem level content size restrictions that I am aware
>>> of on this system.  The user pgsql should have full access to the
>>> filesystems indicated except for the root filesystem.
>>
>> Inodes?
>
> There are 9 miilion inodes free on /db.  All other partitions have at
> least 1/2 million free.

Assuming that this is ext3 on Linux, it could be space reserved for root.

What do you get if you run the following as root:

dumpe2fs /dev/amrd2s1d | grep 'Reserved block count'

Yours,
Laurenz Albe

Re: database corruption

From

Ivan Sergio Borgonovo

Date:

09 April 2009, 07:44:47

On Wed, 8 Apr 2009 22:14:38 -0400
"Jeff Brenton" <jbrenton@sandvine.com> wrote:

>
> There are no filesystem level content size restrictions that I am
> aware of on this system.  The user pgsql should have full access
> to the filesystems indicated except for the root filesystem.

finished inodes?
A lot of small files (even empty) somewhere?
It happened to me when I was running a spider using curl and the
spiders where not exiting properly when another process was killing
them...

--
Ivan Sergio Borgonovo
http://www.webthatworks.it