Thread: Request to share information regarding postgresql pg_xlog file.

Request to share information regarding postgresql pg_xlog file.

From
Yogesh Sharma
Date:

Dear Team,

 

Thanks for your support and suggestion.

We are using below postgresql rpm.

postgresql-8.1.18-2.1

 

In our system, below error is found and occurring is very frequent.

 

CONTEXT:  writing block 53 of relation 1663/16385/280951

ERROR:  could not open relation 1663/16385/280951: No such file or directory

 

Due to this, size of pg_xlog is increasing continuous.

Could you please share what is possible cause of this issue.

And how we can resolve this issue.

 

Also , we tried to stop the postgresql but it couldn’t stop and timout after 60 sec.

please confirm below message in postgre logs.

FATAL:  terminating connection due to administrator command

 

Regards,

Yogesh

Re: Request to share information regarding postgresql pg_xlog file.

From
John R Pierce
Date:
On 9/14/2016 10:09 PM, Yogesh Sharma wrote:

Thanks for your support and suggestion.

We are using below postgresql rpm.

postgresql-8.1.18-2.1


thats not the full RPM name, thats just the version.

8.1 has been obsolete and unsupported for about 6 years now.    8.1.18 was released in 2009, the final 8.1.23 release was in 2010, after which it was dropped.

current releases are 9.1 (soon to be obsoletted), 9.2, 9.3, 9.4, and 9.5, with 9.6 in release candidate state.

CONTEXT:  writing block 53 of relation 1663/16385/280951

ERROR:  could not open relation 1663/16385/280951: No such file or directory



These errors suggest disk file corruption, this can occur from unreliable storage, undetected memory errors, and other such things.




-- 
john r pierce, recycling bits in santa cruz

Re: Request to share information regarding postgresql pg_xlog file.

From
Yogesh Sharma
Date:

Dear John and all,

 

>8.1 has been obsolete and unsupported for about 6 years now.    8.1.18 was released in 2009, the final 8.1.23 release was in 2010, after which it was >dropped.

Yes, we understood your point.

But we require some information related to this rpm.

 

>These errors suggest disk file corruption, this can occur from unreliable storage, undetected memory errors, and other such things.

How we can verify what is actual problem in system?

 

Also please share some information related to below.

we tried to stop the postgresql but it couldn’t stop and timout after 60 sec.

please confirm below message in postgre logs.

FATAL:  terminating connection due to administrator command

 

 

Regards,

Yogesh

 

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of John R Pierce
Sent: Thursday, September 15, 2016 11:28 AM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Request to share information regarding postgresql pg_xlog file.

 

On 9/14/2016 10:09 PM, Yogesh Sharma wrote:

Thanks for your support and suggestion.

We are using below postgresql rpm.

postgresql-8.1.18-2.1


thats not the full RPM name, thats just the version.

8.1 has been obsolete and unsupported for about 6 years now.    8.1.18 was released in 2009, the final 8.1.23 release was in 2010, after which it was dropped.

current releases are 9.1 (soon to be obsoletted), 9.2, 9.3, 9.4, and 9.5, with 9.6 in release candidate state.


CONTEXT:  writing block 53 of relation 1663/16385/280951

ERROR:  could not open relation 1663/16385/280951: No such file or directory



These errors suggest disk file corruption, this can occur from unreliable storage, undetected memory errors, and other such things.


 

-- 
john r pierce, recycling bits in santa cruz

Re: Request to share information regarding postgresql pg_xlog file.

From
Rob Sargent
Date:

On Sep 15, 2016, at 1:20 AM, Yogesh Sharma <Yogesh1.Sharma@nectechnologies.in> wrote:

Dear John and all,
 
>8.1 has been obsolete and unsupported for about 6 years now.    8.1.18 was released in 2009, the final 8.1.23 release was in 2010, after which it was >dropped.
Yes, we understood your point.
But we require some information related to this rpm.
 
>These errors suggest disk file corruption, this can occur from unreliable storage, undetected memory errors, and other such things.
How we can verify what is actual problem in system?
 
Also please share some information related to below.
we tried to stop the postgresql but it couldn’t stop and timout after 60 sec.
please confirm below message in postgre logs.
FATAL:  terminating connection due to administrator command
 
 
Regards,
Yogesh
 
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of John R Pierce
Sent: Thursday, September 15, 2016 11:28 AM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Request to share information regarding postgresql pg_xlog file.
 
On 9/14/2016 10:09 PM, Yogesh Sharma wrote:
Thanks for your support and suggestion.
We are using below postgresql rpm.
postgresql-8.1.18-2.1

thats not the full RPM name, thats just the version.

8.1 has been obsolete and unsupported for about 6 years now.    8.1.18 was released in 2009, the final 8.1.23 release was in 2010, after which it was dropped.

current releases are 9.1 (soon to be obsoletted), 9.2, 9.3, 9.4, and 9.5, with 9.6 in release candidate state.


CONTEXT:  writing block 53 of relation 1663/16385/280951
ERROR:  could not open relation 1663/16385/280951: No such file or directory



These errors suggest disk file corruption, this can occur from unreliable storage, undetected memory errors, and other such things.


 

-- 
john r pierce, recycling bits in santa cruz
What operating system is this running on? John is most likely correct: the disk is not healthy.  How you deal with that depends on your OS
Are you looking for the rpm for that version? Or do you have some other reason for asking about the rpm versus questions about the postgres version
This list requests that you “bottom post” i.e. add your comments to the bottom, not the top. (I don’t like it, but that’s the protocol here)


Re: Request to share information regarding postgresql pg_xlog file.

From
John R Pierce
Date:
On 9/15/2016 12:25 AM, Yogesh Sharma wrote:

Dear John,

 

Thanks for your support.

 

Please find below name of rpm.

RPMS/postgresql-8.1.18-2.1.x86_64.rpm
RPMS/postgresql-devel-8.1.18-2.1.x86_64.rpm
RPMS/postgresql-libs-8.1.18-2.1.x86_64.rpm
RPMS/postgresql-python-8.1.18-2.1.x86_64.rpm
RPMS/postgresql-server-8.1.18-2.1.x86_64.rpm

 

We are using redhat Enterprise Linux 5.8.

 


ok, those RPM's were built and packaged by Redhat, I believe.   If you have a RHEL support contract, you should be able to get help from them.  If you don't, you really shouldn't be running RHEL as there's no updates available without one.


as far as how do you determine whats wrong with your file system?   I don't know how you'd narrow that down, but postgres expected a file to be there, and it wasn't.     what file system are you using for the volume containing the postgres data directory ?  with RHEL5, you were pretty much limited to EXT3, I guess ?      It would probably be a good idea to unmount the volume and fsck it.  also check your system logs for any disk IO errors.   is this storage on a raid controller, or using software raid, or just a simple file system on a single disk, or what?    desktop/consumer disk drives are notorious for lying about writeback caching, telling the software the data is written when its still in a cache on the drive... if the power fails before the data actually gets written to disk with one of these, you can lose stuff.


-- 
john r pierce, recycling bits in santa cruz

Re: Request to share information regarding postgresql pg_xlog file.

From
John R Pierce
Date:
On 9/15/2016 12:53 AM, John R Pierce wrote:



ok, those RPM's were built and packaged by Redhat, I believe.   If you have a RHEL support contract, you should be able to get help from them.  If you don't, you really shouldn't be running RHEL as there's no updates available without one.

wait, the RHEL supplied RPMs have .el5. in the name.  like...

postgresql-8.1.23-10.el5_10.x86_64.rpm

(from the CentOS 5.11 repository)

so, I'm not sure /where/ your RPMs came from.


-- 
john r pierce, recycling bits in santa cruz

Re: Request to share information regarding postgresql pg_xlog file.

From
Yogesh Sharma
Date:

Dear John,

 

Thanks for your support.

 

>as far as how do you determine whats wrong with your file system?  

I tried fsck and hardware check using SMART disk info, no issue found with disk or filesystem.

 

>what file system are you using for the volume containing the postgres data directory ?  with RHEL5, you were pretty much limited to EXT3, I guess ?

File system is ext3 and mount with sync type (rw,sync,dirsync,noatime).

 

>It would probably be a good idea to unmount the volume and fsck it.  also check your system logs for any disk IO errors. 

I did that, fsck runs fine no issue.

 

>is this storage on a raid controller, or using software raid, or just a simple file system on a single disk, or what?

We are using a simple files system which is mirrored at block level.

 

>desktop/consumer disk drives are notorious for lying about writeback caching, telling the software the data is written when its still in a cache on the drive... if the power fails before the data actually gets written to disk with one of these, you can lose stuff.

 

Sync mode of file system ensures that, data is continuously flushed on disk as soon as write system call initiated on file system.

 

 

Thanks,

Yogesh

 

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of John R Pierce
Sent: Thursday, September 15, 2016 1:24 PM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Request to share information regarding postgresql pg_xlog file.

 

On 9/15/2016 12:25 AM, Yogesh Sharma wrote:

Dear John,

 

Thanks for your support.

 

Please find below name of rpm.

RPMS/postgresql-8.1.18-2.1.x86_64.rpm
RPMS/postgresql-devel-8.1.18-2.1.x86_64.rpm
RPMS/postgresql-libs-8.1.18-2.1.x86_64.rpm
RPMS/postgresql-python-8.1.18-2.1.x86_64.rpm
RPMS/postgresql-server-8.1.18-2.1.x86_64.rpm

 

We are using redhat Enterprise Linux 5.8.

 


ok, those RPM's were built and packaged by Redhat, I believe.   If you have a RHEL support contract, you should be able to get help from them.  If you don't, you really shouldn't be running RHEL as there's no updates available without one.


as far as how do you determine whats wrong with your file system?   I don't know how you'd narrow that down, but postgres expected a file to be there, and it wasn't.     what file system are you using for the volume containing the postgres data directory ?  with RHEL5, you were pretty much limited to EXT3, I guess ?      It would probably be a good idea to unmount the volume and fsck it.  also check your system logs for any disk IO errors.   is this storage on a raid controller, or using software raid, or just a simple file system on a single disk, or what?    desktop/consumer disk drives are notorious for lying about writeback caching, telling the software the data is written when its still in a cache on the drive... if the power fails before the data actually gets written to disk with one of these, you can lose stuff.

 

-- 
john r pierce, recycling bits in santa cruz

Re: Request to share information regarding postgresql pg_xlog file.

From
Tom Lane
Date:
Yogesh Sharma <Yogesh1.Sharma@nectechnologies.in> writes:
> We are using below postgresql rpm.
> postgresql-8.1.18-2.1

As already noted, that version is *way* obsolete, and full of known bugs.
It's irresponsible to be storing data you care about in such a version.
Having said that ...

> In our system, below error is found and occurring is very frequent.
> CONTEXT:  writing block 53 of relation 1663/16385/280951
> ERROR:  could not open relation 1663/16385/280951: No such file or directory

Evidently, the bgwriter is trying to flush out a dirty buffer belonging to
a table that isn't there on-disk.  I'm not sure I believe the other
respondents suggesting that the filesystem lost the file, especially not
if you're only seeing complaints about this one block in this one
relation.  You could check by seeing whether any pg_class rows have
relfilenode 280951 in whichever DB has OID 16385.  If not, then this
is just a minor bug that somehow a dirty buffer didn't get flushed before
its table was dropped.

While you don't really care about the data in that buffer in such a case,
the bgwriter doesn't know that.  The trick is to get past that and
complete a checkpoint.  You could try just touch-ing the missing file so
that there's something for the bgwriter to write the data into.

If that doesn't work, TBH, I'd suggest kill -9'ing the bgwriter and
letting the thing recover from WAL.  Given that you've built up a
whole lot of WAL since the last successful checkpoint, that will
take quite a while, so it's a last resort ... but it ought to work.

            regards, tom lane