Thread: wal seams to be corrupted

wal seams to be corrupted

From
Domen Šetar
Date:

Hi admins,

 

The number of wal files on my postgresql server is rising, because it seams that one wal is corrupted. Postgrsql is running normaly.  I see this in postgresql log file:

 

2024-07-19 07:44:12 CEST [2205]: [32288-1] user=,db=,app=,client= DETAIL:  The failed archive command was: test ! -f /var/lib/pgsql/ArchiveDir/000000010000044E0000009D && cp pg_wal/000000010000044E0000009D /var/lib/pgsql/ArchiveDir/000000010000044E0000009D

 

Usualy helped if I deleted wal in ArchiveDir directory. But not this time. Wal is copied again from pg_wal to ArchiveDir directory and  error message continues.

What can I do to solve this problem? Is pg_resetwal solution fort his problem? If it is, how to use it?

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

Attachment

Re: wal seams to be corrupted

From
Kashif Zeeshan
Date:
Hi Domen

On Fri, Jul 19, 2024 at 10:57 AM Domen Šetar <domen.setar@izum.si> wrote:

Hi admins,

 

The number of wal files on my postgresql server is rising, because it seams that one wal is corrupted. Postgrsql is running normaly.  I see this in postgresql log file:

 

2024-07-19 07:44:12 CEST [2205]: [32288-1] user=,db=,app=,client= DETAIL:  The failed archive command was: test ! -f /var/lib/pgsql/ArchiveDir/000000010000044E0000009D && cp pg_wal/000000010000044E0000009D /var/lib/pgsql/ArchiveDir/000000010000044E0000009D

 

Usualy helped if I deleted wal in ArchiveDir directory. But not this time. Wal is copied again from pg_wal to ArchiveDir directory and  error message continues.

What can I do to solve this problem? Is pg_resetwal solution fort his problem? If it is, how to use it?

Yes you should use the  pg_resetwal what it does is clears the write-ahead log (WAL) and optionally resets some other control information stored in the pg_control file. This function is sometimes needed if these files have become corrupted. It should be used only as a last resort, when the server will not start due to such corruption.

You can find the help from the following link

Regards
Kashif Zeeshan

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

Attachment

Re: wal seams to be corrupted

From
Muhammad Ikram
Date:
Hi Domen Setar,

If your database is running normal then  backup your database before going for any solution.

Regards,
Ikram

On Fri, Jul 19, 2024 at 10:57 AM Domen Šetar <domen.setar@izum.si> wrote:

Hi admins,

 

The number of wal files on my postgresql server is rising, because it seams that one wal is corrupted. Postgrsql is running normaly.  I see this in postgresql log file:

 

2024-07-19 07:44:12 CEST [2205]: [32288-1] user=,db=,app=,client= DETAIL:  The failed archive command was: test ! -f /var/lib/pgsql/ArchiveDir/000000010000044E0000009D && cp pg_wal/000000010000044E0000009D /var/lib/pgsql/ArchiveDir/000000010000044E0000009D

 

Usualy helped if I deleted wal in ArchiveDir directory. But not this time. Wal is copied again from pg_wal to ArchiveDir directory and  error message continues.

What can I do to solve this problem? Is pg_resetwal solution fort his problem? If it is, how to use it?

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 



--
Muhammad Ikram

Attachment

Re: wal seams to be corrupted

From
"David G. Johnston"
Date:
On Thursday, July 18, 2024, Domen Šetar <domen.setar@izum.si> wrote:

The number of wal files on my postgresql server is rising, because it seams that one wal is corrupted. Postgrsql is running normaly.  I see this in postgresql log file:

 

2024-07-19 07:44:12 CEST [2205]: [32288-1] user=,db=,app=,client= DETAIL:  The failed archive command was: test ! -f /var/lib/pgsql/ArchiveDir/000000010000044E0000009D && cp pg_wal/000000010000044E0000009D /var/lib/pgsql/ArchiveDir/000000010000044E0000009D

 

Usualy helped if I deleted wal in ArchiveDir directory. But not this time. Wal is copied again from pg_wal to ArchiveDir directory and  error message continues.

What can I do to solve this problem? Is pg_resetwal solution fort his problem? If it is, how to use it?



Without knowing why the archive command failed it is impossible to say.  But archiving doesn’t impact the server producing the WAL so messing with it isn’t a useful approach.  Writing a better archive command is where you should expend your efforts.

If the WAL file is corrupt, which you’ve not shown, but the server is running, doing a full checkpoint and the. physical backup that doesn’t require the problematic WAL would let you not care about it since you would not need it for recovery.

David J.

RE: wal seams to be corrupted

From
Domen Šetar
Date:

I didn't claim that wal file is corrupted. I just say that archiver fail to copy it sucessfully so I persume, that something must be wrong with wal file.

And I need to fix it becuse this problem stops wals to move from pg_wal to archive directory.

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

From: David G. Johnston <david.g.johnston@gmail.com>
Sent: Friday, July 19, 2024 8:16 AM
To: Domen Šetar <domen.setar@izum.si>
Cc: pgsql-admin@lists.postgresql.org
Subject: Re: wal seams to be corrupted

 

On Thursday, July 18, 2024, Domen Šetar <domen.setar@izum.si> wrote:

The number of wal files on my postgresql server is rising, because it seams that one wal is corrupted. Postgrsql is running normaly.  I see this in postgresql log file:

 

2024-07-19 07:44:12 CEST [2205]: [32288-1] user=,db=,app=,client= DETAIL:  The failed archive command was: test ! -f /var/lib/pgsql/ArchiveDir/000000010000044E0000009D && cp pg_wal/000000010000044E0000009D /var/lib/pgsql/ArchiveDir/000000010000044E0000009D

 

Usualy helped if I deleted wal in ArchiveDir directory. But not this time. Wal is copied again from pg_wal to ArchiveDir directory and  error message continues.

What can I do to solve this problem? Is pg_resetwal solution fort his problem? If it is, how to use it?

 

 

Without knowing why the archive command failed it is impossible to say.  But archiving doesn’t impact the server producing the WAL so messing with it isn’t a useful approach.  Writing a better archive command is where you should expend your efforts.

 

If the WAL file is corrupt, which you’ve not shown, but the server is running, doing a full checkpoint and the. physical backup that doesn’t require the problematic WAL would let you not care about it since you would not need it for recovery.

 

David J.

 

Attachment

Re: wal seams to be corrupted

From
"David G. Johnston"
Date:


On Thursday, July 18, 2024, Kashif Zeeshan <kashi.zeeshan@gmail.com> wrote:
Hi Domen

On Fri, Jul 19, 2024 at 10:57 AM Domen Šetar <domen.setar@izum.si> wrote:

Hi admins,

 

The number of wal files on my postgresql server is rising, because it seams that one wal is corrupted. Postgrsql is running normaly.  I see this in postgresql log file:

 

2024-07-19 07:44:12 CEST [2205]: [32288-1] user=,db=,app=,client= DETAIL:  The failed archive command was: test ! -f /var/lib/pgsql/ArchiveDir/000000010000044E0000009D && cp pg_wal/000000010000044E0000009D /var/lib/pgsql/ArchiveDir/000000010000044E0000009D

 

Usualy helped if I deleted wal in ArchiveDir directory. But not this time. Wal is copied again from pg_wal to ArchiveDir directory and  error message continues.

What can I do to solve this problem? Is pg_resetwal solution fort his problem? If it is, how to use it?

Yes you should use the  pg_resetwal 

Please ignore this advice at this time.  The combined inexperience here culminating in using that command is likely to do more harm than good.  Figure out why the archiving is failing then make decisions.  Maybe you’ll get some admin help with that task for free here, otherwise consider hiring someone with DBA experience to improve your setup.

David J.
 

RE: wal seams to be corrupted

From
Domen Šetar
Date:

Thanks for the answer.

What about to stop postgresql server which is primary in replication and promote another server?

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

From: Kashif Zeeshan <kashi.zeeshan@gmail.com>
Sent: Friday, July 19, 2024 8:10 AM
To: Domen Šetar <domen.setar@izum.si>
Cc: pgsql-admin@lists.postgresql.org
Subject: Re: wal seams to be corrupted

 

Hi Domen

 

On Fri, Jul 19, 2024 at 10:57AM Domen Šetar <domen.setar@izum.si> wrote:

Hi admins,

 

The number of wal files on my postgresql server is rising, because it seams that one wal is corrupted. Postgrsql is running normaly.  I see this in postgresql log file:

 

2024-07-19 07:44:12 CEST [2205]: [32288-1] user=,db=,app=,client= DETAIL:  The failed archive command was: test ! -f /var/lib/pgsql/ArchiveDir/000000010000044E0000009D && cp pg_wal/000000010000044E0000009D /var/lib/pgsql/ArchiveDir/000000010000044E0000009D

 

Usualy helped if I deleted wal in ArchiveDir directory. But not this time. Wal is copied again from pg_wal to ArchiveDir directory and  error message continues.

What can I do to solve this problem? Is pg_resetwal solution fort his problem? If it is, how to use it?

Yes you should use the  pg_resetwal what it does is clears the write-ahead log (WAL) and optionally resets some other control information stored in the pg_control file. This function is sometimes needed if these files have become corrupted. It should be used only as a last resort, when the server will not start due to such corruption.

 

You can find the help from the following link

 

Regards

Kashif Zeeshan

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

Attachment

RE: wal seams to be corrupted

From
Domen Šetar
Date:

Hi,

 

I think, that possible the best solution will be to stop postgresql on problem server (which is replication master), promote secondary, replicate data from promoted secondary back to problem server in make it replication master again. That way I'll get rid of problematic wal file.

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

From: Domen Šetar
Sent: Friday, July 19, 2024 7:58 AM
To: pgsql-admin@lists.postgresql.org
Subject: wal seams to be corrupted

 

Hi admins,

 

The number of wal files on my postgresql server is rising, because it seams that one wal is corrupted. Postgrsql is running normaly.  I see this in postgresql log file:

 

2024-07-19 07:44:12 CEST [2205]: [32288-1] user=,db=,app=,client= DETAIL:  The failed archive command was: test ! -f /var/lib/pgsql/ArchiveDir/000000010000044E0000009D && cp pg_wal/000000010000044E0000009D /var/lib/pgsql/ArchiveDir/000000010000044E0000009D

 

Usualy helped if I deleted wal in ArchiveDir directory. But not this time. Wal is copied again from pg_wal to ArchiveDir directory and  error message continues.

What can I do to solve this problem? Is pg_resetwal solution fort his problem? If it is, how to use it?

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

Attachment

Re: wal seams to be corrupted

From
Kashif Zeeshan
Date:
Hi

On Fri, Jul 19, 2024 at 11:37 AM Domen Šetar <domen.setar@izum.si> wrote:

Hi,

 

I think, that possible the best solution will be to stop postgresql on problem server (which is replication master), promote secondary, replicate data from promoted secondary back to problem server in make it replication master again. That way I'll get rid of problematic wal file.

This is the standard way and it will require a lot of time on your end and the down time as well, i think it's better to find the cause of the failure first and  its possible that you can fix the issue in less time and effort but the solution you suggested is the safest way though.

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

From: Domen Šetar
Sent: Friday, July 19, 2024 7:58 AM
To: pgsql-admin@lists.postgresql.org
Subject: wal seams to be corrupted

 

Hi admins,

 

The number of wal files on my postgresql server is rising, because it seams that one wal is corrupted. Postgrsql is running normaly.  I see this in postgresql log file:

 

2024-07-19 07:44:12 CEST [2205]: [32288-1] user=,db=,app=,client= DETAIL:  The failed archive command was: test ! -f /var/lib/pgsql/ArchiveDir/000000010000044E0000009D && cp pg_wal/000000010000044E0000009D /var/lib/pgsql/ArchiveDir/000000010000044E0000009D

 

Usualy helped if I deleted wal in ArchiveDir directory. But not this time. Wal is copied again from pg_wal to ArchiveDir directory and  error message continues.

What can I do to solve this problem? Is pg_resetwal solution fort his problem? If it is, how to use it?

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

Attachment

RE: wal seams to be corrupted

From
Domen Šetar
Date:

Thank you Kashif.

I’ll try to find the cause of the problem. If I fail, I’ll do it with replica.

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

From: Kashif Zeeshan <kashi.zeeshan@gmail.com>
Sent: Friday, July 19, 2024 8:42 AM
To: Domen Šetar <domen.setar@izum.si>
Cc: pgsql-admin@lists.postgresql.org
Subject: Re: wal seams to be corrupted

 

Hi

 

On Fri, Jul 19, 2024 at 11:37AM Domen Šetar <domen.setar@izum.si> wrote:

Hi,

 

I think, that possible the best solution will be to stop postgresql on problem server (which is replication master), promote secondary, replicate data from promoted secondary back to problem server in make it replication master again. That way I'll get rid of problematic wal file.

This is the standard way and it will require a lot of time on your end and the down time as well, i think it's better to find the cause of the failure first and  its possible that you can fix the issue in less time and effort but the solution you suggested is the safest way though.

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

From: Domen Šetar
Sent: Friday, July 19, 2024 7:58 AM
To: pgsql-admin@lists.postgresql.org
Subject: wal seams to be corrupted

 

Hi admins,

 

The number of wal files on my postgresql server is rising, because it seams that one wal is corrupted. Postgrsql is running normaly.  I see this in postgresql log file:

 

2024-07-19 07:44:12 CEST [2205]: [32288-1] user=,db=,app=,client= DETAIL:  The failed archive command was: test ! -f /var/lib/pgsql/ArchiveDir/000000010000044E0000009D && cp pg_wal/000000010000044E0000009D /var/lib/pgsql/ArchiveDir/000000010000044E0000009D

 

Usualy helped if I deleted wal in ArchiveDir directory. But not this time. Wal is copied again from pg_wal to ArchiveDir directory and  error message continues.

What can I do to solve this problem? Is pg_resetwal solution fort his problem? If it is, how to use it?

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

Attachment

Re: wal seams to be corrupted

From
Laurenz Albe
Date:
On Fri, 2024-07-19 at 05:57 +0000, Domen Šetar wrote:
> The number of wal files on my postgresql server is rising, because it seams
> that one wal is corrupted. Postgrsql is running normaly.  I see this in postgresql
> log file:
>  
> 2024-07-19 07:44:12 CEST [2205]: [32288-1] user=,db=,app=,client= DETAIL:  The failed archive command was: test ! -f
/var/lib/pgsql/ArchiveDir/000000010000044E0000009D&& cp pg_wal/000000010000044E0000009D
/var/lib/pgsql/ArchiveDir/000000010000044E0000009D
>  
> Usualy helped if I deleted wal in ArchiveDir directory. But not this time. Wal is
> copied again from pg_wal to ArchiveDir directory and  error message continues.
> What can I do to solve this problem? Is pg_resetwal solution fort his problem? If it is, how to use it?

Don't listen to any advice to run "pg_resetwal".
Only consider switching to the standby if your primary crashes because the disk is full.

You need to determine the cause of the problem.

1. All error messages from "archive_command" end up in the log file.
   Search for those, they may help you determine the cause.

2. Is there a file /var/lib/pgsql/ArchiveDir/000000010000044E0000009D ?
   If yes, delete it, and the problem should be solved.

3. If there is no such file, it must be the "cp" command that is
   failing.  In that case, you should definitely see an error message
   about that in the log file.  Likely causes:

   - the permissions are not right (try by running the "cp" command as
     user "postgres" manually)

   - the target directory does not exist

   - the target directory is full

Yours,
Laurenz Albe



RE: wal seams to be corrupted

From
Domen Šetar
Date:

Thank you admins for helping me.

The problem was stupid and I'm a little bit ashamed.

Archive disk was full and I didn't notice it.

I made some space on it and everything is ok know.

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

From: Domen Šetar
Sent: Friday, July 19, 2024 7:58 AM
To: pgsql-admin@lists.postgresql.org
Subject: wal seams to be corrupted

 

Hi admins,

 

The number of wal files on my postgresql server is rising, because it seams that one wal is corrupted. Postgrsql is running normaly.  I see this in postgresql log file:

 

2024-07-19 07:44:12 CEST [2205]: [32288-1] user=,db=,app=,client= DETAIL:  The failed archive command was: test ! -f /var/lib/pgsql/ArchiveDir/000000010000044E0000009D && cp pg_wal/000000010000044E0000009D /var/lib/pgsql/ArchiveDir/000000010000044E0000009D

 

Usualy helped if I deleted wal in ArchiveDir directory. But not this time. Wal is copied again from pg_wal to ArchiveDir directory and  error message continues.

What can I do to solve this problem? Is pg_resetwal solution fort his problem? If it is, how to use it?

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

Attachment

Re: wal seams to be corrupted

From
Kashif Zeeshan
Date:


On Fri, Jul 19, 2024 at 12:07 PM Domen Šetar <domen.setar@izum.si> wrote:

Thank you admins for helping me.

The problem was stupid and I'm a little bit ashamed.

Archive disk was full and I didn't notice it.

I made some space on it and everything is ok know.

Happy that you have figured out the issue and fixed it.. 

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

From: Domen Šetar
Sent: Friday, July 19, 2024 7:58 AM
To: pgsql-admin@lists.postgresql.org
Subject: wal seams to be corrupted

 

Hi admins,

 

The number of wal files on my postgresql server is rising, because it seams that one wal is corrupted. Postgrsql is running normaly.  I see this in postgresql log file:

 

2024-07-19 07:44:12 CEST [2205]: [32288-1] user=,db=,app=,client= DETAIL:  The failed archive command was: test ! -f /var/lib/pgsql/ArchiveDir/000000010000044E0000009D && cp pg_wal/000000010000044E0000009D /var/lib/pgsql/ArchiveDir/000000010000044E0000009D

 

Usualy helped if I deleted wal in ArchiveDir directory. But not this time. Wal is copied again from pg_wal to ArchiveDir directory and  error message continues.

What can I do to solve this problem? Is pg_resetwal solution fort his problem? If it is, how to use it?

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

Attachment

AW: wal seams to be corrupted

From
"Dischner, Anton"
Date:

Hi,

 

glad you found the problem. Had it in mind but didn’t dare to say it 😉

 

Maybe you want to install a monitor software like check_mk or Icinga. We use both and uptime_kuma to get noticed if services trend to fail or have failed.

BTW: check_mk and the  PostgreSQL is also a very nice tool.

 

Best regards,

 

Anton

 

Von: Domen Šetar <domen.setar@izum.si>
Gesendet: Freitag, 19. Juli 2024 09:08
An: pgsql-admin@lists.postgresql.org
Betreff: RE: wal seams to be corrupted

 

Thank you admins for helping me.

The problem was stupid and I'm a little bit ashamed.

Archive disk was full and I didn't notice it.

I made some space on it and everything is ok know.

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

From: Domen Šetar
Sent: Friday, July 19, 2024 7:58 AM
To: pgsql-admin@lists.postgresql.org
Subject: wal seams to be corrupted

 

Hi admins,

 

The number of wal files on my postgresql server is rising, because it seams that one wal is corrupted. Postgrsql is running normaly.  I see this in postgresql log file:

 

2024-07-19 07:44:12 CEST [2205]: [32288-1] user=,db=,app=,client= DETAIL:  The failed archive command was: test ! -f /var/lib/pgsql/ArchiveDir/000000010000044E0000009D && cp pg_wal/000000010000044E0000009D /var/lib/pgsql/ArchiveDir/000000010000044E0000009D

 

Usualy helped if I deleted wal in ArchiveDir directory. But not this time. Wal is copied again from pg_wal to ArchiveDir directory and  error message continues.

What can I do to solve this problem? Is pg_resetwal solution fort his problem? If it is, how to use it?

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

Attachment

RE: wal seams to be corrupted

From
Domen Šetar
Date:

Yes. Sometimes we don't see obvious.

I noticed now that I don't have disk checks fort his host on Icinga and I'm adding some now. 😉

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

From: Dischner, Anton <Anton.Dischner@med.uni-muenchen.de>
Sent: Friday, July 19, 2024 9:16 AM
To: Domen Šetar <domen.setar@izum.si>
Cc: pgsql-admin@lists.postgresql.org
Subject: AW: wal seams to be corrupted

 

Hi,

 

glad you found the problem. Had it in mind but didn’t dare to say it 😉

 

Maybe you want to install a monitor software like check_mk or Icinga. We use both and uptime_kuma to get noticed if services trend to fail or have failed.

BTW: check_mk and the  PostgreSQL is also a very nice tool.

 

Best regards,

 

Anton

 

Von: Domen Šetar <domen.setar@izum.si>
Gesendet: Freitag, 19. Juli 2024 09:08
An: pgsql-admin@lists.postgresql.org
Betreff: RE: wal seams to be corrupted

 

Thank you admins for helping me.

The problem was stupid and I'm a little bit ashamed.

Archive disk was full and I didn't notice it.

I made some space on it and everything is ok know.

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

From: Domen Šetar
Sent: Friday, July 19, 2024 7:58 AM
To: pgsql-admin@lists.postgresql.org
Subject: wal seams to be corrupted

 

Hi admins,

 

The number of wal files on my postgresql server is rising, because it seams that one wal is corrupted. Postgrsql is running normaly.  I see this in postgresql log file:

 

2024-07-19 07:44:12 CEST [2205]: [32288-1] user=,db=,app=,client= DETAIL:  The failed archive command was: test ! -f /var/lib/pgsql/ArchiveDir/000000010000044E0000009D && cp pg_wal/000000010000044E0000009D /var/lib/pgsql/ArchiveDir/000000010000044E0000009D

 

Usualy helped if I deleted wal in ArchiveDir directory. But not this time. Wal is copied again from pg_wal to ArchiveDir directory and  error message continues.

What can I do to solve this problem? Is pg_resetwal solution fort his problem? If it is, how to use it?

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

Attachment

Re: wal seams to be corrupted

From
Muhammad Waqas
Date:
This is not the corruption I guess

Please check availability of WAL file at OS level

2024년 7월 19일 (금) 오전 10:57, Domen Šetar <domen.setar@izum.si>님이 작성:

Hi admins,

 

The number of wal files on my postgresql server is rising, because it seams that one wal is corrupted. Postgrsql is running normaly.  I see this in postgresql log file:

 

2024-07-19 07:44:12 CEST [2205]: [32288-1] user=,db=,app=,client= DETAIL:  The failed archive command was: test ! -f /var/lib/pgsql/ArchiveDir/000000010000044E0000009D && cp pg_wal/000000010000044E0000009D /var/lib/pgsql/ArchiveDir/000000010000044E0000009D

 

Usualy helped if I deleted wal in ArchiveDir directory. But not this time. Wal is copied again from pg_wal to ArchiveDir directory and  error message continues.

What can I do to solve this problem? Is pg_resetwal solution fort his problem? If it is, how to use it?

 

Best regards!

izum

Domen Šetar
Computer Systems Support
IZUM – Institute of Information Science | Prešernova ulica 17 | 2000 Maribor | Slovenia
T: +386 2 25 20 339 | M: +386 41 676 342 | www.izum.si | domen.setar@izum.si

 

 

Attachment