Home > mailing lists

Re: BUG #18025: Probably we need to change behaviour of the checkpoint failures in PG - Mailing list pgsql-bugs

From	Laurenz Albe
Subject	Re: BUG #18025: Probably we need to change behaviour of the checkpoint failures in PG
Date	July 17, 2023 07:53:32
Msg-id	ca4cdccd0b085cbbf10577be4cb05eae4c35a141.camel@cybertec.at Whole thread Raw
In response to	BUG #18025: Probably we need to change behaviour of the checkpoint failures in PG (PG Bug reporting form <noreply@postgresql.org>)
Responses	Re: BUG #18025: Probably we need to change behaviour of the checkpoint failures in PG
List	pgsql-bugs

Tree view

On Mon, 2023-07-17 at 05:03 +0000, PG Bug reporting form wrote:
> I have just witnessed the data loss scenario.
>
> Scenario is like, there was checkpoint operation failures going on the DB
> server since last 8 hours which means no successful checkpoint happened in
> the DB server since last 8 hours. Then DB server went into the crash mode
> due to the exhausted disk space and did not came up as part of crash
> recovery.

Mistake #1: you did not monitor disk space.

> Actually the victim had moved few WALs from the pg_wal to other
> location and reimporting those wal on original location also did not solved
> the problem.

Mistake #2: manually messing with the database directory.

> DB server was not able to find out the valid checkpoint record.
> The victim was not having the backup which he could use that backup to
> recover the data with the help of available archived WALs.

Mistake #0: no backup.

> So , the victim
> had only one option left in his hand that is pg_resetwal.  We have tried
> every possible solution but did not worked so we did not left with more
> choices other than pg_Resetwal

Mistake #3: run pg_resetwal

"We have tried every possible solution" sounds a bit like "we tried all the
haphazard things that came to our mind".

Sorry, this is not a bug, this is a pilot error.

If PostgreSQL crashes because "pg_wal" runs out of disk space, you increase
the disk space, start PostgreSQL and let it complete crash recovery.  It is
as simple as that.

Yours,
Laurenz Albe

pgsql-bugs by date:

From: Manika Singhal
Date: 17 July 2023, 07:37:30
Subject: Re: BUG #18021: Loading Error

From: Laurenz Albe
Date: 17 July 2023, 07:58:00
Subject: Re: Query returns error "there is no parameter $1" but server logs that there are two parameters supplied

Re: BUG #18025: Probably we need to change behaviour of the checkpoint failures in PG - Mailing list pgsql-bugs

Previous

Next