Thread: ERROR: found xmin from before relfrozenxid; MultiXactid does no longer exist -- apparent wraparound

Hi,

We have a postgres 10.7 database which reports a number of issues on user-created tables as well as system tables. Most errors are one of the following:
-- ERROR:  found xmin 1888159934 from before relfrozenxid 1998177448
-- ERROR:  MultiXactId 613819197 does no longer exist -- apparent wraparound
-- ERROR:  could not access status of transaction 1927393975
   DETAIL:  Could not open file "pg_xact/072E": No such file or directory.

I have tried several of the workarounds suggested online and in the web discussion groups:
1. vacuumdb of the entire database fails with the "found xmin from before relfrozenxid" error
2. pg_dump fails with the same error
3. SELECT sql on the affected tables fails with the error. So I cannot save the table, drop it and re-create it.
4. Removed the "global/pg_internal.init" file and re-started the cluster. Still the same errors.

The database is up and running and most of the tables are accessible. But any kind of SQL on the 4 or 5 affected tables throws the error.

Is there a way to repairing the corruption in this database?
Postgres Version 10.7 on Linux(Ubuntu).

Thanks.

Alanoly Andrews
(alanolya@invera.com)

This e-mail may be privileged and/or confidential, and the sender does not waive any related rights and obligations. Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized. If you received this e-mail in error, please advise me (by return e-mail or otherwise) immediately.

Ce courriel est confidentiel et protégé. L'expéditeur ne renonce pas aux droits et obligations qui s'y rapportent. Toute diffusion, utilisation ou copie de ce message ou des renseignements qu'il contient par une personne autre que le (les) destinataire(s) désigné(s) est interdite. Si vous recevez ce courriel par erreur, veuillez m'en aviser immédiatement, par retour de courriel ou par un autre moyen.'.

On Thu, 2024-05-30 at 14:58 +0000, Alanoly Andrews wrote:
> We have a postgres 10.7 database which reports a number of issues on user-created
> tables as well as system tables. Most errors are one of the following:
> -- ERROR:  found xmin 1888159934 from before relfrozenxid 1998177448
> -- ERROR:  MultiXactId 613819197 does no longer exist -- apparent wraparound
> -- ERROR:  could not access status of transaction 1927393975
>    DETAIL:  Could not open file "pg_xact/072E": No such file or directory.
>
> Is there a way to repairing the corruption in this database?
> Postgres Version 10.7 on Linux(Ubuntu).

Perhaps, but you should hire an expert if the data are important for you.

Yours,
Laurenz Albe



On Fri, May 31, 2024, 09:29 Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Thu, 2024-05-30 at 14:58 +0000, Alanoly Andrews wrote:
> We have a postgres 10.7 database which reports a number of issues on user-created
> tables as well as system tables. Most errors are one of the following:
> -- ERROR:  found xmin 1888159934 from before relfrozenxid 1998177448
> -- ERROR:  MultiXactId 613819197 does no longer exist -- apparent wraparound
> -- ERROR:  could not access status of transaction 1927393975
>    DETAIL:  Could not open file "pg_xact/072E": No such file or directory.
>
> Is there a way to repairing the corruption in this database?
> Postgres Version 10.7 on Linux(Ubuntu).

Perhaps, but you should hire an expert if the data are important for you.

Also, while it's too late now, this could be the result of a bug in the version you are using that was subsequently repaired in 10.15:

Prevent possible data loss from concurrent truncations of SLRU logs (Noah Misch)

This rare problem would manifest in later “apparent wraparound” or “could not access status of transaction” errors.

This is why it's important to keep up-to-date, but even the latest minor 10.x release is out of date as support was dropped back in 2022.

If you manage to get this up and running again, I strongly recommend upgrading to the latest major and minor release (16.3 at the time of writing).

Before you try doing anything though, create a physical backup of your database as situations like this tend to require invasive action that could potentially make the situation even worse.

Also, did this problem only happen in the last day or two? How frequently do you take backups? If you have a backup from just before this issue starting showing itself, and you can afford losing data changes that have occured since the backup, you may find it far easier and quicker to resort to using that backup. Of course, you would need to prove to yourself that the backup was safe by running a VACUUM FREEZE on each database in that backup before starting to use it.  If that runs without issue, you're probably in the clear.

Best of luck.

Thom
Thanks, Thom.
I understand from your response that there is really no way to repair the current damage. 

Yes, we do take daily backups and we have, in fact, restored the database cluster to a point in time before the corruption, suffering some loss of data in the process. I'm now working with the snapshot of the corrupted database (on a different box) to see if there is something that can be done to repair the damage and avoid such a scenario in future.  Yes, and I know that upgrading the Postgres version is the stock answer for situations like this. The upgrade is in the works.

But I was still interested in what the postgres gurus/programmers/hackers had to say about this event.

Regards.
Alanoly.

From: Thom Brown <thom@linux.com>
Sent: May 31, 2024 6:14 AM
To: Laurenz Albe <laurenz.albe@cybertec.at>
Cc: Alanoly Andrews <alanolya@invera.com>; pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Subject: Re: ERROR: found xmin from before relfrozenxid; MultiXactid does no longer exist -- apparent wraparound
 
You don't often get email from thom@linux.com. Learn why this is important

[Email External/Externe] Caution opening links or attachments/attention lors de l'ouverture de liens ou de pièces jointes.

On Fri, May 31, 2024, 09:29 Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Thu, 2024-05-30 at 14:58 +0000, Alanoly Andrews wrote:
> We have a postgres 10.7 database which reports a number of issues on user-created
> tables as well as system tables. Most errors are one of the following:
> -- ERROR:  found xmin 1888159934 from before relfrozenxid 1998177448
> -- ERROR:  MultiXactId 613819197 does no longer exist -- apparent wraparound
> -- ERROR:  could not access status of transaction 1927393975
>    DETAIL:  Could not open file "pg_xact/072E": No such file or directory.
>
> Is there a way to repairing the corruption in this database?
> Postgres Version 10.7 on Linux(Ubuntu).

Perhaps, but you should hire an expert if the data are important for you.

Also, while it's too late now, this could be the result of a bug in the version you are using that was subsequently repaired in 10.15:

Prevent possible data loss from concurrent truncations of SLRU logs (Noah Misch)

This rare problem would manifest in later “apparent wraparound” or “could not access status of transaction” errors.

This is why it's important to keep up-to-date, but even the latest minor 10.x release is out of date as support was dropped back in 2022.

If you manage to get this up and running again, I strongly recommend upgrading to the latest major and minor release (16.3 at the time of writing).

Before you try doing anything though, create a physical backup of your database as situations like this tend to require invasive action that could potentially make the situation even worse.

Also, did this problem only happen in the last day or two? How frequently do you take backups? If you have a backup from just before this issue starting showing itself, and you can afford losing data changes that have occured since the backup, you may find it far easier and quicker to resort to using that backup. Of course, you would need to prove to yourself that the backup was safe by running a VACUUM FREEZE on each database in that backup before starting to use it.  If that runs without issue, you're probably in the clear.

Best of luck.

Thom

This e-mail may be privileged and/or confidential, and the sender does not waive any related rights and obligations. Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized. If you received this e-mail in error, please advise me (by return e-mail or otherwise) immediately.

Ce courriel est confidentiel et protégé. L'expéditeur ne renonce pas aux droits et obligations qui s'y rapportent. Toute diffusion, utilisation ou copie de ce message ou des renseignements qu'il contient par une personne autre que le (les) destinataire(s) désigné(s) est interdite. Si vous recevez ce courriel par erreur, veuillez m'en aviser immédiatement, par retour de courriel ou par un autre moyen.'.

On Fri, May 31, 2024 at 1:25 PM Alanoly Andrews <alanolya@invera.com> wrote:
Yes, and I know that upgrading the Postgres version is the stock answer for situations like this. The upgrade is in the works.

Patching was the solution.  It takes five minutes.
Here's how I did it (since our RHEL systems are blocked from the Internet, and I had to manually d/l the relevant RPMs):
$ sudo -iu postgres pg_ctl stop -wt9999 -mfast
$ sudo yum install PG96.24_RHEL6/*rpm
$ sudo -iu postgres pg_ctl start -wt9999

You'll have a bit of effort finding the PG10 repository, since it's EOL, but it can be found.