ERROR: found xmin from before relfrozenxid; MultiXactid does no longer exist -- apparent wraparound

Started by Alanoly Andrewsalmost 2 years ago5 messagesgeneral
Jump to latest
#1Alanoly Andrews
alanolya@invera.com

Hi,

We have a postgres 10.7 database which reports a number of issues on user-created tables as well as system tables. Most errors are one of the following:
-- ERROR: found xmin 1888159934 from before relfrozenxid 1998177448
-- ERROR: MultiXactId 613819197 does no longer exist -- apparent wraparound
-- ERROR: could not access status of transaction 1927393975
DETAIL: Could not open file "pg_xact/072E": No such file or directory.

I have tried several of the workarounds suggested online and in the web discussion groups:
1. vacuumdb of the entire database fails with the "found xmin from before relfrozenxid" error
2. pg_dump fails with the same error
3. SELECT sql on the affected tables fails with the error. So I cannot save the table, drop it and re-create it.
4. Removed the "global/pg_internal.init" file and re-started the cluster. Still the same errors.

The database is up and running and most of the tables are accessible. But any kind of SQL on the 4 or 5 affected tables throws the error.

Is there a way to repairing the corruption in this database?
Postgres Version 10.7 on Linux(Ubuntu).

Thanks.

Alanoly Andrews
(alanolya@invera.com)

This e-mail may be privileged and/or confidential, and the sender does not waive any related rights and obligations. Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized. If you received this e-mail in error, please advise me (by return e-mail or otherwise) immediately.

Ce courriel est confidentiel et protégé. L'expéditeur ne renonce pas aux droits et obligations qui s'y rapportent. Toute diffusion, utilisation ou copie de ce message ou des renseignements qu'il contient par une personne autre que le (les) destinataire(s) désigné(s) est interdite. Si vous recevez ce courriel par erreur, veuillez m'en aviser immédiatement, par retour de courriel ou par un autre moyen.'.

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Alanoly Andrews (#1)
Re: ERROR: found xmin from before relfrozenxid; MultiXactid does no longer exist -- apparent wraparound

On Thu, 2024-05-30 at 14:58 +0000, Alanoly Andrews wrote:

We have a postgres 10.7 database which reports a number of issues on user-created
tables as well as system tables. Most errors are one of the following:
-- ERROR:  found xmin 1888159934 from before relfrozenxid 1998177448
-- ERROR:  MultiXactId 613819197 does no longer exist -- apparent wraparound
-- ERROR:  could not access status of transaction 1927393975
   DETAIL:  Could not open file "pg_xact/072E": No such file or directory.

Is there a way to repairing the corruption in this database?
Postgres Version 10.7 on Linux(Ubuntu).

Perhaps, but you should hire an expert if the data are important for you.

Yours,
Laurenz Albe

#3Thom Brown
thom@linux.com
In reply to: Laurenz Albe (#2)
Re: ERROR: found xmin from before relfrozenxid; MultiXactid does no longer exist -- apparent wraparound

On Fri, May 31, 2024, 09:29 Laurenz Albe <laurenz.albe@cybertec.at> wrote:

On Thu, 2024-05-30 at 14:58 +0000, Alanoly Andrews wrote:

We have a postgres 10.7 database which reports a number of issues on

user-created

tables as well as system tables. Most errors are one of the following:
-- ERROR: found xmin 1888159934 from before relfrozenxid 1998177448
-- ERROR: MultiXactId 613819197 does no longer exist -- apparent

wraparound

-- ERROR: could not access status of transaction 1927393975
DETAIL: Could not open file "pg_xact/072E": No such file or

directory.

Is there a way to repairing the corruption in this database?
Postgres Version 10.7 on Linux(Ubuntu).

Perhaps, but you should hire an expert if the data are important for you.

Also, while it's too late now, this could be the result of a bug in the
version you are using that was subsequently repaired in 10.15:

Prevent possible data loss from concurrent truncations of SLRU logs (Noah
Misch)

This rare problem would manifest in later “apparent wraparound” or “could
not access status of transaction” errors.

This is why it's important to keep up-to-date, but even the latest minor
10.x release is out of date as support was dropped back in 2022.

If you manage to get this up and running again, I strongly recommend
upgrading to the latest major and minor release (16.3 at the time of
writing).

Before you try doing anything though, create a physical backup of your
database as situations like this tend to require invasive action that could
potentially make the situation even worse.

Also, did this problem only happen in the last day or two? How frequently
do you take backups? If you have a backup from just before this issue
starting showing itself, and you can afford losing data changes that have
occured since the backup, you may find it far easier and quicker to resort
to using that backup. Of course, you would need to prove to yourself that
the backup was safe by running a VACUUM FREEZE on each database in that
backup before starting to use it. If that runs without issue, you're
probably in the clear.

Best of luck.

Thom

Show quoted text
#4Alanoly Andrews
alanolya@invera.com
In reply to: Thom Brown (#3)
Re: ERROR: found xmin from before relfrozenxid; MultiXactid does no longer exist -- apparent wraparound

Thanks, Thom.
I understand from your response that there is really no way to repair the current damage.

Yes, we do take daily backups and we have, in fact, restored the database cluster to a point in time before the corruption, suffering some loss of data in the process. I'm now working with the snapshot of the corrupted database (on a different box) to see if there is something that can be done to repair the damage and avoid such a scenario in future. Yes, and I know that upgrading the Postgres version is the stock answer for situations like this. The upgrade is in the works.

But I was still interested in what the postgres gurus/programmers/hackers had to say about this event.

Regards.
Alanoly.
________________________________
From: Thom Brown <thom@linux.com>
Sent: May 31, 2024 6:14 AM
To: Laurenz Albe <laurenz.albe@cybertec.at>
Cc: Alanoly Andrews <alanolya@invera.com>; pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Subject: Re: ERROR: found xmin from before relfrozenxid; MultiXactid does no longer exist -- apparent wraparound

You don't often get email from thom@linux.com. Learn why this is important<https://aka.ms/LearnAboutSenderIdentification&gt;

[Email External/Externe] Caution opening links or attachments/attention lors de l'ouverture de liens ou de pièces jointes.

On Fri, May 31, 2024, 09:29 Laurenz Albe <laurenz.albe@cybertec.at<mailto:laurenz.albe@cybertec.at>> wrote:
On Thu, 2024-05-30 at 14:58 +0000, Alanoly Andrews wrote:

We have a postgres 10.7 database which reports a number of issues on user-created
tables as well as system tables. Most errors are one of the following:
-- ERROR: found xmin 1888159934 from before relfrozenxid 1998177448
-- ERROR: MultiXactId 613819197 does no longer exist -- apparent wraparound
-- ERROR: could not access status of transaction 1927393975
DETAIL: Could not open file "pg_xact/072E": No such file or directory.

Is there a way to repairing the corruption in this database?
Postgres Version 10.7 on Linux(Ubuntu).

Perhaps, but you should hire an expert if the data are important for you.

Also, while it's too late now, this could be the result of a bug in the version you are using that was subsequently repaired in 10.15:

Prevent possible data loss from concurrent truncations of SLRU logs (Noah Misch)

This rare problem would manifest in later “apparent wraparound” or “could not access status of transaction” errors.

This is why it's important to keep up-to-date, but even the latest minor 10.x release is out of date as support was dropped back in 2022.

If you manage to get this up and running again, I strongly recommend upgrading to the latest major and minor release (16.3 at the time of writing).

Before you try doing anything though, create a physical backup of your database as situations like this tend to require invasive action that could potentially make the situation even worse.

Also, did this problem only happen in the last day or two? How frequently do you take backups? If you have a backup from just before this issue starting showing itself, and you can afford losing data changes that have occured since the backup, you may find it far easier and quicker to resort to using that backup. Of course, you would need to prove to yourself that the backup was safe by running a VACUUM FREEZE on each database in that backup before starting to use it. If that runs without issue, you're probably in the clear.

Best of luck.

Thom

This e-mail may be privileged and/or confidential, and the sender does not waive any related rights and obligations. Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized. If you received this e-mail in error, please advise me (by return e-mail or otherwise) immediately.

Ce courriel est confidentiel et protégé. L'expéditeur ne renonce pas aux droits et obligations qui s'y rapportent. Toute diffusion, utilisation ou copie de ce message ou des renseignements qu'il contient par une personne autre que le (les) destinataire(s) désigné(s) est interdite. Si vous recevez ce courriel par erreur, veuillez m'en aviser immédiatement, par retour de courriel ou par un autre moyen.'.

#5Ron
ronljohnsonjr@gmail.com
In reply to: Alanoly Andrews (#4)
Re: ERROR: found xmin from before relfrozenxid; MultiXactid does no longer exist -- apparent wraparound

On Fri, May 31, 2024 at 1:25 PM Alanoly Andrews <alanolya@invera.com> wrote:

Yes, and I know that upgrading the Postgres version is the stock answer
for situations like this. The upgrade is in the works.

*Patching *was the solution. It takes *five minutes*.
Here's how I did it (since our RHEL systems are blocked from the Internet,
and I had to manually d/l the relevant RPMs):
$ sudo -iu postgres pg_ctl stop -wt9999 -mfast
$ sudo yum install PG96.24_RHEL6/*rpm
$ sudo -iu postgres pg_ctl start -wt9999

You'll have a bit of effort finding the PG10 repository, since it's EOL,
but it can be found.