uncommitted xmin 3100586 from before xid cutoff 10339367 needs to be frozen

Started by Daulatabout 2 years ago6 messagesgeneral
Jump to latest
#1Daulat
daulat.dba@gmail.com

Hi All,

We recently started seeing an error “ERROR: uncommitted xmin 3100586
from before xid cutoff 10339367 needs to be frozen” on our user tables.
I’m unable to do ‘vacuum’, ‘vacuum freeze’ or ‘vacuum full’ on Postgres
14.4 running on a windows environment.

Error:

first come this---- ERROR: uncommitted xmin 3100586 from before xid cutoff
10339367 needs to be frozen
CONTEXT: while scanning block 1403 offset 8 of relation
"pg_catalog.pg_attribute"

Thanks,
Daulat

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Daulat (#1)
Re: uncommitted xmin 3100586 from before xid cutoff 10339367 needs to be frozen

On Fri, 2024-03-22 at 10:56 +0530, Daulat wrote:

We recently started seeing an error “ERROR:  uncommitted xmin 3100586
from before xid cutoff 10339367 needs to be frozen” on our user tables.
I’m unable to do ‘vacuum’, ‘vacuum freeze’ or ‘vacuum full’ on Postgres 14.4 running on a windows environment.

Error:

first come this---- ERROR:  uncommitted xmin 3100586 from before xid cutoff 10339367 needs to be frozen
CONTEXT:  while scanning block 1403 offset 8 of relation "pg_catalog.pg_attribute"

Update to 14.latest; perhaps that data corruption was caused by a bug that
is already fixed.

Upgrading won't get rid of the error though (I think).
The seasy way is to dump the database and restore it to a new database.

Yours,
Laurenz Albe

#3Daulat
daulat.dba@gmail.com
In reply to: Laurenz Albe (#2)
Re: uncommitted xmin 3100586 from before xid cutoff 10339367 needs to be frozen

We are unable to take the backup of our database. While taking backup we
are getting the same error.

psql: error: connection to server at "localhost" (::1), port 5014 failed:
FATAL: pg_attribute catalog is missing 1 attribute(s) for relation OID 2662

Thanks.

On Fri, Mar 22, 2024 at 12:35 PM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:

Show quoted text

On Fri, 2024-03-22 at 10:56 +0530, Daulat wrote:

We recently started seeing an error “ERROR: uncommitted xmin 3100586
from before xid cutoff 10339367 needs to be frozen” on our user tables.
I’m unable to do ‘vacuum’, ‘vacuum freeze’ or ‘vacuum full’ on Postgres

14.4 running on a windows environment.

Error:

first come this---- ERROR: uncommitted xmin 3100586 from before xid

cutoff 10339367 needs to be frozen

CONTEXT: while scanning block 1403 offset 8 of relation

"pg_catalog.pg_attribute"

Update to 14.latest; perhaps that data corruption was caused by a bug that
is already fixed.

Upgrading won't get rid of the error though (I think).
The seasy way is to dump the database and restore it to a new database.

Yours,
Laurenz Albe

#4Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Daulat (#3)
Re: uncommitted xmin 3100586 from before xid cutoff 10339367 needs to be frozen

On Fri, 2024-03-22 at 13:41 +0530, Daulat wrote:

We are unable to take the backup of our database. While taking backup we are getting the same error. 

psql: error: connection to server at "localhost" (::1), port 5014 failed: FATAL:  pg_attribute catalog is missing 1 attribute(s) for relation OID 2662

Then you got severe data corruption. This is the index "pg_class_oid_index",
and corrupted metadata make recovery difficult.

If I were you, I would seek professional help.
But first, stop working with this database immediately.
Stop the server and take a backup of all the files in the data
directory.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

#5Vijaykumar Jain
vijaykumarjain.github@gmail.com
In reply to: Laurenz Albe (#4)
Re: uncommitted xmin 3100586 from before xid cutoff 10339367 needs to be frozen

On Fri, 22 Mar 2024 at 15:39, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

On Fri, 2024-03-22 at 13:41 +0530, Daulat wrote:

We are unable to take the backup of our database. While taking backup we

are getting the same error.

psql: error: connection to server at "localhost" (::1), port 5014

failed: FATAL: pg_attribute catalog is missing 1 attribute(s) for relation
OID 2662

Then you got severe data corruption. This is the index
"pg_class_oid_index",
and corrupted metadata make recovery difficult.

If I were you, I would seek professional help.
But first, stop working with this database immediately.
Stop the server and take a backup of all the files in the data
directory.

Do we have an option that op has a replica running and the bug has not
propagated to the replica , and the op can failover/ take a backup off the
replica.
i mean i could simulate a corruption of pg_catalog on my local vm using dd,
but ofc that is hardware level corruption that did not propagate to the
replica, so i could failover and backup from the replica just fine.
PS : if the bug propagates to the replica or does corruption on the replica
too, then idk the solution. if you could login and get the oid of the
objects (and have field types ready externally), then you can run a
pg_filedump and copy the data.
First contact with the pg_filedump - Highgo Software Inc.
<https://www.highgo.ca/2021/07/14/first-contact-with-the-pg_filedump/&gt;

i tried an example, but i had a lot of info for that.
<https://www.highgo.ca/2021/07/14/first-contact-with-the-pg_filedump/&gt;corruption
demo for blogs. (github.com)
<https://gist.github.com/cabecada/8024d98024559e9fc97ccfcb5324c09f&gt; (if
you dont understand this, then ignore)

Vijay

LinkedIn - Vijaykumar Jain <https://www.linkedin.com/in/vijaykumarjain/&gt;

#6Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Vijaykumar Jain (#5)
Re: uncommitted xmin 3100586 from before xid cutoff 10339367 needs to be frozen

On Fri, 2024-03-22 at 16:07 +0530, Vijaykumar Jain wrote:

On Fri, 22 Mar 2024 at 15:39, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

On Fri, 2024-03-22 at 13:41 +0530, Daulat wrote:

We are unable to take the backup of our database. While taking backup we are getting the same error. 

psql: error: connection to server at "localhost" (::1), port 5014 failed: FATAL:  pg_attribute catalog is missing 1 attribute(s) for relation OID 2662

Then you got severe data corruption.  This is the index "pg_class_oid_index",
and corrupted metadata make recovery difficult.

If I were you, I would seek professional help.
But first, stop working with this database immediately.
Stop the server and take a backup of all the files in the data
directory.

Do we have an option that op has a replica running and the bug has not propagated to the replica

No. Feel free to check on the standby, it should have the same problem.

Yours,
Laurenz Albe