MultiXactId error after upgrade to 9.3.4

Started by Stephen Frostabout 12 years ago40 messageshackers
Jump to latest
#1Stephen Frost
sfrost@snowman.net

Greetings,

Looks like we might not be entirely out of the woods yet regarding
MultiXactId's. After doing an upgrade from 9.2.6 to 9.3.4, we saw the
following:

ERROR: MultiXactId 6849409 has not been created yet -- apparent wraparound

The table contents can be select'd out and match the pre-upgrade
backup, but any attempt to VACUUM / VACUUM FULL / CLUSTER fails,
unsurprisingly.

I've just started looking into this, but here's a bit more data-

The *NEW* (9.3.4) cluster's pg_multixact files:

pg_multixact/members/

-rw------- 1 postgres postgres 8192 Mar 30 02:33 0000

pg_multixact/offsets/

-rw------- 1 postgres postgres 8192 Mar 28 01:11 0000
-rw------- 1 postgres postgres 122880 Mar 30 02:33 0018

The *OLD* (9.2.6) cluster's pg_multixact files:

pg_multixact/members/

-rw-rw-r-- 1 postgres postgres 188416 Mar 30 03:07 0044

pg_multixact/offsets/

-rw-rw-r-- 1 postgres postgres 114688 Mar 30 03:07 0018

txid_current - 40297693
datfrozenxid - 654

autovacuum_freeze_max_age, vacuum_freeze_min_age,
vacuum_freeze_table_age, multixact GUCs are commented out / default
values.

The *NEW* (9.3.4) control data shows:

pg_control version number: 937
Catalog version number: 201306121

Latest checkpoint's NextXID: 0/40297704
Latest checkpoint's NextOID: 53773598

Latest checkpoint's NextMultiXactId: 1601771
Latest checkpoint's NextMultiOffset: 1105

Latest checkpoint's oldestXID: 654
Latest checkpoint's oldestXID's DB: 12036
Latest checkpoint's oldestActiveXID: 40297704
Latest checkpoint's oldestMultiXid: 1601462
Latest checkpoint's oldestMulti's DB: 0

The *OLD* (9.2.6) control data had:

pg_control version number: 922
Catalog version number: 201204301

Latest checkpoint's NextXID: 0/40195831
Latest checkpoint's NextOID: 53757891

Latest checkpoint's NextMultiXactId: 1601462
Latest checkpoint's NextMultiOffset: 4503112

Latest checkpoint's oldestXID: 654
Latest checkpoint's oldestXID's DB: 12870
Latest checkpoint's oldestActiveXID: 0

(It doesn't report the oldestMulti info under 9.2.6).

The pg_upgrade reported:

Setting oldest multixact ID on new cluster

While testing, I discovered that this didn't appear to happen with a
(earlier) upgrade to 9.3.2. Don't know if it's indicative of
anything.

Here is what pageinspace shows for the record:

-[ RECORD 1 ]--------
lp | 45
lp_off | 5896
lp_flags | 1
lp_len | 50
t_xmin | 9663920
t_xmax | 6849409
t_field3 | 26930
t_ctid | (0,45)
t_infomask2 | 5
t_infomask | 6528
t_hoff | 24
t_bits |
t_oid |

Which shows xmax as 6849409 and HEAP_XMAX_IS_MULTI is set. This is
the only tuple in the table which has HEAP_XMAX_IS_MULTI and the xmax
for all of the other tuples is quite a bit higher (though all are
visible currently).

Thoughts?

Thanks,

Stephen

#2Stephen Frost
sfrost@snowman.net
In reply to: Stephen Frost (#1)
Re: MultiXactId error after upgrade to 9.3.4

All,

* Stephen Frost (sfrost@snowman.net) wrote:

Looks like we might not be entirely out of the woods yet regarding
MultiXactId's. After doing an upgrade from 9.2.6 to 9.3.4, we saw the
following:

ERROR: MultiXactId 6849409 has not been created yet -- apparent wraparound

While trying to get the production system back in order, I was able to
simply do:

select * from table for update;

Which happily updated the xmax for all of the rows- evidently without
any care that the MultiXactId in one of the tuples was considered
invalid (by at least some parts of the code).

I have the pre-upgrade database and can upgrade/rollback/etc that pretty
easily. Note that the table contents weren't changed during the
upgrade, of course, and so the 9.2.6 instance has HEAP_XMAX_IS_MULTI set
while t_xmax is 6849409 for the tuple in question- even though
pg_controldata reports NextMultiXactId as 1601462 (and it seems very
unlikely that there's been a wraparound on that in this database..).

Perhaps something screwed up xmax/HEAP_XMAX_IS_MULTI under 9.2 and the
9.3 instance now detects that something is wrong? Or is this a case
which was previously allowed and it's just in 9.3 that we don't like it?
Hard for me to see why that would be the case, but this really feels
like HEAP_XMAX_IS_MULTI was incorrectly set on the old cluster and the
xmax in the table was actually a regular xid.. That would have come
from 9.2 though.

Thanks,

Stephen

#3Stephen Frost
sfrost@snowman.net
In reply to: Stephen Frost (#2)
Re: MultiXactId error after upgrade to 9.3.4

* Stephen Frost (sfrost@snowman.net) wrote:

I have the pre-upgrade database and can upgrade/rollback/etc that pretty
easily. Note that the table contents weren't changed during the
upgrade, of course, and so the 9.2.6 instance has HEAP_XMAX_IS_MULTI set
while t_xmax is 6849409 for the tuple in question- even though
pg_controldata reports NextMultiXactId as 1601462 (and it seems very
unlikely that there's been a wraparound on that in this database..).

Further review leads me to notice that both HEAP_XMAX_IS_MULTI and
HEAP_XMAX_INVALID are set:

t_infomask | 6528

6528 decimal -> 0x1980

0001 1001 1000 0000

Which gives us:

0000 0000 1000 0000 - HEAP_XMAX_LOCK_ONLY
0000 0001 0000 0000 - HEAP_XMIN_COMMITTED
0000 1000 0000 0000 - HEAP_XMAX_INVALID
0001 0000 0000 0000 - HEAP_XMAX_IS_MULTI

Which shows that both HEAP_XMAX_INVALID and HEAP_XMAX_IS_MULTI are set.
Of some interest is that HEAP_XMAX_LOCK_ONLY is also set..

Perhaps something screwed up xmax/HEAP_XMAX_IS_MULTI under 9.2 and the
9.3 instance now detects that something is wrong? Or is this a case
which was previously allowed and it's just in 9.3 that we don't like it?

The 'improve concurrency of FK locking' patch included a change to
UpdateXmaxHintBits():

- * [...] Hence callers should look
- * only at XMAX_INVALID.

...

+ * Hence callers should look only at XMAX_INVALID.
+ *
+ * Note this is not allowed for tuples whose xmax is a multixact.

[...]

+ Assert(!(tuple->t_infomask & HEAP_XMAX_IS_MULTI));

What isn't clear to me is if this restriction was supposed to always be
there and something pre-9.3 screwed this up, or if this is a *new*
restriction on what's allowed, in which case it's an on-disk format
change that needs to be accounted for.

One other thing to mention is that this system was originally a 9.0
system and the last update to this tuple that we believe happened was
when it was on 9.0, prior to the 9.2 upgrade (which happened about a
year ago), so it's possible the issue is from the 9.0 era.

Thanks,

Stephen

#4Andres Freund
andres@anarazel.de
In reply to: Stephen Frost (#1)
Re: MultiXactId error after upgrade to 9.3.4

Hi,

On 2014-03-30 00:00:30 -0400, Stephen Frost wrote:

Greetings,

Looks like we might not be entirely out of the woods yet regarding
MultiXactId's. After doing an upgrade from 9.2.6 to 9.3.4, we saw the
following:

ERROR: MultiXactId 6849409 has not been created yet -- apparent wraparound

The table contents can be select'd out and match the pre-upgrade
backup, but any attempt to VACUUM / VACUUM FULL / CLUSTER fails,
unsurprisingly.

Without having looked at the code, IIRC this looks like some place
misses passing allow_old=true where it's actually required. Any chance
you can get a backtrace for the error message? I know you said somewhere
below that you'd worked around the problem, but maybe you have a copy of
the database somewhere?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Stephen Frost (#3)
Re: MultiXactId error after upgrade to 9.3.4

Stephen Frost wrote:

* Stephen Frost (sfrost@snowman.net) wrote:

I have the pre-upgrade database and can upgrade/rollback/etc that pretty
easily. Note that the table contents weren't changed during the
upgrade, of course, and so the 9.2.6 instance has HEAP_XMAX_IS_MULTI set
while t_xmax is 6849409 for the tuple in question- even though
pg_controldata reports NextMultiXactId as 1601462 (and it seems very
unlikely that there's been a wraparound on that in this database..).

Further review leads me to notice that both HEAP_XMAX_IS_MULTI and
HEAP_XMAX_INVALID are set:

t_infomask | 6528

6528 decimal -> 0x1980

0001 1001 1000 0000

Which gives us:

0000 0000 1000 0000 - HEAP_XMAX_LOCK_ONLY
0000 0001 0000 0000 - HEAP_XMIN_COMMITTED
0000 1000 0000 0000 - HEAP_XMAX_INVALID
0001 0000 0000 0000 - HEAP_XMAX_IS_MULTI

Which shows that both HEAP_XMAX_INVALID and HEAP_XMAX_IS_MULTI are set.

My conclusion here is that some part of the code is failing to examine
XMAX_INVALID before looking at the value stored in xmax itself. There
ought to be a short-circuit. Fortunately, this bug should be pretty
harmless.

.. and after looking, I'm fairly sure the bug is in
heap_tuple_needs_freeze.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#5)
Re: MultiXactId error after upgrade to 9.3.4

On 2014-03-31 08:54:53 -0300, Alvaro Herrera wrote:

My conclusion here is that some part of the code is failing to examine
XMAX_INVALID before looking at the value stored in xmax itself. There
ought to be a short-circuit. Fortunately, this bug should be pretty
harmless.

.. and after looking, I'm fairly sure the bug is in
heap_tuple_needs_freeze.

heap_tuple_needs_freeze() isn't *allowed* to look at
XMAX_INVALID. Otherwise it could miss freezing something still visible
on a standby or after an eventual crash.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andres Freund (#6)
Re: MultiXactId error after upgrade to 9.3.4

Andres Freund wrote:

On 2014-03-31 08:54:53 -0300, Alvaro Herrera wrote:

My conclusion here is that some part of the code is failing to examine
XMAX_INVALID before looking at the value stored in xmax itself. There
ought to be a short-circuit. Fortunately, this bug should be pretty
harmless.

.. and after looking, I'm fairly sure the bug is in
heap_tuple_needs_freeze.

heap_tuple_needs_freeze() isn't *allowed* to look at
XMAX_INVALID. Otherwise it could miss freezing something still visible
on a standby or after an eventual crash.

Ah, you're right. It even says so on the comment at the top (no
caffeine yet.) But what it's doing is still buggy, per this report, so
we need to do *something* ...

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#7)
Re: MultiXactId error after upgrade to 9.3.4

On 2014-03-31 09:19:12 -0300, Alvaro Herrera wrote:

Andres Freund wrote:

On 2014-03-31 08:54:53 -0300, Alvaro Herrera wrote:

My conclusion here is that some part of the code is failing to examine
XMAX_INVALID before looking at the value stored in xmax itself. There
ought to be a short-circuit. Fortunately, this bug should be pretty
harmless.

.. and after looking, I'm fairly sure the bug is in
heap_tuple_needs_freeze.

heap_tuple_needs_freeze() isn't *allowed* to look at
XMAX_INVALID. Otherwise it could miss freezing something still visible
on a standby or after an eventual crash.

Ah, you're right. It even says so on the comment at the top (no
caffeine yet.) But what it's doing is still buggy, per this report, so
we need to do *something* ...

Are you sure needs_freeze() is the problem here?

IIRC it already does some checks for allow_old? Why is the check for
that not sufficient?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andres Freund (#8)
Re: MultiXactId error after upgrade to 9.3.4

Andres Freund wrote:

On 2014-03-31 09:19:12 -0300, Alvaro Herrera wrote:

Andres Freund wrote:

On 2014-03-31 08:54:53 -0300, Alvaro Herrera wrote:

My conclusion here is that some part of the code is failing to examine
XMAX_INVALID before looking at the value stored in xmax itself. There
ought to be a short-circuit. Fortunately, this bug should be pretty
harmless.

.. and after looking, I'm fairly sure the bug is in
heap_tuple_needs_freeze.

heap_tuple_needs_freeze() isn't *allowed* to look at
XMAX_INVALID. Otherwise it could miss freezing something still visible
on a standby or after an eventual crash.

Ah, you're right. It even says so on the comment at the top (no
caffeine yet.) But what it's doing is still buggy, per this report, so
we need to do *something* ...

Are you sure needs_freeze() is the problem here?

IIRC it already does some checks for allow_old? Why is the check for
that not sufficient?

GetMultiXactIdMembers has this:

if (MultiXactIdPrecedes(multi, oldestMXact))
{
ereport(allow_old ? DEBUG1 : ERROR,
(errcode(ERRCODE_INTERNAL_ERROR),
errmsg("MultiXactId %u does no longer exist -- apparent wraparound",
multi)));
return -1;
}

if (!MultiXactIdPrecedes(multi, nextMXact))
ereport(ERROR,
(errcode(ERRCODE_INTERNAL_ERROR),
errmsg("MultiXactId %u has not been created yet -- apparent wraparound",
multi)));

I guess I wasn't expecting that too-old values would last longer than a
full wraparound cycle. Maybe the right fix is just to have the second
check also conditional on allow_old.

Anyway, it's not clear to me why this database has a multixact value of
6 million when the next multixact value is barely above one million.
Stephen said a wraparound here is not likely.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Stephen Frost
sfrost@snowman.net
In reply to: Alvaro Herrera (#9)
Re: MultiXactId error after upgrade to 9.3.4

* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:

I guess I wasn't expecting that too-old values would last longer than a
full wraparound cycle. Maybe the right fix is just to have the second
check also conditional on allow_old.

I don't believe this was a wraparound case.

Anyway, it's not clear to me why this database has a multixact value of
6 million when the next multixact value is barely above one million.
Stephen said a wraparound here is not likely.

I don't think the xmax value is a multixact at all- I think it's
actually a regular xid, but everyone is expected to ignore it because
XMAX_IS_INVALID, yet somehow the MULTIXACT bit was also set and the new
code expects to be able to look at the xmax field even though it's
marked as invalid..

I'm going through the upgrade process again from 9.2 and will get a
stack trace.

Thanks,

Stephen

#11Andres Freund
andres@anarazel.de
In reply to: Stephen Frost (#10)
Re: MultiXactId error after upgrade to 9.3.4

On 2014-03-31 09:09:08 -0400, Stephen Frost wrote:

* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:

I guess I wasn't expecting that too-old values would last longer than a
full wraparound cycle. Maybe the right fix is just to have the second
check also conditional on allow_old.

I don't believe this was a wraparound case.

Could you show pg_controldata from the old cluster?

Anyway, it's not clear to me why this database has a multixact value of
6 million when the next multixact value is barely above one million.
Stephen said a wraparound here is not likely.

I don't think the xmax value is a multixact at all- I think it's
actually a regular xid, but everyone is expected to ignore it because
XMAX_IS_INVALID, yet somehow the MULTIXACT bit was also set and the new
code expects to be able to look at the xmax field even though it's
marked as invalid..

XMAX_IS_INVALID was never allowed to be ignored for vacuum AFAICS. So I
don't think this is a valid explanation.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Stephen Frost
sfrost@snowman.net
In reply to: Andres Freund (#4)
Re: MultiXactId error after upgrade to 9.3.4

Andres,

* Andres Freund (andres@2ndquadrant.com) wrote:

Without having looked at the code, IIRC this looks like some place
misses passing allow_old=true where it's actually required. Any chance
you can get a backtrace for the error message? I know you said somewhere
below that you'd worked around the problem, but maybe you have a copy of
the database somewhere?

#0 GetMultiXactIdMembers (allow_old=<optimized out>, members=0x7fff78200ca0, multi=6849409) at /tmp/buildd/postgresql-9.3-9.3.4/build/../src/backend/access/transam/multixact.c:1130
#1 GetMultiXactIdMembers (multi=6849409, members=0x7fff78200ca0, allow_old=<optimized out>) at /tmp/buildd/postgresql-9.3-9.3.4/build/../src/backend/access/transam/multixact.c:1056
#2 0x00007f46e9e707f8 in FreezeMultiXactId (flags=<synthetic pointer>, cutoff_multi=<optimized out>, cutoff_xid=4285178915, t_infomask=<optimized out>, multi=6849409)
at /tmp/buildd/postgresql-9.3-9.3.4/build/../src/backend/access/heap/heapam.c:5355
#3 heap_prepare_freeze_tuple (tuple=0x7f46e2d9a328, cutoff_xid=4285178915, cutoff_multi=<optimized out>, frz=0x7f46ec4d31b0) at /tmp/buildd/postgresql-9.3-9.3.4/build/../src/backend/access/heap/heapam.c:5593
#4 0x00007f46e9f72aca in lazy_scan_heap (scan_all=0 '\000', nindexes=2, Irel=<optimized out>, vacrelstats=<optimized out>, onerel=0x7f46e9d65ca0) at /tmp/buildd/postgresql-9.3-9.3.4/build/../src/backend/commands/vacuumlazy.c:899
#5 lazy_vacuum_rel (onerel=0x7f46e9d65ca0, vacstmt=<optimized out>, bstrategy=<optimized out>) at /tmp/buildd/postgresql-9.3-9.3.4/build/../src/backend/commands/vacuumlazy.c:236
#6 0x00007f46e9f70755 in vacuum_rel (relid=288284, vacstmt=0x7f46ec59f780, do_toast=1 '\001', for_wraparound=0 '\000') at /tmp/buildd/postgresql-9.3-9.3.4/build/../src/backend/commands/vacuum.c:1205
#7 0x00007f46e9f71445 in vacuum (vacstmt=0x7f46ec59f780, relid=<optimized out>, do_toast=1 '\001', bstrategy=<optimized out>, for_wraparound=0 '\000', isTopLevel=<optimized out>)
at /tmp/buildd/postgresql-9.3-9.3.4/build/../src/backend/commands/vacuum.c:234
#8 0x00007f46ea05e773 in standard_ProcessUtility (parsetree=0x7f46ec59f780, queryString=0x7f46ec59ec90 "vacuum common.wave_supplemental;", context=<optimized out>, params=0x0, dest=<optimized out>, completionTag=0x7fff782016e0 "")
at /tmp/buildd/postgresql-9.3-9.3.4/build/../src/backend/tcop/utility.c:639
#9 0x00007f46ea05b5a3 in PortalRunUtility (portal=0x7f46ec4f82e0, utilityStmt=0x7f46ec59f780, isTopLevel=1 '\001', dest=0x7f46ec59fae0, completionTag=0x7fff782016e0 "")
at /tmp/buildd/postgresql-9.3-9.3.4/build/../src/backend/tcop/pquery.c:1187
#10 0x00007f46ea05c1d6 in PortalRunMulti (portal=0x7f46ec4f82e0, isTopLevel=1 '\001', dest=0x7f46ec59fae0, altdest=0x7f46ec59fae0, completionTag=0x7fff782016e0 "")
at /tmp/buildd/postgresql-9.3-9.3.4/build/../src/backend/tcop/pquery.c:1318
#11 0x00007f46ea05ce6d in PortalRun (portal=0x7f46ec4f82e0, count=9223372036854775807, isTopLevel=1 '\001', dest=0x7f46ec59fae0, altdest=0x7f46ec59fae0, completionTag=0x7fff782016e0 "")
at /tmp/buildd/postgresql-9.3-9.3.4/build/../src/backend/tcop/pquery.c:816
#12 0x00007f46ea05908c in exec_simple_query (query_string=0x7f46ec59ec90 "vacuum common.wave_supplemental;") at /tmp/buildd/postgresql-9.3-9.3.4/build/../src/backend/tcop/postgres.c:1048
#13 PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=0x7f46ec4b9010 "mine", username=<optimized out>) at /tmp/buildd/postgresql-9.3-9.3.4/build/../src/backend/tcop/postgres.c:3997
#14 0x00007f46ea01487d in BackendRun (port=0x7f46ec4fc2c0) at /tmp/buildd/postgresql-9.3-9.3.4/build/../src/backend/postmaster/postmaster.c:3996
#15 BackendStartup (port=0x7f46ec4fc2c0) at /tmp/buildd/postgresql-9.3-9.3.4/build/../src/backend/postmaster/postmaster.c:3685
#16 ServerLoop () at /tmp/buildd/postgresql-9.3-9.3.4/build/../src/backend/postmaster/postmaster.c:1586
#17 PostmasterMain (argc=<optimized out>, argv=<optimized out>) at /tmp/buildd/postgresql-9.3-9.3.4/build/../src/backend/postmaster/postmaster.c:1253
#18 0x00007f46e9e4950c in main (argc=5, argv=0x7f46ec4b81f0) at /tmp/buildd/postgresql-9.3-9.3.4/build/../src/backend/main/main.c:206

Looks like your idea that is has to do w/ freezeing is accurate...

Thanks,

Stephen

#13Stephen Frost
sfrost@snowman.net
In reply to: Andres Freund (#11)
Re: MultiXactId error after upgrade to 9.3.4

* Andres Freund (andres@2ndquadrant.com) wrote:

On 2014-03-31 09:09:08 -0400, Stephen Frost wrote:

* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:

I guess I wasn't expecting that too-old values would last longer than a
full wraparound cycle. Maybe the right fix is just to have the second
check also conditional on allow_old.

I don't believe this was a wraparound case.

Could you show pg_controldata from the old cluster?

Per the original email-

The *OLD* (9.2.6) control data had:

pg_control version number: 922
Catalog version number: 201204301

Latest checkpoint's NextXID: 0/40195831
Latest checkpoint's NextOID: 53757891

Latest checkpoint's NextMultiXactId: 1601462
Latest checkpoint's NextMultiOffset: 4503112

Latest checkpoint's oldestXID: 654
Latest checkpoint's oldestXID's DB: 12870
Latest checkpoint's oldestActiveXID: 0

(It doesn't report the oldestMulti info under 9.2.6).

Was there something else you were looking for?

I don't think the xmax value is a multixact at all- I think it's
actually a regular xid, but everyone is expected to ignore it because
XMAX_IS_INVALID, yet somehow the MULTIXACT bit was also set and the new
code expects to be able to look at the xmax field even though it's
marked as invalid..

XMAX_IS_INVALID was never allowed to be ignored for vacuum AFAICS. So I
don't think this is a valid explanation.

The old 9.2 cluster certainly had no issue w/ vacuum'ing this
table/tuple. Unfortunately, I can't have the 9.2 debug packages
installed concurrently w/ the 9.3 debug packages, so it's a bit awkward
to go back and forth between the two. Anything else of interest while I
have the 9.3 instance running under gdb? Sent the requested backtrace
in another email.

Thanks,

Stephen

#14Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Stephen Frost (#3)
Re: MultiXactId error after upgrade to 9.3.4

Stephen Frost wrote:

Further review leads me to notice that both HEAP_XMAX_IS_MULTI and
HEAP_XMAX_INVALID are set:

t_infomask | 6528

6528 decimal -> 0x1980

0001 1001 1000 0000

Which gives us:

0000 0000 1000 0000 - HEAP_XMAX_LOCK_ONLY
0000 0001 0000 0000 - HEAP_XMIN_COMMITTED
0000 1000 0000 0000 - HEAP_XMAX_INVALID
0001 0000 0000 0000 - HEAP_XMAX_IS_MULTI

Which shows that both HEAP_XMAX_INVALID and HEAP_XMAX_IS_MULTI are set.
Of some interest is that HEAP_XMAX_LOCK_ONLY is also set..

This combination seems reasonable. This tuple had two FOR SHARE
lockers, so it was marked HEAP_XMAX_SHARED_LOCK|HEAP_XMAX_IS_MULTI
(0x1080). Then those lockers finished, and somebody else checked the
tuple with a tqual.c routine (say HeapTupleSatisfiesUpdate), which saw
the lockers were gone and marked it as HEAP_XMAX_INVALID (0x800),
without removing the Xmax value and without removing the other bits.

This is all per spec, so we must cope.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andres Freund (#6)
Re: MultiXactId error after upgrade to 9.3.4

Andres Freund wrote:

On 2014-03-31 08:54:53 -0300, Alvaro Herrera wrote:

My conclusion here is that some part of the code is failing to examine
XMAX_INVALID before looking at the value stored in xmax itself. There
ought to be a short-circuit. Fortunately, this bug should be pretty
harmless.

.. and after looking, I'm fairly sure the bug is in
heap_tuple_needs_freeze.

heap_tuple_needs_freeze() isn't *allowed* to look at
XMAX_INVALID. Otherwise it could miss freezing something still visible
on a standby or after an eventual crash.

I think this rule is wrong. I think the rule ought to be something like
"if the XMAX_INVALID bit is set, then reset whatever is there if there
is something; if the bit is not set, proceed as today". Otherwise we
risk reading garbage, which is what is happening in this case.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#15)
Re: MultiXactId error after upgrade to 9.3.4

Alvaro Herrera wrote:

Andres Freund wrote:

On 2014-03-31 08:54:53 -0300, Alvaro Herrera wrote:

My conclusion here is that some part of the code is failing to examine
XMAX_INVALID before looking at the value stored in xmax itself. There
ought to be a short-circuit. Fortunately, this bug should be pretty
harmless.

.. and after looking, I'm fairly sure the bug is in
heap_tuple_needs_freeze.

heap_tuple_needs_freeze() isn't *allowed* to look at
XMAX_INVALID. Otherwise it could miss freezing something still visible
on a standby or after an eventual crash.

I think this rule is wrong. I think the rule ought to be something like
"if the XMAX_INVALID bit is set, then reset whatever is there if there
is something; if the bit is not set, proceed as today". Otherwise we
risk reading garbage, which is what is happening in this case.

Andres asks on IM: How come there is garbage there in the first place?
I have to admit I have no idea.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Stephen Frost
sfrost@snowman.net
In reply to: Alvaro Herrera (#16)
Re: MultiXactId error after upgrade to 9.3.4

* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:

I think this rule is wrong. I think the rule ought to be something like
"if the XMAX_INVALID bit is set, then reset whatever is there if there
is something; if the bit is not set, proceed as today". Otherwise we
risk reading garbage, which is what is happening in this case.

Andres asks on IM: How come there is garbage there in the first place?
I have to admit I have no idea.

I haven't got any great explanation for that either. I continue to feel
that it's much more likely that it's an xid than a multixid. Is it
possible that it was stamped with a real xmax through some code path
which ignored the IS_MULTI flag? This could have been from as far back
as 9.0-era. On this over-7TB database, only this one tuple had the
issue. I have another set of databases which add up to ~20TB that I'm
currently testing an upgrade from 9.2 to 9.3 on and will certainly let
everyone know if I run into a similar situation there.

On our smallest DB (which we upgraded first) which is much more OLTP
instead of OLAP, we didn't run into this.

This is all on physical gear and we've seen no indications that there
has been any corruption. Hard to rule it out completely, but it seems
pretty unlikely.

Thanks,

Stephen

#18Bruce Momjian
bruce@momjian.us
In reply to: Stephen Frost (#12)
Re: MultiXactId error after upgrade to 9.3.4

On Mon, Mar 31, 2014 at 09:36:03AM -0400, Stephen Frost wrote:

Andres,

* Andres Freund (andres@2ndquadrant.com) wrote:

Without having looked at the code, IIRC this looks like some place
misses passing allow_old=true where it's actually required. Any chance
you can get a backtrace for the error message? I know you said somewhere
below that you'd worked around the problem, but maybe you have a copy of
the database somewhere?

Looks like your idea that is has to do w/ freezeing is accurate...

Where are we on this?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andres Freund (#6)
Re: MultiXactId error after upgrade to 9.3.4

Andres Freund wrote:

On 2014-03-31 08:54:53 -0300, Alvaro Herrera wrote:

My conclusion here is that some part of the code is failing to examine
XMAX_INVALID before looking at the value stored in xmax itself. There
ought to be a short-circuit. Fortunately, this bug should be pretty
harmless.

.. and after looking, I'm fairly sure the bug is in
heap_tuple_needs_freeze.

heap_tuple_needs_freeze() isn't *allowed* to look at
XMAX_INVALID. Otherwise it could miss freezing something still visible
on a standby or after an eventual crash.

I think what we should do here is that if we see that XMAX_INVALID is
set, we just reset everything to zero without checking the multixact
contents. Something like the attached (warning: hand-edited, line
numbers might be bogus)

I still don't know under what circumstances this situation could arise.
This seems most strange to me. I would wonder about this to be just
papering over a different bug elsewhere, except that we know this tuple
comes from a pg_upgraded table and so I think the only real solution is
to cope.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

no-freeze-invalid-multi.patchtext/x-diff; charset=us-asciiDownload+6-1
#20Bruce Momjian
bruce@momjian.us
In reply to: Alvaro Herrera (#19)
Re: MultiXactId error after upgrade to 9.3.4

On Wed, Apr 23, 2014 at 03:01:02PM -0300, Alvaro Herrera wrote:

Andres Freund wrote:

On 2014-03-31 08:54:53 -0300, Alvaro Herrera wrote:

My conclusion here is that some part of the code is failing to examine
XMAX_INVALID before looking at the value stored in xmax itself. There
ought to be a short-circuit. Fortunately, this bug should be pretty
harmless.

.. and after looking, I'm fairly sure the bug is in
heap_tuple_needs_freeze.

heap_tuple_needs_freeze() isn't *allowed* to look at
XMAX_INVALID. Otherwise it could miss freezing something still visible
on a standby or after an eventual crash.

I think what we should do here is that if we see that XMAX_INVALID is
set, we just reset everything to zero without checking the multixact
contents. Something like the attached (warning: hand-edited, line
numbers might be bogus)

I still don't know under what circumstances this situation could arise.
This seems most strange to me. I would wonder about this to be just
papering over a different bug elsewhere, except that we know this tuple
comes from a pg_upgraded table and so I think the only real solution is
to cope.

Shouldn't we log something at least if we are unsure of the cause?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bruce Momjian (#20)
#22Bruce Momjian
bruce@momjian.us
In reply to: Alvaro Herrera (#21)
#23Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bruce Momjian (#22)
#24Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#23)
#25Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andres Freund (#24)
#26Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#25)
#27Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Stephen Frost (#1)
#28Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Alvaro Herrera (#27)
#29Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andrew Gierth (#28)
#30Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#29)
#31Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Alvaro Herrera (#29)
#32Robert Haas
robertmhaas@gmail.com
In reply to: Andrew Gierth (#28)
#33Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Robert Haas (#32)
#34Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andrew Gierth (#31)
#35Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#29)
#36Robert Haas
robertmhaas@gmail.com
In reply to: Andrew Gierth (#33)
#37Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#36)
#38Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#37)
#39Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#36)
#40Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#39)