tuptoaster.c must not use SnapshotAny

Started by Tom Laneover 24 years ago34 messageshackers

tgl@sss.pgh.pa.us

over 24 years ago

The tuple toaster currently fetches toast-table rows using SnapshotAny.
This is quite uncool, because it will in fact find *any* row. Including
dead rows resulting from an aborted (or, in 7.2, still-in-progress)
VACUUM.

I believe this is the explanation for a problem report I'm currently
chasing from Joshua Drake:

digivision2=# select * from change_log;
ERROR: chunk 1 for toast value 3388774 appears multiple times

(there are some thousands of rows with similar problems)

I suggest SnapshotSelf instead. Comments anyone?

regards, tom lane

Tom Lane

tgl@sss.pgh.pa.us

over 24 years ago

In reply to: Tom Lane (#1)

Re: tuptoaster.c must *not* use SnapshotAny

"Command Prompt, Inc." <pgsql-hackers@commandprompt.com> writes:

Josh Drake here: Although I know that development has pretty much ceased
on the 7.1 series. This bug, is potentially very serious for those who
have the problem.

Well, based on our experiments last night, recovery is pretty simple:
vacuum the affected table. So it doesn't qualify as a critical problem
in my eyes, although surely it's a "must fix" for 7.2.

The last time this came up (that I can find) is in August of 2001, so the
bug is not frequent that we are aware of. However, as it appears the fix
is simple should we apply a 7.1.3.1 or 7.1.4 that fixes this issue?

I don't think anyone has the energy to put out a 7.1.4. We seem to have
enough trouble getting 7.2 out the door.

regards, tom lane

Import Notes

Reply to msg id not found: Pine.LNX.4.30.0201160944480.30961-100000@commandprompt.comReference msg id not found: Pine.LNX.4.30.0201160944480.30961-100000@commandprompt.com | Resolved by subject fallback

Zeugswetter Andreas SB SD

ZeugswetterA@spardat.at

over 24 years ago

In reply to: Tom Lane (#2)

Re: tuptoaster.c must *not* use SnapshotAny

The tuple toaster currently fetches toast-table rows using SnapshotAny.
This is quite uncool, because it will in fact find *any* row.

Well, doesn't it need to find *any* row, since the accesses to heap and toast
are not "atomic" (not interlocked) ?

Including dead rows resulting from an aborted

But then also the heap tuple is aborted, and thus a normal select will not
select that toast anyway.

(or, in 7.2, still-in-progress)
VACUUM.

I thought new vacuum did not move tuples ? Why should it then produce
two rows ? And VACUUM FULL locks the table so ...

I think SnapshotSelf could have concurrency problems (not see a toast
that someone else committed between our heap read and toast read)
IIRC, at the time at least SnapshotAny was chosen on purpose.

Might it be a problem, that VACUUM sets xids to FrozenXid, and we always
consider that one Valid, even if that vacuum is still in progress ?

I can only see a problem with a partway through vacuum full, that aborted.

Andreas

PS: short desc between Any and Self, anybody ?

Import Notes

Resolved by subject fallback

Tom Lane

tgl@sss.pgh.pa.us

over 24 years ago

In reply to: Zeugswetter Andreas SB SD (#3)

Re: tuptoaster.c must *not* use SnapshotAny

"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:

The tuple toaster currently fetches toast-table rows using SnapshotAny.
This is quite uncool, because it will in fact find *any* row.

Well, doesn't it need to find *any* row, since the accesses to heap and toast
are not "atomic" (not interlocked) ?

No; it needs to *not* find dead rows resulting from a failed vacuum.
You are right that for the most part TOAST relies on time qualification
of the main-table row, and no doubt that was why Jan thought he could
code it like this; but vacuuming the TOAST table is not an operation
that affects the main table.

Including dead rows resulting from an aborted

But then also the heap tuple is aborted, and thus a normal select will not
select that toast anyway.

Wrong; see above. We are talking about reshuffling rows in the TOAST
table that belong (presumably) to a live row in the main table. During
the VACUUM there will be more than one copy of such rows.

(or, in 7.2, still-in-progress)
VACUUM.

I thought new vacuum did not move tuples ? Why should it then produce
two rows ? And VACUUM FULL locks the table so ...

You're right, lazy VACUUM doesn't create this issue. My mistake.
But VACUUM FULL does.

I can only see a problem with a partway through vacuum full, that aborted.

Exactly.

regards, tom lane

Zeugswetter Andreas SB SD

ZeugswetterA@spardat.at

over 24 years ago

In reply to: Tom Lane (#4)

Re: tuptoaster.c must *not* use SnapshotAny

You're right, lazy VACUUM doesn't create this issue. My mistake.
But VACUUM FULL does.

I can only see a problem with a partway through vacuum

full, that aborted.

Exactly.

Can you try give me a hint, why an aborted VACUUM FULL will not allways be
a problem (also for other operations) until you run another VACUUM FULL
that succeeds ?

How do we know, that a (newly) FrozenXid tuple does not still have
a (visible) duplicate ?

Andreas

Import Notes

Resolved by subject fallback

Command Prompt, Inc.

pgsql-hackers@commandprompt.com

over 24 years ago

In reply to: Tom Lane (#1)

Re: tuptoaster.c must *not* use SnapshotAny

Josh Drake here: Although I know that development has pretty much ceased
on the 7.1 series. This bug, is potentially very serious for those who
have the problem.

The last time this came up (that I can find) is in August of 2001, so the
bug is not frequent that we are aware of. However, as it appears the fix
is simple should we apply a 7.1.3.1 or 7.1.4 that fixes this issue?

On Tue, 15 Jan 2002, Tom Lane wrote:

The tuple toaster currently fetches toast-table rows using SnapshotAny.
This is quite uncool, because it will in fact find *any* row. Including
dead rows resulting from an aborted (or, in 7.2, still-in-progress)
VACUUM.

I believe this is the explanation for a problem report I'm currently
chasing from Joshua Drake:

digivision2=# select * from change_log;
ERROR: chunk 1 for toast value 3388774 appears multiple times

(there are some thousands of rows with similar problems)

I suggest SnapshotSelf instead. Comments anyone?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

--
by way of pgsql-hackers@commandprompt.com

Tom Lane

tgl@sss.pgh.pa.us

over 24 years ago

In reply to: Zeugswetter Andreas SB SD (#5)

Re: tuptoaster.c must *not* use SnapshotAny

"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:

Can you try give me a hint, why an aborted VACUUM FULL will not allways be
a problem (also for other operations) until you run another VACUUM FULL
that succeeds ?

Because noplace else uses SnapshotAny, basically. The only really
legitimate use I can see for it is in Tatsuo's contrib/pgstattuple
hack: that wants to count both visible and invisible tuples.

How do we know, that a (newly) FrozenXid tuple does not still have
a (visible) duplicate ?

It's *not* visible, if you are applying any visibility checks whatever.
But SnapshotAny bypasses all visibility checking.

I forgot to answer your question about what these mean. See
src/backend/utils/time/tqual.c for the gory details, but if you
are willing to accept the summary comments in that file:

Any No visibility check whatever

Self Other committed xacts + effects of this xact up to
*and including* current command

Now Other committed xacts + effects of this xact up to
*but not including* current command

Dirty like Self, but can see effects of other in-progress
xacts too

Snapshot like Now, but specified other xacts are considered
uncommitted even if they've actually committed by now

There are also special visibility rules for Update and Vacuum, which
are not really different from the normal rules, they just return more
information than "visible" or "not visible". (Dirty returns some
side information too.)

Hmm. Now that I look at it, Self probably isn't quite right either;
if we are reading a main-table tuple that's committed dead but is
still visible according to our snapshot, we need to be able to see
its toast tuples too; but they're committed dead as well. Sigh.
I think we need a special visibility rule for TOAST, that only
checks for HEAP_MOVED_IN/HEAP_MOVED_OFF (the bits that take care
of VACUUM moves).

Comments?

regards, tom lane

Jan Wieck

JanWieck@Yahoo.com

over 24 years ago

In reply to: Tom Lane (#1)

Re: tuptoaster.c must *not* use SnapshotAny

Tom Lane wrote:

The tuple toaster currently fetches toast-table rows using SnapshotAny.
This is quite uncool, because it will in fact find *any* row. Including
dead rows resulting from an aborted (or, in 7.2, still-in-progress)
VACUUM.

Really? So what you're saying means that while vacuum is
running there's a window where one tuple can exist multiple
times and just the fact that other backends can see the
vacuum transaction alive prevents them from showing up? That
doesn't sound really good to me.

Jan

I believe this is the explanation for a problem report I'm currently
chasing from Joshua Drake:

digivision2=# select * from change_log;
ERROR: chunk 1 for toast value 3388774 appears multiple times

(there are some thousands of rows with similar problems)

I suggest SnapshotSelf instead. Comments anyone?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

Tom Lane

tgl@sss.pgh.pa.us

over 24 years ago

In reply to: Jan Wieck (#8)

Re: tuptoaster.c must *not* use SnapshotAny

Jan Wieck <janwieck@yahoo.com> writes:

Really? So what you're saying means that while vacuum is
running there's a window where one tuple can exist multiple
times

Moving a tuple in VACUUM isn't fundamentally different from an UPDATE.
You need a visibility rule to tell you which version of the tuple to
pay attention to. Right now, TOAST hasn't got one.

Per later discussion, Andreas is right that SnapshotSelf won't do.
I think we need a new tqual.c routine just for TOAST. (The alternative
would be for TOAST to know which snapshot was used to find the main-
table tuple, but that doesn't seem workable.)

regards, tom lane

#10

Bruce Momjian

bruce@momjian.us

over 24 years ago

In reply to: Tom Lane (#7)

Re: tuptoaster.c must *not* use SnapshotAny

I have added more comments to tqual.c to the stuff Tom just added. I
added visibility items to the top of each function comment.

---------------------------------------------------------------------------

Tom Lane wrote:

"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:

Can you try give me a hint, why an aborted VACUUM FULL will not allways be
a problem (also for other operations) until you run another VACUUM FULL
that succeeds ?

Because noplace else uses SnapshotAny, basically. The only really
legitimate use I can see for it is in Tatsuo's contrib/pgstattuple
hack: that wants to count both visible and invisible tuples.

How do we know, that a (newly) FrozenXid tuple does not still have
a (visible) duplicate ?

It's *not* visible, if you are applying any visibility checks whatever.
But SnapshotAny bypasses all visibility checking.

I forgot to answer your question about what these mean. See
src/backend/utils/time/tqual.c for the gory details, but if you
are willing to accept the summary comments in that file:

Any No visibility check whatever

Self Other committed xacts + effects of this xact up to
*and including* current command

Now Other committed xacts + effects of this xact up to
*but not including* current command

Dirty like Self, but can see effects of other in-progress
xacts too

Snapshot like Now, but specified other xacts are considered
uncommitted even if they've actually committed by now

There are also special visibility rules for Update and Vacuum, which
are not really different from the normal rules, they just return more
information than "visible" or "not visible". (Dirty returns some
side information too.)

Hmm. Now that I look at it, Self probably isn't quite right either;
if we are reading a main-table tuple that's committed dead but is
still visible according to our snapshot, we need to be able to see
its toast tuples too; but they're committed dead as well. Sigh.
I think we need a special visibility rule for TOAST, that only
checks for HEAP_MOVED_IN/HEAP_MOVED_OFF (the bits that take care
of VACUUM moves).

Comments?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

#11

Zeugswetter Andreas SB SD

ZeugswetterA@spardat.at

over 24 years ago

In reply to: Bruce Momjian (#10)

Re: tuptoaster.c must *not* use SnapshotAny

How do we know, that a (newly) FrozenXid tuple does not still have
a (visible) duplicate ?

It's *not* visible, if you are applying any visibility checks whatever.
But SnapshotAny bypasses all visibility checking.

I am concerned about the case where VACUUM FULL:
1. inserts heap tuple to new location using FrozenXid
2. updates original heap tuples's xmax

What if we crash/abort between step 1 and 2 but we used FrozenXid for 1.
Don't know if we actually do this, but imho we are only allowed
to use FrozenXid for an inplace vacuum operation.

Andreas

Import Notes

Resolved by subject fallback

#12

Tom Lane

tgl@sss.pgh.pa.us

over 24 years ago

In reply to: Zeugswetter Andreas SB SD (#11)

Re: tuptoaster.c must *not* use SnapshotAny

"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:

I am concerned about the case where VACUUM FULL:
1. inserts heap tuple to new location using FrozenXid
2. updates original heap tuples's xmax

It doesn't matter whether it's FrozenXid or not. The tuple is not
visible if it's got the wrong setting of HEAP_MOVED_OFF/IN.

regards, tom lane

#13

Zeugswetter Andreas SB SD

ZeugswetterA@spardat.at

over 24 years ago

In reply to: Tom Lane (#12)

Re: tuptoaster.c must *not* use SnapshotAny

I am concerned about the case where VACUUM FULL:
1. inserts heap tuple to new location using FrozenXid
2. updates original heap tuples's xmax

It doesn't matter whether it's FrozenXid or not. The tuple is not
visible if it's got the wrong setting of HEAP_MOVED_OFF/IN.

But the FrozenXid tuple has HEAP_MOVED_IN and the original has
not yet been altered to HEAP_MOVED_OFF because of abort.
Is the HEAP_MOVED_IN tuple not visible ?

Andreas

Import Notes

Resolved by subject fallback

#14

Tom Lane

tgl@sss.pgh.pa.us

over 24 years ago

In reply to: Zeugswetter Andreas SB SD (#13)

Re: tuptoaster.c must *not* use SnapshotAny

"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:

It doesn't matter whether it's FrozenXid or not. The tuple is not
visible if it's got the wrong setting of HEAP_MOVED_OFF/IN.

But the FrozenXid tuple has HEAP_MOVED_IN and the original has
not yet been altered to HEAP_MOVED_OFF because of abort.
Is the HEAP_MOVED_IN tuple not visible ?

Right. Actually it doesn't matter whether the old tuple has
HEAP_MOVED_OFF or not; it's still visible *until* the VACUUM commits.
The commit atomically switches us from the state where the unmoved
tuples are good to the state where the moved ones are good.

This is all exactly the same whether FrozenXid is involved or not.

regards, tom lane

#15

Jan Wieck

JanWieck@Yahoo.com

over 24 years ago

In reply to: Tom Lane (#14)

Re: tuptoaster.c must *not* use SnapshotAny

Tom Lane wrote:

"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:

It doesn't matter whether it's FrozenXid or not. The tuple is not
visible if it's got the wrong setting of HEAP_MOVED_OFF/IN.

But the FrozenXid tuple has HEAP_MOVED_IN and the original has
not yet been altered to HEAP_MOVED_OFF because of abort.
Is the HEAP_MOVED_IN tuple not visible ?

Right. Actually it doesn't matter whether the old tuple has
HEAP_MOVED_OFF or not; it's still visible *until* the VACUUM commits.
The commit atomically switches us from the state where the unmoved
tuples are good to the state where the moved ones are good.

This is all exactly the same whether FrozenXid is involved or not.

Originally I added SnapshotAny only to solve the problem that
the lookup of the toast table happens independant from the
access to the main tuple, and that in fact the main tuple
could have been deleted by the own transaction in between.

Would changing SnapshotAny to honor HEAP_MOVED_* fix the
problem? If nothing else by now uses this snapshot, it could
be done in place.

Jan

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

#16

Tom Lane

tgl@sss.pgh.pa.us

over 24 years ago

In reply to: Jan Wieck (#15)

Re: tuptoaster.c must *not* use SnapshotAny

Jan Wieck <janwieck@yahoo.com> writes:

Originally I added SnapshotAny only to solve the problem that
the lookup of the toast table happens independant from the
access to the main tuple, and that in fact the main tuple
could have been deleted by the own transaction in between.

Would changing SnapshotAny to honor HEAP_MOVED_* fix the
problem? If nothing else by now uses this snapshot, it could
be done in place.

What I've done is add an additional snapshot class SnapshotToast
that behaves that way. There are a couple other uses of SnapshotAny
that I didn't think I wanted to mess with.

regards, tom lane

#17

Hiroshi Inoue

Inoue@tpf.co.jp

over 24 years ago

In reply to: Jan Wieck (#15)

Re: tuptoaster.c must *not* use SnapshotAny

Tom Lane wrote:

Jan Wieck <janwieck@yahoo.com> writes:

Originally I added SnapshotAny only to solve the problem that
the lookup of the toast table happens independant from the
access to the main tuple, and that in fact the main tuple
could have been deleted by the own transaction in between.

Would changing SnapshotAny to honor HEAP_MOVED_* fix the
problem? If nothing else by now uses this snapshot, it could
be done in place.

What I've done is add an additional snapshot class SnapshotToast
that behaves that way. There are a couple other uses of SnapshotAny
that I didn't think I wanted to mess with.

AFAIR SnapshotAny was introduced for Foreign Key stuff.
SnapshotToast doesn't seem a proper solution to me but
7.2 should be released ASAP anyway.

regards,
Hiroshi Inoue

#18

Tom Lane

tgl@sss.pgh.pa.us

over 24 years ago

In reply to: Hiroshi Inoue (#17)

Re: tuptoaster.c must *not* use SnapshotAny

Hiroshi Inoue <Inoue@tpf.co.jp> writes:

SnapshotToast doesn't seem a proper solution to me

Do you have a better idea in mind?

regards, tom lane

#19

Hiroshi Inoue

Inoue@tpf.co.jp

over 24 years ago

In reply to: Jan Wieck (#15)

Re: tuptoaster.c must *not* use SnapshotAny

Tom Lane wrote:

Hiroshi Inoue <Inoue@tpf.co.jp> writes:

SnapshotToast doesn't seem a proper solution to me

Do you have a better idea in mind?

The same snapshot for both the main and the toast table
seems better though I don't know how it's difficult to
change. For example is it possible to update a toast
chunk partially using SnapshotToast ?

regards,
Hiroshi Inoue

#20

Tom Lane

tgl@sss.pgh.pa.us

over 24 years ago

In reply to: Hiroshi Inoue (#19)

Re: tuptoaster.c must *not* use SnapshotAny

Hiroshi Inoue <Inoue@tpf.co.jp> writes:

The same snapshot for both the main and the toast table
seems better though I don't know how it's difficult to
change.

That would be cleaner but I don't see any way to do it. The toasted
value gets passed all over the system, potentially, before someone asks
to detoast it. At that point you really don't have a good way to know
which snapshot was used to fetch the main-table row.

For example is it possible to update a toast
chunk partially using SnapshotToast ?

As things stand (with either SnapshotToast or the old SnapshotAny way)
it is never possible to update an individual toast value, either
partially or wholly. All you can do is lay down a new toast value (with
a new identifying OID) and then delete the old one.

But I'm not sure that this is wrong, or fixable. Trying to update part
of a toasted value is very much like wanting to update part of an
existing row in-place, which we cannot possibly do. We have to lay down
a whole new row whenever any part of it is updated.

regards, tom lane

#21

Jan Wieck

JanWieck@Yahoo.com

over 24 years ago

In reply to: Tom Lane (#20)

#22

Hiroshi Inoue

Inoue@tpf.co.jp

over 24 years ago

In reply to: Jan Wieck (#15)

#23