Pruning never visible changes

Started by Simon Riggsover 3 years ago12 messageshackers
Jump to latest
#1Simon Riggs
simon@2ndQuadrant.com

A user asked me whether we prune never visible changes, such as

BEGIN;
INSERT...
UPDATE.. (same row)
COMMIT;

Once committed, the original insert is no longer visible to anyone, so
"ought to be able to be pruned", sayeth the user. And they also say
that changing the app is much harder, as ever.

After some thought, Yes, we can prune, but not in all cases - only if
the never visible tuple is at the root end of the update chain. The
only question is can that be done cheaply enough to bother with. The
answer in one specific case is Yes, in other cases No.

This patch adds a new test for this use case, and code to remove the
never visible row when the changes are made by the same xid.

(I'm pretty sure there used to be a test for this some years back and
I'm guessing it was removed because it isn't always possible to remove
the tuple, which this new patch honours.)

Please let me know what you think.

--
Simon Riggs http://www.EnterpriseDB.com/

Attachments:

never_visible.v1.patchapplication/octet-stream; name=never_visible.v1.patchDownload+78-7
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#1)
Re: Pruning never visible changes

Simon Riggs <simon.riggs@enterprisedb.com> writes:

A user asked me whether we prune never visible changes, such as
BEGIN;
INSERT...
UPDATE.. (same row)
COMMIT;

Didn't we just have this discussion in another thread? You cannot
do that, at least not without checking that the originating
transaction has no snapshots that could see the older row version.
I'm not sure whether or not snapmgr.c has enough information to
determine that, but in any case this formulation is surely
unsafe, because it isn't even checking whether that transaction is
our own, let alone asking snapmgr.c.

I'm dubious that a safe version would fire often enough to be
worth the cycles spent.

regards, tom lane

#3Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#2)
Re: Pruning never visible changes

On Fri, 16 Sept 2022 at 15:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon.riggs@enterprisedb.com> writes:

A user asked me whether we prune never visible changes, such as
BEGIN;
INSERT...
UPDATE.. (same row)
COMMIT;

Didn't we just have this discussion in another thread?

Not that I was aware of, but it sounds like a different case anyway.

You cannot
do that, at least not without checking that the originating
transaction has no snapshots that could see the older row version.

I'm saying this is possible only AFTER the end of the originating
xact, so there are no issues with additional snapshots.

i.e. the never visible row has to be judged RECENTLY_DEAD before we even check.

--
Simon Riggs http://www.EnterpriseDB.com/

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#3)
Re: Pruning never visible changes

Simon Riggs <simon.riggs@enterprisedb.com> writes:

On Fri, 16 Sept 2022 at 15:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:

You cannot
do that, at least not without checking that the originating
transaction has no snapshots that could see the older row version.

I'm saying this is possible only AFTER the end of the originating
xact, so there are no issues with additional snapshots.

I see, so the point is just that we can prune even if the originating
xact hasn't yet passed the global xmin horizon. I agree that's safe,
but will it fire often enough to be worth the trouble? Also, why
does it need to be restricted to certain shapes of HOT chains ---
that is, why can't we do exactly what we'd do if the xact *were*
past the xmin horizon?

regards, tom lane

#5Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Tom Lane (#2)
Re: Pruning never visible changes

On Fri, 2022-09-16 at 10:26 -0400, Tom Lane wrote:

Simon Riggs <simon.riggs@enterprisedb.com> writes:

A user asked me whether we prune never visible changes, such as
BEGIN;
INSERT...
UPDATE.. (same row)
COMMIT;

Didn't we just have this discussion in another thread?

For reference: that was
/messages/by-id/f6a491b32cb44bb5daaafec835364f7149348fa1.camel@cybertec.at

Yours,
Laurenz Albe

#6Simon Riggs
simon@2ndQuadrant.com
In reply to: Laurenz Albe (#5)
Re: Pruning never visible changes

On Fri, 16 Sept 2022 at 21:07, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

On Fri, 2022-09-16 at 10:26 -0400, Tom Lane wrote:

Simon Riggs <simon.riggs@enterprisedb.com> writes:

A user asked me whether we prune never visible changes, such as
BEGIN;
INSERT...
UPDATE.. (same row)
COMMIT;

Didn't we just have this discussion in another thread?

For reference: that was
/messages/by-id/f6a491b32cb44bb5daaafec835364f7149348fa1.camel@cybertec.at

Thanks. I confirm I hadn't seen that, and indeed, I wrote the patch on
5 Sept before you posted.

--
Simon Riggs http://www.EnterpriseDB.com/

#7Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#4)
Re: Pruning never visible changes

On Fri, 16 Sept 2022 at 18:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon.riggs@enterprisedb.com> writes:

On Fri, 16 Sept 2022 at 15:26, Tom Lane <tgl@sss.pgh.pa.us> wrote:

You cannot
do that, at least not without checking that the originating
transaction has no snapshots that could see the older row version.

I'm saying this is possible only AFTER the end of the originating
xact, so there are no issues with additional snapshots.

I see, so the point is just that we can prune even if the originating
xact hasn't yet passed the global xmin horizon. I agree that's safe,
but will it fire often enough to be worth the trouble?

It is an edge case with limited utility, I agree, which is why I
looked for a cheap test (xmin == xmax only).

This additional test is also valid, but too expensive to apply:
TransactionIdGetTopmostTranactionId(xmax) ==
TransactionIdGetTopmostTranactionId(xmin)

Also, why
does it need to be restricted to certain shapes of HOT chains ---
that is, why can't we do exactly what we'd do if the xact *were*
past the xmin horizon?

I thought it important to maintain the integrity of the HOT chain, in
that the xmax of one tuple version matches the xmin of the next. So
pruning only from the root of the chain allows us to maintain that
validity check.

I'm on the fence myself, which is why I didn't post it immediately I
had written it.

If not, it would be useful to add a note in comments to the code to
explain why we don't do this.

--
Simon Riggs http://www.EnterpriseDB.com/

#8Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#2)
Re: Pruning never visible changes

On Fri, 16 Sept 2022 at 10:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon.riggs@enterprisedb.com> writes:

A user asked me whether we prune never visible changes, such as
BEGIN;
INSERT...
UPDATE.. (same row)
COMMIT;

Didn't we just have this discussion in another thread?

Well..... not "just" :)

commit 44e4bbf75d56e643b6afefd5cdcffccb68cce414
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Fri Apr 29 16:29:42 2011 -0400

Remove special case for xmin == xmax in HeapTupleSatisfiesVacuum().

VACUUM was willing to remove a committed-dead tuple immediately if it was
deleted by the same transaction that inserted it. The idea is that such a
tuple could never have been visible to any other transaction, so we don't
need to keep it around to satisfy MVCC snapshots. However, there was
already an exception for tuples that are part of an update chain, and this
exception created a problem: we might remove TOAST tuples (which are never
part of an update chain) while their parent tuple stayed around (if it was
part of an update chain). This didn't pose a problem for most things,
since the parent tuple is indeed dead: no snapshot will ever consider it
visible. But MVCC-safe CLUSTER had a problem, since it will try to copy
RECENTLY_DEAD tuples to the new table. It then has to copy their TOAST
data too, and would fail if VACUUM had already removed the toast tuples.

Easiest fix is to get rid of the special case for xmin == xmax. This may
delay reclaiming dead space for a little bit in some cases, but it's by far
the most reliable way to fix the issue.

Per bug #5998 from Mark Reid. Back-patch to 8.3, which is the oldest
version with MVCC-safe CLUSTER.

#9Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Bruce Momjian (#8)
Re: Pruning never visible changes

On Mon, 19 Sept 2022 at 01:16, Greg Stark <stark@mit.edu> wrote:

On Fri, 16 Sept 2022 at 10:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon.riggs@enterprisedb.com> writes:

A user asked me whether we prune never visible changes, such as
BEGIN;
INSERT...
UPDATE.. (same row)
COMMIT;

Didn't we just have this discussion in another thread?

Well..... not "just" :)

This recent thread [0]/messages/by-id/2031521.1663076724@sss.pgh.pa.us Subject: Re: Tuples inserted and deleted by the same transaction Date: 2022-09-13 14:13:44 mentioned the same, and I mentioned it in [1]/messages/by-id/CAEze2Whjnhg96Wt2-DxtBydhmMDmVm2WfWOX3aGcB2C2Hbry0Q@mail.gmail.com Subject: Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic Date: 2021-06-14 09:53:47 (in a thread about a PS comment)
too last year.

Kind regards,

Matthias van de Meent

[0]: /messages/by-id/2031521.1663076724@sss.pgh.pa.us Subject: Re: Tuples inserted and deleted by the same transaction Date: 2022-09-13 14:13:44
Subject: Re: Tuples inserted and deleted by the same transaction
Date: 2022-09-13 14:13:44

[1]: /messages/by-id/CAEze2Whjnhg96Wt2-DxtBydhmMDmVm2WfWOX3aGcB2C2Hbry0Q@mail.gmail.com Subject: Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic Date: 2021-06-14 09:53:47 (in a thread about a PS comment)
Subject: Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic
Date: 2021-06-14 09:53:47
(in a thread about a PS comment)

#10Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#8)
Re: Pruning never visible changes

On Mon, 19 Sept 2022 at 00:16, Greg Stark <stark@mit.edu> wrote:

On Fri, 16 Sept 2022 at 10:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon.riggs@enterprisedb.com> writes:

A user asked me whether we prune never visible changes, such as
BEGIN;
INSERT...
UPDATE.. (same row)
COMMIT;

Didn't we just have this discussion in another thread?

Well..... not "just" :)

commit 44e4bbf75d56e643b6afefd5cdcffccb68cce414
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Fri Apr 29 16:29:42 2011 -0400

Remove special case for xmin == xmax in HeapTupleSatisfiesVacuum().

VACUUM was willing to remove a committed-dead tuple immediately if it was
deleted by the same transaction that inserted it. The idea is that such a
tuple could never have been visible to any other transaction, so we don't
need to keep it around to satisfy MVCC snapshots. However, there was
already an exception for tuples that are part of an update chain, and this
exception created a problem: we might remove TOAST tuples (which are never
part of an update chain) while their parent tuple stayed around (if it was
part of an update chain). This didn't pose a problem for most things,
since the parent tuple is indeed dead: no snapshot will ever consider it
visible. But MVCC-safe CLUSTER had a problem, since it will try to copy
RECENTLY_DEAD tuples to the new table. It then has to copy their TOAST
data too, and would fail if VACUUM had already removed the toast tuples.

Easiest fix is to get rid of the special case for xmin == xmax. This may
delay reclaiming dead space for a little bit in some cases, but it's by far
the most reliable way to fix the issue.

Per bug #5998 from Mark Reid. Back-patch to 8.3, which is the oldest
version with MVCC-safe CLUSTER.

Good research Greg, thank you. Only took 10 years for me to notice it
was gone ;-)

--
Simon Riggs http://www.EnterpriseDB.com/

#11Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Simon Riggs (#10)
Re: Pruning never visible changes

On 2022-Sep-22, Simon Riggs wrote:

On Mon, 19 Sept 2022 at 00:16, Greg Stark <stark@mit.edu> wrote:

VACUUM was willing to remove a committed-dead tuple immediately if it was
deleted by the same transaction that inserted it. The idea is that such a
tuple could never have been visible to any other transaction, so we don't
need to keep it around to satisfy MVCC snapshots. However, there was
already an exception for tuples that are part of an update chain, and this
exception created a problem: we might remove TOAST tuples (which are never
part of an update chain) while their parent tuple stayed around (if it was
part of an update chain). This didn't pose a problem for most things,
since the parent tuple is indeed dead: no snapshot will ever consider it
visible. But MVCC-safe CLUSTER had a problem, since it will try to copy
RECENTLY_DEAD tuples to the new table. It then has to copy their TOAST
data too, and would fail if VACUUM had already removed the toast tuples.

Good research Greg, thank you. Only took 10 years for me to notice it
was gone ;-)

But this begs the question: is the proposed change safe, given that
ancient consideration? I don't think TOAST issues have been mentioned
in this thread so far, so I wonder if there is a test case that verifies
that this problem doesn't occur for some reason.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/

#12Simon Riggs
simon@2ndQuadrant.com
In reply to: Alvaro Herrera (#11)
Re: Pruning never visible changes

On Thu, 22 Sept 2022 at 15:16, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2022-Sep-22, Simon Riggs wrote:

On Mon, 19 Sept 2022 at 00:16, Greg Stark <stark@mit.edu> wrote:

VACUUM was willing to remove a committed-dead tuple immediately if it was
deleted by the same transaction that inserted it. The idea is that such a
tuple could never have been visible to any other transaction, so we don't
need to keep it around to satisfy MVCC snapshots. However, there was
already an exception for tuples that are part of an update chain, and this
exception created a problem: we might remove TOAST tuples (which are never
part of an update chain) while their parent tuple stayed around (if it was
part of an update chain). This didn't pose a problem for most things,
since the parent tuple is indeed dead: no snapshot will ever consider it
visible. But MVCC-safe CLUSTER had a problem, since it will try to copy
RECENTLY_DEAD tuples to the new table. It then has to copy their TOAST
data too, and would fail if VACUUM had already removed the toast tuples.

Good research Greg, thank you. Only took 10 years for me to notice it
was gone ;-)

But this begs the question: is the proposed change safe, given that
ancient consideration? I don't think TOAST issues have been mentioned
in this thread so far, so I wonder if there is a test case that verifies
that this problem doesn't occur for some reason.

Oh, completely agreed.

I will submit a modified patch that adds a test case and just a
comment to explain why we can't remove such rows.

--
Simon Riggs http://www.EnterpriseDB.com/