When UPDATE a row in a table with BEFORE ROW UPDATE trigger, the XMAX of new tuple is set to current XID

Started by Charles Qi8 months ago7 messagesgeneral
Jump to latest
#1Charles Qi
qyqgpower@gmail.com

And if we do the updates in multiple subtransactions, multixact will be
created, which is not created when the BEFORE ROW UPDATE trigger is absent.

Is this behavior by design? If so, what is the purpose for the behavior?

Tested version:
PostgreSQL 14.18 (Ubuntu 14.18-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit

The attached file reproduce.sql can be used to reproduce the behavior.

Attachments:

reproduce.sqlapplication/octet-stream; name=reproduce.sqlDownload
#2Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Charles Qi (#1)
Re: When UPDATE a row in a table with BEFORE ROW UPDATE trigger, the XMAX of new tuple is set to current XID

On 8/6/25 05:37, Charles Qi wrote:

And if we do the updates in multiple subtransactions, multixact will be
created, which is not created when the BEFORE ROW UPDATE trigger is absent.

Is this behavior by design? If so, what is the purpose for the behavior?

I would say this is by design. My reasoning is that the savepoints are
essentially rollback points and the state of the tuple would need to be
saved for each potential rollback. Hence a different transaction id for
each savepoint.

Tested version:
PostgreSQL 14.18 (Ubuntu 14.18-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit

The attached file reproduce.sql can be used to reproduce the behavior.

--
Adrian Klaver
adrian.klaver@aklaver.com

#3Charles Qi
qyqgpower@gmail.com
In reply to: Adrian Klaver (#2)
Re: When UPDATE a row in a table with BEFORE ROW UPDATE trigger, the XMAX of new tuple is set to current XID

As I stated before, when the BEFORE ROW UPDATE trigger is absent, even
if we update the same row in multiple subtransactions inside one top
transaction, no multixact will be created.
Check the attached no_multi.sql for example.

Let me clarify the question, when the BEFORE ROW UPDATE trigger is presented
Q. Why do we need to set the XMAX of the new tuple to the current xid?
which risks piling up multixacts quickly in savepoint/exception block
scenarios.

Show quoted text

On Thu, Aug 7, 2025 at 2:22 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 8/6/25 05:37, Charles Qi wrote:

And if we do the updates in multiple subtransactions, multixact will be
created, which is not created when the BEFORE ROW UPDATE trigger is absent.

Is this behavior by design? If so, what is the purpose for the behavior?

I would say this is by design. My reasoning is that the savepoints are
essentially rollback points and the state of the tuple would need to be
saved for each potential rollback. Hence a different transaction id for
each savepoint.

Tested version:
PostgreSQL 14.18 (Ubuntu 14.18-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit

The attached file reproduce.sql can be used to reproduce the behavior.

--
Adrian Klaver
adrian.klaver@aklaver.com

Attachments:

no_multi.sqlapplication/octet-stream; name=no_multi.sqlDownload
#4Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Charles Qi (#3)
Re: When UPDATE a row in a table with BEFORE ROW UPDATE trigger, the XMAX of new tuple is set to current XID

On 8/7/25 18:20, Charles Qi wrote:

As I stated before, when the BEFORE ROW UPDATE trigger is absent, even
if we update the same row in multiple subtransactions inside one top
transaction, no multixact will be created.
Check the attached no_multi.sql for example.

Yeah, I saw that in your previous example. I would say it is evidence
that this due to the trigger/function combination, unfortunately an
exact answer to this is beyond my present knowledge. I will dig into
this when I get a chance, but in the meantime hopefully someone with
more expertise will provide the answer.

Let me clarify the question, when the BEFORE ROW UPDATE trigger is presented
Q. Why do we need to set the XMAX of the new tuple to the current xid?
which risks piling up multixacts quickly in savepoint/exception block
scenarios.

The information here:

https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-MULTIXACT-WRAPAROUND

shows that multixact system can handle fairly high levels of activity.

Do you think your normal operations are going to exceed the values shown
below?

Following the configuration links gets you:

vacuum_multixact_freeze_min_age (integer)

Specifies the cutoff age (in multixacts) that VACUUM should use to
decide whether to trigger freezing of pages with an older multixact ID.
The default is 5 million multixacts. Although users can set this value
anywhere from zero to one billion, VACUUM will silently limit the
effective value to half the value of
autovacuum_multixact_freeze_max_age, so that there is not an
unreasonably short time between forced autovacuums. For more information
see Section 24.1.5.1.

vacuum_multixact_freeze_table_age (integer)

VACUUM performs an aggressive scan if the table's
pg_class.relminmxid field has reached the age specified by this setting.
An aggressive scan differs from a regular VACUUM in that it visits every
page that might contain unfrozen XIDs or MXIDs, not just those that
might contain dead tuples. The default is 150 million multixacts.
Although users can set this value anywhere from zero to two billion,
VACUUM will silently limit the effective value to 95% of
autovacuum_multixact_freeze_max_age, so that a periodic manual VACUUM
has a chance to run before an anti-wraparound is launched for the table.
For more information see Section 24.1.5.1.

autovacuum_multixact_freeze_max_age (integer)

Specifies the maximum age (in multixacts) that a table's
pg_class.relminmxid field can attain before a VACUUM operation is forced
to prevent multixact ID wraparound within the table. Note that the
system will launch autovacuum processes to prevent wraparound even when
autovacuum is otherwise disabled.

Vacuuming multixacts also allows removal of old files from the
pg_multixact/members and pg_multixact/offsets subdirectories, which is
why the default is a relatively low 400 million multixacts. This
parameter can only be set at server start, but the setting can be
reduced for individual tables by changing table storage parameters. For
more information see Section 24.1.5.1.

On Thu, Aug 7, 2025 at 2:22 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 8/6/25 05:37, Charles Qi wrote:

And if we do the updates in multiple subtransactions, multixact will be
created, which is not created when the BEFORE ROW UPDATE trigger is absent.

Is this behavior by design? If so, what is the purpose for the behavior?

I would say this is by design. My reasoning is that the savepoints are
essentially rollback points and the state of the tuple would need to be
saved for each potential rollback. Hence a different transaction id for
each savepoint.

Tested version:
PostgreSQL 14.18 (Ubuntu 14.18-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit

The attached file reproduce.sql can be used to reproduce the behavior.

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Adrian Klaver
adrian.klaver@aklaver.com

#5Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Charles Qi (#3)
Re: When UPDATE a row in a table with BEFORE ROW UPDATE trigger, the XMAX of new tuple is set to current XID

On Fri, 2025-08-08 at 09:20 +0800, Charles Qi wrote:

Let me clarify the question, when the BEFORE ROW UPDATE trigger is presented
Q. Why do we need to set the XMAX of the new tuple to the current xid?

Because the row gets locked, I'd say (without looking at your example).

which risks piling up multixacts quickly in savepoint/exception block
scenarios.

Why is that a problem for you?

Perhaps the trigger could use SELECT ... FOR ... to lock the row in the
strongest level your transaction needs. A multixact is only necessary
if a subtransaction needs to take a stronger lock on the row than what
was there before.

Yours,
Laurenz Albe

#6Charles Qi
qyqgpower@gmail.com
In reply to: Laurenz Albe (#5)
Re: When UPDATE a row in a table with BEFORE ROW UPDATE trigger, the XMAX of new tuple is set to current XID

On Mon, Aug 11, 2025 at 3:34 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:

On Fri, 2025-08-08 at 09:20 +0800, Charles Qi wrote:

Let me clarify the question, when the BEFORE ROW UPDATE trigger is presented
Q. Why do we need to set the XMAX of the new tuple to the current xid?

Because the row gets locked, I'd say (without looking at your example).

With or without the trigger, the row gets locked and unlocked while
the update is doing its thing.
The problem here is that HEAP_XMAX_KEYSHR_LOCK and XMAX are set with
the trigger even if the update transaction is finished, while both are
not set without the trigger.

which risks piling up multixacts quickly in savepoint/exception block
scenarios.

Why is that a problem for you?

Perhaps the trigger could use SELECT ... FOR ... to lock the row in the
strongest level your transaction needs. A multixact is only necessary
if a subtransaction needs to take a stronger lock on the row than what
was there before.

Yours,
Laurenz Albe

The piling up of multixacts are related to the performance topic,
which is not in the scope of this mail.

The trigger function in example is doing nothing but return new, the
row is actually locked by the trigger itself (trigger.c >
ExecBRUpdateTriggers > GetTupleForTrigger)
You mentioned a very important behavior:

A multixact is only necessary
if a subtransaction needs to take a stronger lock on the row than what
was there before.

We are doing two no key updates in example, and should not need a
stronger lock on the same row.

#7Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Charles Qi (#6)
Re: When UPDATE a row in a table with BEFORE ROW UPDATE trigger, the XMAX of new tuple is set to current XID

On Mon, 2025-08-11 at 11:34 +0800, Charles Qi wrote:

The trigger function in example is doing nothing but return new, the
row is actually locked by the trigger itself (trigger.c >
ExecBRUpdateTriggers > GetTupleForTrigger)

You mentioned a very important behavior:

A multixact is only necessary
if a subtransaction needs to take a stronger lock on the row than what
was there before.

We are doing two no key updates in example, and should not need a
stronger lock on the same row.

Still, you could explicitly lock the row in the trigger function with
a high enough lock level to avoid a multixact being created later on.

Yours,
Laurenz Albe