Optimize update of tables with generated columns

Started by Peter Eisentrautover 6 years ago4 messageshackers
Jump to latest
#1Peter Eisentraut
peter_e@gmx.net

When updating a table row with generated columns, we only need to
recompute those generated columns whose base columns have changed in
this update and keep the rest unchanged. This can result in a
significant performance benefit (easy to reproduce for example with a
tsvector column). The required information was already kept in
RangeTblEntry.extraUpdatedCols; we just have to make use of it.

A small problem is that right now ExecSimpleRelationUpdate() does not
populate extraUpdatedCols. That needs fixing first. This is also
related to the issue discussed in "logical replication does not fire
per-column triggers"[0]/messages/by-id/21673e2d-597c-6afe-637e-e8b10425b240@2ndquadrant.com. I'll leave my patch here while that issue is
being resolved.

[0]: /messages/by-id/21673e2d-597c-6afe-637e-e8b10425b240@2ndquadrant.com
/messages/by-id/21673e2d-597c-6afe-637e-e8b10425b240@2ndquadrant.com

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-Optimize-update-of-tables-with-generated-columns.patchtext/plain; charset=UTF-8; name=0001-Optimize-update-of-tables-with-generated-columns.patch; x-mac-creator=0; x-mac-type=0Download+38-11
#2Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#1)
Re: Optimize update of tables with generated columns

On 2019-12-21 07:47, Peter Eisentraut wrote:

When updating a table row with generated columns, we only need to
recompute those generated columns whose base columns have changed in
this update and keep the rest unchanged. This can result in a
significant performance benefit (easy to reproduce for example with a
tsvector column). The required information was already kept in
RangeTblEntry.extraUpdatedCols; we just have to make use of it.

A small problem is that right now ExecSimpleRelationUpdate() does not
populate extraUpdatedCols. That needs fixing first.

Here is an updated patch set that contains a fix for the issue above
(should be backpatched IMO) and the actual performance patch as before.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v2-0001-Fill-in-extraUpdatedCols-in-logical-replication.patchtext/plain; charset=UTF-8; name=v2-0001-Fill-in-extraUpdatedCols-in-logical-replication.patch; x-mac-creator=0; x-mac-type=0Download+17-7
v2-0002-Optimize-update-of-tables-with-generated-columns.patchtext/plain; charset=UTF-8; name=v2-0002-Optimize-update-of-tables-with-generated-columns.patch; x-mac-creator=0; x-mac-type=0Download+38-11
#3Pavel Stehule
pavel.stehule@gmail.com
In reply to: Peter Eisentraut (#2)
Re: Optimize update of tables with generated columns

čt 13. 2. 2020 v 14:40 odesílatel Peter Eisentraut <
peter.eisentraut@2ndquadrant.com> napsal:

On 2019-12-21 07:47, Peter Eisentraut wrote:

When updating a table row with generated columns, we only need to
recompute those generated columns whose base columns have changed in
this update and keep the rest unchanged. This can result in a
significant performance benefit (easy to reproduce for example with a
tsvector column). The required information was already kept in
RangeTblEntry.extraUpdatedCols; we just have to make use of it.

A small problem is that right now ExecSimpleRelationUpdate() does not
populate extraUpdatedCols. That needs fixing first.

Here is an updated patch set that contains a fix for the issue above
(should be backpatched IMO) and the actual performance patch as before.

+ 1

I tested check-world without problems, and changes of patch has sense for
me.

Regards

Pavel

Show quoted text

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#4Peter Eisentraut
peter_e@gmx.net
In reply to: Pavel Stehule (#3)
Re: Optimize update of tables with generated columns

On 2020-02-13 16:16, Pavel Stehule wrote:

čt 13. 2. 2020 v 14:40 odesílatel Peter Eisentraut
<peter.eisentraut@2ndquadrant.com
<mailto:peter.eisentraut@2ndquadrant.com>> napsal:

On 2019-12-21 07:47, Peter Eisentraut wrote:

When updating a table row with generated columns, we only need to
recompute those generated columns whose base columns have changed in
this update and keep the rest unchanged.  This can result in a
significant performance benefit (easy to reproduce for example with a
tsvector column).  The required information was already kept in
RangeTblEntry.extraUpdatedCols; we just have to make use of it.

A small problem is that right now ExecSimpleRelationUpdate() does not
populate extraUpdatedCols.  That needs fixing first.

Here is an updated patch set that contains a fix for the issue above
(should be backpatched IMO) and the actual performance patch as before.

+ 1

I tested check-world without problems, and changes of patch has sense
for me.

committed, thanks

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services