Why does ExecComputeStoredGenerated() form a heap tuple

Started by Andres Freundalmost 7 years ago11 messages
#1Andres Freund
andres@anarazel.de

Hi,

while rebasing the remaining tableam patches (luckily a pretty small set
now!), I had a few conflicts with ExecComputeStoredGenerated(). While
resolving I noticed:

oldtuple = ExecFetchSlotHeapTuple(slot, true, &should_free);
newtuple = heap_modify_tuple(oldtuple, tupdesc, values, nulls, replaces);
ExecForceStoreHeapTuple(newtuple, slot);
if (should_free)
heap_freetuple(oldtuple);

MemoryContextSwitchTo(oldContext);

First off, I'm not convinced this is correct:

ISTM you'd need at least an ExecMaterializeSlot() before the
MemoryContextSwitchTo() in ExecComputeStoredGenerated().

But what actually brought me to reply was that it seems like it'll cause
unnecessary slowdowns for !heap AMs. First, it'll form a heaptuple if
the slot isn't in that form, and then it'll cause a conversion by
storing a heap tuple even if the target doesn't use heap representation.

ISTM the above would be much more efficiently - even more efficient if
only heap is used - implemented as something roughly akin to:

slot_getallattrs(slot);
memcpy(values, slot->tts_values, ...);
memcpy(nulls, slot->tts_isnull, ...);

for (int i = 0; i < natts; i++)
{
if (TupleDescAttr(tupdesc, i)->attgenerated == ATTRIBUTE_GENERATED_STORED)
{
values[i] = ...
}
else
values[i] = datumCopy(...);
}

ExecClearTuple(slot);
memcpy(slot->tts_values, values, ...);
memcpy(slot->tts_isnull, nulls, ...);
ExecStoreVirtualTuple(slot);
ExecMaterializeSlot(slot);

that's not perfect, but more efficient than your version...

Greetings,

Andres Freund

#2Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#1)
Re: Why does ExecComputeStoredGenerated() form a heap tuple

Hi,

On 2019-03-30 19:57:44 -0700, Andres Freund wrote:

while rebasing the remaining tableam patches (luckily a pretty small set
now!), I had a few conflicts with ExecComputeStoredGenerated(). While
resolving I noticed:

oldtuple = ExecFetchSlotHeapTuple(slot, true, &should_free);
newtuple = heap_modify_tuple(oldtuple, tupdesc, values, nulls, replaces);
ExecForceStoreHeapTuple(newtuple, slot);
if (should_free)
heap_freetuple(oldtuple);

MemoryContextSwitchTo(oldContext);

First off, I'm not convinced this is correct:

ISTM you'd need at least an ExecMaterializeSlot() before the
MemoryContextSwitchTo() in ExecComputeStoredGenerated().

But what actually brought me to reply was that it seems like it'll cause
unnecessary slowdowns for !heap AMs. First, it'll form a heaptuple if
the slot isn't in that form, and then it'll cause a conversion by
storing a heap tuple even if the target doesn't use heap representation.

ISTM the above would be much more efficiently - even more efficient if
only heap is used - implemented as something roughly akin to:

slot_getallattrs(slot);
memcpy(values, slot->tts_values, ...);
memcpy(nulls, slot->tts_isnull, ...);

for (int i = 0; i < natts; i++)
{
if (TupleDescAttr(tupdesc, i)->attgenerated == ATTRIBUTE_GENERATED_STORED)
{
values[i] = ...
}
else
values[i] = datumCopy(...);
}

ExecClearTuple(slot);
memcpy(slot->tts_values, values, ...);
memcpy(slot->tts_isnull, nulls, ...);
ExecStoreVirtualTuple(slot);
ExecMaterializeSlot(slot);

that's not perfect, but more efficient than your version...

Also, have you actually benchmarked this code? ISTM that adding a
stored generated column would cause quite noticable slowdowns in the
COPY path based on this code.

Greetings,

Andres Freund

#3Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Andres Freund (#1)
Re: Why does ExecComputeStoredGenerated() form a heap tuple

On 2019-03-31 04:57, Andres Freund wrote:

while rebasing the remaining tableam patches (luckily a pretty small set
now!), I had a few conflicts with ExecComputeStoredGenerated(). While
resolving I noticed:

The core of that code was written a long time ago and perhaps hasn't
caught up with all the refactoring going on. I'll look through your
proposal and update the code.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#4Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Andres Freund (#2)
Re: Why does ExecComputeStoredGenerated() form a heap tuple

On 2019-03-31 05:00, Andres Freund wrote:

Also, have you actually benchmarked this code? ISTM that adding a
stored generated column would cause quite noticable slowdowns in the
COPY path based on this code.

Yes, it'll be slower than not having it, but it's much faster than the
equivalent trigger.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#5Andres Freund
andres@anarazel.de
In reply to: Peter Eisentraut (#4)
Re: Why does ExecComputeStoredGenerated() form a heap tuple

Hi,

On 2019-04-01 11:25:46 +0200, Peter Eisentraut wrote:

On 2019-03-31 05:00, Andres Freund wrote:

Also, have you actually benchmarked this code? ISTM that adding a
stored generated column would cause quite noticable slowdowns in the
COPY path based on this code.

Yes, it'll be slower than not having it, but it's much faster than the
equivalent trigger.

It at the moment is quite noticably slower than directly inserting the
generated column.

postgres[11993][1]=# CREATE TABLE foo_without_generated(id int, copy_of_int int);
CREATE TABLE
Time: 0.625 ms
postgres[11993][1]=# CREATE TABLE foo_with_generated(id int, copy_of_int int generated always as (id) stored);
CREATE TABLE
Time: 0.771 ms
postgres[11993][1]=# INSERT INTO foo_without_generated SELECT g.i, g.i FROM generate_series(1, 1000000) g(i);
INSERT 0 1000000
Time: 691.533 ms
postgres[11993][1]=# INSERT INTO foo_with_generated SELECT g.i FROM generate_series(1, 1000000) g(i);
INSERT 0 1000000
Time: 825.471 ms
postgres[11993][1]=# COPY foo_without_generated TO '/tmp/foo_without_generated';
COPY 1000000
Time: 194.051 ms
postgres[11993][1]=# COPY foo_with_generated TO '/tmp/foo_with_generated';
COPY 1000000
Time: 153.146 ms
postgres[11993][1]=# ;TRUNCATE foo_without_generated ;COPY foo_without_generated FROM '/tmp/foo_without_generated';
Time: 0.178 ms
TRUNCATE TABLE
Time: 8.456 ms
COPY 1000000
Time: 394.990 ms
postgres[11993][1]=# ;TRUNCATE foo_with_generated ;COPY foo_with_generated FROM '/tmp/foo_with_generated';
Time: 0.147 ms
TRUNCATE TABLE
Time: 8.043 ms
COPY 1000000
Time: 508.918 ms

From a quick profile that's indeed largely because
ExecComputeStoredGenerated() is really inefficient - and it seems
largely unnecessarily so. I think this should at least be roughly as
efficient as getting the additional data from the client.

Minor other point: I'm not a fan of defining more general infrastructure
like ExecComputedStoredGenerated() in nodeModifyTable.c - it's already
large and confusing, and it's not obvious that e.g. COPY would call into
it.

Greetings,

Andres Freund

#6Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Peter Eisentraut (#3)
1 attachment(s)
Re: Why does ExecComputeStoredGenerated() form a heap tuple

On 2019-04-01 11:23, Peter Eisentraut wrote:

On 2019-03-31 04:57, Andres Freund wrote:

while rebasing the remaining tableam patches (luckily a pretty small set
now!), I had a few conflicts with ExecComputeStoredGenerated(). While
resolving I noticed:

The core of that code was written a long time ago and perhaps hasn't
caught up with all the refactoring going on. I'll look through your
proposal and update the code.

The attached patch is based on your sketch. It's clearly better in the
long term not to rely on heap tuples here. But in testing this change
seems to make it slightly slower, certainly not a speedup as you were
apparently hoping for.

Test setup:

create table t0 (a int, b int);
insert into t0 select generate_series (1, 10000000); -- 10 million
\copy t0 (a) to 'test.dat';

-- for comparison, without generated column
truncate t0;
\copy t0 (a) from 'test.dat';

-- master
create table t1 (a int, b int generated always as (a * 2) stored);
truncate t1;
\copy t1 (a) from 'test.dat';

-- patched
truncate t1;
\copy t1 (a) from 'test.dat';

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-Don-t-form-heap-tuple-in-ExecComputeStoredGenerated.patchtext/plain; charset=UTF-8; name=0001-Don-t-form-heap-tuple-in-ExecComputeStoredGenerated.patch; x-mac-creator=0; x-mac-type=0Download
From d87b94d555c1d863a65e738a6953a1c87b0af2b2 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Tue, 23 Apr 2019 10:18:39 +0200
Subject: [PATCH] Don't form heap tuple in ExecComputeStoredGenerated

---
 src/backend/executor/nodeModifyTable.c | 27 ++++++++++++--------------
 1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 444c0c0574..216c3baa64 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -53,6 +53,7 @@
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "utils/builtins.h"
+#include "utils/datum.h"
 #include "utils/memutils.h"
 #include "utils/rel.h"
 
@@ -254,9 +255,6 @@ ExecComputeStoredGenerated(EState *estate, TupleTableSlot *slot)
 	MemoryContext oldContext;
 	Datum	   *values;
 	bool	   *nulls;
-	bool	   *replaces;
-	HeapTuple	oldtuple, newtuple;
-	bool		should_free;
 
 	Assert(tupdesc->constr && tupdesc->constr->has_generated_stored);
 
@@ -294,7 +292,10 @@ ExecComputeStoredGenerated(EState *estate, TupleTableSlot *slot)
 
 	values = palloc(sizeof(*values) * natts);
 	nulls = palloc(sizeof(*nulls) * natts);
-	replaces = palloc0(sizeof(*replaces) * natts);
+
+	slot_getallattrs(slot);
+	memcpy(values, slot->tts_values, sizeof(*values) * natts);
+	memcpy(nulls, slot->tts_isnull, sizeof(*nulls) * natts);
 
 	for (int i = 0; i < natts; i++)
 	{
@@ -311,20 +312,16 @@ ExecComputeStoredGenerated(EState *estate, TupleTableSlot *slot)
 
 			values[i] = val;
 			nulls[i] = isnull;
-			replaces[i] = true;
 		}
+		else
+			values[i] = datumCopy(values[i], TupleDescAttr(tupdesc, i)->attbyval, TupleDescAttr(tupdesc, i)->attlen);
 	}
 
-	oldtuple = ExecFetchSlotHeapTuple(slot, true, &should_free);
-	newtuple = heap_modify_tuple(oldtuple, tupdesc, values, nulls, replaces);
-	/*
-	 * The tuple will be freed by way of the memory context - the slot might
-	 * only be cleared after the context is reset, and we'd thus potentially
-	 * double free.
-	 */
-	ExecForceStoreHeapTuple(newtuple, slot, false);
-	if (should_free)
-		heap_freetuple(oldtuple);
+	ExecClearTuple(slot);
+	memcpy(slot->tts_values, values, sizeof(*values) * natts);
+	memcpy(slot->tts_isnull, nulls, sizeof(*nulls) * natts);
+	ExecStoreVirtualTuple(slot);
+	ExecMaterializeSlot(slot);
 
 	MemoryContextSwitchTo(oldContext);
 }
-- 
2.21.0

#7David Rowley
david.rowley@2ndquadrant.com
In reply to: Peter Eisentraut (#6)
Re: Why does ExecComputeStoredGenerated() form a heap tuple

On Tue, 23 Apr 2019 at 20:23, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

The attached patch is based on your sketch. It's clearly better in the
long term not to rely on heap tuples here. But in testing this change
seems to make it slightly slower, certainly not a speedup as you were
apparently hoping for.

Test setup:

create table t0 (a int, b int);
insert into t0 select generate_series (1, 10000000); -- 10 million
\copy t0 (a) to 'test.dat';

-- for comparison, without generated column
truncate t0;
\copy t0 (a) from 'test.dat';

-- master
create table t1 (a int, b int generated always as (a * 2) stored);
truncate t1;
\copy t1 (a) from 'test.dat';

-- patched
truncate t1;
\copy t1 (a) from 'test.dat';

I didn't do the exact same test, but if I use COPY instead of \copy,
then for me patched is faster.

Normal table:

postgres=# copy t0 (a) from '/home/drowley/test.dat';
COPY 10000000
Time: 5437.768 ms (00:05.438)
postgres=# truncate t0;
TRUNCATE TABLE
Time: 20.775 ms
postgres=# copy t0 (a) from '/home/drowley/test.dat';
COPY 10000000
Time: 5272.228 ms (00:05.272)

Master:

postgres=# copy t1 (a) from '/home/drowley/test.dat';
COPY 10000000
Time: 6570.031 ms (00:06.570)
postgres=# truncate t1;
TRUNCATE TABLE
Time: 17.813 ms
postgres=# copy t1 (a) from '/home/drowley/test.dat';
COPY 10000000
Time: 6486.253 ms (00:06.486)

Patched:

postgres=# copy t1 (a) from '/home/drowley/test.dat';
COPY 10000000
Time: 5359.338 ms (00:05.359)
postgres=# truncate table t1;
TRUNCATE TABLE
Time: 25.551 ms
postgres=# copy t1 (a) from '/home/drowley/test.dat';
COPY 10000000
Time: 5347.596 ms (00:05.348)

For the patch, I wonder if you need this line:

+ memcpy(values, slot->tts_values, sizeof(*values) * natts);

If you got rid of that and changed the datumCopy to use
slot->tts_values[i] instead.

Maybe it's also worth getting rid of the first memcpy for the null
array and just assign the element in the else clause.

It might also be cleaner to assign TupleDescAttr(tupdesc, i) to a
variable instead of using the macro 3 times. It'd make that datumCopy
line shorter too.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#8Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: David Rowley (#7)
1 attachment(s)
Re: Why does ExecComputeStoredGenerated() form a heap tuple

On 2019-04-24 00:26, David Rowley wrote:

I didn't do the exact same test, but if I use COPY instead of \copy,
then for me patched is faster.

OK, confirmed that way, too.

For the patch, I wonder if you need this line:

+ memcpy(values, slot->tts_values, sizeof(*values) * natts);

If you got rid of that and changed the datumCopy to use
slot->tts_values[i] instead.

done

Maybe it's also worth getting rid of the first memcpy for the null
array and just assign the element in the else clause.

Tried that, seems to be slower. So I left it as is.

It might also be cleaner to assign TupleDescAttr(tupdesc, i) to a
variable instead of using the macro 3 times. It'd make that datumCopy
line shorter too.

Also done.

Updated patch attached.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v2-0001-Convert-ExecComputeStoredGenerated-to-use-tuple-s.patchtext/plain; charset=UTF-8; name=v2-0001-Convert-ExecComputeStoredGenerated-to-use-tuple-s.patch; x-mac-creator=0; x-mac-type=0Download
From 8617ec7576c88081f85e4498ab558e7c5aa909ec Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Wed, 15 May 2019 19:37:52 +0200
Subject: [PATCH v2] Convert ExecComputeStoredGenerated to use tuple slots

This code was still using the old style of forming a heap tuple rather
than using tuple slots.  This would be less efficient if a non-heap
access method was used.  And using tuple slots is actually quite a bit
faster when using heap as well.

Also add some test cases for generated columns with null values and
with varlena values.  This lack of coverage was discovered while
working on this patch.

Discussion: https://www.postgresql.org/message-id/flat/20190331025744.ugbsyks7czfcoksd%40alap3.anarazel.de
---
 src/backend/executor/nodeModifyTable.c  | 33 +++++++++++++------------
 src/test/regress/expected/generated.out | 31 ++++++++++++++++++++---
 src/test/regress/sql/generated.sql      | 10 ++++++--
 3 files changed, 52 insertions(+), 22 deletions(-)

diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index d545bbce8a..d3a0dece5a 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -53,6 +53,7 @@
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "utils/builtins.h"
+#include "utils/datum.h"
 #include "utils/memutils.h"
 #include "utils/rel.h"
 
@@ -254,9 +255,6 @@ ExecComputeStoredGenerated(EState *estate, TupleTableSlot *slot)
 	MemoryContext oldContext;
 	Datum	   *values;
 	bool	   *nulls;
-	bool	   *replaces;
-	HeapTuple	oldtuple, newtuple;
-	bool		should_free;
 
 	Assert(tupdesc->constr && tupdesc->constr->has_generated_stored);
 
@@ -294,11 +292,15 @@ ExecComputeStoredGenerated(EState *estate, TupleTableSlot *slot)
 
 	values = palloc(sizeof(*values) * natts);
 	nulls = palloc(sizeof(*nulls) * natts);
-	replaces = palloc0(sizeof(*replaces) * natts);
+
+	slot_getallattrs(slot);
+	memcpy(nulls, slot->tts_isnull, sizeof(*nulls) * natts);
 
 	for (int i = 0; i < natts; i++)
 	{
-		if (TupleDescAttr(tupdesc, i)->attgenerated == ATTRIBUTE_GENERATED_STORED)
+		Form_pg_attribute attr = TupleDescAttr(tupdesc, i);
+
+		if (attr->attgenerated == ATTRIBUTE_GENERATED_STORED)
 		{
 			ExprContext *econtext;
 			Datum		val;
@@ -311,20 +313,19 @@ ExecComputeStoredGenerated(EState *estate, TupleTableSlot *slot)
 
 			values[i] = val;
 			nulls[i] = isnull;
-			replaces[i] = true;
+		}
+		else
+		{
+			if (!nulls[i])
+				values[i] = datumCopy(slot->tts_values[i], attr->attbyval, attr->attlen);
 		}
 	}
 
-	oldtuple = ExecFetchSlotHeapTuple(slot, true, &should_free);
-	newtuple = heap_modify_tuple(oldtuple, tupdesc, values, nulls, replaces);
-	/*
-	 * The tuple will be freed by way of the memory context - the slot might
-	 * only be cleared after the context is reset, and we'd thus potentially
-	 * double free.
-	 */
-	ExecForceStoreHeapTuple(newtuple, slot, false);
-	if (should_free)
-		heap_freetuple(oldtuple);
+	ExecClearTuple(slot);
+	memcpy(slot->tts_values, values, sizeof(*values) * natts);
+	memcpy(slot->tts_isnull, nulls, sizeof(*nulls) * natts);
+	ExecStoreVirtualTuple(slot);
+	ExecMaterializeSlot(slot);
 
 	MemoryContextSwitchTo(oldContext);
 }
diff --git a/src/test/regress/expected/generated.out b/src/test/regress/expected/generated.out
index d4ed3f7ae1..f62c93f468 100644
--- a/src/test/regress/expected/generated.out
+++ b/src/test/regress/expected/generated.out
@@ -226,15 +226,16 @@ NOTICE:  merging multiple inherited definitions of column "b"
 ERROR:  inherited column "b" has a generation conflict
 DROP TABLE gtesty;
 -- test stored update
-CREATE TABLE gtest3 (a int PRIMARY KEY, b int GENERATED ALWAYS AS (a * 3) STORED);
-INSERT INTO gtest3 (a) VALUES (1), (2), (3);
+CREATE TABLE gtest3 (a int, b int GENERATED ALWAYS AS (a * 3) STORED);
+INSERT INTO gtest3 (a) VALUES (1), (2), (3), (NULL);
 SELECT * FROM gtest3 ORDER BY a;
  a | b 
 ---+---
  1 | 3
  2 | 6
  3 | 9
-(3 rows)
+   |  
+(4 rows)
 
 UPDATE gtest3 SET a = 22 WHERE a = 2;
 SELECT * FROM gtest3 ORDER BY a;
@@ -243,7 +244,29 @@ SELECT * FROM gtest3 ORDER BY a;
   1 |  3
   3 |  9
  22 | 66
-(3 rows)
+    |   
+(4 rows)
+
+CREATE TABLE gtest3a (a text, b text GENERATED ALWAYS AS (a || '+' || a) STORED);
+INSERT INTO gtest3a (a) VALUES ('a'), ('b'), ('c'), (NULL);
+SELECT * FROM gtest3a ORDER BY a;
+ a |  b  
+---+-----
+ a | a+a
+ b | b+b
+ c | c+c
+   | 
+(4 rows)
+
+UPDATE gtest3a SET a = 'bb' WHERE a = 'b';
+SELECT * FROM gtest3a ORDER BY a;
+ a  |   b   
+----+-------
+ a  | a+a
+ bb | bb+bb
+ c  | c+c
+    | 
+(4 rows)
 
 -- COPY
 TRUNCATE gtest1;
diff --git a/src/test/regress/sql/generated.sql b/src/test/regress/sql/generated.sql
index da11b8c9b8..6a56ae260f 100644
--- a/src/test/regress/sql/generated.sql
+++ b/src/test/regress/sql/generated.sql
@@ -95,12 +95,18 @@ CREATE TABLE gtest1_2 () INHERITS (gtest1, gtesty);  -- error
 DROP TABLE gtesty;
 
 -- test stored update
-CREATE TABLE gtest3 (a int PRIMARY KEY, b int GENERATED ALWAYS AS (a * 3) STORED);
-INSERT INTO gtest3 (a) VALUES (1), (2), (3);
+CREATE TABLE gtest3 (a int, b int GENERATED ALWAYS AS (a * 3) STORED);
+INSERT INTO gtest3 (a) VALUES (1), (2), (3), (NULL);
 SELECT * FROM gtest3 ORDER BY a;
 UPDATE gtest3 SET a = 22 WHERE a = 2;
 SELECT * FROM gtest3 ORDER BY a;
 
+CREATE TABLE gtest3a (a text, b text GENERATED ALWAYS AS (a || '+' || a) STORED);
+INSERT INTO gtest3a (a) VALUES ('a'), ('b'), ('c'), (NULL);
+SELECT * FROM gtest3a ORDER BY a;
+UPDATE gtest3a SET a = 'bb' WHERE a = 'b';
+SELECT * FROM gtest3a ORDER BY a;
+
 -- COPY
 TRUNCATE gtest1;
 INSERT INTO gtest1 (a) VALUES (1), (2);

base-commit: aa4b8c61d2cd57b53be03defb04d59b232a0e150
-- 
2.21.0

#9David Rowley
david.rowley@2ndquadrant.com
In reply to: Peter Eisentraut (#8)
Re: Why does ExecComputeStoredGenerated() form a heap tuple

On Thu, 16 May 2019 at 05:44, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

Updated patch attached.

This patch looks okay to me.

It's not for this patch, or probably for PG12, but it would be good if
we could avoid the formation of the Tuple until right before the new
tuple is inserted.

I see heap_form_tuple() is called 3 times for a single INSERT with:

create table t (a text, b text, c text generated always as (b || b) stored);

create or replace function t_trigger() returns trigger as $$
begin
NEW.b = UPPER(NEW.a);
RETURN NEW;
end;
$$ language plpgsql;

create trigger t_on_insert before insert on t for each row execute
function t_trigger();

insert into t (a) values('one');

and heap_deform_tuple() is called once for each additional
heap_form_tuple(). That's pretty wasteful :-(

Maybe Andres can explain if this is really required, or if it's just
something that's not well optimised yet.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#10Andres Freund
andres@anarazel.de
In reply to: David Rowley (#9)
Re: Why does ExecComputeStoredGenerated() form a heap tuple

Hi,

On 2019-05-20 14:23:34 +1200, David Rowley wrote:

It's not for this patch, or probably for PG12, but it would be good if
we could avoid the formation of the Tuple until right before the new
tuple is inserted.

I see heap_form_tuple() is called 3 times for a single INSERT with:

create table t (a text, b text, c text generated always as (b || b) stored);

create or replace function t_trigger() returns trigger as $$
begin
NEW.b = UPPER(NEW.a);
RETURN NEW;
end;
$$ language plpgsql;

create trigger t_on_insert before insert on t for each row execute
function t_trigger();

insert into t (a) values('one');

and heap_deform_tuple() is called once for each additional
heap_form_tuple(). That's pretty wasteful :-(

Maybe Andres can explain if this is really required, or if it's just
something that's not well optimised yet.

I think we can optimize this further, but it's not unexpected.

I see:

1) ExecCopySlot() call in in ExecModifyTable(). For INSERT SELECT the
input will be in a virtual slot. We might be able to have some
trickery to avoid this one in some case. Not sure how much it'd help
- I think we most of the time would just move the forming of the
tuple around - ExecInsert() wants to materialize the slot.

2) plpgsql form/deform due to updating a field. I don't see how we could
easily fix that. We'd have to invent a mechanism that allows plpgsql to pass
slots around. I guess it's possible you could make that work somehow?
But I think we'd also need to change the external trigger interface -
which currently specifies that the return type is a HeapTuple

3) ExecComputeStoredGenerated(). I suspect it's not particularly useful
to get rid of the heap_form_tuple (from with ExecMaterialize())
here. When actually inserting we'll have to actually form the tuple
anyway. But what I do wonder is whether it would make sense to move
the materialization outside of that function. If there's constraints,
or partitioning, we'll have to deform (parts of) the tuple, to access
the necessary columns.

Currently materializing an unmaterialized slot (i.e. making it
independent from anything but memory referenced by the slot) also means
that later accesses will need to deform again. I'm fairly sure we can
improve that for many cases (IIRC we don't need to that for virtual
slots, but that's irrelevant here).

I'm pretty sure we get rid of most of this, but it'll be some work. I'm
also not sure how important it is - for INSERT/UPDATE, in how many cases
is the bottleneck those copies, rather than other parts of query
execution? I suspect you can measure it for some INSERT ... SELECT type
cases - but probably the overhead of triggers and GENERATED is going to
be higher.

Greetings,

Andres Freund

#11Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: David Rowley (#9)
Re: Why does ExecComputeStoredGenerated() form a heap tuple

On 2019-05-20 04:23, David Rowley wrote:

On Thu, 16 May 2019 at 05:44, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

Updated patch attached.

This patch looks okay to me.

committed

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services