MAIN, Uncompressed?

Started by Simon Riggsover 8 years ago14 messages
#1Simon Riggs
simon@2ndquadrant.com

It looks like we need a new Column Storage option for MAIN, Uncompressed.

We have these Column Storage options
Plain - inline only, uncompressed
Main - inline until external as last resort, compressible
External - external, uncompressed
Extended - external, compressible

So there is no option for Main, but not compressible...

With reference to code... there seems to be no way to skip step 3

/* ----------
* Compress and/or save external until data fits into target length
*
* 1: Inline compress attributes with attstorage 'x', and store very
* large attributes with attstorage 'x' or 'e' external immediately
* 2: Store attributes with attstorage 'x' or 'e' external
* 3: Inline compress attributes with attstorage 'm'
* 4: Store attributes with attstorage 'm' external
* ----------
*/

Not sure what to call this new option? MAINU?

Objections?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#1)
Re: MAIN, Uncompressed?

Simon Riggs <simon@2ndquadrant.com> writes:

It looks like we need a new Column Storage option for MAIN, Uncompressed.
We have these Column Storage options
Plain - inline only, uncompressed
Main - inline until external as last resort, compressible
External - external, uncompressed
Extended - external, compressible

So there is no option for Main, but not compressible...

Doesn't Plain serve the purpose?

In point of fact, though, "never inline and never compress" is not really
a useful option, as it can be more easily read as "fail on wide values".

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Simon Riggs
simon@2ndquadrant.com
In reply to: Tom Lane (#2)
Re: MAIN, Uncompressed?

On 25 August 2017 at 12:24, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

It looks like we need a new Column Storage option for MAIN, Uncompressed.
We have these Column Storage options
Plain - inline only, uncompressed
Main - inline until external as last resort, compressible
External - external, uncompressed
Extended - external, compressible

So there is no option for Main, but not compressible...

Doesn't Plain serve the purpose?

No, because that just gives an error if you try to insert a large column value.

In point of fact, though, "never inline and never compress" is not really
a useful option, as it can be more easily read as "fail on wide values".

Agreed, but that is not what I am proposing.

Main is roughly what is wanted, yet it always tries to compress. If
you already know that won't be useful it should be possible to turn
compression off.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#3)
Re: MAIN, Uncompressed?

Simon Riggs <simon@2ndquadrant.com> writes:

Main is roughly what is wanted, yet it always tries to compress. If
you already know that won't be useful it should be possible to turn
compression off.

If you know compression isn't useful, but you don't want to fail on
wide values, then "external" should serve the purpose.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Simon Riggs
simon@2ndquadrant.com
In reply to: Tom Lane (#4)
Re: MAIN, Uncompressed?

On 25 August 2017 at 13:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

Main is roughly what is wanted, yet it always tries to compress. If
you already know that won't be useful it should be possible to turn
compression off.

If you know compression isn't useful, but you don't want to fail on
wide values, then "external" should serve the purpose.

Well, almost. External toasts at 2048-ish bytes whereas Main toasts at
8160 bytes.

The rows are typically near 4kB long, so if marked External they would
always be toasted.

It's desirable to have the full row in the heap block, rather than
have to access heap-toastindex-toastblocks in all cases.

The data is also incompressible, so Main just wastes time on insert.

Hence, we have a missing option.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#5)
Re: MAIN, Uncompressed?

Simon Riggs <simon@2ndquadrant.com> writes:

On 25 August 2017 at 13:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:

If you know compression isn't useful, but you don't want to fail on
wide values, then "external" should serve the purpose.

Well, almost. External toasts at 2048-ish bytes whereas Main toasts at
8160 bytes.
The rows are typically near 4kB long, so if marked External they would
always be toasted.
It's desirable to have the full row in the heap block, rather than
have to access heap-toastindex-toastblocks in all cases.
The data is also incompressible, so Main just wastes time on insert.
Hence, we have a missing option.

Maybe, but the use case seems mighty narrow.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Simon Riggs
simon@2ndquadrant.com
In reply to: Tom Lane (#6)
Re: MAIN, Uncompressed?

On 25 August 2017 at 14:08, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

On 25 August 2017 at 13:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:

If you know compression isn't useful, but you don't want to fail on
wide values, then "external" should serve the purpose.

Well, almost. External toasts at 2048-ish bytes whereas Main toasts at
8160 bytes.
The rows are typically near 4kB long, so if marked External they would
always be toasted.
It's desirable to have the full row in the heap block, rather than
have to access heap-toastindex-toastblocks in all cases.
The data is also incompressible, so Main just wastes time on insert.
Hence, we have a missing option.

Maybe, but the use case seems mighty narrow.

JSON blobs between 2kB and 8160 bytes are very common.

String length is maybe a poisson distribution, definitely not uniform.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Greg Stark
stark@mit.edu
In reply to: Simon Riggs (#7)
Re: MAIN, Uncompressed?

On 25 August 2017 at 19:59, Simon Riggs <simon@2ndquadrant.com> wrote:

On 25 August 2017 at 14:08, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Maybe, but the use case seems mighty narrow.

JSON blobs between 2kB and 8160 bytes are very common.

String length is maybe a poisson distribution, definitely not uniform.

But JSON blobs should be highly compressible. Even jsonb will be quite
compressible.

That said I always found remembering the mapping from these names to
various behaviours to be quite hard to use. I would have found it far
more useful to have two separate properties I could set "compress" and
"external" or perhaps even more useful would be to set some kind of
guideline size threshold for each (and perhaps a second size compress
threshold and external threshold for the whole tuple).

I think this is a particularly old piece of code and we're lucky the
default heuristics have served well for all this time because I doubt
many people fiddle with these storage attributes. The time may have
come to come up with a better UI for the storage attributes because
people are doing new things (like json) and wanting more control over
this heuristic.

For what it's worth I think a good start would be to give people more
visibility into what the tuptoaster heuristic is actually doing to
their data and that will encourage people to give feedback about when
they're surprised and are frustrated by the existing UI.

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Stark (#8)
Re: MAIN, Uncompressed?

Greg Stark <stark@mit.edu> writes:

I think this is a particularly old piece of code and we're lucky the
default heuristics have served well for all this time because I doubt
many people fiddle with these storage attributes. The time may have
come to come up with a better UI for the storage attributes because
people are doing new things (like json) and wanting more control over
this heuristic.

Yeah, I could get behind a basic rethinking of the tuptoaster control
knobs. I'm just not in love with Simon's specific proposal, especially
not if we can't think of a better name for it than "MAINU".

For what it's worth I think a good start would be to give people more
visibility into what the tuptoaster heuristic is actually doing to
their data and that will encourage people to give feedback about when
they're surprised and are frustrated by the existing UI.

Hm, what might that look like exactly? More pgstattuple functionality
perhaps?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Simon Riggs
simon@2ndquadrant.com
In reply to: Tom Lane (#9)
Re: MAIN, Uncompressed?

On 25 August 2017 at 20:53, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Greg Stark <stark@mit.edu> writes:

I think this is a particularly old piece of code and we're lucky the
default heuristics have served well for all this time because I doubt
many people fiddle with these storage attributes. The time may have
come to come up with a better UI for the storage attributes because
people are doing new things (like json) and wanting more control over
this heuristic.

Yeah, I could get behind a basic rethinking of the tuptoaster control
knobs. I'm just not in love with Simon's specific proposal, especially
not if we can't think of a better name for it than "MAINU".

Extended/External would be just fine if you could set the toast
target, so I think a better suggestion would be to make "toast_target"
a per-attribute option .

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Mark Kirkwood
mark.kirkwood@catalyst.net.nz
In reply to: Simon Riggs (#10)
Re: MAIN, Uncompressed?

On 26/08/17 12:18, Simon Riggs wrote:

On 25 August 2017 at 20:53, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Greg Stark <stark@mit.edu> writes:

I think this is a particularly old piece of code and we're lucky the
default heuristics have served well for all this time because I doubt
many people fiddle with these storage attributes. The time may have
come to come up with a better UI for the storage attributes because
people are doing new things (like json) and wanting more control over
this heuristic.

Yeah, I could get behind a basic rethinking of the tuptoaster control
knobs. I'm just not in love with Simon's specific proposal, especially
not if we can't think of a better name for it than "MAINU".

Extended/External would be just fine if you could set the toast
target, so I think a better suggestion would be to make "toast_target"
a per-attribute option .

+1, have thought about this myself previously....thank you for bringing
it up!

regards

Mark

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Simon Riggs
simon@2ndquadrant.com
In reply to: Mark Kirkwood (#11)
1 attachment(s)
Re: MAIN, Uncompressed?

On 26 August 2017 at 05:40, Mark Kirkwood <mark.kirkwood@catalyst.net.nz> wrote:

On 26/08/17 12:18, Simon Riggs wrote:

On 25 August 2017 at 20:53, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Greg Stark <stark@mit.edu> writes:

I think this is a particularly old piece of code and we're lucky the
default heuristics have served well for all this time because I doubt
many people fiddle with these storage attributes. The time may have
come to come up with a better UI for the storage attributes because
people are doing new things (like json) and wanting more control over
this heuristic.

Yeah, I could get behind a basic rethinking of the tuptoaster control
knobs. I'm just not in love with Simon's specific proposal, especially
not if we can't think of a better name for it than "MAINU".

Extended/External would be just fine if you could set the toast
target, so I think a better suggestion would be to make "toast_target"
a per-attribute option .

+1, have thought about this myself previously....thank you for bringing it
up!

OK, so table-level option for "toast_tuple_target", not attribute-level option

The attached patch and test shows this concept is useful and doesn't
affect existing data.

For 4x 4000 byte rows:
* by default we use 1 heap block and 3 toast blocks
* toast_tuple_target=4080 uses 2 heap blocks and 0 toast blocks

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

toast_tuple_target.v1.patchapplication/octet-stream; name=toast_tuple_target.v1.patchDownload
diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml
index 69600321e6..4128c3358c 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -618,8 +618,8 @@ ALTER TABLE [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable>
 
      <para>
       <literal>SHARE UPDATE EXCLUSIVE</literal> lock will be taken for
-      fillfactor and autovacuum storage parameters, as well as the
-      following planner related parameters:
+      fillfactor, toast and autovacuum storage parameters,
+      as well as the following planner related parameters:
       effective_io_concurrency, parallel_workers, seq_page_cost
       random_page_cost, n_distinct and n_distinct_inherited.
      </para>
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index e9c2c49533..ad0c02c651 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1146,6 +1146,26 @@ FROM ( { <replaceable class="PARAMETER">numeric_literal</replaceable> | <replace
    </varlistentry>
 
    <varlistentry>
+    <term><literal>toast_tuple_target</> (<type>integer</>)</term>
+    <listitem>
+     <para>
+      The toast_tuple_target specifies the threshold of tuple length at which
+      we try to move long column values into TOAST tables, and is also the
+      target length we try to reduce the length below once toasting begins.
+      This affects columns marked as either External or Extended only
+      and applies only to new tuples - there is no effect on existing rows.
+      By default this parameter is set to allow at least 4 tuples per block,
+      which with default blocksize will be 2040 bytes. Valid values are
+      between 128 bytes and the (blocksize - header), by default 8160 bytes.
+      Changing this value may not be useful for very short or very long rows.
+      Note that the default setting is useful, so it is possible that setting
+      this parameter could have negative effects in some cases.
+      This parameter cannot be set for TOAST tables.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>parallel_workers</> (<type>integer</>)</term>
     <listitem>
      <para>
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index ec10762529..ea1c8ccc3a 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -23,6 +23,7 @@
 #include "access/nbtree.h"
 #include "access/reloptions.h"
 #include "access/spgist.h"
+#include "access/tuptoaster.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
 #include "commands/tablespace.h"
@@ -292,6 +293,15 @@ static relopt_int intRelOpts[] =
 	},
 	{
 		{
+			"toast_tuple_target",
+			"Sets the target tuple length at which external columns will be toasted",
+			RELOPT_KIND_HEAP,
+			ShareUpdateExclusiveLock
+		},
+		TOAST_TUPLE_TARGET, 128, TOAST_TUPLE_TARGET_MAIN
+	},
+	{
+		{
 			"pages_per_range",
 			"Number of pages that each page range covers in a BRIN index",
 			RELOPT_KIND_BRIN,
@@ -1344,6 +1354,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, multixact_freeze_table_age)},
 		{"log_autovacuum_min_duration", RELOPT_TYPE_INT,
 		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, log_min_duration)},
+		{"toast_tuple_target", RELOPT_TYPE_INT,
+		offsetof(StdRdOptions, toast_tuple_target)},
 		{"autovacuum_vacuum_scale_factor", RELOPT_TYPE_REAL,
 		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_scale_factor)},
 		{"autovacuum_analyze_scale_factor", RELOPT_TYPE_REAL,
diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c
index 458180bc95..a75e764a15 100644
--- a/src/backend/access/heap/tuptoaster.c
+++ b/src/backend/access/heap/tuptoaster.c
@@ -730,7 +730,7 @@ toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup,
 		hoff += sizeof(Oid);
 	hoff = MAXALIGN(hoff);
 	/* now convert to a limit on the tuple data size */
-	maxDataLen = TOAST_TUPLE_TARGET - hoff;
+	maxDataLen = RelationGetToastTupleTarget(rel, TOAST_TUPLE_TARGET) - hoff;
 
 	/*
 	 * Look for attributes with attstorage 'x' to compress.  Also find large
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 4bc61e5380..68fd6fbd54 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -277,6 +277,7 @@ typedef struct StdRdOptions
 {
 	int32		vl_len_;		/* varlena header (do not touch directly!) */
 	int			fillfactor;		/* page fill factor in percent (0..100) */
+	int			toast_tuple_target; /* target for tuple toasting */
 	AutoVacOpts autovacuum;		/* autovacuum-related options */
 	bool		user_catalog_table; /* use as an additional catalog relation */
 	int			parallel_workers;	/* max number of parallel workers */
@@ -286,6 +287,14 @@ typedef struct StdRdOptions
 #define HEAP_DEFAULT_FILLFACTOR		100
 
 /*
+ * RelationGetToastTupleTarget
+ *		Returns the relation's toast_tuple_target.  Note multiple eval of argument!
+ */
+#define RelationGetToastTupleTarget(relation, defaulttarg) \
+	((relation)->rd_options ? \
+	 ((StdRdOptions *) (relation)->rd_options)->toast_tuple_target : (defaulttarg))
+
+/*
  * RelationGetFillFactor
  *		Returns the relation's fillfactor.  Note multiple eval of argument!
  */
diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out
index 35cadb24aa..3a42ef77be 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -1166,6 +1166,53 @@ SELECT substr(f1, 99995, 10) from toasttest;
  567890
 (4 rows)
 
+TRUNCATE TABLE toasttest;
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+SELECT pg_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+ blocks 
+--------
+      1
+(1 row)
+
+select pg_relation_size('pg_toast.pg_toast_'||(select oid from pg_class where relname = 'toasttest'))/current_setting('block_size')::integer as blocks;
+ blocks 
+--------
+      3
+(1 row)
+
+SELECT pg_total_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+ blocks 
+--------
+      9
+(1 row)
+
+TRUNCATE TABLE toasttest;
+ALTER TABLE toasttest set (toast_tuple_target = 4080);
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+SELECT pg_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+ blocks 
+--------
+      2
+(1 row)
+
+select pg_relation_size('pg_toast.pg_toast_'||(select oid from pg_class where relname = 'toasttest'))/current_setting('block_size')::integer as blocks;
+ blocks 
+--------
+      0
+(1 row)
+
+SELECT pg_total_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+ blocks 
+--------
+      6
+(1 row)
+
 DROP TABLE toasttest;
 --
 -- test substr with toasted bytea values
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index f9cfaeb44a..6396693f27 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -366,6 +366,25 @@ SELECT substr(f1, 99995) from toasttest;
 -- string length
 SELECT substr(f1, 99995, 10) from toasttest;
 
+TRUNCATE TABLE toasttest;
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+SELECT pg_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+select pg_relation_size('pg_toast.pg_toast_'||(select oid from pg_class where relname = 'toasttest'))/current_setting('block_size')::integer as blocks;
+SELECT pg_total_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+
+TRUNCATE TABLE toasttest;
+ALTER TABLE toasttest set (toast_tuple_target = 4080);
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+SELECT pg_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+select pg_relation_size('pg_toast.pg_toast_'||(select oid from pg_class where relname = 'toasttest'))/current_setting('block_size')::integer as blocks;
+SELECT pg_total_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+
 DROP TABLE toasttest;
 
 --
#13Simon Riggs
simon@2ndquadrant.com
In reply to: Simon Riggs (#12)
1 attachment(s)
Re: MAIN, Uncompressed?

On 29 August 2017 at 07:58, Simon Riggs <simon@2ndquadrant.com> wrote:

On 26 August 2017 at 05:40, Mark Kirkwood <mark.kirkwood@catalyst.net.nz> wrote:

On 26/08/17 12:18, Simon Riggs wrote:

On 25 August 2017 at 20:53, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Greg Stark <stark@mit.edu> writes:

I think this is a particularly old piece of code and we're lucky the
default heuristics have served well for all this time because I doubt
many people fiddle with these storage attributes. The time may have
come to come up with a better UI for the storage attributes because
people are doing new things (like json) and wanting more control over
this heuristic.

Yeah, I could get behind a basic rethinking of the tuptoaster control
knobs. I'm just not in love with Simon's specific proposal, especially
not if we can't think of a better name for it than "MAINU".

Extended/External would be just fine if you could set the toast
target, so I think a better suggestion would be to make "toast_target"
a per-attribute option .

+1, have thought about this myself previously....thank you for bringing it
up!

OK, so table-level option for "toast_tuple_target", not attribute-level option

The attached patch and test shows this concept is useful and doesn't
affect existing data.

For 4x 4000 byte rows:
* by default we use 1 heap block and 3 toast blocks
* toast_tuple_target=4080 uses 2 heap blocks and 0 toast blocks

New patch, v2, since one line in the docs failed to apply because of
recent changes.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

toast_tuple_target.v2.patchapplication/octet-stream; name=toast_tuple_target.v2.patchDownload
diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml
index 0fb385ece7..208cb28214 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -618,7 +618,7 @@ ALTER TABLE [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable>
 
      <para>
       <literal>SHARE UPDATE EXCLUSIVE</literal> lock will be taken for
-      fillfactor and autovacuum storage parameters, as well as the
+      fillfactor, toast and autovacuum storage parameters, as well as the
       following planner related parameters:
       <varname>effective_io_concurrency</>, <varname>parallel_workers</>, <varname>seq_page_cost</>,
       <varname>random_page_cost</>, <varname>n_distinct</> and <varname>n_distinct_inherited</>.
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 824253de40..439bc8171a 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1168,6 +1168,26 @@ FROM ( { <replaceable class="PARAMETER">numeric_literal</replaceable> | <replace
    </varlistentry>
 
    <varlistentry>
+    <term><literal>toast_tuple_target</> (<type>integer</>)</term>
+    <listitem>
+     <para>
+      The toast_tuple_target specifies the threshold of tuple length at which
+      we try to move long column values into TOAST tables, and is also the
+      target length we try to reduce the length below once toasting begins.
+      This affects columns marked as either External or Extended only
+      and applies only to new tuples - there is no effect on existing rows.
+      By default this parameter is set to allow at least 4 tuples per block,
+      which with default blocksize will be 2040 bytes. Valid values are
+      between 128 bytes and the (blocksize - header), by default 8160 bytes.
+      Changing this value may not be useful for very short or very long rows.
+      Note that the default setting is useful, so it is possible that setting
+      this parameter could have negative effects in some cases.
+      This parameter cannot be set for TOAST tables.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>parallel_workers</> (<type>integer</>)</term>
     <listitem>
      <para>
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index ec10762529..ea1c8ccc3a 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -23,6 +23,7 @@
 #include "access/nbtree.h"
 #include "access/reloptions.h"
 #include "access/spgist.h"
+#include "access/tuptoaster.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
 #include "commands/tablespace.h"
@@ -292,6 +293,15 @@ static relopt_int intRelOpts[] =
 	},
 	{
 		{
+			"toast_tuple_target",
+			"Sets the target tuple length at which external columns will be toasted",
+			RELOPT_KIND_HEAP,
+			ShareUpdateExclusiveLock
+		},
+		TOAST_TUPLE_TARGET, 128, TOAST_TUPLE_TARGET_MAIN
+	},
+	{
+		{
 			"pages_per_range",
 			"Number of pages that each page range covers in a BRIN index",
 			RELOPT_KIND_BRIN,
@@ -1344,6 +1354,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, multixact_freeze_table_age)},
 		{"log_autovacuum_min_duration", RELOPT_TYPE_INT,
 		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, log_min_duration)},
+		{"toast_tuple_target", RELOPT_TYPE_INT,
+		offsetof(StdRdOptions, toast_tuple_target)},
 		{"autovacuum_vacuum_scale_factor", RELOPT_TYPE_REAL,
 		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_scale_factor)},
 		{"autovacuum_analyze_scale_factor", RELOPT_TYPE_REAL,
diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c
index 5a8f1dab83..c74945a52a 100644
--- a/src/backend/access/heap/tuptoaster.c
+++ b/src/backend/access/heap/tuptoaster.c
@@ -727,7 +727,7 @@ toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup,
 		hoff += sizeof(Oid);
 	hoff = MAXALIGN(hoff);
 	/* now convert to a limit on the tuple data size */
-	maxDataLen = TOAST_TUPLE_TARGET - hoff;
+	maxDataLen = RelationGetToastTupleTarget(rel, TOAST_TUPLE_TARGET) - hoff;
 
 	/*
 	 * Look for attributes with attstorage 'x' to compress.  Also find large
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 4bc61e5380..68fd6fbd54 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -277,6 +277,7 @@ typedef struct StdRdOptions
 {
 	int32		vl_len_;		/* varlena header (do not touch directly!) */
 	int			fillfactor;		/* page fill factor in percent (0..100) */
+	int			toast_tuple_target; /* target for tuple toasting */
 	AutoVacOpts autovacuum;		/* autovacuum-related options */
 	bool		user_catalog_table; /* use as an additional catalog relation */
 	int			parallel_workers;	/* max number of parallel workers */
@@ -286,6 +287,14 @@ typedef struct StdRdOptions
 #define HEAP_DEFAULT_FILLFACTOR		100
 
 /*
+ * RelationGetToastTupleTarget
+ *		Returns the relation's toast_tuple_target.  Note multiple eval of argument!
+ */
+#define RelationGetToastTupleTarget(relation, defaulttarg) \
+	((relation)->rd_options ? \
+	 ((StdRdOptions *) (relation)->rd_options)->toast_tuple_target : (defaulttarg))
+
+/*
  * RelationGetFillFactor
  *		Returns the relation's fillfactor.  Note multiple eval of argument!
  */
diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out
index 35cadb24aa..3a42ef77be 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -1166,6 +1166,53 @@ SELECT substr(f1, 99995, 10) from toasttest;
  567890
 (4 rows)
 
+TRUNCATE TABLE toasttest;
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+SELECT pg_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+ blocks 
+--------
+      1
+(1 row)
+
+select pg_relation_size('pg_toast.pg_toast_'||(select oid from pg_class where relname = 'toasttest'))/current_setting('block_size')::integer as blocks;
+ blocks 
+--------
+      3
+(1 row)
+
+SELECT pg_total_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+ blocks 
+--------
+      9
+(1 row)
+
+TRUNCATE TABLE toasttest;
+ALTER TABLE toasttest set (toast_tuple_target = 4080);
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+SELECT pg_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+ blocks 
+--------
+      2
+(1 row)
+
+select pg_relation_size('pg_toast.pg_toast_'||(select oid from pg_class where relname = 'toasttest'))/current_setting('block_size')::integer as blocks;
+ blocks 
+--------
+      0
+(1 row)
+
+SELECT pg_total_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+ blocks 
+--------
+      6
+(1 row)
+
 DROP TABLE toasttest;
 --
 -- test substr with toasted bytea values
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index f9cfaeb44a..6396693f27 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -366,6 +366,25 @@ SELECT substr(f1, 99995) from toasttest;
 -- string length
 SELECT substr(f1, 99995, 10) from toasttest;
 
+TRUNCATE TABLE toasttest;
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+SELECT pg_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+select pg_relation_size('pg_toast.pg_toast_'||(select oid from pg_class where relname = 'toasttest'))/current_setting('block_size')::integer as blocks;
+SELECT pg_total_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+
+TRUNCATE TABLE toasttest;
+ALTER TABLE toasttest set (toast_tuple_target = 4080);
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+SELECT pg_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+select pg_relation_size('pg_toast.pg_toast_'||(select oid from pg_class where relname = 'toasttest'))/current_setting('block_size')::integer as blocks;
+SELECT pg_total_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+
 DROP TABLE toasttest;
 
 --
#14Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Simon Riggs (#13)
1 attachment(s)
Re: [HACKERS] MAIN, Uncompressed?

On 09/12/2017 12:11 PM, Simon Riggs wrote:

OK, so table-level option for "toast_tuple_target", not attribute-level option

The attached patch and test shows this concept is useful and doesn't
affect existing data.

For 4x 4000 byte rows:
* by default we use 1 heap block and 3 toast blocks
* toast_tuple_target=4080 uses 2 heap blocks and 0 toast blocks

New patch, v2, since one line in the docs failed to apply because of
recent changes.

This has bitrotted ever so slightly. I fixed that and took the
opportunity to edit the docs text slightly to improve the clarity a bit.
Revised patch attached - adjust to your taste.

The code is extremely simple and small and straightforward, and the
tests are good.

Marking as Ready for Committer,

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

toast_tuple_target.v3.patchtext/x-patch; name=toast_tuple_target.v3.patchDownload
diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml
index 3b19ea7..92db00f 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -629,7 +629,7 @@ ALTER TABLE [ IF EXISTS ] <replaceable class="parameter">name</replaceable>
 
      <para>
       <literal>SHARE UPDATE EXCLUSIVE</literal> lock will be taken for
-      fillfactor and autovacuum storage parameters, as well as the
+      fillfactor, toast and autovacuum storage parameters, as well as the
       following planner related parameters:
       <varname>effective_io_concurrency</varname>, <varname>parallel_workers</varname>, <varname>seq_page_cost</varname>,
       <varname>random_page_cost</varname>, <varname>n_distinct</varname> and <varname>n_distinct_inherited</varname>.
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index bbb3a51..83eef7f 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1201,6 +1201,27 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
    </varlistentry>
 
    <varlistentry>
+    <term><literal>toast_tuple_target</> (<type>integer</>)</term>
+    <listitem>
+     <para>
+      The toast_tuple_target specifies the minimum tuple length required before
+      we try to move long column values into TOAST tables, and is also the
+      target length we try to reduce the length below once toasting begins.
+      This only affects columns marked as either External or Extended
+      and applies only to new tuples - there is no effect on existing rows.
+      By default this parameter is set to allow at least 4 tuples per block,
+      which with the default blocksize will be 2040 bytes. Valid values are
+      between 128 bytes and the (blocksize - header), by default 8160 bytes.
+      Changing this value may not be useful for very short or very long rows.
+      Note that the default setting is often close to optimal, and
+      it is possible that setting this parameter could have negative
+      effects in some cases.
+      This parameter cannot be set for TOAST tables.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><literal>parallel_workers</literal> (<type>integer</type>)</term>
     <listitem>
      <para>
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 3d0ce9a..aa9c0f1 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -23,6 +23,7 @@
 #include "access/nbtree.h"
 #include "access/reloptions.h"
 #include "access/spgist.h"
+#include "access/tuptoaster.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
 #include "commands/tablespace.h"
@@ -292,6 +293,15 @@ static relopt_int intRelOpts[] =
 	},
 	{
 		{
+			"toast_tuple_target",
+			"Sets the target tuple length at which external columns will be toasted",
+			RELOPT_KIND_HEAP,
+			ShareUpdateExclusiveLock
+		},
+		TOAST_TUPLE_TARGET, 128, TOAST_TUPLE_TARGET_MAIN
+	},
+	{
+		{
 			"pages_per_range",
 			"Number of pages that each page range covers in a BRIN index",
 			RELOPT_KIND_BRIN,
@@ -1344,6 +1354,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, multixact_freeze_table_age)},
 		{"log_autovacuum_min_duration", RELOPT_TYPE_INT,
 		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, log_min_duration)},
+		{"toast_tuple_target", RELOPT_TYPE_INT,
+		offsetof(StdRdOptions, toast_tuple_target)},
 		{"autovacuum_vacuum_scale_factor", RELOPT_TYPE_REAL,
 		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_scale_factor)},
 		{"autovacuum_analyze_scale_factor", RELOPT_TYPE_REAL,
diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c
index 5a8f1da..c74945a 100644
--- a/src/backend/access/heap/tuptoaster.c
+++ b/src/backend/access/heap/tuptoaster.c
@@ -727,7 +727,7 @@ toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup,
 		hoff += sizeof(Oid);
 	hoff = MAXALIGN(hoff);
 	/* now convert to a limit on the tuple data size */
-	maxDataLen = TOAST_TUPLE_TARGET - hoff;
+	maxDataLen = RelationGetToastTupleTarget(rel, TOAST_TUPLE_TARGET) - hoff;
 
 	/*
 	 * Look for attributes with attstorage 'x' to compress.  Also find large
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 4bc61e5..68fd6fb 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -277,6 +277,7 @@ typedef struct StdRdOptions
 {
 	int32		vl_len_;		/* varlena header (do not touch directly!) */
 	int			fillfactor;		/* page fill factor in percent (0..100) */
+	int			toast_tuple_target; /* target for tuple toasting */
 	AutoVacOpts autovacuum;		/* autovacuum-related options */
 	bool		user_catalog_table; /* use as an additional catalog relation */
 	int			parallel_workers;	/* max number of parallel workers */
@@ -286,6 +287,14 @@ typedef struct StdRdOptions
 #define HEAP_DEFAULT_FILLFACTOR		100
 
 /*
+ * RelationGetToastTupleTarget
+ *		Returns the relation's toast_tuple_target.  Note multiple eval of argument!
+ */
+#define RelationGetToastTupleTarget(relation, defaulttarg) \
+	((relation)->rd_options ? \
+	 ((StdRdOptions *) (relation)->rd_options)->toast_tuple_target : (defaulttarg))
+
+/*
  * RelationGetFillFactor
  *		Returns the relation's fillfactor.  Note multiple eval of argument!
  */
diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out
index 35cadb2..3a42ef7 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -1166,6 +1166,53 @@ SELECT substr(f1, 99995, 10) from toasttest;
  567890
 (4 rows)
 
+TRUNCATE TABLE toasttest;
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+SELECT pg_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+ blocks 
+--------
+      1
+(1 row)
+
+select pg_relation_size('pg_toast.pg_toast_'||(select oid from pg_class where relname = 'toasttest'))/current_setting('block_size')::integer as blocks;
+ blocks 
+--------
+      3
+(1 row)
+
+SELECT pg_total_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+ blocks 
+--------
+      9
+(1 row)
+
+TRUNCATE TABLE toasttest;
+ALTER TABLE toasttest set (toast_tuple_target = 4080);
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+SELECT pg_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+ blocks 
+--------
+      2
+(1 row)
+
+select pg_relation_size('pg_toast.pg_toast_'||(select oid from pg_class where relname = 'toasttest'))/current_setting('block_size')::integer as blocks;
+ blocks 
+--------
+      0
+(1 row)
+
+SELECT pg_total_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+ blocks 
+--------
+      6
+(1 row)
+
 DROP TABLE toasttest;
 --
 -- test substr with toasted bytea values
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index f9cfaeb..6396693 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -366,6 +366,25 @@ SELECT substr(f1, 99995) from toasttest;
 -- string length
 SELECT substr(f1, 99995, 10) from toasttest;
 
+TRUNCATE TABLE toasttest;
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+SELECT pg_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+select pg_relation_size('pg_toast.pg_toast_'||(select oid from pg_class where relname = 'toasttest'))/current_setting('block_size')::integer as blocks;
+SELECT pg_total_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+
+TRUNCATE TABLE toasttest;
+ALTER TABLE toasttest set (toast_tuple_target = 4080);
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+INSERT INTO toasttest values (repeat('1234567890',400));
+SELECT pg_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+select pg_relation_size('pg_toast.pg_toast_'||(select oid from pg_class where relname = 'toasttest'))/current_setting('block_size')::integer as blocks;
+SELECT pg_total_relation_size('toasttest')/current_setting('block_size')::integer as blocks;
+
 DROP TABLE toasttest;
 
 --